Important YARN configuration properties

To configure YARN and MapReduce on top of YARN, we should look into couple of configuration files

  • yarn-site.xml
  • mapred-site.xml

yarn-site.xml

  • yarn.scheduler.minimum-allocation-mb: The minimum allocation for every container request at the RM
  • yarn.scheduler.maximum-allocation-mb: The maximum allocation for every container request at the RM
  • yarn.scheduler.minimum-allocation-vcores: The minimum allocation for every container request at the RM, in terms of virtual CPU cores.
  • yarn.scheduler.maximum-allocation-vcores: The maximum allocation for every container request at the RM, in terms of virtual CPU cores.
  • yarn.nodemanager.resource.memory-mb: Amount of physical memory, that can be allocated for containers. Total RAM on a given node that can be utilized by the node manager to create the containers
  • yarn.nodemanager.resource.cpu-vcores: Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of physical cores used by YARN containers.
  • yarn.nodemanager.pmem-check-enabled: Whether physical memory limits will be enforced for containers.
  • yarn.nodemanager.vmem-check-enabled: Whether virtual memory limits will be enforced for containers.
  • yarn.nodemanager.vmem-pmem-ratio: Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio.

Virtual Memory: physical + paged memory

Reference:
https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/

mapred-site.xml

  • mapreduce.framework.name: yarn
  • mapreduce.map.memory.mb: The amount of memory to request from the YARN scheduler for each map task. This is total physical RAM of a Map Task Container.
  • mapreduce.map.java.opts: The JVM Heap Size (0.8 times above RAM), so that JVM memory is within the container physical memory
  • mapreduce.reduce.memory.mb: The amount of memory to request from the YARN scheduler for each reduce task.
  • yarn.app.mapreduce.am.resource.mb: The amount of memory the MR AppMaster needs.

Reference:
https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

Advertisement

2 Responses to Important YARN configuration properties

  1. Pingback: Memory Management in Spark | coding algorithms

  2. Pingback: Tuning Spark Applications | coding algorithms

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Mawazo

Mostly technology with occasional sprinkling of other random thoughts

amintabar

Amir Amintabar's personal page

101 Books

Reading my way through Time Magazine's 100 Greatest Novels since 1923 (plus Ulysses)

Seek, Plunnge and more...

My words, my world...

ARRM Foundation

Do not wait for leaders; do it alone, person to person - Mother Teresa

Executive Management

An unexamined life is not worth living – Socrates

javaproffesionals

A topnotch WordPress.com site

thehandwritinganalyst

Just another WordPress.com site

coding algorithms

"An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem." -- John Tukey

%d bloggers like this: