Important YARN configuration properties
February 13, 2017 2 Comments
To configure YARN and MapReduce on top of YARN, we should look into couple of configuration files
- yarn-site.xml
- mapred-site.xml
yarn-site.xml
- yarn.scheduler.minimum-allocation-mb: The minimum allocation for every container request at the RM
- yarn.scheduler.maximum-allocation-mb: The maximum allocation for every container request at the RM
- yarn.scheduler.minimum-allocation-vcores: The minimum allocation for every container request at the RM, in terms of virtual CPU cores.
- yarn.scheduler.maximum-allocation-vcores: The maximum allocation for every container request at the RM, in terms of virtual CPU cores.
- yarn.nodemanager.resource.memory-mb: Amount of physical memory, that can be allocated for containers. Total RAM on a given node that can be utilized by the node manager to create the containers
- yarn.nodemanager.resource.cpu-vcores: Number of vcores that can be allocated for containers. This is used by the RM scheduler when allocating resources for containers. This is not used to limit the number of physical cores used by YARN containers.
- yarn.nodemanager.pmem-check-enabled: Whether physical memory limits will be enforced for containers.
- yarn.nodemanager.vmem-check-enabled: Whether virtual memory limits will be enforced for containers.
- yarn.nodemanager.vmem-pmem-ratio: Ratio between virtual memory to physical memory when setting memory limits for containers. Container allocations are expressed in terms of physical memory, and virtual memory usage is allowed to exceed this allocation by this ratio.
Virtual Memory: physical + paged memory
Reference:
https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
mapred-site.xml
- mapreduce.framework.name: yarn
- mapreduce.map.memory.mb: The amount of memory to request from the YARN scheduler for each map task. This is total physical RAM of a Map Task Container.
- mapreduce.map.java.opts: The JVM Heap Size (0.8 times above RAM), so that JVM memory is within the container physical memory
- mapreduce.reduce.memory.mb: The amount of memory to request from the YARN scheduler for each reduce task.
- yarn.app.mapreduce.am.resource.mb: The amount of memory the MR AppMaster needs.
Pingback: Memory Management in Spark | coding algorithms
Pingback: Tuning Spark Applications | coding algorithms