System Configuration defines the computers, processes, and devices that compose the system and its boundary. More general the system configuration is the specific definition of the elements that define and/or prescribe what a system is composed of.
It consists of various configuration properties for MapReduce, ResourceManager and NodeManager.
To configure ResourceManager, NodeManager or Mapreduce properties, click on Configure MapReduce under ADMIN menu tab. Change the properties according to requirements and click Save to update properties.
Various properties that can be configured are:
Type | Key | Default Value | Description |
---|---|---|---|
Map Reduce | yarn.resourcemanager.address | 0.0.0.0:8040 | The address of the applications manager interface in the RM. |
Map Reduce | yarn.resourcemanager.scheduler.address | 0.0.0.0:8141 | The address of the scheduler interface. |
Map Reduce | yarn.resourcemanager.webapp.address | 0.0.0.0:8088 | The address of the RM web application. |
Map Reduce | yarn.resourcemanager.resource-tracker.address | 0.0.0.0:8025 | The address of the RM Resource Tracker. |
Map Reduce | yarn.resourcemanager.admin.address | 0.0.0.0:8141 | The address of the RM admin interface. |
Map Reduce | mapreduce.job.hdfs-servers | ${fs.default.name} | HDFS Server URI. |
Map Reduce | mapreduce.framework.name | yarn | The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn. |
Map Reduce | mapreduce.map.memory.mb | 1536 | The amount of memory the MR AppMaster needs. |
Map Reduce | mapreduce.reduce.memory.mb | 3072 | Larger resource limit for reduces. |
Map Reduce | mapreduce.task.io.sort.mb | 512 | The total amount of buffer memory to use while sorting files, in megabytes. By default, gives each merge stream 1MB, which should minimize seeks. |
Map Reduce | mapreduce.task.io.sort.factor | 100 | The number of streams to merge at once while sorting files. This determines the number of open file handles. |
Map Reduce | mapreduce.reduce.shuffle.parallelcopies | 50 | The default number of parallel transfers run by reduce during the copy(shuffle) phase. |
Map Reduce | queryio.yarn.log-dir | Where log files are stored. Used by queryio server for yarn runtime configuration. | |
Map Reduce | queryio.yarn.pid-dir | The directory where pid files are stored. Used by queryio server for yarn runtime configuration. | |
Map Reduce | queryio.yarn.heap-size | 4096 | The maximum amount of heap to use, in MB. Used by queryio server for yarn runtime configuration. |
Node Manager | yarn.nodemanager.address | 0.0.0.0:0 | Address of node manager IPC. |
Node Manager | yarn.nodemanager.localizer.address | 0.0.0.0:4344 | Address where the localizer IPC is. |
Node Manager | yarn.nodemanager.container-manager.thread-count | 5 | Number of threads container manager uses. |
Node Manager | yarn.nodemanager.localizer.client.thread-count | 5 | Number of threads to handle localization requests. |
Node Manager | yarn.nodemanager.heartbeat.interval-ms | 1000 | Heartbeat interval to RM |
Node Manager | yarn.nodemanager.local-dirs | /tmp/nm-local-dir | List of directories to store localized files in. |
Node Manager | yarn.nodemanager.log-dirs | /tmp/logs | Where to store container logs. |
Node Manager | yarn.nodemanager.resource.memory-mb | 8192 | Amount of physical memory, in MB, that can be allocated for containers. |
Node Manager | yarn.nodemanager.webapp.address | 0.0.0.0:9999 | NM Webapp address. |
Node Manager | yarn.nodemanager.aux-services | mapreduce.shuffle | TShuffle service that needs to be set for Map Reduce applications. |
Node Manager | queryio.nodemanager.options | -Dcom.sun.management.jmxremote $YARN_NODEMANAGER_OPTS -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl.need.client.auth=false -Dcom.sun.management.jmxremote.ssl=false | Node Manager specific runtime options. Used by queryio server for yarn runtime configuration. |
Resource Manager | yarn.resourcemanager.client.thread-count | 10 | The number of threads used to handle applications manager requests. |
Resource Manager | yarn.resourcemanager.scheduler.client.thread-count | 10 | Number of threads to handle scheduler interface. |
Resource Manager | yarn.resourcemanager.admin.client.thread-count | 1 | Number of threads used to handle RM admin interface. |
Resource Manager | yarn.resourcemanager.resource-tracker.client.thread-count | 10 | Number of threads to handle resource tracker calls. |
Resource Manager | yarn.scheduler.minimum-allocation-mb | 128 | The minimum allocation size for every container request at the RM, in MBs. Memory requests lower than this won |
Resource Manager | yarn.scheduler.maximum-allocation-mb | 10240 | The maximum allocation size for every container request at the RM, in MBs. Memory requests higher than this won |
Resource Manager | mapreduce.jobhistory.address | 0.0.0.0:10020 | MapReduce JobHistory Server IPC host:port |
Resource Manager | mapreduce.jobhistory.webapp.address | 0.0.0.0:19888 | MapReduce JobHistory Server Web UI host:port |
Resource Manager | mapreduce.jobhistory.intermediate-done-dir | /mr-history/tmp | Directory where history files are written by MapReduce jobs. |
Resource Manager | mapreduce.jobhistory.done-dir | /mr-history/done | Directory where history files are managed by the MR JobHistory Server. |
Resource Manager | queryio.unit.num.splits | 100 | Number of splits for each mapper. |
Resource Manager | queryio.resourcemanager.options | -Dcom.sun.management.jmxremote $YARN_RESOURCEMANAGER_OPTS -Dcom.sun.management.jmxremote.port=9008 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl.need.client.auth=false -Dcom.sun.management.jmxremote.ssl=false | Resource Manager specific runtime options. Used by queryio server for yarn runtime configuration. |
NOTE: All descriptions are part of Apache Hadoop documentation.
You can also add custom configuration properties related to any MapReduce cluster component.