Tuning Hadoop on Dell PowerEdge Servers - Dell Community

Tuning Hadoop on Dell PowerEdge Servers This Dell Technical White Paper explains how to tune BIOS, OS and Hadoop settings to increase performance in Hadoop workloads.

Donald Russell Solutions Performance Analysis

Tuning Hadoop on Dell PowerEdge Servers

Contents Executive summary ................................................................................................................................................... 5 Introduction ................................................................................................................................................. 5 Key findings .................................................................................................................................................. 5 Compression ................................................................................................................................... 5 Block size ......................................................................................................................................... 5 Number of hard drives................................................................................................................... 5 File system ....................................................................................................................................... 5 OS settings ....................................................................................................................................... 5 Java settings .................................................................................................................................... 5 Hadoop settings ............................................................................................................................. 6 Methodology ...............................................................................................................................................................7 Hadoop cluster configuration ...................................................................................................................7 BIOS ................................................................................................................................................................7 Power management .......................................................................................................................7 ASPM ..................................................................................................................................................7 Raid controller or SATA power management ........................................................................... 8 Summary .......................................................................................................................................... 8 OS and file system ...................................................................................................................................... 8 Open file descriptors and files ..................................................................................................... 8 File system ....................................................................................................................................... 9 Network.......................................................................................................................................... 10 Huge pages.....................................................................................................................................12 Linux kernel swappiness parameter ...........................................................................................13 Linux IO scheduler ........................................................................................................................13 Summary ........................................................................................................................................ 14 Hadoop settings .......................................................................................................................................................15 Hard drive scaling ......................................................................................................................................15 Hadoop file system compression .......................................................................................................... 16 HDFS block size .......................................................................................................................................... 17 MapReduce ................................................................................................................................................ 19 The MapReduce process ......................................................................................................................... 20 Map process ................................................................................................................................................21 Reduce process ......................................................................................................................................... 22

ii


Java reuse option ...................................................................................................................................... 23 Summary .................................................................................................................................................... 23 Conclusion ............................................................................................................................................................... 24 Appendix A — LZO installation process .............................................................................................................. 25 Appendix B — Hadoop cluster hardware configuration .................................................................................. 27 Appendix C — Glossary of Hadoop parameters affecting performance ...................................................... 29 MapReduce parameters that affect timing ........................................................................................... 30 Appendix D — Optimal Hadoop parameters for test cluster .......................................................................... 32 Appendix E — Further information ....................................................................................................................... 33

Tables Table 1.

Compression formats supported in Hadoop ................................................................................ 17

Table 2.

HDFS block size and MapRuduce tasks — change needed for io.sort.mb parameter ......... 18

Table 3.

Dell C8220 DN cluster hardware configuration .......................................................................... 27

Table 4.

Dell R620 M,S,E,HA ........................................................................................................................... 28

Table 5.

Optimal Hadoop parameters for test cluster ............................................................................... 32

Figures Figure 1.

Normalized execution time versus hard drives in Hadoop



Add to MapReduce Service Environment Safety Valve HADOOP_CLASSPATH=/usr/lib/hadoop/lib/*

26


Appendix B — Hadoop cluster hardware configuration Table 3. Dell C8220 DN cluster hardware configuration

Dell C8220 DN Cluster Hardware Details Number of nodes Processor, per node Memory, per node HDD, per node HBA BMC firmware revision Power Supply BIOS Options BIOS revision FBC revision IOMMU HW Prefetch HW Prefetch Training on SW SVMM SR-IOV Turbo Memory Speed PM MEZZ.LOM, PCI-e ALL nonused USB ports disabled RM disabled / RM LOM Shared SATA Power Management Power Mgmt Mode OS config JVM version OS version

4 2X E5 2640(2.5GHz, 55W) 8 x 8GB 1333MHz RDIMMs 12 x 1TB 2.5" 7.2k SATA LSI 2008 1.28 4x 1400W 2.03 1.20 Disabled Enabled Enabled Enabled Disabled Auto Auto(1600) Disabled N/A Disabled Enabled Max Performance N/A 1.6.31 CentOs 6.4

27


Table 4. Dell R620 M,S,E,HA

Dell R620 M,S,E,HA Hardware Details Number of nodes Processor, per node Memory, per node HDD, per node HBA BMC firmware revision Power Supply BIOS Options BIOS revision FBC revision IOMMU HW Prefetch HW Prefetch Training on SW SVMM SR-IOV Turbo Memory Speed PM MEZZ.LOM, PCI-e ALL nonused USB ports disabled RM disabled / RM LOM Shared SATA Power Management Power Mgmt Mode OS config JVM version OS version

3 E5-2680 (2.7GHz, 130W) 8 x 8GB 1600MHz RDIMMs 8 x 146GB 2.5" 15k SAS H310 1.11 2x 1100W 3.0.0 1.20 Disabled Enabled Enabled Enabled Disabled Auto Auto(1333) Disabled N/A N/A Disabled Disabled Max Performance N/A 1.6.31 CentOs 6.4

28


Appendix C — Glossary of Hadoop parameters affecting performance dfs.namenode.handler.count The number of threads the NameNode uses to serve requests. The default value is 10, changing the value gained no noticeable change in performance in the test cluster. With larger clusters changing this value may increase performance, because there are more file operations on the name node. Change the value, and then use a tool such as nnbench for verification. dfs.datanode.handler.count This value is the number of threads that the data nodes use. The default value is 3, changing this value to the number of hard drives in the data node seems to yield the best results. Use testDFSio to verify if this works for your cluster. dfs.datanode.max.xcievers This value is the maximum number of threads used to access the local file system on a data node. The default value is 256. Increasing this to 2048 increased performance on the test cluster. If the data nodes in your cluster have more than 12 hard drives attached, increasing this value even more might increase data node HDFS performance. io.file.buffer.size This is memory buffer to which file IO copies, stores and writes data. It should be a multiple of 4096. It should be safe to use 131072, but we used double that value. The performance gain is not enormous. When using HBase be careful not to set this value too high. dfs.block.size (hdfs-site.xml) This parameter controls the block size written by the data node on to the HDFS file system. A larger the value in megabytes is generally better, until bottlenecks such as the storage controller or Memory come into play. io.sort.factor This parameter is the number of streams to merge concurrently when shuffling and sorting files. io.sort.mb This parameter is the amount of memory used to buffer Mapping process output sorting. mapred.reduce.parallel.copies This parameter is the number of concurrent connections that the Reducer uses to fetch data from the Mappers. tasktracker.http.threads This parameter is the number of connections that the tasktracker uses to provider intermediate data to the Reducers.

29


mapred.tasktracker.map.tasks.maximum mapred.tasktracker.reduce.tasks.maximum These parameters control the maximum number of Mapping and Reducer tasks that run on a data node. A good general rule is to use one-half to two times the number of cores on a node. mapred.max.split.size mapred.min.split.size These two parameters, along with the dfs.block.size, determine the data-chunk size that is fed into the MapReduce process. mapred.map.tasks.speculative.execution mapred.reduce.tasks.speculative.execution These parameters cause the jobtracker to run a copy of the same job on another node. Once the data output is done on one of the data nodes, the other incomplete job is ended. mapred.job.reuse.jvm.num.tasks This parameter tells Hadoop to use the JVM already created for a task, instead destroying and recreating the JVM. When set to −1 this setting can give up to a five percent performance increase in MapReduce jobs. mapred.compress.map.output mapred.map.output.compression.codec mapred.output.compress mapred.output.compression.type These parameters tell Hadoop what compression codec and type to use, as well as in what phase of the MapReduce process to use it. Compression helps with faster disk writes, saves HDFS partition space and decreases the IO transfer time between the Mapper and the Reducers. These benefits come, however, at the cost of processor cycles to encode and decode the compression stream.

MapReduce parameters that affect timing The following parameters can help reduce the latencies inherent in the execution process of MapReduce. These parameters can be added or found in the mapred-site.xml configuration file. mapreduce.tasktracker.outofband.heartbeat Setting the mapreduce.tasktracker.outofband.heartbeat parameter to true, instead of using the default of false, allows the tasktracker to send an out-of-band heartbeat when the task is completed, to reduce latency. jobclient.progress.monitor.poll.interval By default the jobclient.progress.monitor.poll.interval is set 1000 milliseconds. Setting this parameter lower on small clusters can decrease the time lost waiting to verify job completion, because this parameter reports the status of the Hadoop job while it is running.

30


mapreduce.jobtracker.heartbeat.interval.min This is the interval at which each service checks on each other when a job is running. On small clusters changing the mapreduce.jobtracker.heartbeat.interval.min from the default of 10 to a smaller value can increase performance. mapred.reduce.slowstart.completed.maps This parameter tells the Reducer phase of MapReduce to start immediately when the Mapping phase starts, instead of waiting for Mapping process output files to be created. Having the Reducer process spun up and ready to go can increase performance of the whole MapReduce process. Set the mapred.reduce.slowstart.completed.maps to 0 on small jobs; larger job must be set higher.

31


Appendix D — Optimal Hadoop parameters for test cluster Table 5. Optimal Hadoop parameters for test cluster

Parameter

Value

Notes

131072

Size of read/write buffer used in sequence files.

core-site.xml io.file.buffer.size

hdfs-site.xml dfs.blocksize dfs.namenode.handler.count

404750336 100

HDFS blocksize 384MB for large file systems. NameNode server thread to handle RPCs.

true record lzo

compress map output The file compression type. sets the compression codec to be used. Larger memory heap-size for child jvms, during Map process. Higher memory limit while sorting data for map tasks. More streams merged at once while sorting files. memory reserved for storing map outputs, % of jvm memory used to merge map output files Reuses JAVA JVM over and over Starts the reducer before the mapper is finished Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.

mapred-site.xml mapreduce.output.compress mapreduce.output.compession.type mapreduce.output.compression.codec mapreduce.map.java.opts

Xmx1536M

mapreduce.task.io.sort.mb

450

mapreduce.task.io.sort.factor

100

mapreduce.job.shuffle.merge.percent

0.66

mapred.job.reduce.input.buffer.percent

0.65

mapred.job.reuse.jvm.num.tasks

−1

mapred.reduce.slowstart.completed.maps

0

mapreduce.reduce.shuffle.parallelcopies

50

32


Appendix E — Further information For information on the Dell Crowbar tool for installation of the Hadoop cluster, see Dell Cloudera Apache Hadoop install with Crowbar. For information on the HiBench test suite for Hadoop clusters, see GitHub.com/HiBench. For more information on using HiBench, see the Dell white paper, Dell Apache Hadoop Performance Analysis. See the Dell TechCenter blog, Getting the Right Mix for Hadoop.

33