Tuning Hadoop on Dell PowerEdge Servers - Dell Community

Normalized execution time versus hard drives in Hadoop data nodes . .... The number of hard drives (HDD) attached to the data nodes can decrease the Hadoop ...
546KB Sizes 10 Downloads 139 Views
Tuning Hadoop on Dell PowerEdge Servers This Dell Technical White Paper explains how to tune BIOS, OS and Hadoop settings to increase performance in Hadoop workloads.

Donald Russell Solutions Performance Analysis

Tuning Hadoop on Dell PowerEdge Servers

Contents Executive summary ................................................................................................................................................... 5 Introduction ................................................................................................................................................. 5 Key findings .................................................................................................................................................. 5 Compression ................................................................................................................................... 5 Block size ......................................................................................................................................... 5 Number of hard drives................................................................................................................... 5 File system ....................................................................................................................................... 5 OS settings ....................................................................................................................................... 5 Java settings .................................................................................................................................... 5 Hadoop settings ............................................................................................................................. 6 Methodology ...............................................................................................................................................................7 Hadoop cluster configuration ...................................................................................................................7 BIOS ................................................................................................................................................................7 Power management .......................................................................................................................7 ASPM ..................................................................................................................................................7 Raid controller or SATA power management ........................................................................... 8 Summary .......................................................................................................................................... 8 OS and file system ...................................................................................................................................... 8 Open file descriptors and files ..................................................................................................... 8 File system ....................................................................................................................................... 9 Network.......................................................................................................................................... 10 Huge pages.....................................................................................................................................12 Linux kernel swappiness parameter ...........................................................................................13 Linux IO scheduler ........................................................................................................................13 Summary ........................................................................................................................................ 14 Hadoop settings ...........................................................