Thursday, August 14, 2014

Improve Hadoop Job Performance - Best Practices

Rules and Parameters to Improve Hadoop Job Performance:

Default values are of Hortonworks Hadoop 2.0, please check configuration files for other values.

io.sort.mb: Buffer size for sorting
Default Size: 100 MB
Description: This is the total memory available for sorting per node.
Suggestion 1: This should be around 10 times the value of io.sort.factor. For large jobs (jobs in which map output is very large), it value
should be increased (considering total memory available to each node) so that lesser spill to the disk happens.
Suggestion 2 : 2x block size.
To avoid a "task out of memory" error, set io.sort.mb to a value greater than 0.25*mapred.child.java.opts and less than 0.5*mapred.child.java.opts.





dfs.blocksize: File system block size
Default Size: 128 MB
Description: If Input data size = 2000 GB, dfs.block.size=256 MB then minimum no. of maps= (2000*1024)/256 = 8,000 Maps
Suggestions: If you have less resources and large set of data then dfs.block.size should be more because of the considerable amount of overhead for map task creation.
For large number of small files: If it is less than the block size than aggregate it.
Advantage: This can potentially make it possible for client to read/write more data without interacting with the Namenode, and it also
reduces the metadata size of the Namenode, reducing Namenode load (this can be an important consideration for extremely large file
systems).

io.sort.factor: Stream merge factor
Default Size: 10
Description: Increase in its value provide help to reducer because the rear-most collection of streams (equal to the value of io.sort.factor) are directly sent to the reduce function without merging and thus saving time in merging.
For the job that has large number of maps and map’s output is also large, it’s value should be increased such that it’s value remains to
the one-tenth of the value of io.sort.mb.

mapred.compress.map.output: Map Output Compression
Default Value: False
Description: For the large jobs and if the data is splittable, it’s value should be set true because of Disk IO is usually the performance
bottleneck and thus data transfer between nodes will be fast.
Recommend user: LZO
If you are emitting sequence files for your output, then you can set the mapred.output.compression.type property to control the type of
compression to use. The default is RECORD, which compresses individual records. Changing this to BLOCK, which compresses groups of records, is recommended since it compresses better.

mapred.map.tasks.speculative.execution: Map/Reduce task speculative execution
Default value: True
Description: On a busy cluster speculative execution can reduce overall throughput, since redundant tasks are being executed in an
attempt to bring down the execution time for a single job.
Suggestions: large jobs where average task completion time is significant (> 1 hr) due to complex and large calculations and high
throughput is required the speculative execution should be set to false.
If high priority or sensitive task, it should be TRUE.

mapreduce.reduce.shuffle.parallelcopies
Default value: 5
Description: Number of threads to copy map outputs to the Reducer.
Suggestion: It’s value should be the square root of the number of nodes in your cluster.
Suggestion: large jobs (the jobs in which map output is very large), value of this property can be increased keeping in mind that it will increase the total CPU usage.

Other Performance Parameters:
CombineFileInputFormat class in the org.apace.hadoop.mapreduce.lib.input package: Emphasize user to aggregate large no of small files so that no. of mappers can be reduced (Recommended especially when there are large no. of small files and going to be processed more than once).

Use combiners 
whenever it is possible: Combiner is a mini reducer on the map side. It reduces the amount of data sent to the reducers. Doesn’t require to be in the same class as reducers.

Emphasize customer to use reducer 
in such a way that each of them generates at least 1 GB of output. This will increase the performance, as number of reducers will be less.

Number of task slots per node
:
Mappers:Reducers = 4:3
Map Slots /Node= 0.75 * Total no. of core
Reduce Slots/Node = 0.50 * Total no of core

Feel free to contact me on:
priyanks@asu.edu


No comments:

Post a Comment