Full utilization of all cores in Hadoop pseudo-distributed mode
Asked Answered
L

2

8

I am running a task in pseudo-distributed mode on my 4 core laptop. How can I ensure that all cores are effectively used. Currently my job tracker shows that only one job is executing at a time. Does that mean only one core is used?

The following are my configuration files.

conf/core-site.xml:

<configuration>
   <property>
       <name>fs.default.name</name>
       <value>hdfs://localhost:9000</value>
   </property>
 </configuration>

conf/hdfs-site.xml:

<configuration>
  <property>
       <name>dfs.replication</name>
       <value>1</value>
  </property>
</configuration>

conf/mapred-site.xml:

<configuration>
   <property>
        <name>mapred.job.tracker</name>
        <value>localhost:9001</value>  
   </property>

</configuration>

EDIT: As per the answer, I need to add the following properties in mapred-site.xml

 <property>
     <name>mapred.map.tasks</name> 
     <value>4</value> 
  </property>
  <property>
     <name>mapred.reduce.tasks</name> 
     <value>4</value> 
  </property>
Ligature answered 2/12, 2011 at 13:47 Comment(3)
mapred.map.tasks and mapred.reduce.tasks don't control the # of map/reduce tasks per node. Please try and make sure before selecting an answer.Advance
@Praveen that's correct but since he has 1 node he will need to at least suggest to it to also use more mappers not just raise the max per tracker.Bakeman
@Ligature In addition to adding the mapred.(map | reduce).tasks values of 4 in mapred-site.xml, did you change the values of mapreduce.tasktracker.(map | reduce).tasks.maximum in order to fully utilize all cores?Antimalarial
B
3

mapred.map.tasks and mapred.reduce.tasks will control this, and (I believe) would be set in mapred-site.xml. However this establishes these as cluster-wide defaults; more usually you would configure these on a per-job basis. You can set the same params on the java command line with -D

Bakeman answered 2/12, 2011 at 13:53 Comment(6)
How many map and reduce tasks would be optimal for a 4 core system?Ligature
4 would probably be a good start -- though you may quickly be I/O bound on one machine rather than CPU-bound.Bakeman
I think that is the new problem. #8358130Ligature
Any hints on how to go about this?Ligature
Please see my response in a separate answer.Advance
PS I think Praveen's answer is also necessary. The default max is 2, and so if you want to run more than 2 on your 1 node, the max has to be raised as well.Bakeman
A
6

mapreduce.tasktracker.map.tasks.maximum and mapreduce.tasktracker.reduce.tasks.maximum properties control the number of map and reduce tasks per node. For a 4 core processor, start with 2/2 and from there change the values if required. A slot is a map or a reduce slot, setting the values to 4/4 will make the Hadoop framework launch 4 map and 4 reduce tasks simultaneously. A total of 8 map and reduce tasks run at a time on a node.

mapred.map.tasks and mapred.reduce.tasks properties control the total number of map/reduce tasks for the job and not the # of tasks per node. Also, mapred.map.tasks is a hint to the Hadoop framework and the total # of map tasks for the job equals the # of InputSplits.

Advance answered 2/12, 2011 at 16:27 Comment(0)
B
3

mapred.map.tasks and mapred.reduce.tasks will control this, and (I believe) would be set in mapred-site.xml. However this establishes these as cluster-wide defaults; more usually you would configure these on a per-job basis. You can set the same params on the java command line with -D

Bakeman answered 2/12, 2011 at 13:53 Comment(6)
How many map and reduce tasks would be optimal for a 4 core system?Ligature
4 would probably be a good start -- though you may quickly be I/O bound on one machine rather than CPU-bound.Bakeman
I think that is the new problem. #8358130Ligature
Any hints on how to go about this?Ligature
Please see my response in a separate answer.Advance
PS I think Praveen's answer is also necessary. The default max is 2, and so if you want to run more than 2 on your 1 node, the max has to be raised as well.Bakeman

© 2022 - 2024 — McMap. All rights reserved.