2014-07-06 5 views
2

기본 맵퍼 및 감속기를 사용하는 간단한 mapreduce 작업이 있습니다. 입력은 일부 텍스트 파일입니다. 의사 배포 모드에서 Hadoop 2.x를 사용하고 있습니다.mapred.reduce.tasks가 예상대로 작동하지 않습니다.

내가 mapred.reduce.tasks=2으로 설정하더라도 내 생각에는 여전히 하나의 감속기 만 호출됩니다.

package org.priya.sort; 

import java.net.URI; 
import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.conf.Configured; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Job; 
import org.apache.hadoop.mapreduce.filecache.DistributedCache; 
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat; 
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat; 
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; 
import org.apache.hadoop.mapreduce.lib.partition.InputSampler; 
import org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner; 
import org.apache.hadoop.util.Tool; 
import org.apache.hadoop.util.ToolRunner; 
public class TestingReduce extends Configured implements Tool { 

    @Override 
    public int run(String[] arg0) throws Exception { 
     System.out.println("###########I am in TestingReduce###########"); 
     Job job = Job.getInstance(getConf()); 
     job.setInputFormatClass(TextInputFormat.class); 
     job.setOutputFormatClass(TextOutputFormat.class); 
     job.setJarByClass(TestingReduce.class); 
     System.out.println("#########The number of reducers :: " +job.getNumReduceTasks()); 
     FileInputFormat.addInputPath(job, new Path("/input")); 
     FileOutputFormat.setOutputPath(job, new Path("/totalOrderOutput")); 
     return job.waitForCompletion(true) ? 0 :1 ; 
    } 

    public static void main(String args[]) throws Exception { 
     int i = ToolRunner.run(new TestingReduce(), args) ; 
     System.out.println("Retun value is " + i); 
    } 
} 

나는이 일을 여전히 하나의 감속기가 생성 내가 2 감속기의 수를 설정하고 있지만

###########I am in TestingReduce########### 
OpenJDK 64-Bit Server VM warning: You have loaded library /home/priya/workspace/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. 
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. 
14/07/06 15:24:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
#########The number of reducers :: 2 
14/07/06 15:24:48 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id 
14/07/06 15:24:48 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 
14/07/06 15:24:49 INFO input.FileInputFormat: Total input paths to process : 3 
14/07/06 15:24:50 INFO mapreduce.JobSubmitter: number of splits:3 
14/07/06 15:24:50 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name 
14/07/06 15:24:50 INFO Configuration.deprecation: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 
14/07/06 15:24:50 INFO Configuration.deprecation: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 
14/07/06 15:24:50 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 
14/07/06 15:24:51 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1851811203_0001 
14/07/06 15:24:51 WARN conf.Configuration: file:/home/priya/hdfs-tmp/mapred/staging/priya1851811203/.staging/job_local1851811203_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 
14/07/06 15:24:51 WARN conf.Configuration: file:/home/priya/hdfs-tmp/mapred/staging/priya1851811203/.staging/job_local1851811203_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 
14/07/06 15:24:52 WARN conf.Configuration: file:/home/priya/hdfs-tmp/mapred/local/localRunner/priya/job_local1851811203_0001/job_local1851811203_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 
14/07/06 15:24:52 WARN conf.Configuration: file:/home/priya/hdfs-tmp/mapred/local/localRunner/priya/job_local1851811203_0001/job_local1851811203_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 
14/07/06 15:24:52 INFO mapreduce.Job: The url to track the job: http://localhost:8080/ 
14/07/06 15:24:52 INFO mapreduce.Job: Running job: job_local1851811203_0001 
14/07/06 15:24:52 INFO mapred.LocalJobRunner: OutputCommitter set in config null 
14/07/06 15:24:52 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 
14/07/06 15:24:53 INFO mapred.LocalJobRunner: Waiting for map tasks 
14/07/06 15:24:53 INFO mapred.LocalJobRunner: Starting task: attempt_local1851811203_0001_m_000000_0 
14/07/06 15:24:53 INFO mapreduce.Job: Job job_local1851811203_0001 running in uber mode : false 
14/07/06 15:24:53 INFO mapreduce.Job: map 0% reduce 0% 
14/07/06 15:24:53 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
14/07/06 15:24:53 INFO mapred.MapTask: Processing split: hdfs://localhost/input/2.txt:0+15 
14/07/06 15:24:53 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
14/07/06 15:24:53 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 
14/07/06 15:24:53 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 
14/07/06 15:24:53 INFO mapred.MapTask: soft limit at 83886080 
14/07/06 15:24:53 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 
14/07/06 15:24:53 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: 
14/07/06 15:24:54 INFO mapred.MapTask: Starting flush of map output 
14/07/06 15:24:54 INFO mapred.MapTask: Spilling map output 
14/07/06 15:24:54 INFO mapred.MapTask: bufstart = 0; bufend = 79; bufvoid = 104857600 
14/07/06 15:24:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214368(104857472); length = 29/6553600 
14/07/06 15:24:54 INFO mapred.MapTask: Finished spill 0 
14/07/06 15:24:54 INFO mapred.Task: Task:attempt_local1851811203_0001_m_000000_0 is done. And is in the process of committing 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: map 
14/07/06 15:24:54 INFO mapred.Task: Task 'attempt_local1851811203_0001_m_000000_0' done. 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1851811203_0001_m_000000_0 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1851811203_0001_m_000001_0 
14/07/06 15:24:54 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
14/07/06 15:24:54 INFO mapred.MapTask: Processing split: hdfs://localhost/input/1.txt:0+10 
14/07/06 15:24:54 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
14/07/06 15:24:54 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 
14/07/06 15:24:54 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 
14/07/06 15:24:54 INFO mapred.MapTask: soft limit at 83886080 
14/07/06 15:24:54 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 
14/07/06 15:24:54 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 
14/07/06 15:24:54 INFO mapreduce.Job: map 100% reduce 0% 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: 
14/07/06 15:24:54 INFO mapred.MapTask: Starting flush of map output 
14/07/06 15:24:54 INFO mapred.MapTask: Spilling map output 
14/07/06 15:24:54 INFO mapred.MapTask: bufstart = 0; bufend = 50; bufvoid = 104857600 
14/07/06 15:24:54 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600 
14/07/06 15:24:54 INFO mapred.MapTask: Finished spill 0 
14/07/06 15:24:54 INFO mapred.Task: Task:attempt_local1851811203_0001_m_000001_0 is done. And is in the process of committing 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: map 
14/07/06 15:24:54 INFO mapred.Task: Task 'attempt_local1851811203_0001_m_000001_0' done. 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: Finishing task: attempt_local1851811203_0001_m_000001_0 
14/07/06 15:24:54 INFO mapred.LocalJobRunner: Starting task: attempt_local1851811203_0001_m_000002_0 
14/07/06 15:24:54 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
14/07/06 15:24:54 INFO mapred.MapTask: Processing split: hdfs://localhost/input/3.txt:0+10 
14/07/06 15:24:54 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
14/07/06 15:24:54 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) 
14/07/06 15:24:54 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 
14/07/06 15:24:54 INFO mapred.MapTask: soft limit at 83886080 
14/07/06 15:24:54 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 
14/07/06 15:24:54 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: 
14/07/06 15:24:55 INFO mapred.MapTask: Starting flush of map output 
14/07/06 15:24:55 INFO mapred.MapTask: Spilling map output 
14/07/06 15:24:55 INFO mapred.MapTask: bufstart = 0; bufend = 50; bufvoid = 104857600 
14/07/06 15:24:55 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214380(104857520); length = 17/6553600 
14/07/06 15:24:55 INFO mapred.MapTask: Finished spill 0 
14/07/06 15:24:55 INFO mapred.Task: Task:attempt_local1851811203_0001_m_000002_0 is done. And is in the process of committing 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: map 
14/07/06 15:24:55 INFO mapred.Task: Task 'attempt_local1851811203_0001_m_000002_0' done. 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: Finishing task: attempt_local1851811203_0001_m_000002_0 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: Map task executor complete. 
14/07/06 15:24:55 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ] 
14/07/06 15:24:55 INFO mapred.Merger: Merging 3 sorted segments 
14/07/06 15:24:55 INFO mapred.Merger: Down to the last merge-pass, with 3 segments left of total size: 191 bytes 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: 
14/07/06 15:24:55 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords 
14/07/06 15:24:55 INFO mapred.Task: Task:attempt_local1851811203_0001_r_000000_0 is done. And is in the process of committing 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: 
14/07/06 15:24:55 INFO mapred.Task: Task attempt_local1851811203_0001_r_000000_0 is allowed to commit now 
14/07/06 15:24:55 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1851811203_0001_r_000000_0' to hdfs://localhost/totalOrderOutput/_temporary/0/task_local1851811203_0001_r_000000 
14/07/06 15:24:55 INFO mapred.LocalJobRunner: reduce > reduce 
14/07/06 15:24:55 INFO mapred.Task: Task **'attempt_local1851811203_0001_r_000000_0'** done. 
14/07/06 15:24:56 INFO mapreduce.Job: map 100% reduce 100% 
14/07/06 15:24:56 INFO mapreduce.Job: Job job_local1851811203_0001 completed successfully 
14/07/06 15:24:56 INFO mapreduce.Job: Counters: 32 
    File System Counters 
     FILE: Number of bytes read=21871 
     FILE: Number of bytes written=768178 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=110 
     HDFS: Number of bytes written=74 
     HDFS: Number of read operations=37 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=6 
    Map-Reduce Framework 
     Map input records=18 
     Map output records=18 
     Map output bytes=179 
     Map output materialized bytes=233 
     Input split bytes=279 
     Combine input records=0 
     Combine output records=0 
     Reduce input groups=8 
     Reduce shuffle bytes=0 
     Reduce input records=18 
     Reduce output records=18 
     Spilled Records=36 
     Shuffled Maps =0 
     Failed Shuffles=0 
     Merged Map outputs=0 
     GC time elapsed (ms)=54 
     CPU time spent (ms)=0 
     Physical memory (bytes) snapshot=0 
     Virtual memory (bytes) snapshot=0 
     Total committed heap usage (bytes)=1372061696 
    File Input Format Counters 
     Bytes Read=35 
    File Output Format Counters 
     Bytes Written=74 
Retun value is 0 

hadoop jar TestingReducer.jar -D mapred.reduce.tasks=2을 실행 명령 아래 사용하고 있습니다.

답변

2

로컬 모드에서 실행하는 이유가 있습니다.

당신은 look at the source code of the LocalJobRunner가 가질 수

int numReduceTasks = job.getNumReduceTasks(); 
if (numReduceTasks > 1 || numReduceTasks < 0) { 
    // we only allow 0 or 1 reducer in local mode 
    numReduceTasks = 1; 
    job.setNumReduceTasks(1); 
} 

분산 모드를 의사로 변경하려면 구성해야합니다

mapreduce.framework.name = yarn 

현재 local이 세트가 있습니다.

+0

감사합니다. Thomas !!! 네, 일부 디버깅 목적을 위해 로컬 모드에서 프로그램을 실행 중이 었습니다. 가상 분산 모드로 가져 왔습니다. – priyaranjan

관련 문제