클래스 WordCount $ TokenMapper가 Hadoop 예제 프로그램에 없습니다.

this page에있는 hadoop 예제를 수행 중이며 클래스를 찾을 수 없음 오류가 발생했습니다. 이클립스는 어떤 구문 오류도보고 있지 않으며 job.setMapperClass(TokenizerMapper.class)에 인스턴스를 강조 표시하면 TokenizerMapper 클래스를 강조 표시합니다. 하위 클래스이거나 여기에서 뭔가를 간과하고 있기 때문에입니까? 내가 명령을 hadoop jar word.jar input output (나는 이미 디렉터리 (args) 매개 변수가/user/[myuser]에 HDFS에서 상대 경로라는 것을 알았 기 때문에 주 Hadoop 노드에서 실행하고 있으므로 문제가되지 않습니다.)

Any . 당신은 당신의 명령 줄에서 jar 이름이 누락되었습니다. 감사하겠습니다 여기

import java.io.IOException; 
    import java.util.StringTokenizer; 
    import org.apache.hadoop.conf.Configuration; 
    import org.apache.hadoop.fs.Path; 
    import org.apache.hadoop.io.IntWritable; 
    import org.apache.hadoop.io.Text; 
    import org.apache.hadoop.mapreduce.Job; 
    import org.apache.hadoop.mapreduce.Mapper; 
    import org.apache.hadoop.mapreduce.Reducer; 
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 

    public class WordCount { 

     Configuration configuration = null; 

     public static class TokenizerMapper 
     extends Mapper<Object, Text, Text, IntWritable>{ 

     private final static IntWritable one = new IntWritable(1); 
     private Text word = new Text(); 

     public void map(Object key, Text value, Context context 
         ) throws IOException, InterruptedException { 
      StringTokenizer itr = new StringTokenizer(value.toString()); 
      while (itr.hasMoreTokens()) { 
      word.set(itr.nextToken()); 
      context.write(word, one); 
      } 
     } 
     } 

     public static class IntSumReducer 
      extends Reducer<Text,IntWritable,Text,IntWritable> { 
     private IntWritable result = new IntWritable(); 

     public void reduce(Text key, Iterable<IntWritable> values, 
          Context context 
          ) throws IOException, InterruptedException { 
      int sum = 0; 
      for (IntWritable val : values) { 
      sum += val.get(); 
      } 
      result.set(sum); 
      context.write(key, result); 
     } 
     } 

     public static void main(String[] args) { 
     Configuration conf = new Configuration(); 
     Job job; 
     conf.addResource(new Path("/work/hadoop/config","core-site.xml")); 
    conf.addResource(new Path("/work/hadoop/config","hdfs-site.xml")); 

    try { 
     job = Job.getInstance(conf, "word count"); 


     job.setJarByClass(WordCount.class); 
     job.setMapperClass(TokenizerMapper.class); 
     job.setCombinerClass(IntSumReducer.class); 
     job.setReducerClass(IntSumReducer.class); 

     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(IntWritable.class); 

     FileInputFormat.addInputPath(job, new Path(args[0])); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 

     System.exit(job.waitForCompletion(true) ? 0 : 1); 

     } catch (IOException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } catch (ClassNotFoundException e) { 
     // TODO Auto-generated catch block 
     e.printStackTrace(); 
    } catch (InterruptedException e) { 
     // TODO Auto-generated catch block 
      e.printStackTrace(); 
     } 
     } 
    }

출처

2014-11-26 Bennett

도움이 문서에서이 검색합니다. 그냥 wc.jar 외에, 그것은 당신의 드라이버 클래스 이름입니다 단어 수 있습니다.

을

bin/hadoop jar wc.jar WordCount /user/joe/wordcount/input /user/joe/wordcount/output

출처

2014-11-26 16:19:55 SMA

arg [0]은 WordCount, arg [1]은 입력, arg [2]는 출력되도록합니다. 코드가 I/O 디렉토리? 출력 디렉토리 [stuff]/input이 이미 있습니다. "라는 오류가 나타납니다. – Bennett

아니요. [this] (http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html#jar) 디렉토리는 동일한 입력을 여러 번 실행했을 수도 있다는 것을 의미합니다. 이전 실행 출력 파일을 실행하거나 삭제할 때마다 출력 디렉토리를 변경하십시오. – SMA

오류가 발생하여 출력이 예상되는 곳 (arg [1])에서 입력을보고 있으므로 설명서가 말하는 내용에 위배된다는 것을 알 수 있습니다. – Bennett

jar 응용 프로그램 시작점 WordCount.java이면 아무 문제없이이 명령을 실행할 수 있습니다. hadoop jar word.jar input output. 현재 jar에는 엔트리 포인트가 없습니다. 그래서 jar 파일에 Class의 정규화 된 이름을 지정하지 않으면 오류가 발생합니다.
Eclipse를 사용하면 클래스를 기본 클래스/진입 점으로 설정하여 다음과 같이 실행할 수 있습니다.

Project=> Right click => Export => JAR File => Next => Specify Jar Path => Next 
=> JAR Manifest specification. 
select the class of the application entry point. 
Main class. Browse and select the main class(WordCount in your case). 
Finish.

희망 하시겠습니까?

출처

2014-11-27 16:17:21

JAR 파일을 클래스 경로 정의 디렉토리로 내 보내면 문제가 해결됩니다. eclipse에서 JAR 파일을 내보낼 때 "자원이 파일 시스템과 동기화되지 않았습니다"오류가 발생할 수 있습니다. 이 경우 프로젝트를 마우스 오른쪽 단추로 클릭하고 Eclipse에서 새로 고침을 선택하십시오. 이것은 'out of sync'문제를 해결할 것입니다

출처

2015-06-24 05:00:54 user5043049

클래스 WordCount $ TokenMapper가 Hadoop 예제 프로그램에 없습니다.

답변

관련 문제