2017-03-28 1 views
1

Spark 용 Hadoop-MongoDB Connector의 EnronMail 예제를 실행하려고합니다. 따라서 GitHub의 자바 코드 예제를 사용하고 있습니다. https://github.com/mongodb/mongo-hadoop/blob/master/examples/enron/spark/src/main/java/com/mongodb/spark/examples/enron/Enron.java 저는 필요에 따라 서버 이름을 수정하고 사용자 이름과 암호를 추가했습니다.Spark Task not Serializable Hadoop-MongoDB- 커넥터 Enron

Exception in thread "main" org.apache.spark.SparkException: Task not serializable 
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304) 
    at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294) 
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) 
    at org.apache.spark.SparkContext.clean(SparkContext.scala:2066) 
    at org.apache.spark.rdd.RDD$$anonfun$flatMap$1.apply(RDD.scala:333) 
    at org.apache.spark.rdd.RDD$$anonfun$flatMap$1.apply(RDD.scala:332) 
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) 
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) 
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) 
    at org.apache.spark.rdd.RDD.flatMap(RDD.scala:332) 
    at org.apache.spark.api.java.JavaRDDLike$class.flatMap(JavaRDDLike.scala:130) 
    at org.apache.spark.api.java.AbstractJavaRDDLike.flatMap(JavaRDDLike.scala:46) 
    at Enron.run(Enron.java:43) 
    at Enron.main(Enron.java:104) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) 
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) 
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) 
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) 
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
Caused by: java.io.NotSerializableException: Enron 
Serialization stack: 
    - object not serializable (class: Enron, value: [email protected]) 
    - field (class: Enron$1, name: this$0, type: class Enron) 
    - object (class Enron$1, [email protected]) 
    - field (class: org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1, name: f$3, type: interface org.apache.spark.api.java.function.FlatMapFunction) 
    - object (class org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$1$1, <function1>) 
    at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) 
    at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) 
    at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:101) 
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301) 
    ... 22 more 

나는 다음 FlatMapFunction에 대한 새로운 클래스를 생성하고이 클래스에 의해 엔론 클래스를 확장 : 나는 다음 그것을 가지고

오류 메시지입니다. 이것은 문제를 해결할 수 없습니다. 어떤 아이디어가이 문제를 해결하는 방법?

답변

1

문제는 mongo-hadoop-spark-2.0.2.jar를 호출하여 해결되었습니다. 또한 다음 글꼴을 사용하여 :

<dependencies> 
<dependency> 
    <groupId>junit</groupId> 
    <artifactId>junit</artifactId> 
    <version>3.8.1</version> 
    <scope>test</scope> 
</dependency> 

      <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-sql_2.11</artifactId> 
     <version>1.5.1</version> 
    </dependency> 
    <dependency> 
     <groupId>org.apache.spark</groupId> 
     <artifactId>spark-core_2.11</artifactId> 
     <version>1.5.1</version> 
    </dependency> 
    <dependency> 
     <groupId>log4j</groupId> 
     <artifactId>log4j</artifactId> 
     <version>1.2.14</version> 
    </dependency> 

<!-- https://mvnrepository.com/artifact/org.mongodb.mongo-hadoop/mongo-hadoop-core --> 
<dependency> 
    <groupId>org.mongodb.mongo-hadoop</groupId> 
    <artifactId>mongo-hadoop-core</artifactId> 
    <version>1.4.1</version> 
</dependency> 
<!-- https://mvnrepository.com/artifact/org.mongodb/bson --> 
<dependency> 
    <groupId>org.mongodb</groupId> 
    <artifactId>bson</artifactId> 
    <version>3.4.2</version> 
    </dependency> 
    </dependencies> 
</project>