2016-08-11 4 views
1

spark1.6과 Anaconda2를 성공적으로 설치했습니다. 내가 ipython 사용하려고하면, 나는 다음과 같은 문제를 가지고 : 나는 다음과 같은 사용스파크에서 ipython과 관련된 오류 메시지의 의미는 무엇입니까?

Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob. 

: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): 
java.io.IOException: Cannot run program "/root/anaconda2/bin": error=13,Permission denied 
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) at  org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:161) 
at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:87) 
at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:63) 
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:134) 
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:101) 
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) 
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) 
at org.apache.spark.scheduler.Task.run(Task.scala:89) 
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745) 

Caused by: java.io.IOException: error=13, Permission denied 
at java.lang.UNIXProcess.forkAndExec(Native Method) 
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186) 
at java.lang.ProcessImpl.start(ProcessImpl.java:130) 
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) 
... 14 more 

Driver stacktrace: 
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) 
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) 
at scala.Option.foreach(Option.scala:236) 
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) 
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) 
at org.apache.spark.api.python.PythonRDD$.runJob(PythonRDD.scala:393) 
at org.apache.spark.api.python.PythonRDD.runJob(PythonRDD.scala) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) 
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) 
at py4j.Gateway.invoke(Gateway.java:259) 
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
at py4j.commands.CallCommand.execute(CallCommand.java:79) 
at py4j.GatewayConnection.run(GatewayConnection.java:209) 
at java.lang.Thread.run(Thread.java:745) 

Caused by: java.io.IOException: Cannot run program "/root/anaconda2/bin": error=13, Permission denied 
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047) 
at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:161) 
at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:87) 
at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:63) 
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:134) 
at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:101) 
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) 
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) 
at org.apache.spark.scheduler.Task.run(Task.scala:89) 
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
... 1 more 
Caused by: java.io.IOException: error=13, Permission denied 
at java.lang.UNIXProcess.forkAndExec(Native Method) 
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186) 
at java.lang.ProcessImpl.start(ProcessImpl.java:130) 
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028) 
... 14 more 

ipython 코드를, 내가 마지막 줄을 코딩 할 때 나는 오류가 발생했습니다.

from pyspark.mllib.regression import LabeledPoint, LinearRegressionWithSGD, LinearRegressionModel 

로드 및

def parsePoint(line): 
    values = [float(x) for x in line.replace(',', ' ').split(' ')] 
    return LabeledPoint(values[0], values[1:]) 

data = sc.textFile("data/mllib/ridge-data/lpsa.data") 
parsedData = data.map(parsePoint) 

당신이 먼저 아나콘다 배포

sudo -r rm anaconda_installation_path 
를 제거하는 것이 좋습니다 모델 오류가

model = LinearRegressionWithSGD.train(parsedData, iterations=100, step=0.00000001) 
+0

자세한 정보를 제공해주십시오. 어떤 시스템을 사용하고 있습니까 (Windows, Linux 또는 OSX)? 아나콘다 설치 디렉토리? linux라면 Anaconda (sudo bash Anaconda.xx.sh)를 설치하는 동안 sudo를 사용했을 것입니다. root 권한이 필요하기 때문입니다. – ashwinids

+0

제가 사용하는 시스템은 Linux-Centos입니다. 루트 사용자 데스크탑에 아나콘다를 설치합니다. 루트 사용자로 아나콘다를 설치했습니다. 그리고이 명령을 루트 사용자로 사용합니다. – Chauncey

+0

위 코드를 어떻게 실행 했습니까? 노트북, pyspark 셸 또는 스파크 제출? '파이크 킨 _ 파이썬 '은 무엇입니까? – ShuaiYuan

답변

0

발생 빌드 데이터를 구문 분석, 설치 그것없이 sudo

sh Anaconda.xx.sh 

자세한 내용은 page을 참조하십시오.

0

아나콘다를 다시 설치하고 아나콘다 디렉토리에/opt/anaconda 디렉토리를 추가하십시오.

관련 문제