2016-06-21 1 views
1

PyCharm에서 Cassandra 데이터베이스에 연결하기 위해 다음을 작성했습니다.PyCharm IDE에서 Passpark를 Cassandra 데이터베이스에 연결

from pyspark import SparkContext, SparkConf 
from pyspark.sql import SQLContext 
import os 

os.environ['SPARK_HOME']="C:\Users\MyEnv\Documents\spark-1.6.1-bin-hadoop2.4" 
conf = SparkConf() 
conf.setAppName("Spark Cassandra") 
conf.set("spark.cassandra.connection.host","xxx.xxx.xxx.xxx").set("spark.cassandra.connection.port","9000") 
sc = SparkContext(conf=conf) 
sql = SQLContext(sc) 
print("it means that ") 
dataFrame = sql.read.format("org.apache.spark.sql.cassandra").options(table="table_name", keyspace="MyDb").load() 
dataFrame.printSchema() 

인쇄 기능이 실행되고 있지만, 라인

sql.read.format ("org.apache.spark.sql.cassandra"). 옵션 (표 = "TABLE_NAME"KEYSPACE = "MYDB ") .load()

보고서에 다음과 같은 오류 :

Traceback (most recent call last): 
File "C:/Users/MyEnv/PycharmProjects/Big_Spark/Cassandra_connector2.py", line 16, in <module> 
dataFrame = sql.read.format("org.apache.spark.sql.cassandra").options(table="tmf_pm1", keyspace="framework20").load() 
File "C:\Users\MyEnv\Documents\spark-1.6.1-bin-hadoop2.4\python\pyspark\sql\readwriter.py", line 139, in load 
return self._df(self._jreader.load()) 
File "C:\Users\MyEnv\AppData\Local\Continuum\Anaconda\lib\site-packages\py4j\java_gateway.py", line 1026, in __call__ 
answer, self.gateway_client, self.target_id, self.name) 
File "C:\Users\MyEnv\Documents\spark-1.6.1-bin-hadoop2.4\python\pyspark\sql\utils.py", line 45, in deco 
return f(*a, **kw) 
File "C:\Users\MyEnv\AppData\Local\Continuum\Anaconda\lib\site- packages\py4j\protocol.py", line 316, in get_return_value 
format(target_id, ".", name), value) 
py4j.protocol.Py4JJavaError: An error occurred while calling o26.load. 
: java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra. Please find packages at http://spark-packages.org 
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77) 
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102) 
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) 
at java.lang.reflect.Method.invoke(Unknown Source) 
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) 
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) 
at py4j.Gateway.invoke(Gateway.java:259) 
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) 
at py4j.commands.CallCommand.execute(CallCommand.java:79) 
at py4j.GatewayConnection.run(GatewayConnection.java:209) 
at java.lang.Thread.run(Unknown Source) 
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.cassandra.DefaultSource 
at java.net.URLClassLoader$1.run(Unknown Source) 
at java.net.URLClassLoader$1.run(Unknown Source) 
at java.security.AccessController.doPrivileged(Native Method) 
at java.net.URLClassLoader.findClass(Unknown Source) 
at java.lang.ClassLoader.loadClass(Unknown Source) 
at java.lang.ClassLoader.loadClass(Unknown Source) 
at   org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62) 
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62) 
at scala.util.Try$.apply(Try.scala:161) 
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62) 
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62) 
at scala.util.Try.orElse(Try.scala:82) 
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62) 
... 13 more 

16/06/21 13:31:43 INFO SparkContext: Invoking stop() from shutdown hook 

어떤 문제가 될 수 있을까?

답변

1

추가 :

spark.jars.packages com.datastax.spark:spark-cassandra-connector_2.10:1.6.0 

사람 :

SPARK_HOME\conf\spark-defaults.conf 
관련 문제