더 많은 실습 경험을 얻으려면 프로젝트 단어 수를 시험해보고 싶었습니다.
다음은 내가 가지고있는 샘플 데이터입니다. python으로 mapreduce 용 프로그램을 시도하고 도움이 필요합니다.
유엔 (UN)
국제 협력을 촉진 10 월 1945 년 24 일에 설립 된 정부 간기구 이다. A 비효율적 인 국제 연맹을 대신하여, 조직은이 제 2 차 세계 대전 이후 또 다른 그러한 갈등을 막기 위해 만들어졌습니다.
[...]
이와 나는 또한 2[[email protected] Desktop]# python RatingsBreakdown.py UN.txt
Traceback (most recent call last):
File "RatingsBreakdown.py", line 1, in <module>
from mrjob.job import MRJob
File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 1106
for k, v in unfiltered_jobconf.items() if v is not None
^
SyntaxError: invalid syntax
파이썬으로 다음과 같은 오류가 점점 오전 내 결과
from mrjob.job import MRJob
from mrjob.step import MRStep
class MovieRatings(MRJob):
def steps(self):
return [
MRStep(mapper=self.mapper_get_ratings,
reducer=self.reducer_count_ratings),
]
def mapper_get_ratings(self, _, line):
(word) = line.split(' ')
yield word, 1
def reducer_count_ratings(self, key, values):
yield Key, sum(values)
if __name__ == '__main__':
MovieRatings.run()
를 얻기 위해 다음 파이썬 코드를 사용 파이썬 3
[[email protected] Desktop]# python3 RatingsBreakdown.py UN.txt
No configs found; falling back on auto-configuration
No configs specified for inline runner
Running step 1 of 2...
Creating temp directory /tmp/RatingsBreakdown.training.20171128.083536.602598
Error while reading from /tmp/RatingsBreakdown.training.20171128.083536.602598/step/000/mapper/00000/input:
Traceback (most recent call last):
File "RatingsBreakdown.py", line 25, in <module>
RatingsBreakdown.run()
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 424, in run
mr_job.execute()
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 445, in execute
super(MRJob, self).execute()
File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 185, in execute
self.run_job()
File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 233, in run_job
runner.run()
File "/usr/lib/python3.4/site-packages/mrjob/runner.py", line 511, in run
self._run()
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 144, in _run
self._run_mappers_and_combiners(step_num, map_splits)
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 185, in _run_mappers_and_combiners
for task_num, map_split in enumerate(map_splits)
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 120, in _run_multiple
func()
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 662, in _run_mapper_and_combiner
run_mapper()
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 685, in _run_task
stdin, stdout, stderr, wd, env)
File "/usr/lib/python3.4/site-packages/mrjob/inline.py", line 92, in invoke_task
task.execute()
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 433, in execute
self.run_mapper(self.options.step_num)
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 517, in run_mapper
for out_key, out_value in mapper(key, value) or():
File "RatingsBreakdown.py", line 13, in mapper_get_ratings
(userID, movieID, rating, timestamp) = line.split('\t')
ValueError: need more than 1 value to unpack
내가 오류를 해결하고 뉴욕의 실수가 무엇인지 이해하고 싶은 내 MovieRatings
[[email protected] Desktop]# python3 MovieRatings.py UN.txt
No configs found; falling back on auto-configuration
No configs specified for inline runner
Running step 1 of 1...
Creating temp directory /tmp/MovieRatings.training.20171128.083635.368889
Error while reading from /tmp/MovieRatings.training.20171128.083635.368889/step/000/reducer/00000/input:
Traceback (most recent call last):
File "MovieRatings.py", line 20, in <module>
MovieRatings.run()
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 424, in run
mr_job.execute()
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 445, in execute
super(MRJob, self).execute()
File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 185, in execute
self.run_job()
File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 233, in run_job
runner.run()
File "/usr/lib/python3.4/site-packages/mrjob/runner.py", line 511, in run
self._run()
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 150, in _run
self._run_reducers(step_num, num_reducer_tasks)
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 246, in _run_reducers
for task_num in range(num_reducer_tasks)
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 120, in _run_multiple
func()
File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 685, in _run_task
stdin, stdout, stderr, wd, env)
File "/usr/lib/python3.4/site-packages/mrjob/inline.py", line 92, in invoke_task
task.execute()
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 439, in execute
self.run_reducer(self.options.step_num)
File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 560, in run_reducer
for out_key, out_value in reducer(key, values) or():
File "MovieRatings.py", line 17, in reducer_count_ratings
yield Key, sum(values)
NameError: name 'Key' is not defined
와 또한
. 이 라이브러리는 Python3에
File "RatingsBreakdown.py", line 13, in mapper_get_ratings
(userID, movieID, rating, timestamp) = line.split('\t')
ValueError: need more than 1 value to unpack
먼저 작동처럼
'Key' 대신'key'입니까? – Mel
'steps()'에'[]'이 (가) 들여 쓰기 문제가 있습니다. 여기에 몇 가지 문제가있을 수 있습니다. – cdarke