2017-11-28 1 views
-3

더 많은 실습 경험을 얻으려면 프로젝트 단어 수를 시험해보고 싶었습니다.
다음은 내가 가지고있는 샘플 데이터입니다. python으로 mapreduce 용 프로그램을 시도하고 도움이 필요합니다.

유엔 (UN)

국제 협력을 촉진 10 월 1945 년 24 일에 설립 된 정부 간기구 이다. A 비효율적 인 국제 연맹을 대신하여, 조직은이 제 2 차 세계 대전 이후 또 다른 그러한 갈등을 막기 위해 만들어졌습니다.

[...]

이와 나는 또한 2

[[email protected] Desktop]# python RatingsBreakdown.py UN.txt 
Traceback (most recent call last): 
    File "RatingsBreakdown.py", line 1, in <module> 
    from mrjob.job import MRJob 
    File "/usr/lib/python2.6/site-packages/mrjob/job.py", line 1106 
    for k, v in unfiltered_jobconf.items() if v is not None 
    ^
SyntaxError: invalid syntax 

파이썬으로 다음과 같은 오류가 점점 오전 내 결과

from mrjob.job import MRJob 

from mrjob.step import MRStep 



class MovieRatings(MRJob): 

    def steps(self): 

     return [ 

      MRStep(mapper=self.mapper_get_ratings, 

        reducer=self.reducer_count_ratings), 

    ] 



    def mapper_get_ratings(self, _, line): 

     (word) = line.split(' ') 

     yield word, 1 



    def reducer_count_ratings(self, key, values): 

     yield Key, sum(values) 


if __name__ == '__main__': 

    MovieRatings.run() 

를 얻기 위해 다음 파이썬 코드를 사용 파이썬 3

[[email protected] Desktop]# python3 RatingsBreakdown.py UN.txt 
No configs found; falling back on auto-configuration 
No configs specified for inline runner 
Running step 1 of 2... 
Creating temp directory /tmp/RatingsBreakdown.training.20171128.083536.602598 
Error while reading from /tmp/RatingsBreakdown.training.20171128.083536.602598/step/000/mapper/00000/input: 
Traceback (most recent call last): 
    File "RatingsBreakdown.py", line 25, in <module> 
    RatingsBreakdown.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 424, in run 
    mr_job.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 445, in execute 
    super(MRJob, self).execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 185, in execute 
    self.run_job() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 233, in run_job 
    runner.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/runner.py", line 511, in run 
    self._run() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 144, in _run 
    self._run_mappers_and_combiners(step_num, map_splits) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 185, in _run_mappers_and_combiners 
    for task_num, map_split in enumerate(map_splits) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 120, in _run_multiple 
    func() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 662, in _run_mapper_and_combiner 
    run_mapper() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 685, in _run_task 
    stdin, stdout, stderr, wd, env) 
    File "/usr/lib/python3.4/site-packages/mrjob/inline.py", line 92, in invoke_task 
    task.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 433, in execute 
    self.run_mapper(self.options.step_num) 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 517, in run_mapper 
    for out_key, out_value in mapper(key, value) or(): 
    File "RatingsBreakdown.py", line 13, in mapper_get_ratings 
    (userID, movieID, rating, timestamp) = line.split('\t') 
ValueError: need more than 1 value to unpack 
내가 오류를 해결하고 뉴욕의 실수가 무엇인지 이해하고 싶은 내 MovieRatings

[[email protected] Desktop]# python3 MovieRatings.py UN.txt 
No configs found; falling back on auto-configuration 
No configs specified for inline runner 
Running step 1 of 1... 
Creating temp directory /tmp/MovieRatings.training.20171128.083635.368889 
Error while reading from /tmp/MovieRatings.training.20171128.083635.368889/step/000/reducer/00000/input: 
Traceback (most recent call last): 
    File "MovieRatings.py", line 20, in <module> 
    MovieRatings.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 424, in run 
    mr_job.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 445, in execute 
    super(MRJob, self).execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 185, in execute 
    self.run_job() 
    File "/usr/lib/python3.4/site-packages/mrjob/launch.py", line 233, in run_job 
    runner.run() 
    File "/usr/lib/python3.4/site-packages/mrjob/runner.py", line 511, in run 
    self._run() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 150, in _run 
    self._run_reducers(step_num, num_reducer_tasks) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 246, in _run_reducers 
    for task_num in range(num_reducer_tasks) 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 120, in _run_multiple 
    func() 
    File "/usr/lib/python3.4/site-packages/mrjob/sim.py", line 685, in _run_task 
    stdin, stdout, stderr, wd, env) 
    File "/usr/lib/python3.4/site-packages/mrjob/inline.py", line 92, in invoke_task 
    task.execute() 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 439, in execute 
    self.run_reducer(self.options.step_num) 
    File "/usr/lib/python3.4/site-packages/mrjob/job.py", line 560, in run_reducer 
    for out_key, out_value in reducer(key, values) or(): 
    File "MovieRatings.py", line 17, in reducer_count_ratings 
    yield Key, sum(values) 
NameError: name 'Key' is not defined 

와 또한

. 이 라이브러리는 Python3에

File "RatingsBreakdown.py", line 13, in mapper_get_ratings 
    (userID, movieID, rating, timestamp) = line.split('\t') 
ValueError: need more than 1 value to unpack 

먼저 작동처럼

+1

'Key' 대신'key'입니까? – Mel

+0

'steps()'에'[]'이 (가) 들여 쓰기 문제가 있습니다. 여기에 몇 가지 문제가있을 수 있습니다. – cdarke

답변

0

당신은 RatingsBreakdown.py이 ... 또한, 귀하의 표시 입력이 어떤 탭을 포함하지 않는 당신이 4 열을 추출하기 위해 노력했다 달려 보인다. 당신이 여기에서 기대했던 것을 분명히 밝혀 내지 못했습니다.

File "MovieRatings.py", line 17, in reducer_count_ratings 
    yield Key, sum(values) 
NameError: name 'Key' is not defined 

셀프 설명 ... 당신의 변수는이 과정 (link) 우측의 예제를 시도하고 key

+0

나는 완벽하게 대답 할 수 없었다고 생각한다. 다음 번에 더 나은 질문을 설명하려고합니다. 당신의 도움을 주셔서 감사합니다 –

0

. 그것은 효과가 있었다.

from mrjob.job import MRJob 

class wordcount(MRJob): 

    def mapper(self, _, line): 
     (word) = line.split(' ') 
     yield word, 1 

    def reducer(self,x,count): 
     yield x,sum(count) 

if __name__ == '__main__': 
    wordcount.run() 
관련 문제