2014-01-17 3 views
1

셀러리와 함께 Scrapy 크롤링 프로세스를 실행하려고합니다. 나는 자습서의 많은 일을 검토 한 결과이 모두가 그것을 할 것으로 보인다 방법이지만, 그것은 나를 위해 작동하지 않습니다AttributeError : 'CrawlerProcess'개체에 'install'속성이 없습니다.

tasks.py

from multiprocessing import Process 
from scrapy.crawler import CrawlerProcess 
from scrapy.conf import settings 
from scraper.ubuntu_scraper.ubuntu_spider import UbuntuSpider 
from celery.task.schedules import crontab 
from celery.decorators import periodic_task 

# this will run every minute 
@periodic_task(run_every=crontab(hour="*", minute="*", day_of_week="*")) 
def crawl(): 
    crawler = DomainCrawlerScript() 
    return crawler.crawl() 

class DomainCrawlerScript(): 
    def __init__(self): 
     self.crawler = CrawlerProcess(settings) 
     self.crawler.install() 
     self.crawler.configure() 
    def _crawl(self): 
     self.crawler.crawl(UbuntuSpider) 
     self.crawler.start() 
     self.crawler.stop() 
    def crawl(self): 
     p = Process(target=self._crawl) 
     p.start() 
     p.join() 

celery.py

from __future__ import absolute_import 
import os 
from celery import Celery 

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hardware.settings') 

app = Celery('hardware', broker = 'django://', include=['scraper.tasks']) 
app.config_from_object('django.conf:settings') 
내가 python manage.py celeryd -v 2 -B -s celery -E -l INFO -I scraper.tasks을 실행하면

내가 얻을 :

Traceback (most recent call last): 
    File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 238, in trace_task 
    R = retval = fun(*args, **kwargs) 
    File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 416, in __protected_call__ 
    return self.run(*args, **kwargs) 
    File "/home/olyazavr/Moka5/hardware/scraper/tasks.py", line 12, in crawl 
    crawler.install() 
AttributeError: 'CrawlerProcess' object has no attribute 'install' 

답변

0

볼 scrapy crawl command가 어떻게하고 있는지 그것과 동일을하십시오 :

crawler = self.crawler_process.create_crawler() 
spider = crawler.spiders.create(spname, **opts.spargs) 
crawler.crawl(spider) 
self.crawler_process.start() 
관련 문제