2013-04-22 3 views
4

안녕하세요. 다음 코드를 구현하려고했습니다 ..치료 용 웹 크롤러에서 오류 발생

from scrapy.http import Request 

를 또는, 또한 "바로 가기"수입이 :

from scrapy.spider import BaseSpider 
from scrapy.selector import HtmlXPathSelector 
from bs4 import BeautifulSoup 

class spider_aicte(BaseSpider): 
    name = "Indian_Colleges" 
    allowed_domains = ["http://www.domain.org"] 
    start_urls = [ 
     "http://www.domain.org/appwebsite.html", 
     ] 

    def parse(self, response): 
     filename = response.url.split("/")[-2] 
     soup = BeautifulSoup(response.body) 
     for link in soup.find_all('a'): 
      download_link = link.get('href') 
      if '.pdf' in download_link: 
       pdf_link = "http://www.domain.org" + download_link 
       print pdf_link 
       class FileSpider(BaseSpider): 
        name = "fspider" 
        allowed_domains = ["www.domain.org"] 
        start_urls = [ 
          pdf_link 
          ] 
     for url in pdf_link: 
      yield Request(url, callback=self.save_pdf) 

    def save_pdf(self, response): 
     path = self.get_path(response.url) 
     with open(path, "wb") as f: 
      f.write(response.body) 

답변

9

당신은 사용하기 전에 Request를 가져와야합니다

from scrapy import Request 

또는, import scrapy 라인이있는 경우, scrapy.Request을 사용하십시오.

관련 문제