0
나는 Mechanize를 사용하여 간단한 웹 스크롤러를 작성했습니다. 이제는 다음 페이지를 재귀 적으로 가져 오는 방법을 고수했습니다. 아래 코드를 참조하십시오. 루비에서 재귀 메서드를 사용하는 방법
def self.generate_page #generate a Mechainze page object,the first page
agent = Mechanize.new
url = "http://www.baidu.com/s?wd=intitle:#{URI.encode(WORD)}%20site:sina.com.cn&rn=50&gpc=stf#{URI.encode(TIME)}"
page = agent.get(url)
page
end
def self.next_page(n_page) #get next page recursively by click next tag showed in each pages
puts n_page
# if I dont use puts , I get nothing , when using puts, I get
#<Mechanize::Page:0x007fd341c70fd0>
#<Mechanize::Page:0x007fd342f2ce08>
#<Mechanize::Page:0x007fd341d0cf70>
#<Mechanize::Page:0x007fd3424ff5c0>
#<Mechanize::Page:0x007fd341e1f660>
#<Mechanize::Page:0x007fd3425ec618>
#<Mechanize::Page:0x007fd3433f3e28>
#<Mechanize::Page:0x007fd3433a2410>
#<Mechanize::Page:0x007fd342446ca0>
#<Mechanize::Page:0x007fd343462490>
#<Mechanize::Page:0x007fd341c2fe18>
#<Mechanize::Page:0x007fd342d18040>
#<Mechanize::Page:0x007fd3432c76a8>
#which are the results I want
np = Mechanize.new.click(n_page.link_with(:text=>/next/)) unless n_page.link_with(:text=>/next/).nil?
result = next_page(np) unless np.nil?
result # here the value is empty, I dont know what is worng
end
def self.get_page # trying to pass the result of next_page() method
puts next_page(generate_page)
# it seems result is never passed here,
end
나는 ... .. 뭐가 잘못 파악하지 못할 이러한 두 개의 링크 여전히
What is recursion and how does it work? 및
Ruby recursive function 하지만 따라 누군가가 나를 도울 수 있기를 바랍니다 감사합니다
감사합니다! 첫 번째로 반복적으로 할 것이지만 다음과 같은 결과를 얻지 못했습니다.'results.push (np) 까지! np np = click_to_next_page (np) np && results.push (np)'코드가 정말 도움이되었습니다. 나 많이! – roccia