2016-12-20 1 views
1

내가 위키 피 디아에서 테이블 데이터를 얻기 위해 노력하고 있지만 오류- NoneType 오류

AttributeError: 'NoneType' object has no attribute 'findAll' 

가 계속 여기 내 코드입니다.

from bs4 import BeautifulSoup 
import urllib 
import urllib.request 



wiki = "https://en.wikipedia.org/wiki/List_of_current_United_States_Senators" 
page = urllib.request.urlopen(wiki) 
soup = BeautifulSoup(page, "lxml") 

name = "" 
party = "" 
state = "" 
picture = "" 
link = "" 
district = "" 

table = soup.find("table", { "class" : "wikitable sortable" }) 

f = open('output.csv', 'w') 

for row in table.findAll("tr"): 
    cells = row.findAll("td") 


    state = cells[0].find(text=True) 
    picture = cells[2].findAll(text=True) 
    name = cells[3].find(text=True) 
    party = cells[4].find(text=True) 


    write_to_file = name + "," + state + "," + party + "," + link + "," + picture + "," + district + "\n" 
    print (write_to_file) 
    f.write(write_to_file) 

f.close() 

어떤 도움 (위키 API를 사용하지만 난 오히려 사용하는 무엇을 잃었어요에 대한 생각을)를 할 경우에도 다른 방법으로는, 감상 할 수있다.

답변

0

가장 큰 문제는 soup.find("table", { "class" : "wikitable sortable" })None입니다. 클래스 sortable wikitable sortable의 요소가 있지만 어쩌면 그 요소를 원할 수도 있습니다.

나는 그것을 고치고 if 및 약간 prints를 추가했다. 여전히 작동하지 않지만 문제를 해결하기 쉽습니다. 이제 당신의 차례입니다 :)

from bs4 import BeautifulSoup 
import urllib 
import urllib.request 

wiki = "https://en.wikipedia.org/wiki/List_of_current_United_States_Senators" 
page = urllib.request.urlopen(wiki) 
soup = BeautifulSoup(page, "lxml") 

name = "" 
party = "" 
state = "" 
picture = "" 
link = "" 
district = "" 

table = soup.find("table", { "class" : "sortable wikitable sortable" }) 

f = open('output.csv', 'w') 

for row in table.findAll("tr"): 
    cells = row.findAll("td") 
    if cells: 
     state = cells[0].find(text=True) 
     picture = cells[2].findAll(text=True) 
     name = cells[3].find(text=True) 
     party = cells[4].find(text=True) 

     print(state, type(state)) 
     print(picture, type(picture)) 
     print(name, type(name)) 
     print(party, type(party)) 
     write_to_file = name + "," + state + "," + party + "," + link + "," + picture + "," + district + "\n" 
     print (write_to_file) 
     f.write(write_to_file) 
     f.flush() 

f.close() 
+0

대단히 감사합니다! 나는 그것을 작동 시키도록했다 (어쨌든 텍스트로). 그러나 나는 여전히 그림 링크를 얻는 방법을 알아 내려하고있다. –

+0

"picture = cells.findAll ("a ")"와 같은 것을 사용하려고했지만 결과 집합을 반환합니다. 어떻게하면 링크를 얻을 수 있을까요? 고맙습니다! –

0
import bs4, requests 

base_url = 'https://en.wikipedia.org/wiki/List_of_current_United_States_Senators' 
response = requests.get(base_url) 
soup = bs4.BeautifulSoup(response.text, 'lxml') 

with open('out.txt', 'w', newline='') as out: 
    writer = csv.writer(out) 
    for row in table('tr'): 
     row_text = [td.get_text(strip=True) for td in row('td') if td.text ] 
     writer.writerow(row_text) 
     print(row_text) 

인쇄 :

[] 
['Alabama', '3', 'Shelby, RichardRichard Shelby', 'Republican', 'None', 'U.S. House,Alabama Senate', 'University of Alabama, Tuscaloosa(BA;LLB)Birmingham School of Law(JD)', 'January 3, 1987', '(1934-05-06)May 6, 1934(age\xa082)', '2022'] 
['Alabama', '2', 'Sessions, JeffJeff Sessions', 'Republican', 'Lawyer in private practice', 'Alabama Attorney General,U.S. Attorneyfor theSouthern District of Alabama', 'Huntingdon College(BA)University of Alabama, Tuscaloosa(JD)', 'January 3, 1997', '(1946-12-24)December 24, 1946(age\xa069)', '2020'] 
['Alaska', '3', 'Murkowski, LisaLisa Murkowski', 'Republican', 'Lawyer in private practice', 'Alaska House', 'Georgetown University(BA)Willamette University(JD)', 'December 20, 2002', '(1957-05-22)May 22, 1957(age\xa059)', '2022'] 
['Alaska', '2', 'Sullivan, DanDan Sullivan', 'Republican', 'Lawyer in private practice', 'Alaska Natural Resources Commissioner,Alaska Attorney General,U.S. Assistant Secretary of State for Economic and Business Affairs', 'Harvard University(BA)Georgetown University(MS;JD)', 'January 3, 2015', '(1964-11-13)November 13, 1964(age\xa052)', '2020'] 

out.txt :

Alabama,3,"Shelby, RichardRichard Shelby",Republican,None,"U.S. House,Alabama Senate","University of Alabama, Tuscaloosa(BA;LLB)Birmingham School of Law(JD)","January 3, 1987","(1934-05-06)May 6, 1934(age 82)",2022 
Alabama,2,"Sessions, JeffJeff Sessions",Republican,Lawyer in private practice,"Alabama Attorney General,U.S. Attorneyfor theSouthern District of Alabama","Huntingdon College(BA)University of Alabama, Tuscaloosa(JD)","January 3, 1997","(1946-12-24)December 24, 1946(age 69)",2020 
Alaska,3,"Murkowski, LisaLisa Murkowski",Republican,Lawyer in private practice,Alaska House,Georgetown University(BA)Willamette University(JD),"December 20, 2002","(1957-05-22)May 22, 1957(age 59)",2022 
Alaska,2,"Sullivan, DanDan Sullivan",Republican,Lawyer in private practice,"Alaska Natural Resources Commissioner,Alaska Attorney General,U.S. Assistant Secretary of State for Economic and Business Affairs",Harvard University(BA)Georgetown University(MS;JD),"January 3, 2015","(1964-11-13)November 13, 1964(age 52)",2020 
Arizona,3,"McCain, JohnJohn McCain",Republican,None,"U.S. House,U.S. NavyCaptain",United States Naval Academy(BS),"January 3, 1987","(1936-08-29)August 29, 1936(age 80)",2022 
Arizona,1,"Flake, JeffJeff Flake",Republican,Nonprofit director,U.S. House,"Brigham Young University, Utah(BA;MA)","January 3, 2013","(1962-12-31)December 31, 1962(age 53)",2018 
Arkansas,3,"Boozman, JohnJohn Boozman",Republican,Optometrist,"Rogers Public School Board,U.S. House","University of Arkansas, Fayetteville(attended)Southern College of Optometry(OD)","January 3, 2011","(1950-12-10)December 10, 1950(age 66)",2022