2016-12-23 1 views
1

여러 테이블이 포함 된 this URL의 콘텐츠를 긁어 내려고했습니다. 원하는 출력은 다음과 같습니다BeautifulSoup - 페이지에서 여러 표를 긁으시겠습니까?

은 같은 테이블의 형식을 너무 어려운 경우
NAME  FG% FT% 3PM REB AST STL BLK TO PTS  SCORE 
Team Jackson (0-8)  .4313 .7500 21 71 34 11 12 15 189  1-8-0 
Team Keyrouze (4-4)  .4441 .8090 31 130 71 18 13 45 373  8-1-0 
Nutz Vs. Draymond Green (4-4)  .4292 .8769 30 86 66 15 9 28 269  3-6-0 
Team Pauls 2 da Wall (3-5)  .4784 .8438 40 123 64 18 20 30 316  6-3-0 
Team Noey (2-6)  .4350 .7679 21 125 62 20 9 33 278  7-2-0 
YOU REACH, I TEACH (2-5-1)  .4810 .7432 20 114 56 30 7 50 277  2-7-0 
Kris Kaman His Pants (5-3)  .4328 .8000 20 74 59 20 5 27 238  3-6-0 
Duke's Balls In Daniels Face (3-4-1)  .5000 .7045 42 139 38 27 22 30 303  6-3-0 
Knicks Tape (5-3)  .5000 .8152 34 143 92 12 9 47 397  4-5-0 
Suck MyDirk (5-3)  .4734 .8814 29 106 86 22 17 40 435  5-4-0 
In Porzingod We Trust (4-4)  .4928 .7222 27 180 95 16 16 46 423  7-2-0 
Team Aguilar (6-1-1)  .4718 .7053 28 177 65 12 35 48 413  2-7-0 
Team Li (7-0-1)  .4714 .8118 35 134 74 17 17 47 368  6-3-0 
Team Iannetta (4-4)  .4527 .7302 22 125 90 20 13 44 288  3-6-0 

, 나는 모든 테이블을 긁어 수있는 방법을 알고 싶습니다? 모든 행을 긁어 내 코드는 다음과 같다 :

tableStats = soup.find('table', {'class': 'tableBody'}) 
rows = tableStats.findAll('tr') 

for row in rows: 
    print(row.string) 

그러나 그것은 단지 값 "팀"과 아무것도 ... 왜 테이블의 모든 행을 포함하지 않습니다를 인쇄?

감사합니다.

+0

왜 모든 사람들이 요즘 농구 점수를 긁어에 치열하다? – martianwars

답변

0

질문에 지정된 2 차원 매트릭스를 정확하게 얻는 방법을 찾았습니다. 목록에 팀으로 저장되어 있습니다.

코드 :

from bs4 import BeautifulSoup 
import requests 

source_code = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017") 
plain_text = source_code.text 
soup = BeautifulSoup(plain_text, 'lxml') 
teams = [] 
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'}) 

# Creates a 2-D matrix. 
for row in range(len(rows)): 
    team_row = [] 
    columns = rows[row].findAll('td') 
    for column in columns: 
     team_row.append(column.getText()) 
    print(team_row) 
    # Add each team to a teams matrix. 
    teams.append(team_row) 

출력 :

['Team Jackson (0-10)', '', '.4510', '.8375', '41', '135', '101', '23', '11', '50', '384', '', '5-4-0'] 
['YOU REACH, I TEACH (3-6-1)', '', '.4684', '.7907', '22', '169', '103', '22', '10', '32', '342', '', '4-5-0'] 
['Nutz Vs. Draymond Green (4-6)', '', '.4552', '.8372', '30', '157', '68', '15', '16', '39', '356', '', '2-7-0'] 
["Jesse's Blue Balls (4-5-1)", '', '.4609', '.7576', '47', '158', '71', '30', '20', '38', '333', '', '7-2-0'] 
['Team Noey (4-6)', '', '.4763', '.8261', '42', '164', '70', '25', '29', '44', '480', '', '5-4-0'] 
['Suck MyDirk (6-3-1)', '', '.4733', '.8403', '54', '160', '132', '23', '11', '47', '544', '', '4-5-0'] 
['Kris Kaman His Pants (5-5)', '', '.4569', '.8732', '53', '138', '105', '27', '21', '53', '465', '', '6-3-0'] 
['Team Aguilar (6-3-1)', '', '.4433', '.7229', '40', '202', '68', '30', '22', '54', '452', '', '3-6-0'] 
['Knicks Tape (6-3-1)', '', '.4406', '.8824', '52', '172', '108', '24', '13', '49', '513', '', '6-3-0'] 
['Team Iannetta (4-6)', '', '.5321', '.6923', '24', '146', '94', '32', '16', '60', '428', '', '3-6-0'] 
['In Porzingod We Trust (6-4)', '', '.4694', '.6364', '37', '216', '133', '31', '21', '77', '468', '', '4-5-0'] 
['Team Keyrouze (6-4)', '', '.4705', '.8854', '51', '135', '108', '25', '17', '43', '550', '', '5-4-0'] 
['Team Li (8-1-1)', '', '.4369', '.8182', '57', '203', '130', '34', '22', '54', '525', '', '6-3-0'] 
['Team Pauls 2 da Wall (5-5)', '', '.4780', '.5970', '27', '141', '47', '19', '25', '28', '263', '', '3-6-0'] 
3

table 태그를 찾는 대신 더 신뢰할 수있는 class (예 : linescoreTeamRow)으로 행을 직접 찾아야합니다. 이 코드 스 니펫은 트릭을 수행합니다.

from bs4 import BeautifulSoup 
import requests 
a = requests.get("http://games.espn.com/fba/scoreboard?leagueId=224165&seasonId=2017") 
soup = BeautifulSoup(a.text, 'lxml') 
# searching for the rows directly 
rows = soup.findAll('tr', {'class': 'linescoreTeamRow'}) 
# you will need to isolate elements in the row for the table 
for row in rows: 
    print row.text 
관련 문제