Beautifulsoup = 태그 내의 콘텐츠를 추출하십시오.

"Hello world"콘텐츠를 추출하고 싶습니다. 페이지에는 <table> 및 이와 유사한 <td colspan="2">이라는 배수가 있습니다. Beautifulsoup = 태그 내의 콘텐츠를 추출하십시오.

나는 다음과 같은 시도 :

hello = soup.find(text='Name: ') 
hello.findPreviousSiblings

그러나 그것은 아무것도 반환하지 않습니다. 다음 추출 "내 집 주소"로

<table border="0" cellspacing="2" width="800"> 
<tr> 
<td colspan="2"><b>Name: </b>Hello world</td> 
</tr> 
<tr>

또한, 나는 또한 데 문제 :

<td><b>Address:</b></td> 

<td>My home address</td>

나는 또한 사용하고 여기에

코드의 조각이다 동일한 방법으로 text = "Address :"를 검색하지만 다음 줄로 이동하여 <td>의 내용을 추출하는 방법은 무엇입니까? 다음

출처

2011-05-14 ready

사용하는 대신

>>> s = '<table border="0" cellspacing="2" width="800"><tr><td colspan="2"><b>Name: </b>Hello world</td></tr><tr>' 
>>> soup = BeautifulSoup(s) 
>>> hello = soup.find(text='Name: ') 
>>> hello.next 
u'Hello world'

다음 및 이전는 형제 방법은 파스 트리

출처

2011-05-14 02:26:53

아무 것도 반환하지 않습니다. hello = soup.find (text = 'Name :') hello.next – ready

문서의 다른 곳에서는 '이름 :'을 사용합니까? –

죄송합니다. 이전의 실수였습니다. 이제 작동합니다. – ready

와 함께 작업하는 동안 당신은 그들이 파서에 의해 처리 된 순서대로 문서 요소를 이동하자 contents 연산자는 text을 <tag>text</tag>에서 추출하는 데 적합합니다.

<td>My home address</td> 예 :

이

s = '<td>My home address</td>' 
soup = BeautifulSoup(s) 
td = soup.find('td') #<td>My home address</td> 
td.contents #My home address

<td><b>Address:</b></td> 예 :

s = '<td><b>Address:</b></td>' 
soup = BeautifulSoup(s) 
td = soup.find('td').find('b') #<b>Address:</b> 
td.contents #Address:

출처

2013-01-09 18:21:05 solvingPuzzles

Beautifulsoup = 태그 내의 콘텐츠를 추출하십시오.

답변

관련 문제