Python/BeautifulSoup - 요소에서 모든 태그를 제거하는 방법?

BeautifulSoup에서 찾은 요소의 모든 태그를 간단히 제거 할 수 있습니까?Python/BeautifulSoup - 요소에서 모든 태그를 제거하는 방법?

2013-04-25 Daniele B

당신이 태그를 제거하지만, 내용을 유지하려는 가정하면,이 질문에 대한 허용 대답을 참조하십시오 Remove a tag using BeautifulSoup but keep its contents은 다음과 같습니다

2013-04-25 04:31:04 Shaun

가 할 수있는 방법입니다! 이 라인이

가 현재 요소 내의 모든 텍스트 부분을 함께 합류로 간단

''.join(htmlelement.find(text=True))

출처

2013-04-25 04:46:12

당신은 BS4에서 분해되어 방법을 사용할 수 있습니다

soup = bs4.BeautifulSoup('<body><a href="http://example.com/">I linked to <i>example.com</i></a></body>') 

for a in soup.find('a').children: 
    if isinstance(a,bs4.element.Tag): 
     a.decompose() 

print soup 

Out: <html><body><a href="http://example.com/">I linked to </a></body></html>

출처

2013-10-17 22:37:41 danblack

왜 아무 대답이 없다 unwrap 방법에 대해 언급 한 것을 본 적이 있습니까? 또는, get_text 방법

bs4에서 사라 BeautifulStoneSoup와 http://www.crummy.com/software/BeautifulSoup/bs4/doc/#unwrap http://www.crummy.com/software/BeautifulSoup/bs4/doc/#get-text

출처

2014-04-29 00:40:34 Bobby

, 더 쉽게, 그것은 Python3

from bs4 import BeautifulSoup 

soup = BeautifulSoup(html) 
text = soup.get_text() 
print(text)

출처

2015-01-27 02:47:02 shawnl

그것은이다 : 여기

Signal et Communication Ingénierie Réseaux et Télécommunications

소스 코드입니다 getText() 대신'get_text()'를 사용하는 것이 더 좋습니다. – SparkAndShine

왜 그럴까요? 그것은 사실일지도 모르지만 그 이유를 이해하는 것이 도움이 될 것입니다. –

+11

getText()는 bs3 구문이며 pep8을 준수하지 않습니다. 가능성이 높습니다. –

사용 get_text() 심지어 간단, 그것은 아래 문서 또는 모든 텍스트를 반환 하나의 유니 코드 문자열로서의 태그.

예를 들어, 다음과 같은 텍스트에서 모든 다른 스크립트 태그를 제거합니다

<td><a href="http://www.irit.fr/SC">Signal et Communication</a> 
<br/><a href="http://www.irit.fr/IRT">Ingénierie Réseaux et Télécommunications</a> 
</td>

예상 된 결과는 다음과 같습니다

#!/usr/bin/env python3 
from bs4 import BeautifulSoup 

text = ''' 
<td><a href="http://www.irit.fr/SC">Signal et Communication</a> 
<br/><a href="http://www.irit.fr/IRT">Ingénierie Réseaux et Télécommunications</a> 
</td> 
''' 
soup = BeautifulSoup(text) 

print(soup.get_text())

출처

2015-07-20 16:37:08 SparkAndShine

Python/BeautifulSoup - 요소에서 모든 태그를 제거하는 방법?

답변

관련 문제