내가 파이썬에 새로 온 사람, 나는 이것의 원인은 단순히 금발의 순간임을 확신하지만 지난 며칠 벽을 날 운전 됐어요을 기반으로. 나는 점점 계속AttributeError : re.compile 패턴
내가 라인 (50)의 하나 라인 (54)의 패턴을 교체 할 경우 라인 (106)
에 "ELIF conName.search (선) : AttributeError은 '목록'개체가 어떤 속성 '검색'이 없다" 다음, 106-113 실행 미세 라인,하지만 난 줄 105-123을 주석하는 경우, 코드의 나머지 부분은 최고의 종류를 작동
라인 (114)에서 동일한 오류가 발생합니다.
그때 라인 (106 ~ 109), 라인 (110-113) 실행 벌금을 주석,하지만 난 줄 같은 오류 (114) 가 가## This should be line 19
html_doc = """
<title>Flickr: username</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta property="og:title" content="username" />
<meta property="og:type" content="flickr_photos:profile" />
<meta property="og:url" content="http://www.flickr.com/people/Something/" />
<meta property="og:site_name" content="Flickr" />
<meta property="og:image" content="http://farm79.staticflickr.com/1111/buddyicons/[email protected]?1234567890#[email protected]" />
<li>
<a href="/groups/theportraitgroup/">The Portrait Group</a>
<span class="text-list-item">
1,939,830 photos, 125,874 members
</span>
</li>
<li>
<a href="/groups/[email protected]/">Seagulls Gone Wild</a>
<span class="text-list-item">
2,266 photos, 464 members
</span>
</li> """
from urllib.request import urlopen
from bs4 import BeautifulSoup
import fileinput
import re
## This should be line 46
## Strips for basic group data
Tab = re.compile("(\t){1,}") # strip tabs
ID = re.compile("^.*/\">") # Group ID, could be ID or Href
Href = re.compile("(\s)*<a href=\"/groups/") # Strips to beginning of ID
GName = re.compile("/\">(<b>)*") # Strips from end of Href to GName
## Persons contact info
conName = re.compile("(\s)*<meta property=\"og\:title\" content=\"") # Contact Name
##conName = re.compile("(\s)*<a href=\"/groups/")
conID = re.compile("(\s)*<meta property=\"og\:image.*\#") # Gets conName's @N ID
conRef = re.compile("(\s)*<meta property=\"og\:url.*com/people/")
Amp = re.compile("&")
Qt = re.compile(""")
Gt = re.compile(">")
Lt = re.compile("<")
exfile = 1 ## 0 = use internal data, 1 = use external file
InFile = html_doc
if exfile:
InFile = open('\Python\test\Group\group 50 min.ttxt', 'r', encoding = "utf-8", errors = "backslashreplace")
closein = 1 ## Only close input file if it was opened
else:
closein = 0
OutFile = open('C:\Python\test\Group\Output.ttxt', 'w', encoding = "utf-8", errors = "backslashreplace")
cOutFile = open('C:\Python\test\Group\ContactOutput.ttxt', 'w', encoding = "utf-8", errors = "backslashreplace")
i = 1 ## counter for debugging
## This should be line 80
for line in InFile:
## print('{}'.format(i), end = ', ') ## this is just a debugging line, to see where the program errors out
## i += 1
if Href.search(line):
ln = line
ln = re.sub(Href, "", ln)
gID, Name = ln.split("/\">")
Name = Name[:-5] ## this removes the "\n" at EOL as well
if "@N" in gID:
rH = ""
else:
rH = gID
gID = ""
## sLn = '{3}\t{0}\t{1}\t{2}\n'.format(Name, gID, rH, conName)
sLn = '{0}\t{1}\t{2}\n'.format(Name, gID, rH, conName)
## Replace HTML codes
sLn = re.sub(Gt, ">", sLn)
sLn = re.sub(Lt, "<", sLn)
sLn = re.sub(Qt, "\"", sLn)
sLn = re.sub(Amp, "&", sLn)
OutFile.write(sLn)
## This should be line 104
#################################################
elif conName.search(line):
ln = line
ln = re.sub(conName, "", ln)
conName = ln.split("\" />")
elif conID.search(line) is not None:
ln = line
ln = re.sub(conID, "", ln)
conID = ln.split("\" />")
elif conRef.search(line) is not None:
ln = line
ln = re.sub(conRef, "", ln)
conRef = ln.split("\" />")
else:
pass
sLn = '{0}\t{1}\t{2}\n'.format(conID, conRef, conName)
cOutFile.write(sLn) ## I know, this will make a massive file with duplicated data, but deal w/ it later
#################################################
if closein:
InFile.close()
OutFile.close()
cOutFile.close()
가
사람이 어떤 아이디어가 있습니까를 얻을 때?
감사합니다.
나는 당신이 정규 표현식에 대신 HTML 조각을 구문 분석을 사용하지 않는 이유는 BS4 수입 BeautifulSoup'에서'있는 것으로 확인? – jfs