urllib.urlretrieve를 사용하여 HTTP를 통해 파일을 다운로드 할 수 없음

나는 여전히 내 MP3 다운로더에서 작업하고 있지만 현재 다운로드중인 파일에 문제가 있습니다. 나는 두 가지 버전의 부분을 가지고있어. 첫 번째 파일은 적절한 파일을 제공하지만 오류가 발생합니다. 두 번째 파일은 너무 작지만 오류가없는 파일을 제공합니다. 바이너리 모드에서 파일을 열려고 시도했지만 도움이되지 않았습니다. 나는 html로 어떤 일을하기에 꽤 새로울 것이므로 어떤 도움도 감사 할 것입니다.urllib.urlretrieve를 사용하여 HTTP를 통해 파일을 다운로드 할 수 없음

import urllib 
import urllib2 

def milk(): 
    SongList = [] 
    SongStrings = [] 
    SongNames = [] 
    earmilk = urllib.urlopen("http://www.earmilk.com/category/pop") 
    reader = earmilk.read() 
    #gets the position of the playlist 
    PlaylistPos = reader.find("var newPlaylistTracks = ") 
    #finds the number of songs in the playlist 
    NumberSongs = reader[reader.find("var newPlaylistIds = "): PlaylistPos].count(",") + 1 
    initPos = PlaylistPos 

    #goes though the playlist and records the html address and name of the song 

    for song in range(0, NumberSongs): 
     songPos = reader[initPos:].find("http:") + initPos 
     namePos = reader[songPos:].find("name") + songPos 
     namePos += reader[namePos:].find(">") 
     nameEndPos = reader[namePos:].find("<") + namePos 
     SongStrings.append(reader[songPos: reader[songPos:].find('"') + songPos]) 
     SongNames.append(reader[namePos + 1: nameEndPos]) 
     initPos = nameEndPos 

    for correction in range(0, NumberSongs): 
     SongStrings[correction] = SongStrings[correction].replace('\\/', "/") 

    #downloading songs 

    fileName = ''.join([a.isalnum() and a or '_' for a in SongNames[0]]) 
    fileName = fileName.replace("_", " ") + ".mp3" 


#   This version writes a file that can be played but gives an error saying: "TypeError: expected a character buffer object" 
## songDL = open(fileName, "wb") 
## songDL.write(urllib.urlretrieve(SongStrings[0], fileName)) 


#   This version creates the file but it cannot be played (file size is much smaller than it should be) 
## url = urllib.urlretrieve(SongStrings[0], fileName) 
## url = str(url) 
## songDL = open(fileName, "wb") 
## songDL.write(url) 


    songDL.close() 

    earmilk.close()

출처

2013-12-15 johnsona

다시 읽어 the documentation for urllib.urlretrieve : (

반환 파일 이름은 개체를 찾을 수있는 아래의 로컬 파일 이름입니다 튜플 (파일 이름, 헤더)와 헤더 무엇이든 정보입니다)가 반환 한 객체의 urlopen() 메서드는 반환 될 수 있습니다 (원격 객체, 캐시 가능).

파일 자체의 바이트를 반환 할 것으로 예상됩니다. urlretrieve의 요점은 파일에 대한 쓰기를 처리하고 작성된 파일 이름을 반환한다는 것입니다 (일반적으로 함수에 제공 한 두 번째 인수와 동일한 것입니다).

출처

2013-12-15 21:10:08 Iguananaut

그런데, 이런 종류의 일은 [pdb] (http://docs.python.org/2/library/pdb.html)를 사용하는 것을 배우기 좋은 이유입니다. 파이썬 REPL에서 함수를 실행하면 충돌이 발생할 때'import pdb; pdb.pm()'을 호출하여 코드가 충돌 한 지점에서 디버거 프롬프트를 표시합니다. 거기에서'urlretrieve '와 같은 함수가 실제로 반환되는지 직접 확인할 수 있습니다. 그 이유는 반환 값을 사용하여 수행해야하는 다양한 작업이 실패하는 이유를 알 수 있어야합니다. – Iguananaut

urllib.urlretrieve를 사용하여 HTTP를 통해 파일을 다운로드 할 수 없음

답변

관련 문제