UnicodeEncodeError : 'ascii'코덱은 문자 u '\ u0446'을 32 위치에 인코딩 할 수 없습니다. 서수가 범위 내에 없습니다 (128)

-1

이전 인턴이 작성한 일부 코드를 디버그하려고하는데 일부 문제가 해결됩니다. 이 문제는 다른 유니 코드 오류 게시물의 답변으로 인해 발생합니다.UnicodeEncodeError : 'ascii'코덱은 문자 u ' u0446'을 32 위치에 인코딩 할 수 없습니다. 서수가 범위 내에 없습니다 (128)

def dumpTextPacket(self, header, bugLog, offset, outfile): 
     bugLog.seek(offset) 
     data = bugLog.read(header[1])  # header[1] = size of the packet 
     outString = data.decode("utf-8","ignore") 
     if(header[3] == 8): # Removing ugly characters from packet that has bTag = 8. 
      outString = outString[1:] 
      outString = outString.strip('\0') # Remove all 'null' characters from text 
     outString = "{:.3f}".format(header[5]) + ' ms: ' + outString    # Append the timestamp to the beginning of the line 
     outfile.write(outString)

내가 유니 코드와 함께 많은 경험이없는, 그래서 나는이 문제를 어떤 포인터를 정말 감사하겠습니다 :

오류는이 함수의 마지막 줄에서 발견된다!

편집 : Python 2.7 이하를 사용하면 전체 파일이됩니다. 내가 언급해야 할 또 다른 점은 코드가 일부 파일을 구문 분석 할 때 작동한다는 것입니다.하지만 타임 스탬프가 너무 커지면 다른 파일에서 오류가 발생한다고 생각합니까?

main.py 파일에서 LogInterpreter.execute() 메서드를 호출하면 dumpTextPacket 메서드의 마지막 행 인 "outfile.write (outString)"행에 제목에 오류가 표시됩니다. execute 메소드에서 호출되는 :

import sys 
import os 
from struct import unpack 
class LogInterpreter: 

def __init__(self): 
    self.RTCUpdated = False 
    self.RTCOffset = 0.0 
    self.LastTimeStamp = 0.0 
    self.TimerRolloverCount = 0 
    self.ThisTimeStamp = 0.0 

    self.m_RTCSeconds = 0.0 
    self.m_StartTimeInSec = 0.0 

def GetRTCOffset(self): 
    return self.m_RTCSeconds - self.m_StartTimeInSec 

def convertTimeStamp(self,uTime,LogRev): 
    TicsPerSecond = 24000000.0 

    self.ThisTimeStamp = uTime 
    self.RTCOffset = self.GetRTCOffset() 

    if int(LogRev) == 2: 
     if self.RTCUpdated: 
      self.LastTimeStamp = 0.0 
     if self.LastTimeStamp > self.ThisTimeStamp: 
      self.TimerRolloverCount += 1 
     self.LastTimeStamp = self.ThisTimeStamp 

    ULnumber = (-1 & 0xffffffff) 

    return ((ULnumber/TicsPerSecond)*self.TimerRolloverCount + (uTime/TicsPerSecond) + self.RTCOffset) * 1000.0 

########################################################################## 
# Information about the header for the current packet we are looking at. #         
########################################################################## 
def grabHeader(self, bugLog, offset): 
    ''' 
    s_PktHdrRev1 
    /*0*/ u16 StartOfPacketMarker; # uShort 2 
    /*2*/ u16 SizeOfPacket;  # uShort 2 
    /*4*/ u08 LogRev;    # uChar 1  
    /*5*/ u08 bTag;    # uChar 1  
    /*6*/ u16 iSeq;    # uShort 2 
    /*8*/ u32 uTime;    # uLong 4 
    ''' 
    headerSize = 12 # Header size in bytes 
    bType = 'HHBBHL' # codes for our byte type 
    bugLog.seek(offset) 
    data = bugLog.read(headerSize) 

    if len(data) < headerSize: 
     print('Error in the format of BBLog file') 
     sys.exit() 

    headerArray = unpack(bType, data) 
    convertedTime = self.convertTimeStamp(headerArray[5],headerArray[2]) 
    headerArray = headerArray[:5] + (convertedTime,) 
    return headerArray 

################################################################ 
# bTag = 8 or bTag = 16 --> just write the data to LogMsgs.txt # 
################################################################ 
def dumpTextPacket(self, header, bugLog, offset, outfile): 
    bugLog.seek(offset) 
    data = bugLog.read(header[1])        # header[1] = size of the packet 
    outString = data.decode("utf-8","ignore") 
    if(header[3] == 8):           # Removing ugly characters from packet that has bTag = 8. 
     outString = outString[1:] 
     outString = outString.strip('\0')       # Remove all 'null' characters from text 
    outString = "{:.3f}".format(header[5]) + ' ms: ' + outString # Append the timestamp to the beginning of the line 
    outfile.write(outString) 



def execute(self): 
    path = './Logs/' 
    for fn in os.listdir(path): 
     fileName = fn 
     print fn 
     if (fileName.endswith(".bin")): 
     # if(fileName.split('.')[1] == "bin"): 
      print("Parsing "+fileName) 
      outfile = open(path+fileName.split('.')[0]+".txt", "w")   # Open a file for output 
      fileSize = os.path.getsize(path+fileName) 
      packetOffset = 0 
      with open(path+fileName, 'rb') as bugLog: 
       while(packetOffset < fileSize): 
        currHeader = self.grabHeader(bugLog, packetOffset)  # Grab the header for the current packet 
        packetOffset = packetOffset + 12       # Increment the pointer by 12 bytes (size of a header packet) 
        if currHeader[3]==8 or currHeader[3]==16:     # Look at the bTag and see if it is a text packet 
         self.dumpTextPacket(currHeader, bugLog, packetOffset, outfile) 
        packetOffset = packetOffset + currHeader[1]    # Move on to the next packet by incrementing the pointer by the size of the current packet 
      outfile.close() 
      print(fileName+" completed.")

출처

2016-08-08 Jennifer

함수에 입력을 추가 할 수도 있습니까? –

당신이 쓰는 파일은 아마 ascii 코덱을 사용하여 열릴 것입니다. 파이썬 2 또는 3을 사용하고 있습니까? –

@DennisKuypers : 파이썬 2이고'data'가 이미'unicode' 객체 인 경우,'decode'를 수행하려고하면 실제'decode하기 전에 기본 로케일 설정 (ASCII를 의미 함)을 사용하여 암묵적으로 인코딩합니다 '단계. 그래도 추적 할 수 있어야합니다. – ShadowRanger

당신이 그들 중 하나가 너무 유니 코드로 결과를 강요 할 것이다 유니 코드, 파이썬 2 인과 함께 두 문자열을 추가 할 때. 당신이 data.decode을 사용하기 때문에

>>> 'a' + u'b' 
u'ab'

는 outString 유니 코드 일 것이다.

이진 파일에 쓸 때 바이트 문자열이 있어야합니다. Python 2는 유니 코드 문자열을 바이트 문자열로 변환하려고 시도하지만 가장 일반적인 코덱 ('ascii')을 사용합니다. 이 코덱은 많은 유니 코드 문자에서 실패합니다. 특히 코드 포인트가 '\u007f' 이상인 유니 코드 문자는 실패합니다. 이 문제를 해결하기 위해 더 많은 능력 코덱으로 스스로를 인코딩 할 수 있습니다 : 당신은 바이트 문자열과 유니 코드 문자열을 혼합 할 수 없으며 어떤 자동 변환을 시도하지 않습니다 파이썬 3

outfile.write(outString.encode('utf-8'))

모든 변경.

출처

2016-08-09 19:00:57

UnicodeEncodeError : 'ascii'코덱은 문자 u '\ u0446'을 32 위치에 인코딩 할 수 없습니다. 서수가 범위 내에 없습니다 (128)

답변

관련 문제