2016-07-12 6 views
0

데이터 프레임을 열어 일부 분석을 위해 팬더에 삽입하려고합니다.데이터 프레임 읽기 오류 csv

raw = pd.read_csv('/home/chris/Desktop/Cambridge/SOURCE_DATA/Node_56_Nairobi_OutputFile.xls', encoding='utf16', error_bad_lines=False) 

다른 스레드에 대한 제안을 시도했습니다. 그런 다음이 상황이 발생합니다.

Skipping line 3: expected 1 fields, saw 20 
Skipping line 21: expected 1 fields, saw 2 
Skipping line 22: expected 1 fields, saw 6 
Skipping line 23: expected 1 fields, saw 3 
Skipping line 27: expected 1 fields, saw 2 
Skipping line 28: expected 1 fields, saw 2 
Skipping line 30: expected 1 fields, saw 2 
Skipping line 34: expected 1 fields, saw 2 
Skipping line 35: expected 1 fields, saw 2 
Skipping line 36: expected 1 fields, saw 2 
Skipping line 37: expected 1 fields, saw 2 
Skipping line 38: expected 1 fields, saw 2 
Skipping line 39: expected 1 fields, saw 2 
Skipping line 40: expected 1 fields, saw 2 
Skipping line 111: expected 1 fields, saw 2 
Skipping line 113: expected 1 fields, saw 2 
Skipping line 116: expected 1 fields, saw 2 
Skipping line 117: expected 1 fields, saw 2 
Skipping line 161: expected 1 fields, saw 2 
Skipping line 162: expected 1 fields, saw 2 
Skipping line 182: expected 1 fields, saw 2 
Skipping line 184: expected 1 fields, saw 3 
Skipping line 202: expected 1 fields, saw 2 
Skipping line 204: expected 1 fields, saw 2 
Skipping line 218: expected 1 fields, saw 3 
Skipping line 222: expected 1 fields, saw 2 
Skipping line 223: expected 1 fields, saw 2 
Skipping line 232: expected 1 fields, saw 5 
Skipping line 233: expected 1 fields, saw 2 
Skipping line 234: expected 1 fields, saw 2 
Skipping line 235: expected 1 fields, saw 3 
Skipping line 237: expected 1 fields, saw 2 
Skipping line 259: expected 1 fields, saw 4 
Skipping line 265: expected 1 fields, saw 3 
Skipping line 275: expected 1 fields, saw 2 
Skipping line 290: expected 1 fields, saw 2 
Skipping line 294: expected 1 fields, saw 2 
Skipping line 301: expected 1 fields, saw 2 
Skipping line 303: expected 1 fields, saw 3 
Skipping line 307: expected 1 fields, saw 3 
Skipping line 323: expected 1 fields, saw 2 
Skipping line 326: expected 1 fields, saw 3 
Skipping line 332: expected 1 fields, saw 2 
Skipping line 334: expected 1 fields, saw 2 
Skipping line 340: expected 1 fields, saw 4 
Skipping line 345: expected 1 fields, saw 4 
Skipping line 349: expected 1 fields, saw 2 
Skipping line 351: expected 1 fields, saw 2 
Skipping line 361: expected 1 fields, saw 2 
Skipping line 370: expected 1 fields, saw 2 

계속됩니다. 왜? 아직도 결국 난 정말 왜 생각하지 않는이 오류

CParserError        
Traceback (most recent call last) 
<ipython-input-21-ab444ae5f5e9> in <module>() 
----> 1 raw = pd.read_csv('/home/chris/Desktop/Cambridge/SOURCE_DATA/Node_56_Nairobi_OutputFile.xls', encoding='utf16', error_bad_lines=False) 

/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skip_footer, doublequote, delim_whitespace, as_recarray, compact_ints, use_unsigned, low_memory, buffer_lines, memory_map, float_precision) 
    527      skip_blank_lines=skip_blank_lines) 
    528 
--> 529   return _read(filepath_or_buffer, kwds) 
    530 
    531  parser_f.__name__ = name 

/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in _read(filepath_or_buffer, kwds) 
    303   return parser 
    304 
--> 305  return parser.read() 
    306 
    307 _parser_defaults = { 

/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows) 
    761     raise ValueError('skip_footer not supported for iteration') 
    762 
--> 763   ret = self._engine.read(nrows) 
    764 
    765   if self.options.get('as_recarray'): 

/home/chris/anaconda2/lib/python2.7/site-packages/pandas/io/parsers.pyc in read(self, nrows) 
    1211  def read(self, nrows=None): 
    1212   try: 
-> 1213    data = self._reader.read(nrows) 
    1214   except StopIteration: 
    1215    if self._first_chunk: 

pandas/parser.pyx in pandas.parser.TextReader.read (pandas/parser.c:7988)() 

pandas/parser.pyx in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244)() 

pandas/parser.pyx in pandas.parser.TextReader._read_rows (pandas/parser.c:8970)() 

pandas/parser.pyx in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838)() 

pandas/parser.pyx in pandas.parser.raise_parser_error (pandas/parser.c:22649)() 

CParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file. 

에게 던졌습니다 것을보다 더.

+0

당신이 CSV 파일의 샘플을 업로드 할 수 ... .CSV 파일과 혼합 .XLS 파일 있어요? 구분 기호와 줄 끝 문자가 일치하지 않는 CSV 체계 일 수 있습니다. –

+0

또한 utf-16을 사용해야하는 특별한 이유가 있습니까? –

+0

물론, 그 일을하는 가장 좋은 방법은 무엇입니까? 그리고 실제로, 비슷한 결과를 가진 다른 인코딩을 시도했습니다. –

답변

0

잠깐,이 문제가 해결되었습니다. 깎아 지른듯한 기념비적 인 어리 석음 및 코딩 좌절의 순간, 나는

raw = pd.read_excel('/home/chris/Desktop/Cambridge/SOURCE_DATA/Node_56_Nairobi_OutputFile.xls') 

마른 세수