2017-03-21 1 views
0

I가 다음과 같은 팬더 데이터 프레임 my_df :팬더 : 여러 다른 컬럼의 값을 기준으로 새 열을 만들

col_A  col_B 
------------------- 
blue  medium 
red   small 
yellow  big 

나는 다음과 같은 조건에 따라 새로운 col_C 추가 할 :

if col_A == 'blue', col_C = 'A_blue' 
if col_B == 'big', col_C = 'B_big' 

For all other cases, col_C = '' 

def my_bad_data(row): 
    if row['col_A'] == 'blue': 
     return 'A_blue' 
    elif row['col_B'] == 'big': 
     return 'B_big' 
    else: 
     return '' 

my_df['col_C'] = my_df.apply(lambda row: my_bad_data(row)) 

그러나 I :

는이를 달성하기 위해, 나는 다음과 같은했다 내가 잘못 여기에 무슨 짓을했는지

--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 
pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4028)() 

pandas/src/hashtable_class_helper.pxi in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:8125)() 

TypeError: an integer is required 

During handling of the above exception, another exception occurred: 

KeyError         Traceback (most recent call last) 
<ipython-input-20-3898742c4378> in <module>() 
----> 1 my_df['col_C'] = my_df.apply(lambda row: my_bad_data(row)) 
     2 asset_df 

/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, args, **kwds) 
    4161      if reduce is None: 
    4162       reduce = True 
-> 4163      return self._apply_standard(f, axis, reduce=reduce) 
    4164    else: 
    4165     return self._apply_broadcast(f, axis) 

/usr/local/lib/python3.4/dist-packages/pandas/core/frame.py in _apply_standard(self, func, axis, ignore_failures, reduce) 
    4257    try: 
    4258     for i, v in enumerate(series_gen): 
-> 4259      results[i] = func(v) 
    4260      keys.append(v.name) 
    4261    except Exception as e: 

<ipython-input-20-3898742c4378> in <lambda>(row) 
----> 1 asset_df['quality_flag'] = my_df.apply(lambda row: my_bad_data(row)) 
     2 my_df 

<ipython-input-19-2a09810e2dd4> in my_bad_data(row) 
     1 def bug_function(row): 
----> 2  if row['col_A'] == 'blue': 
     3   return 'A_blue' 
     4  elif row['col_B'] == 'big': 
     5   return 'B_big' 

/usr/local/lib/python3.4/dist-packages/pandas/core/series.py in __getitem__(self, key) 
    599   key = com._apply_if_callable(key, self) 
    600   try: 
--> 601    result = self.index.get_value(self, key) 
    602 
    603    if not is_scalar(result): 

/usr/local/lib/python3.4/dist-packages/pandas/indexes/base.py in get_value(self, series, key) 
    2167   try: 
    2168    return self._engine.get_value(s, k, 
-> 2169           tz=getattr(series.dtype, 'tz', None)) 
    2170   except KeyError as e1: 
    2171    if len(self) > 0 and self.inferred_type in ['integer', 'boolean']: 

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:3342)() 

pandas/index.pyx in pandas.index.IndexEngine.get_value (pandas/index.c:3045)() 

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4094)() 

KeyError: ('col_A', 'occurred at index id') 

어떤 생각 : 다음과 같은 오류가있어? 감사!

답변

1

그래,이 세미 자주 실행, 당신은 dataframe.apply(func, axis=1) 싶어. 문서보기 here :

axis : {0 or ‘index’, 1 or ‘columns’}, default 0 
    0 or ‘index’: apply function to each column 
    1 or ‘columns’: apply function to each row 
관련 문제