2017-04-19 2 views

답변

1

예, 가능합니다. GridSearchCV에 logloss를 반환하는 점수 함수 (음수, 그리드는 높은 점수 모델을 선택하고 더 적은 손실 모델을 원함)를 제공해야하며

과 같은 최상의 반복 모델을 사용해야합니다
from xgboost import XGBClassifier  
from sklearn.grid_search import GridSearchCV 
from sklearn import metrics 

tuned_parameters = {'learning_rate': [0.4,0.5], 
     'max_depth': [6,7] 
    } 

fit_params={ 
    "eval_set":[(X_test_tr_boost, y_test)], 
    "eval_metric": 'mlogloss', 
    "early_stopping_rounds":100, 
    "verbose":True 
} 

# XGBClassifier with early stopping Returns the model from the last iteration (not the best one). 
# In order to provide to GridSearchCV the score of the best model, we need to use a score function 
# to evaluate log_loss calling the estimator with the appropiate ntree_limit param 
#(instead of using scoring=‘neg_log_loss’ in GridSearchCV creation) 
#in order to use the best iteration of the estimator (ntree_limit) 

def _score_func(estimator, X, y): 
    score1 = metrics.log_loss(y,estimator.predict_proba(X, 
          ntree_limit=estimator.best_ntree_limit), 
          labels=[0, 1, 2, 3, 4, 5, 6, 7, 8]) 
    return -score1 

model = XGBClassifier(objective ='multi:softprob', seed=0,n_estimators=1000) 
gridsearch = GridSearchCV(model, tuned_parameters, verbose=999999 , 
    scoring=_score_func, 
    fit_params=fit_params 
    ) 
gridsearch.fit(X_train_tr_boost, y_train) 

print (gridsearch.best_params_) 
print (gridsearch.best_score_) 
관련 문제