XGBoost(eXtra Gradient Boost)
XGBoost는 트리 기반의 앙상블 학습에서 가장 각광을 받는 알고리즘중 하나이다.
분류에 있어서 일반적으로 다른 머신러닝보다 뛰어난 예측 성능을 나타낸다. XGBoost는 GBM에 기반하지만 ,
GBM의 단점인 느린 수행시간 및 과적합 규제 부재 등의 문제를 해결해서 매우 각곽 받고 있다.
특히 XGBoost는 병렬 CPU 환경에서 병렬 학습이 가능해 기존 GBM보다 빠르게 학습을 완료할 수 있다.
XGBoost의 장점
항목 | 설명 |
---|---|
뛰어난 예측 성능 | 일반적으로 분류의 회귀 영억에서 뛰어난 예측 성능을 발휘합니다. |
GBM 대비 빠른 수행 시간 | XGBoost 는 병령 수행 및 다양한 기능으로 GBM에 비해 빠른 수행 성능을 보장한다. |
과적합 규제 | XGBoost는 과적합 규제 기능으로 과적합에 좀 더 강한 내구성을 가질 수 있다. |
Tree pruning(가지치기) | XGBoost도 max_depth 파라미터로 분할 깊이를 조정하기도 하지만, tree pruning으로 더 이상 긍정 이득이 없는 분할을 가지치기 해서 분할 수를 더 줄이는 추가적인 장점이 있다. |
자체 내장 교차 검증 | XGBoost는 방복 수행 시마다 내부적으로 학습 데이터 셋과 평가 데이터 세세에 대한 교차 검증을 수행해 최적화된 반복 수행 횟수를 가질 수 있다. 지정된 반복 횟수가 아니라 교차 검증을 통해 데이터 셋의 평가 값이 최적화 되면 반복을 중간에 멈출 수 있는 조기 중단 기능이 있다. |
결손값 자체 처리 | XGBoost는 결손값을 자체 처리할 수 있다. |
파이썬 래퍼 XGBoost 하이퍼 파라미터
XGBoost는 GBM과 유사한 하이퍼 파라미터를 동일하게 가지고 있으며, 여기에 조기 중단 (early stopping), 과적합 규제를 위한 하이퍼 파라미터 등이 추가 됐다.
주요 일반 파라미터
- booster : gbtree(트리기반모델) 또는 gblinear(선형 모델) 선택, 디폴트는 gbtree 이다.
- slient : 디폴트는 0이며 , 출력 메시지를 나타내고 싶지 않은 경우 1로 설정
- ngthread :CPU의 실생 스레드 개수 조정 , 디폴트는 모든 CPU스레드 사용 .
주요 부스터 파라미터
*학습 태스크 파라미터
XGBoost에서 과적합 문제가 심각하다면 다음과 같이 적용할 것을 고려할 수 있다.
- eta 값을 낮춘다(0.01~0.1) , eta 값을 낮출 경우 num_round( 또는 n_estimators)는 반대로 높여줘야함.
- max_depth 값늘 낮춤
- min_child_weight 값을 높임
- gamma 값을 높임
- subsample와 colsample_bytree를 조정 하는 것도 트리가 너무 복잡하게 생성되는 것을 막아 과적합 문제에 도움이 될 수 있다.
요약
XGBoost 자체적으로 교차 검증, 성능 평가, 피처의 중요도 등의 시각화 기능을 가지고 있다.기본적으로 GBM에서 성능 향상 기능이 추가되었다.
예를 들면 수행속도를 향상시키기위해 대표적인 기능인 조기중단 기능이 있다.
지정한 부스팅 반복 횟수에 도달하지 않더라도 예측 오류가 더 이상 개선되지 않으면 반복을 중단한다.
XGBoost 실습 - 위스콘신 유방암 데이터 셋
먼저 데이터 셋을 확인해 보자.
import xgboost as xgb
from xgboost import plot_importance
import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
import warnings
warnings.filterwarnings('ignore')
dataset = load_breast_cancer()
X_features= dataset.data
y_label = dataset.target
cancer_df = pd.DataFrame(data=X_features, columns=dataset.feature_names)
cancer_df['target']= y_label
cancer_df.iloc[:3,:6]
mean radius | mean texture | mean perimeter | mean area | mean smoothness | mean compactness | |
---|---|---|---|---|---|---|
0 | 17.99 | 10.38 | 122.8 | 1001.0 | 0.11840 | 0.27760 |
1 | 20.57 | 17.77 | 132.9 | 1326.0 | 0.08474 | 0.07864 |
2 | 19.69 | 21.25 | 130.0 | 1203.0 | 0.10960 | 0.15990 |
종양의 크기와 모양에 관련된 많은 속성이 숫자형 값으로 돼있다.
print(dataset.target_names)
print(cancer_df['target'].value_counts())
['malignant' 'benign']
1 357
0 212
Name: target, dtype: int64
1값인 양성 benign이 357개, 0값이 212 개 이다. 전체 데이터 셋중 80%를 학습용으로 , 20%를 테스트용으로 분할하자.
# 전체 데이터 중 80%는 학습용 데이터, 20%는 테스트용 데이터 추출
X_train, X_test, y_train, y_test=train_test_split(X_features, y_label,
test_size=0.2, random_state=156 )
print(X_train.shape , X_test.shape)
(455, 30) (114, 30)
XGBClassifier 클래스의 fit(), predict()를 이용해 학습과 예측을 수행한다.
# 사이킷런 래퍼 XGBoost 클래스인 XGBClassifier 임포트
from xgboost import XGBClassifier
xgb_wrapper = XGBClassifier(n_estimators=400, learning_rate=0.1, max_depth=3)
xgb_wrapper.fit(X_train, y_train)
w_preds = xgb_wrapper.predict(X_test)
w_pred_proba = xgb_wrapper.predict_proba(X_test)[:, 1]
get_clf_eval(y_test , w_preds, w_pred_proba)
오차 행렬
[[35 2]
[ 1 76]]
정확도: 0.9737, 정밀도: 0.9744, 재현율: 0.9870, F1: 0.9806, AUC:0.9951
사이킷런 래퍼 XGBoost 에서도 조기 중단을 수행할 수 있다.
조기 중단 관련한 파라미터를 fit()에 입력하면 된다.
조기 중단 관련 파라미터는 평가 지표가 향상될 수 있는 반복 횟수를 정의하는 early_stopping_roundss, 조기 중단을 위한 평가 지표인 eval_metric, 그리고 성능 평가를 수행할 데이터 셋인 eval_set이다.
성능 평가를 수행할 데이터 셋은 학습 데이터가 아니라 별도의 데이터 셋이여야 한다. early_stopping_rounds를 100, eval_metric은 logloss,eval_set는 테스트 데이터 셋으로 설정한다.
다음 예제는 평가를 위한 데이터 셋으로 테스트 데이테 셋을 사용했습니다만, 바람직한 부분은 아니다.
테스트 데이터 셋은 학습 시에는 완전히 알려지지 않은 데이터 셋을 사용해야 한다. 평가에 테스트 데이터 셋을 사용하면 학습 시에 미리 참고가 되어 과적합할 수 있기 때문이다.
단, 이 예제에서는 데이터 셋의 크기가 작아 테스트 데이터 평가용으로 사용했으니, 이 부분을 주의하자.
from xgboost import XGBClassifier
xgb_wrapper = XGBClassifier(n_estimators=400, learning_rate=0.1, max_depth=3)
evals = [(X_test, y_test)]
xgb_wrapper.fit(X_train, y_train, early_stopping_rounds=100, eval_metric="logloss",
eval_set=evals, verbose=True)
ws100_preds = xgb_wrapper.predict(X_test)
ws100_pred_proba = xgb_wrapper.predict_proba(X_test)[:, 1]
[0] validation_0-logloss:0.61352
Will train until validation_0-logloss hasn't improved in 100 rounds.
[1] validation_0-logloss:0.547842
[2] validation_0-logloss:0.494247
[3] validation_0-logloss:0.447986
[4] validation_0-logloss:0.409109
[5] validation_0-logloss:0.374977
[6] validation_0-logloss:0.345714
[7] validation_0-logloss:0.320529
[8] validation_0-logloss:0.29721
[9] validation_0-logloss:0.277991
[10] validation_0-logloss:0.260302
[11] validation_0-logloss:0.246037
[12] validation_0-logloss:0.231556
[13] validation_0-logloss:0.22005
[14] validation_0-logloss:0.208572
[15] validation_0-logloss:0.199993
[16] validation_0-logloss:0.190118
[17] validation_0-logloss:0.181818
[18] validation_0-logloss:0.174729
[19] validation_0-logloss:0.167657
[20] validation_0-logloss:0.158202
[21] validation_0-logloss:0.154725
[22] validation_0-logloss:0.148947
[23] validation_0-logloss:0.143308
[24] validation_0-logloss:0.136344
[25] validation_0-logloss:0.132778
[26] validation_0-logloss:0.127912
[27] validation_0-logloss:0.125263
[28] validation_0-logloss:0.119978
[29] validation_0-logloss:0.116412
[30] validation_0-logloss:0.114502
[31] validation_0-logloss:0.112572
[32] validation_0-logloss:0.11154
[33] validation_0-logloss:0.108681
[34] validation_0-logloss:0.106681
[35] validation_0-logloss:0.104207
[36] validation_0-logloss:0.102962
[37] validation_0-logloss:0.100576
[38] validation_0-logloss:0.098683
[39] validation_0-logloss:0.096444
[40] validation_0-logloss:0.095869
[41] validation_0-logloss:0.094242
[42] validation_0-logloss:0.094715
[43] validation_0-logloss:0.094272
[44] validation_0-logloss:0.093894
[45] validation_0-logloss:0.094184
[46] validation_0-logloss:0.09402
[47] validation_0-logloss:0.09236
[48] validation_0-logloss:0.093012
[49] validation_0-logloss:0.091272
[50] validation_0-logloss:0.090051
[51] validation_0-logloss:0.089605
[52] validation_0-logloss:0.089577
[53] validation_0-logloss:0.090703
[54] validation_0-logloss:0.089579
[55] validation_0-logloss:0.090357
[56] validation_0-logloss:0.091587
[57] validation_0-logloss:0.091527
[58] validation_0-logloss:0.091986
[59] validation_0-logloss:0.091951
[60] validation_0-logloss:0.091939
[61] validation_0-logloss:0.091461
[62] validation_0-logloss:0.090311
[63] validation_0-logloss:0.089407
[64] validation_0-logloss:0.089719
[65] validation_0-logloss:0.089743
[66] validation_0-logloss:0.089622
[67] validation_0-logloss:0.088734
[68] validation_0-logloss:0.088621
[69] validation_0-logloss:0.089739
[70] validation_0-logloss:0.089981
[71] validation_0-logloss:0.089782
[72] validation_0-logloss:0.089584
[73] validation_0-logloss:0.089533
[74] validation_0-logloss:0.088748
[75] validation_0-logloss:0.088597
[76] validation_0-logloss:0.08812
[77] validation_0-logloss:0.088396
[78] validation_0-logloss:0.088736
[79] validation_0-logloss:0.088153
[80] validation_0-logloss:0.087577
[81] validation_0-logloss:0.087412
[82] validation_0-logloss:0.08849
[83] validation_0-logloss:0.088575
[84] validation_0-logloss:0.08807
[85] validation_0-logloss:0.087641
[86] validation_0-logloss:0.087416
[87] validation_0-logloss:0.087611
[88] validation_0-logloss:0.087065
[89] validation_0-logloss:0.08727
[90] validation_0-logloss:0.087161
[91] validation_0-logloss:0.086962
[92] validation_0-logloss:0.087166
[93] validation_0-logloss:0.087067
[94] validation_0-logloss:0.086592
[95] validation_0-logloss:0.086116
[96] validation_0-logloss:0.087139
[97] validation_0-logloss:0.086768
[98] validation_0-logloss:0.086694
[99] validation_0-logloss:0.086547
[100] validation_0-logloss:0.086498
[101] validation_0-logloss:0.08641
[102] validation_0-logloss:0.086288
[103] validation_0-logloss:0.086258
[104] validation_0-logloss:0.086835
[105] validation_0-logloss:0.086767
[106] validation_0-logloss:0.087321
[107] validation_0-logloss:0.087304
[108] validation_0-logloss:0.08728
[109] validation_0-logloss:0.087298
[110] validation_0-logloss:0.087289
[111] validation_0-logloss:0.088002
[112] validation_0-logloss:0.087936
[113] validation_0-logloss:0.087843
[114] validation_0-logloss:0.088066
[115] validation_0-logloss:0.087649
[116] validation_0-logloss:0.087298
[117] validation_0-logloss:0.087799
[118] validation_0-logloss:0.087751
[119] validation_0-logloss:0.08768
[120] validation_0-logloss:0.087626
[121] validation_0-logloss:0.08757
[122] validation_0-logloss:0.087547
[123] validation_0-logloss:0.087156
[124] validation_0-logloss:0.08767
[125] validation_0-logloss:0.087737
[126] validation_0-logloss:0.088275
[127] validation_0-logloss:0.088309
[128] validation_0-logloss:0.088266
[129] validation_0-logloss:0.087886
[130] validation_0-logloss:0.088861
[131] validation_0-logloss:0.088675
[132] validation_0-logloss:0.088743
[133] validation_0-logloss:0.089218
[134] validation_0-logloss:0.089179
[135] validation_0-logloss:0.088821
[136] validation_0-logloss:0.088512
[137] validation_0-logloss:0.08848
[138] validation_0-logloss:0.088386
[139] validation_0-logloss:0.089145
[140] validation_0-logloss:0.08911
[141] validation_0-logloss:0.088765
[142] validation_0-logloss:0.088678
[143] validation_0-logloss:0.088389
[144] validation_0-logloss:0.089271
[145] validation_0-logloss:0.089238
[146] validation_0-logloss:0.089139
[147] validation_0-logloss:0.088907
[148] validation_0-logloss:0.089416
[149] validation_0-logloss:0.089388
[150] validation_0-logloss:0.089108
[151] validation_0-logloss:0.088735
[152] validation_0-logloss:0.088717
[153] validation_0-logloss:0.088484
[154] validation_0-logloss:0.088471
[155] validation_0-logloss:0.088545
[156] validation_0-logloss:0.088521
[157] validation_0-logloss:0.088547
[158] validation_0-logloss:0.088275
[159] validation_0-logloss:0.0883
[160] validation_0-logloss:0.08828
[161] validation_0-logloss:0.088013
[162] validation_0-logloss:0.087758
[163] validation_0-logloss:0.087784
[164] validation_0-logloss:0.087777
[165] validation_0-logloss:0.087517
[166] validation_0-logloss:0.087542
[167] validation_0-logloss:0.087642
[168] validation_0-logloss:0.08739
[169] validation_0-logloss:0.087377
[170] validation_0-logloss:0.087298
[171] validation_0-logloss:0.087368
[172] validation_0-logloss:0.087395
[173] validation_0-logloss:0.087385
[174] validation_0-logloss:0.087132
[175] validation_0-logloss:0.087159
[176] validation_0-logloss:0.086955
[177] validation_0-logloss:0.087053
[178] validation_0-logloss:0.08697
[179] validation_0-logloss:0.086973
[180] validation_0-logloss:0.087038
[181] validation_0-logloss:0.086799
[182] validation_0-logloss:0.086826
[183] validation_0-logloss:0.086582
[184] validation_0-logloss:0.086588
[185] validation_0-logloss:0.086614
[186] validation_0-logloss:0.086372
[187] validation_0-logloss:0.086369
[188] validation_0-logloss:0.086297
[189] validation_0-logloss:0.086104
[190] validation_0-logloss:0.086023
[191] validation_0-logloss:0.08605
[192] validation_0-logloss:0.086149
[193] validation_0-logloss:0.085916
[194] validation_0-logloss:0.085915
[195] validation_0-logloss:0.085984
[196] validation_0-logloss:0.086012
[197] validation_0-logloss:0.085922
[198] validation_0-logloss:0.085853
[199] validation_0-logloss:0.085874
[200] validation_0-logloss:0.085888
[201] validation_0-logloss:0.08595
[202] validation_0-logloss:0.08573
[203] validation_0-logloss:0.08573
[204] validation_0-logloss:0.085753
[205] validation_0-logloss:0.085821
[206] validation_0-logloss:0.08584
[207] validation_0-logloss:0.085776
[208] validation_0-logloss:0.085686
[209] validation_0-logloss:0.08571
[210] validation_0-logloss:0.085806
[211] validation_0-logloss:0.085593
[212] validation_0-logloss:0.085801
[213] validation_0-logloss:0.085806
[214] validation_0-logloss:0.085744
[215] validation_0-logloss:0.085658
[216] validation_0-logloss:0.085843
[217] validation_0-logloss:0.085632
[218] validation_0-logloss:0.085726
[219] validation_0-logloss:0.085783
[220] validation_0-logloss:0.085791
[221] validation_0-logloss:0.085817
[222] validation_0-logloss:0.085757
[223] validation_0-logloss:0.085674
[224] validation_0-logloss:0.08586
[225] validation_0-logloss:0.085871
[226] validation_0-logloss:0.085927
[227] validation_0-logloss:0.085954
[228] validation_0-logloss:0.085874
[229] validation_0-logloss:0.086057
[230] validation_0-logloss:0.086002
[231] validation_0-logloss:0.085922
[232] validation_0-logloss:0.086102
[233] validation_0-logloss:0.086115
[234] validation_0-logloss:0.086169
[235] validation_0-logloss:0.086263
[236] validation_0-logloss:0.086292
[237] validation_0-logloss:0.086217
[238] validation_0-logloss:0.086395
[239] validation_0-logloss:0.086342
[240] validation_0-logloss:0.08618
[241] validation_0-logloss:0.086195
[242] validation_0-logloss:0.086248
[243] validation_0-logloss:0.086263
[244] validation_0-logloss:0.086293
[245] validation_0-logloss:0.086222
[246] validation_0-logloss:0.086398
[247] validation_0-logloss:0.086347
[248] validation_0-logloss:0.086276
[249] validation_0-logloss:0.086448
[250] validation_0-logloss:0.086294
[251] validation_0-logloss:0.086312
[252] validation_0-logloss:0.086364
[253] validation_0-logloss:0.086394
[254] validation_0-logloss:0.08649
[255] validation_0-logloss:0.086441
[256] validation_0-logloss:0.08629
[257] validation_0-logloss:0.08646
[258] validation_0-logloss:0.086391
[259] validation_0-logloss:0.086441
[260] validation_0-logloss:0.086461
[261] validation_0-logloss:0.086491
[262] validation_0-logloss:0.086445
[263] validation_0-logloss:0.086466
[264] validation_0-logloss:0.086319
[265] validation_0-logloss:0.086488
[266] validation_0-logloss:0.086538
[267] validation_0-logloss:0.086471
[268] validation_0-logloss:0.086501
[269] validation_0-logloss:0.086522
[270] validation_0-logloss:0.086689
[271] validation_0-logloss:0.086738
[272] validation_0-logloss:0.08683
[273] validation_0-logloss:0.086684
[274] validation_0-logloss:0.08664
[275] validation_0-logloss:0.086496
[276] validation_0-logloss:0.086355
[277] validation_0-logloss:0.086519
[278] validation_0-logloss:0.086567
[279] validation_0-logloss:0.08659
[280] validation_0-logloss:0.086679
[281] validation_0-logloss:0.086637
[282] validation_0-logloss:0.086499
[283] validation_0-logloss:0.086356
[284] validation_0-logloss:0.086405
[285] validation_0-logloss:0.086429
[286] validation_0-logloss:0.086456
[287] validation_0-logloss:0.086504
[288] validation_0-logloss:0.08637
[289] validation_0-logloss:0.086457
[290] validation_0-logloss:0.086453
[291] validation_0-logloss:0.086322
[292] validation_0-logloss:0.086284
[293] validation_0-logloss:0.086148
[294] validation_0-logloss:0.086196
[295] validation_0-logloss:0.086221
[296] validation_0-logloss:0.086308
[297] validation_0-logloss:0.086178
[298] validation_0-logloss:0.086263
[299] validation_0-logloss:0.086131
[300] validation_0-logloss:0.086179
[301] validation_0-logloss:0.086052
[302] validation_0-logloss:0.086016
[303] validation_0-logloss:0.086101
[304] validation_0-logloss:0.085977
[305] validation_0-logloss:0.086059
[306] validation_0-logloss:0.085971
[307] validation_0-logloss:0.085998
[308] validation_0-logloss:0.085999
[309] validation_0-logloss:0.085877
[310] validation_0-logloss:0.085923
[311] validation_0-logloss:0.085948
Stopping. Best iteration:
[211] validation_0-logloss:0.085593
n_estimatros를 400으로 설정해도 400번 반복을 수행하지 않고 311번 반복한 후 학습을 완료 했음을 알 수 있다.
311번 반복한 후 멈춘 이유는 211번 반복이후 311번까지 100번 동안 성능 평가지 수가 향상되지 않았기 때문이다.
조기중단으로 학습된 XGBClassifier의 예측 성능을 살펴보면. 조기중단이 적용 되지 않은 결과 보다 약간 저조한 성능을 나타내지만 큰 차이는 안디ㅏ.
get_clf_eval(y_test , ws100_preds, ws100_pred_proba)
오차 행렬
[[34 3]
[ 1 76]]
정확도: 0.9649, 정밀도: 0.9620, 재현율: 0.9870, F1: 0.9744, AUC:0.9954
# early_stopping_rounds를 10으로 설정하고 재 학습.
xgb_wrapper.fit(X_train, y_train, early_stopping_rounds=10,
eval_metric="logloss", eval_set=evals,verbose=True)
ws10_preds = xgb_wrapper.predict(X_test)
ws10_pred_proba = xgb_wrapper.predict_proba(X_test)[:, 1]
get_clf_eval(y_test , ws10_preds, ws10_pred_proba)
[0] validation_0-logloss:0.61352
Will train until validation_0-logloss hasn't improved in 10 rounds.
[1] validation_0-logloss:0.547842
[2] validation_0-logloss:0.494247
[3] validation_0-logloss:0.447986
[4] validation_0-logloss:0.409109
[5] validation_0-logloss:0.374977
[6] validation_0-logloss:0.345714
[7] validation_0-logloss:0.320529
[8] validation_0-logloss:0.29721
[9] validation_0-logloss:0.277991
[10] validation_0-logloss:0.260302
[11] validation_0-logloss:0.246037
[12] validation_0-logloss:0.231556
[13] validation_0-logloss:0.22005
[14] validation_0-logloss:0.208572
[15] validation_0-logloss:0.199993
[16] validation_0-logloss:0.190118
[17] validation_0-logloss:0.181818
[18] validation_0-logloss:0.174729
[19] validation_0-logloss:0.167657
[20] validation_0-logloss:0.158202
[21] validation_0-logloss:0.154725
[22] validation_0-logloss:0.148947
[23] validation_0-logloss:0.143308
[24] validation_0-logloss:0.136344
[25] validation_0-logloss:0.132778
[26] validation_0-logloss:0.127912
[27] validation_0-logloss:0.125263
[28] validation_0-logloss:0.119978
[29] validation_0-logloss:0.116412
[30] validation_0-logloss:0.114502
[31] validation_0-logloss:0.112572
[32] validation_0-logloss:0.11154
[33] validation_0-logloss:0.108681
[34] validation_0-logloss:0.106681
[35] validation_0-logloss:0.104207
[36] validation_0-logloss:0.102962
[37] validation_0-logloss:0.100576
[38] validation_0-logloss:0.098683
[39] validation_0-logloss:0.096444
[40] validation_0-logloss:0.095869
[41] validation_0-logloss:0.094242
[42] validation_0-logloss:0.094715
[43] validation_0-logloss:0.094272
[44] validation_0-logloss:0.093894
[45] validation_0-logloss:0.094184
[46] validation_0-logloss:0.09402
[47] validation_0-logloss:0.09236
[48] validation_0-logloss:0.093012
[49] validation_0-logloss:0.091272
[50] validation_0-logloss:0.090051
[51] validation_0-logloss:0.089605
[52] validation_0-logloss:0.089577
[53] validation_0-logloss:0.090703
[54] validation_0-logloss:0.089579
[55] validation_0-logloss:0.090357
[56] validation_0-logloss:0.091587
[57] validation_0-logloss:0.091527
[58] validation_0-logloss:0.091986
[59] validation_0-logloss:0.091951
[60] validation_0-logloss:0.091939
[61] validation_0-logloss:0.091461
[62] validation_0-logloss:0.090311
Stopping. Best iteration:
[52] validation_0-logloss:0.089577
오차 행렬
[[34 3]
[ 2 75]]
정확도: 0.9561, 정밀도: 0.9615, 재현율: 0.9740, F1: 0.9677, AUC:0.9947
조기 중단값을 너무 급격하게 줄이면 예측 성능이 저하 될 수 있다. 그러므로 적절한 early_stopping_rounds를 설정해야 한다.
피처의 중요도를 시각화하는 모듈인 plot_importance() API에 사이킷런 래퍼 클래스를 입력해도 앞에서 파이썬 래퍼 클래스를 입력한 결과와 똑같이 시각화 결과를 도출해 준다.
from xgboost import plot_importance
import matplotlib.pyplot as plt
%matplotlib inline
fig, ax = plt.subplots(figsize=(10, 12))
# 사이킷런 래퍼 클래스를 입력해도 무방.
plot_importance(xgb_wrapper, ax=ax)