python - Print decision tree and feature_importance when using BaggingClassifier -
obtaining decision tree , important features can easy when using decisiontreeclassifier in scikit learn. not able obtain none of them if , bagging function, e.g., baggingclassifier.
since need fit model using baggingclassifier, can not return results (print trees (graphs), feature_importances_, ...) related decisiontreeclassifier.
hier script:
seed = 7 n_iterations = 199 dtc = decisiontreeclassifier(random_state=seed, max_depth=none, min_impurity_split= 0.2, min_samples_leaf=6, max_features=none, #if none, max_features=n_features. max_leaf_nodes=20, criterion='gini', splitter='best', ) #parametersdtc = {'max_depth':range(3,10), 'max_leaf_nodes':range(10, 30)} parameters = {'max_features':range(1,200)} dt = randomizedsearchcv(baggingclassifier(base_estimator=dtc, #max_samples=1, n_estimators=100, #max_features=1, bootstrap = false, bootstrap_features = true, random_state=seed), parameters, n_iter=n_iterations, n_jobs=14, cv=kfold, error_score='raise', random_state=seed, refit=true) #min_samples_leaf=10 # fit model fit_dt= dt.fit(x_train, y_train) print(dir(fit_dt)) tree_model = dt.best_estimator_ # print important features (not working) features = tree_model.feature_importances_ print(features) rank = np.argsort(features)[::-1] print(rank[:12]) print(sorted(list(zip(features)))) # importing image (not working) sklearn.externals.six import stringio tree.export_graphviz(dt.best_estimator_, out_file='tree.dot') # necessary plot graph dot_data = stringio() # need understand relates read of strings tree.export_graphviz(dt.best_estimator_, out_file=dot_data, filled=true, class_names= target_names, rounded=true, special_characters=true) graph = pydotplus.graph_from_dot_data(dot_data.getvalue()) img = image(graph.create_png()) print(dir(img)) # dir can check possibilities in graph.create_png open("my_tree.png", "wb") png: png.write(img.data)
i obtain erros like: 'baggingclassifier' object has no attribute 'tree_' , 'baggingclassifier' object has no attribute 'feature_importances'. know how can obtain them? thanks.
based on the documentation, baggingclassifier object indeed doesn't have attribute 'feature_importances'. still compute described in answer question: feature importances - bagging, scikit-learn
you can access trees produced during fitting of baggingclassifier using attribute estimators_
, in following example:
from sklearn import svm, datasets sklearn.model_selection import gridsearchcv sklearn.ensemble import baggingclassifier iris = datasets.load_iris() clf = baggingclassifier(n_estimators=3) clf.fit(iris.data, iris.target) clf.estimators_
clf.estimators_
list of 3 fitted decision trees:
[decisiontreeclassifier(class_weight=none, criterion='gini', max_depth=none, max_features=none, max_leaf_nodes=none, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, presort=false, random_state=1422640898, splitter='best'), decisiontreeclassifier(class_weight=none, criterion='gini', max_depth=none, max_features=none, max_leaf_nodes=none, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, presort=false, random_state=1968165419, splitter='best'), decisiontreeclassifier(class_weight=none, criterion='gini', max_depth=none, max_features=none, max_leaf_nodes=none, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, presort=false, random_state=2103976874, splitter='best')]
so can iterate on list , access each 1 of trees.
Comments
Post a Comment