Feature Power: A New Variable Importance Measure for Random Forests
Katherine A. Fotion
Oral Defence Date:
Monday, May 14, 2018 - 15:00
Assoc. Prof. Kazunori Okada, Prof. Dragutin Petkovic, Assoc. Prof. Hui Yang
Variable importance and interaction measures are crucial to breaking open the "black box” of machine-learned classifiers. The existing metrics, however, are data-driven and lack a solid mathematical foundation, resulting in misleading conclusions on certain types of data. We propose feature power: a new variable importance measure based on the Shapley value of cooperative game theory. We evaluate the validity of this new measure and the behavior of feature power in comparison to existing variable importance metrics. We also introduce coalition power: a methodology for quantifying the power of a group of features collectively. We demonstrate that both methods produce consistent, correct results on toy data and gain interesting insights by applying feature power to real data sets. We discuss the extensibility of both power measures to other tree-based ensembles and neural networks.
random forests, machine learning, variable importance, model explainability, voting games