Flow-Disentangled Feature Importance
Xingshu Chen · Yifeng Guo · Jin-Hong Du
Abstract
Quantifying feature importance with valid statistical uncertainty is central to interpretable machine learning, yet classical model-agnostic methods often fail under feature correlation, producing unreliable attributions and compromising inference. Statistical approaches that address correlation through feature decorrelation have shown promise but remain restricted to $\ell_2$ loss, limiting their applicability across diverse machine learning tasks. We introduce Flow-Disentangled Feature Importance (FDFI), a model-agnostic framework that resolves these limitations by combining principled statistical inference with computational flexibility. FDFI leverages flow matching to learn flexible disentanglement maps that not only handle arbitrary feature distributions but also provide an interpretable pathway for understanding how importance is attributed through the data's correlation structure. The framework generalizes the decorrelation-based attribution to general differentiable loss functions, enabling statistically valid importance assessment for black-box predictors across regression and classification. We establish statistical inference theory, deriving semiparametric efficiency of FDFI estimators, which enables valid confidence intervals and hypothesis testing with Type I error control. Experiments demonstrate that FDFI achieves substantially higher statistical power than removal-based and conditional permutation approaches, while maintaining robust and interpretable attributions even under severe interdependence. These findings hold across synthetic benchmarks and a broad collection of real datasets spanning diverse domains.
Successful Page Load