A Mathematical Optimization Approach for Expert-Informed Bayesian Best Subset Selection
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
A central challenge in statistical modeling is identifying the subset of features that belong in the true regression model.
The classical best subset selection problem, recently made tractable via mixed-integer optimization (MIO), finds the globally optimal sparse solution.
It does not, however, make use of any information beyond the observed data.
In many applied settings, domain experts can meaningfully rank or score the relevance of candidate predictors, yet no existing framework integrates such probabilistic expert assessments directly into the best-subsets objective.
This paper presents Expert-Implied Bayesian Best Subsets (EBBS), a method that incorporates domain-expert probability estimates of feature relevance into the MIO best-subsets problem through a maximum a posteriori (MAP) framework.
Expert views from multiple respondents are aggregated into a single prior probability per feature using the Poisson binomial distribution for marginal probability estimates, the pairwise win rate for pairwise comparisons, or the normalized mean rank for ordinal rankings.
This probability enters the objective function as a log-odds penalty term that smoothly encourages or discourages the selection of each feature consistent with the expert consensus.
This paper provides analytic derivations of the MAP formulation and characterizes its theoretical properties.
The proposed model reduces to Best Subsets when experts all have no views.
Empirical results on synthetic and real datasets are forthcoming.