The generalized underlap coefficient with an application in clustering
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
Quantifying distributional separation across groups is fundamental in statistical learning and scientific discovery, yet most classical discrepancy measures are tailored to two-group comparisons.
We generalize the underlap coefficient (UNL), a multi-group separation measure, to multivariate settings.
We study its relationship with Bayes risk and mutual information, and further interpret the UNL as a measure of dependence between group labels and variables of interest.
We propose an efficient importance sampling estimator of the UNL that can be combined with flexible density estimation methods.
A key application is the assessment of partition-covariate dependence in clustering, where the UNL provides an interpretable measure of whether latent group structure can be explained by specific covariates.
The methodology is illustrated on two real-world datasets.