Assessing Monotone Dependence: Area Under the Curve Meets Rank Correlation
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
The assessment of monotone dependence between random variables is a classical problem in statistics and a gamut of application domains. Consequently, researchers have sought measures of association that are invariant under strictly increasing transformations of the margins, with the extant literature being splintered. For continuous variables, symmetric rank correlation coefficients, such as Spearman's Rho and Kendall's Tau, have been studied at great length in the statistical literature. For dichotomous outcomes, the asymmetric area under the curve (AUC) measure is used to assess monotone dependence. We unify and complete thus far disconnected strands of literature, by establishing common population level theory, common estimators, and common tests that bridge continuous and dichotomous settings and apply to all linearly ordered outcomes.
Originating in the biomedical literature, the C index provides a bridge between AUC, to which it reduces for a dichotomous outcome, and Kendall's Tau, to which it relates linearly under continuity. To establish the same kind of bridge between AUC and Spearman's Rho, we introduce asymmetric grade correlation, AGC$(X,Y)$, as the covariance of the mid distribution function transforms, or grades, of $X$ and $Y$, divided by the variance of the grade of $Y$. The coefficient of monotone association then is CMA$(X,Y) = \frac{1}{2} ($AGC$(X,Y) + 1)$. When $X$ and $Y$ are continuous, AGC is symmetric and equals Spearman's Rho. When $Y$ is dichotomous, CMA equals AUC. We establish central limit theorems for the sample versions of these measures, and we develop tests of DeLong type for their equality with a shared outcome $Y$. In case studies, we assess progress in data-driven weather prediction and evaluate methods of uncertainty quantification for large language models.