Aggregation with Exponential Weights is Optimal in Expectation
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
The aggregation with exponential weights (AEW) estimator is not fully understood in the basic setting of model selection aggregation with squared loss.
In particular, whether it is minimax-rate optimal in expectation for large enough fixed temperatures and under random design has been an open problem since its introduction, which was explicitly posed by Lecué and Mendelson (2013).
In this paper, we settle this problem by showing that \emph{without} requiring a Bernstein-type assumption, the AEW indeed achieves the excess risk $T \log (M) / (n+1)$ in expectation, whenever the temperature $T$ satisfies $(L^2/T)\exp(B/T)\leq \mu /2$.
Here, the number of dictionary elements is $M$, the estimator has observed $n$ i.i.d. samples from any distribution, and the loss is assumed to be bounded by $B$, $L$-Lipschitz continuous and $\mu$-strongly convex.
For squared loss, we show that $T\geq 4 b^2$ suffices when the predictions and labels are $[0,b]$-valued.
Because AEW is known to be suboptimal in expectation for temperatures below some constant, this shows that AEW has a sharp phase transition when the temperature is large enough but constant, as conjectured by Lecué and Mendelson.