Generalization error of min-norm interpolators in transfer learning
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
This paper establishes the generalization error of pooled min-$\ell_2$-norm interpolation in transfer learning, where data from diverse distributions are available.
Min-norm interpolators arise naturally as implicit regularized limits of modern machine learning algorithms.
Prior work has characterized their out-of-distribution risk when samples from the test distribution are unavailable during training.
In many applications, however, limited test samples may be available at training time, yet properties of min-norm interpolation in this regime remain poorly understood.
We address this gap by characterizing the bias and variance of pooled min-$\ell_2$-norm interpolation under both covariate shift and model shift.
Our results yield several important implications.
In certain cases under model shift, we show that adding data always hurts when the signal-to-noise ratio (SNR) is low.
At higher SNR levels, transfer learning is beneficial provided the shift-to-signal ratio falls below a threshold that we characterize explicitly.
Under covariate shift, we find that when the source sample size is small relative to the dimension, greater heterogeneity between domains reduces risk, and vice versa.
While our model shift results are initially established for Gaussian designs, we extend them to more general designs through a universality argument.
To illustrate the broader applicability of our technical tools beyond interpolation learning, we characterize the risk of a bias-corrected estimator that uses the pooled interpolator as an initialization and corrects the resulting bias with target data.
On the technical side, we develop a novel anisotropic local law and a Lindeberg-swapping argument, yielding tools that may be of independent interest in random matrix theory and universality analysis.
Finally, we supplement our theory with simulations demonstrating the finite-sample efficacy of our results.