Source-Optimal Training is Transfer-Suboptimal
C. Evans Hedges
我们证明了转移学习的一个根本错位:将源风险最小化的源正则化几乎从未与正交化最大化转移收益相吻合。 通过L2-SP 脊回归的尖锐相位边界,我们表征了转移最优源惩罚 τ_0^* ,并显示它与任务最优值存在可预测的差异,需要在高SNR制度中更强的正则化,在低SNR制度中需要更强的正则化。 此外,在各向同性设置中,转移的决定与目标样本大小和噪声非常独立,仅取决于任务对齐和源特性。 CIFAR-10和MNIST实验证实这种反直觉模式在非线性网络中持续存在。
We prove a fundamental misalignment in transfer learning: the source regularization that minimizes source risk almost never coincides with the regularization maximizing transfer benefit. Through sharp phase boundaries for L2-SP ridge regression, we characterize the transfer-optimal source penalty τ_0^* and show it diverges predictably from task-optimal values, requiring stronger regularization in high-SNR regimes and weaker regularization in low-SNR regimes. Additionally, in isotropic settings t...