Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks
Vahid Shahverdi, Giovanni Luca Marchetti, Kathlén Kohn
深度神经网络通常推断稀疏表示,在学习过程中汇合到子网络。 在这项工作中,我们通过代数几何学的镜头从理论上分析子网络及其偏差。 我们考虑具有多项式激活函数的全连接网络,并专注于它们参数化的功能空间的几何形状,通常被称为神经歧体。 首先,我们计算由子网络进行参数化的神经流形子空间的维度。 第二,我们表明这个子空间是单数的。 第三,我们认为这种奇点通常对应于训练动力学的关键点。 最后,我们讨论卷积网络,其中子网络和奇点与网络有类似的关联,但偏倚并没有产生。
Deep neural networks often infer sparse representations, converging to a subnetwork during the learning process. In this work, we theoretically analyze subnetworks and their bias through the lens of algebraic geometry. We consider fully-connected networks with polynomial activation functions, and focus on the geometry of the function space they parametrize, often referred to as neuromanifold. First, we compute the dimension of the subspace of the neuromanifold parametrized by subnetworks. Second...