活水快报 - 42Digest

变压器的统一几何场理论框架:从歧管嵌入到内核调制

A Unified Geometric Field Theory Framework for Transformers: From Manifold Embeddings to Kernel Modulation

Xianshuai Shi, Jianfeng Zhu, and Leibo Liu

arXiv

2025年11月11日

Transformer架构通过其自我关注机制在自然语言处理、计算机视觉和科学计算方面取得了巨大成功。然而,其核心组件位置编码和注意力机制缺乏统一的物理或数学解释。本文提出了结构理论框架,集成了位置编码、内核积分运算符和深入理论探究的注意力机制。我们将离散位置(如文本令牌索引和图像像素坐标)映射到连续流形上的空间函数,使变形金刚层的场上理论解释成为可能,因为内核调制运算符对嵌入式流形起作用。

The Transformer architecture has achieved tremendous success in natural language processing, computer vision, and scientific computing through its self-attention mechanism. However, its core components-positional encoding and attention mechanisms-have lacked a unified physical or mathematical interpretation. This paper proposes a structural theoretical framework that integrates positional encoding, kernel integral operators, and attention mechanisms for in-depth theoretical investigation. We map...

机器学习

View Source