Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory
Kenneth Payne and Baptiste Alloui-Cros
大型语言模型(LLMs)是否是一种新型的战略智能,能够在竞争环境中进行目标推理?我们提供了有力的支持证据。迭代囚徒困境(IPD)长期以来一直是研究决策制定的模型。我们首次进行了一系列演化IPD锦标赛,将经典策略(如以牙还牙、冷酷触发)与来自前沿AI公司OpenAI、Google和Anthropic的智能体进行对抗。通过改变每场锦标赛的终止概率("未来阴影"),我们引入了复杂性和偶然性,干扰了记忆效应。我们的结果表明,LLMs具有高度竞争力,在这些复杂生态系统中持续存活甚至有时会扩散。此外,它们展现出独特且持久的"战略指纹":Google的Gemini模型在战略上表现出冷酷无情,会利用合作对手并对背叛者进行报复,而OpenAI的模型则保持高度合作性,这一特性在敌对环境中被证明是灾难性的。Anthropic的Claude则成为最具宽容性的互惠者,即使在被利用或成功背叛后,仍表现出显著的合作意愿恢复倾向。对模型提供的近32,000条文本推理分析表明,它们会积极推理时间范围和对手的可能策略,我们证明这种推理对其决策至关重要。这项工作将经典博弈论与机器心理学联系起来,为不确定性下的算法决策提供了丰富而细致的视角。
Are Large Language Models (LLMs) a new form of strategic intelligence, able to reason about goals in competitive settings? We present compelling supporting evidence. The Iterated Prisoner's Dilemma (IPD) has long served as a model for studying decision-making. We conduct the first ever series of evolutionary IPD tournaments, pitting canonical strategies (e.g., Tit-for-Tat, Grim Trigger) against agents from the leading frontier AI companies OpenAI, Google, and Anthropic. By varying the terminatio...