From Model Training to Model Raising
Roland Aydin, Christian Cyron, Steve Bachelor, Ashton Anderson, Robert West
目前的人工智能训练方法只有在其核心能力建立后才能使模型与人类价值保持一致,从而导致模型容易错位且缺乏根深蒂固的价值体系。 我们提出从“模型训练”到“模型提升”的范式转变,其中对齐从一开始就编织成模型的发展。 我们确定了这一范式的几个关键组成部分,所有这些都围绕着重新设计训练语料库:从第一人称视角重新构建训练数据,将信息重新定位为生活体验,模拟社交互动,脚手架订购训练数据。 我们预计,培训语料库的重新设计将导致从第一次培训令牌开始对价值观的早期承诺,因此知识,技能和价值观本质上更难分离。 在一个大型语言模型能力在许多任务中开始超越人类能力的生态系统中,在我们看来,这似乎是一个关键的需求。
Current AI training methods align models with human values only after their core capabilities have been established, resulting in models that are easily misaligned and lack deep-rooted value systems. We propose a paradigm shift from "model training" to "model raising", in which alignment is woven into a model's development from the start. We identify several key components for this paradigm, all centered around redesigning the training corpus: reframing training data from a first-person perspect...