PurpCode: Reasoning for Safer Code Generation
Jiawei Liu and Nirav Diwan and Zhe Wang and Haoyu Zhai and Xiaona Zhou and Kiet A. Nguyen and Tianjiao Yu and Muntasir Wahed and Yinlin Deng and Hadjer Benkraouda and Yuxiang Wei and Lingming Zhang and Ismini Lourentzou and Gang Wang
我们推出了 PurpCode,这是第一个训练后配方,用于训练安全代码推理模型,以生成安全代码并防御恶意网络活动。 PurpCode 在两个阶段训练一个推理模型:(一) 规则学习,它明确教导模型引用网络安全规则来生成无漏洞代码,避免促进恶意网络活动;(二) 强化学习,它通过多样化、多目标的奖励机制优化模型安全并保留模型效用。 为了通过全面的网络安全数据增强培训管道的能力,我们进行内部红队,根据现实世界的任务综合全面和高覆盖提示,在模型中诱导不安全的网络活动。 基于PurpCode,我们开发了一个基于推理的编码模型,即PurpCode-32B,它展示了最先进的网络安全,优于各种前沿模型。 同时,我们的对齐方法降低了一般和特定网络安全场景中的模型重反率,同时保留了代码生成和通用安全知识中的模型效用。
We introduce PurpCode, the first post-training recipe for training safe code reasoning models towards generating secure code and defending against malicious cyberactivities. PurpCode trains a reasoning model in two stages: (i) Rule Learning, which explicitly teaches the model to reference cybersafety rules to generate vulnerability-free code and to avoid facilitating malicious cyberactivities; and (ii) Reinforcement Learning, which optimizes model safety and preserves model utility through diver...