Prudential Reliability of Large Language Models in Reinsurance: Governance, Assurance, and Capital Efficiency
Stella C. Dong
本文为评估再保险中大型语言模型(LLM)的可靠性制定了审慎框架。 五支柱架构 - 治理,数据谱系,保证,弹性和监管调整 - 将来自偿付能力II,SR 11-7的监督期望以及EIOPA(2025),NAIC(2023)和IAID(2024)的指导转化为可测量的生命周期控制。 该框架通过再保险AI可靠性和保证基准(RAIRAB)实施,该基准评估治理嵌入式LLM是否符合接地,透明度和问责制的审慎标准。 在六个任务家族中,检索接地配置实现了更高的接地精度(0.90),减少了大约40%的幻觉和解释漂移,透明度几乎翻了一番。 这些机制降低了风险转移和资本配置中的信息摩擦,表明现有的审慎理论在治理明确、数据可追溯和保证可验证时已经适应了可靠的人工智能。
This paper develops a prudential framework for assessing the reliability of large language models (LLMs) in reinsurance. A five-pillar architecture–governance, data lineage, assurance, resilience, and regulatory alignment–translates supervisory expectations from Solvency II, SR 11-7, and guidance from EIOPA (2025), NAIC (2023), and IAIS (2024) into measurable lifecycle controls. The framework is implemented through the Reinsurance AI Reliability and Assurance Benchmark (RAIRAB), which evaluates ...