活水快报 - 42Digest

多模态行为模式分析与眼动追踪和基于LLM的推理

Multimodal Behavioral Patterns Analysis with Eye-Tracking and LLM-Based Reasoning

Dongyang Guo, Yasmeen Abdrabou, Enkeleda Thaqi, Enkelejda Kasneci

arXiv

2025年7月24日

眼动追踪数据揭示了对用户认知状态的宝贵见解,但由于其结构化,非语言性质,难以分析。虽然大型语言模型(LLM)擅长对文本进行推理,但它们与时间和数值数据作斗争。本文介绍了一个多模态人-AI协作框架,旨在增强从眼动信号中提取的认知模式。该框架包括:(1)使用水平和垂直分割以及LLM推理来揭示潜在凝视模式的多阶段管道;(2)专家模型共同评分模块,将专家判断与LLM输出集成,以生成行为解释的信任分数;(3)将基于LSTM的时间建模与LLM驱动的语义分析相结合的混合异常检测模块。我们在多个LLM和快速策略中的结果显示了一致性,可解释性和性能的改进,高达50

Eye-tracking data reveals valuable insights into users' cognitive states but is difficult to analyze due to its structured, non-linguistic nature. While large language models (LLMs) excel at reasoning over text, they struggle with temporal and numerical data. This paper presents a multimodal human-AI collaborative framework designed to enhance cognitive pattern extraction from eye-tracking signals. The framework includes: (1) a multi-stage pipeline using horizontal and vertical segmentation alon...

人机交互人工智能计算与语言机器学习

View Source