活水快报 - 42Digest

InterFeat:在结构化生物医学数据中寻找有趣假设的自动化管道

InterFeat: An Automated Pipeline for Finding Interesting Hypotheses in Structured Biomedical Data

Dan Ofer, Michal Linial, Dafna Shahaf

arXiv

2025年5月18日

发现有趣的现象是科学发现的核心,但它是一本手册,定义不清的概念。我们提出了一个集成管道,用于在结构化生物医学数据中发现有趣的简单假设(具有效应方向的特征和目标关系和潜在潜在机制)。该管道结合了机器学习、知识图谱、文献搜索和大型语言模型。我们将“有趣性”正式化为新颖性,实用性和合理性的结合。在英国生物银行的8种主要疾病中,我们的管道在文献中出现之前,一直在恢复风险因素。 40-53名候选人被验证为有趣,而基线为0-7。总的来说,28管道解决了“有趣性”和任何目标的挑战。我们发布数据和代码:https://github.com/LinialLab/InterFeat

Finding interesting phenomena is the core of scientific discovery, but it is a manual, ill-defined concept. We present an integrative pipeline for automating the discovery of interesting simple hypotheses (feature-target relations with effect direction and a potential underlying mechanism) in structured biomedical data. The pipeline combines machine learning, knowledge graphs, literature search and Large Language Models. We formalize "interestingness" as a combination of novelty, utility and pla...

定量方法人工智能计算与语言信息检索

View Source