Poster
in
Workshop: Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2025
From Medical Literature to Predictive Features: An Evidence-based Knowledge Graph Approach
Donghee Choi · Antoine Lain · Joram Posma · Mark Kozdoba · Binyamin Perets · Shie Mannor
We present a novel approach to augmenting medical and biological prediction tasks with knowledge derived from literature. Specifically, while typical modern medical and biological datasets may contain a large amounts of biomarker and genetic data features per subject, the number of subjects often remains limited. In addition, while many of the collected features are relatively easy to measure, individually such features are typically not strongly informative with regards to higher order prediction tasks of interest. This small sample data setting thus limits and complicates the applicability of standard machine learning prediction methods due to possible issues of overfitting. At the same time, decades of medical research have produced extensive knowledge in the form of documented associations between various biological entities. Here we propose a framework for integrating this evidence-based knowledge into predictive models, addressing several challenges in the use of qualitative literature findings to obtain more informative representations of quantitative data. The stages of the approach include: a construction of a Knowledge Graph by extracting entity relationships from the literature, a construction of a probability model consistent with the relationships, and the use of the model for improved predictions via feature augmentation and sparsity. Our initial evaluation results demonstrate improved prediction accuracy on biomarkers in the NutriTech dataset.