SPIREX: Improving LLM-based relation extraction from RNA-focused scientific literature using graph machine learning
Contributo in Atti di convegno
Data di Pubblicazione:
2024
Citazione:
SPIREX: Improving LLM-based relation extraction from RNA-focused scientific literature using graph machine learning / E. Cavalleri, M. Soto Gomez, A. Pashaeibarough, D. Malchiodi, J.H. Caufield, J.T. Reese, C. Mungall, P.N. Robinson, E. Casiraghi, G. Valentini, M. Mesiti - In: Proceedings of Workshops at the 50th International Conference on Very Large Data Bases[s.l] : VLDB.org, 2024. - pp. 1-11 (( Intervento presentato al 50. convegno International Conference on Very Large Data Bases tenutosi a Guangzhou nel 2024.
Abstract:
Relation extraction from scientific literature to align with a domain ontology is a well-known challenge in natural language processing, particularly critical in precision medicine. The advent of large language models (LLMs) has enabled the development of new and effective approaches to this problem. However, the extracted relations can be prone to problems (e.g., hallucination) that must be minimized. In this paper, we present the initial development of SPIREX, an extension of the SPIRES-based system designed to extract triples from scientific literature involving RNA molecules. Our system leverages schema constraints in the formulation of LLM prompts and utilizes graph machine learning on our RNA-based knowledge graph, RNA-KG, to assess the plausibility of the extracted triples. RNA-KG comprises more than 12.5M edges representing various types of relationships involving RNA molecules.
Tipologia IRIS:
03 - Contributo in volume
Keywords:
LLMs; Machine Learning; RNA; NLP; Biomedical Knowledge Graphs
Elenco autori:
E. Cavalleri, M. Soto Gomez, A. Pashaeibarough, D. Malchiodi, J.H. Caufield, J.T. Reese, C. Mungall, P.N. Robinson, E. Casiraghi, G. Valentini, M. Mesiti
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
Proceedings of Workshops at the 50th International Conference on Very Large Data Bases