On the extraction of meaningful RNA interactions from Scientific Publications through LLMs and SPIRES
Contributo in Atti di convegno
Data di Pubblicazione:
2024
Citazione:
On the extraction of meaningful RNA interactions from Scientific Publications through LLMs and SPIRES / E. Cavalleri, M. Mesiti (CEUR WORKSHOP PROCEEDINGS). - In: EDBT/ICDT-WS 2024 : EDBT/ICDT 2024 Workshops / [a cura di] T. Palpanas, H.V. Jagadish. - [s.l] : CEUR-WS, 2024. - pp. 1-6 (( convegno EDBT/ICDT 2024 Joint Conference tenutosi a Paestum nel 2024.
Abstract:
Knowledge graphs (KGs) are useful tools to uniformly represent and integrate heterogeneous information about a domain
of interest. However, they are inherently incomplete; therefore, new facts should be introduced by extracting them from
structured and unstructured data sources. Starting from RNA-KG, the first KG tailored for representing different kinds of
RNA molecules that we recently developed, in this paper we evaluate the use of SPIRES for extracting interactions among
bio-entities involving RNA molecules from scientific papers guided by the RNA-KG schema. SPIRES is a general-purpose
knowledge extraction system for mining information conforming to a specified schema. A customized prompt is generated
and submitted to a Large Language Model (LLM) along with a text to extract a set of RDF triples adhering to the schema
constraints. The experiments show a high accuracy in extracting interactions from the scientific literature.
of interest. However, they are inherently incomplete; therefore, new facts should be introduced by extracting them from
structured and unstructured data sources. Starting from RNA-KG, the first KG tailored for representing different kinds of
RNA molecules that we recently developed, in this paper we evaluate the use of SPIRES for extracting interactions among
bio-entities involving RNA molecules from scientific papers guided by the RNA-KG schema. SPIRES is a general-purpose
knowledge extraction system for mining information conforming to a specified schema. A customized prompt is generated
and submitted to a Large Language Model (LLM) along with a text to extract a set of RDF triples adhering to the schema
constraints. The experiments show a high accuracy in extracting interactions from the scientific literature.
Tipologia IRIS:
03 - Contributo in volume
Keywords:
RNA-based technologies; Knowledge Graphs; RNA-drug discovery; Large Language Models
Elenco autori:
E. Cavalleri, M. Mesiti
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
EDBT/ICDT-WS 2024 : EDBT/ICDT 2024 Workshops