Research hypothesis generation over scientific knowledge graphs

Borrego, Agustín; Dessì, Danilo; Ayala, Daniel; Hernández, Inma; Osborne, Francesco; Recupero, Diego Reforgiato; Buscaldi, Davide; Ruiz, David and Motta, Enrico (2025). Research hypothesis generation over scientific knowledge graphs. Knowledge-Based Systems (Early access).

DOI: https://doi.org/10.1016/j.knosys.2025.113280

Abstract

Generating research hypotheses is a crucial step in scientific investigation that involves the creation of precise, verifiable, and logically valid statements that can be empirically examined. Therefore, many efforts have been made to automate or assist this process through the use of various Artificial Intelligence solutions. However, most existing methods are tailored to very specific domains, particularly within the biomedical field. There have been recent attempts to formalize hypothesis generation as a link prediction task over knowledge graphs. This solution is potentially domain-independent and applicable across diverse disciplines. Nevertheless, current approaches for link prediction, which typically rely on embedding models or path-based methods, have shown limited success in accurately predicting new hypotheses. To address these limitations, this paper introduces ResearchLink, an innovative and domain-independent methodology for hypothesis generation over knowledge graphs. ResearchLink combines path-based features and knowledge graph embeddings with text embeddings, capturing the semantic context of entities within a given corpus, and integrates additional information from bibliometric databases to improve research collaboration predictions. To conduct a rigorous evaluation of ResearchLink, we constructed CSKG-600, a new dataset for hypothesis generation, consisting of 600 statements that were manually labelled by domain experts. ResearchLink achieved outstanding performance (78.7% P@20), significantly outperforming alternative approaches such as TransH (71.8%), TransD (71.7%), and RotatE (70.7%).

Viewing alternatives

Download history

Metrics

Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions

Export

About