Copy the page URI to the clipboard
Tsaneva, Stefani; Dessì, Danilo; Osborne, Francesco and Sabou, Marta
(2025).
DOI: https://doi.org/10.1016/j.ipm.2025.104145
Abstract
Ensuring the quality of knowledge graphs (KGs) is crucial for the success of the intelligent applications they support. Recent advances in large language models (LLMs) have demonstrated human-level performance across various tasks, raising the question of their potential for KG validation. In this work, we explore the role of LLMs in human-centric KG validation workflows, examining different collaboration strategies between LLMs and domain experts. We propose and evaluate nine distinct approaches, ranging from fully automated validation to hybrid methods that combine expert oversight with AI assistance. These workflows are tested within a real-world KG construction pipeline used to generate the Computer Science Knowledge Graph (CS-KG), a large-scale resource designed to support scientometric tasks such as trend forecasting and hypothesis generation. CS-KG comprises 41 million statements represented as 350 million triples within the Computer Science domain. Our findings show that integrating LLMs into the CS-KG verification process enhances precision by 12%, improving alignment with expert-level validation. However, this comes at the cost of recall, resulting in a 5% decrease in the overall F1 score. In contrast, a hybrid approach which involves both human-in-the-loop and LLM modules, yields the best overall results, improving F1 score by 5% with minimal human involvement.