Identifying Semantic Relationships Between Research Topics Using Large Language Models in a Zero-Shot Learning Setting

Aggarwal, Tanay; Salatino, Angelo; Osborne, Francesco and Motta, Enrico (2024). Identifying Semantic Relationships Between Research Topics Using Large Language Models in a Zero-Shot Learning Setting. In: 4th International Workshop on Scientific Knowledge: Representation, Discovery, and Assessment, Sci-K 2024, 12 Nov 2024, Baltimore.

URL: https://ceur-ws.org/Vol-3780/paper3.pdf

Abstract

Knowledge Organization Systems (KOS), such as ontologies, taxonomies, and thesauri, play a crucial role in organising scientific knowledge. They help scientists navigate the vast landscape of research literature and are essential for building intelligent systems such as smart search engines, recommendation systems, conversational agents, and advanced analytics tools. However, the manual creation of these KOSs is costly, time-consuming, and often leads to outdated and overly broad representations. As a result, researchers have been exploring automated or semi-automated methods for generating ontologies of research topics. This paper analyses the use of large language models (LLMs) to identify semantic relationships between research topics. We specifically focus on six open and lightweight LLMs (up to 10.7 billion parameters) and use two zero-shot reasoning strategies to identify four types of relationships: broader, narrower, same-as, and other. Our preliminary analysis indicates that Dolphin2.1-OpenOrca-7B performs strongly in this task, achieving a 0.853 F1-score against a gold standard of 1,000 relationships derived from the IEEE Thesaurus. These promising results bring us one step closer to the next generation of tools for automatically curating KOSs, ultimately making the scientific literature easier to explore.

Viewing alternatives

Download history

Item Actions

Export

About