Language Models for Citation Classification

Nambanoor Kunnath, Suchetha (2024). Language Models for Citation Classification. PhD thesis The Open University.



Authors reference academic works for a variety of reasons. As a result, not all citations in a research article have the same purpose. The need to understand and distinguish these citation purposes led to the development of automated approaches that consider semantic cues in the form of the context surrounding the citations. Identifying the semantic aspects of citations has proven valuable in various applications including research assessment, information retrieval, document summarisation, and more.
While automated citation classification has been in progress since the early 2000s, current efforts to determine citation types based on their contexts remain largely domain-specific. Besides, there is a lack of standard benchmarks for evaluating models for citation classification. Extracting valuable metadata related to the reason behind citation in scientific articles, particularly across multiple domains, is laborious and researchers still lack consensus on what should be the optimal context size for effective detection of citation function. The current methods heavily rely on the amount of annotated data used for training, making them data-centric. The emergence of self-supervised language models, which efficiently learn contextual relationships from vast unannotated datasets, has brought about substantial changes in the realm of Natural Language Processing in recent years. Despite these advancements, the few-shot predictive capability of the language models remains under-utilised in this field.
This thesis addresses the above shortcomings of citation classification. We systematically and comprehensively review the existing methodologies used by the previous works and identify the research gap and the potential future works. This meta-analysis forms the foundation for the research problems addressed in Chapters 3, 4, 5 and 6.
Initially, we introduce a novel benchmark in the form of an open shared task competition for multi-disciplinary citation classification in Chapter 3. The methods submitted to this shared task highlight the superiority of deep learning-based approaches and hinted at the importance of incorporating additional context to enhance the performance of citation classification models.
Secondly, we create a new open access feature-enriched multi-disciplinary citation classification dataset to overcome the challenges associated with extracting meta-data from both citing and cited articles in Chapter 4. The feature extraction process, utilising multiple sources and the missing meta-data values, indicates the complexities involved in extracting features for a heterogeneous dataset.
In Chapter 5, we assess domain-specific and multi-disciplinary datasets by fine-tuning them on pre-trained scientific language models, specifically exploring various fixed citation context windows. We introduce a new method for automatically extracting dynamic context windows in an unsupervised manner. Both sets of experiments emphasise the significance of additional context in citation context classification. Moreover, the experimental results also show the domain dependence of the citation context window, providing evidence for the benefit of extracting context dynamically.
Lastly, Chapter 6 presents novel prompting strategies for scientific and general-purpose language models to reduce the dependence on labelled citation classification datasets. The analysis of model performances under zero and few-shot settings reveals the effectiveness of large language models with minimal supervision, particularly when employing the newly proposed dynamic citation context-based prompting strategy.

Viewing alternatives

Download history


Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions