Copy the page URI to the clipboard
Ao, Shuang
(2025).
DOI: https://doi.org/10.21954/ou.ro.00102247
Abstract
Despite the impressive performance of AI systems in various fields, knowing when to trust their predictions is still difficult. In this thesis, we address the critical challenges of model calibration, failure detection and uncertainty quantification to enhance the reliability and trustworthiness of AI systems. Firstly, we investigate the problem of model miscalibration, where a model's confidence in its predictions doesn't align with its actual accuracy. We propose novel metrics and calibration techniques to tackle both overconfidence and under-confidence, enhancing the reliability of model predictions. This is particularly important in safety-critical tasks where under-confidence can lead to missed detection or alerts. Secondly, we delve into the issue of failure detection. We propose new evaluation metrics, including the Excess Area Under the Optimal Risk-Coverage curve (E-AUoptRC) and the Trust Index (TI), to better assess model trustworthiness and learning ability. These metrics provide a more nuanced understanding of model performance than traditional accuracy measures. Thirdly, we explore uncertainty quantification in natural language generation (NLG) tasks, particularly in the context of large language models (LLMs). We introduce Contrastive Semantic Similarity, a Contrastive Language–Image Pre-training (CLIP) based feature extraction module, to measure semantic similarity between text pairs and estimate uncertainty. This method improves the trustworthiness of LLM-generated responses by enabling selective NLG, where unreliable generations are detected and rejected.
Throughout the thesis, we conduct extensive experiments on different modalities of language and vision, with various benchmark datasets and different model architectures, demonstrating the effectiveness of our proposed methods in improving model calibration, failure detection, and uncertainty quantification. This research contributes to the development of more reliable and safe AI systems, particularly in safety-critical domains where the consequences of model errors can be severe.