The Open UniversitySkip to content

Improving Information Retrieval Bug Localisation Using Contextual Heuristics

Dilshener, Tezcan (2017). Improving Information Retrieval Bug Localisation Using Contextual Heuristics. PhD thesis. The Open University.

Full text available as:
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (5MB) | Preview
Google Scholar: Look up in Google Scholar


Software developers working on unfamiliar systems are challenged to identify where and how high-level concepts are implemented in the source code prior to performing maintenance tasks. Bug localisation is a core program comprehension activity in software maintenance: given the observation of a bug, e.g. via a bug report, where is it located in the source code?

Information retrieval (IR) approaches see the bug report as the query, and the source files as the documents to be retrieved, ranked by relevance. Current approaches rely on project history, in particular previously fixed bugs and versions of the source code. Existing IR techniques fall short of providing adequate solutions in finding all the source code files relevant for a bug. Without additional help, bug localisation can become a tedious, time- consuming and error-prone task.

My research contributes a novel algorithm that, given a bug report and the application’s source files, uses a combination of lexical and structural information to suggest, in a ranked order, files that may have to be changed to resolve the reported bug without requiring past code and similar reports.

I study eight applications for which I had access to the user guide, the source code, and some bug reports. I compare the relative importance and the occurrence of the domain concepts in the project artefacts and measure the effectiveness of using only concept key words to locate files relevant for a bug compared to using all the words of a bug report.

Measuring my approach against six others, using their five metrics and eight projects, I position an effected file in the top-1, top-5 and top-10 ranks on average for 44%, 69% and 76% of the bug reports respectively. This is an improvement of 23%, 16% and 11% respectively over the best performing current state-of-the-art tool.

Finally, I evaluate my algorithm with a range of industrial applications in user studies, and found that it is superior to simple string search, as often performed by developers. These results show the applicability of my approach to software projects without history and offers a simpler light-weight solution.

Item Type: Thesis (PhD)
Copyright Holders: 2016 The Author
Keywords: computer software development; debugging; heuristic; computer algorithms
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM)
Faculty of Science, Technology, Engineering and Mathematics (STEM) > Computing and Communications
Item ID: 50204
Depositing User: Tezcan Dilshener
Date Deposited: 09 Aug 2017 12:36
Last Modified: 28 Mar 2020 00:31
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU