The Open UniversitySkip to content

INTT: Identifier Name Tokenisation Tool

Butler, Simon (2011). INTT: Identifier Name Tokenisation Tool. The Open University.

Full text available as:
[img] ZIP archive (Version of Record)
Download (7MB)
Google Scholar: Look up in Google Scholar


Identifier names are the main vehicle for semantic information during program comprehension. For tool-supported program comprehension tasks, including concept location and requirements traceability, identifier names need to be tokenised into their semantic constituents.
We present INTT, a Java library that implements an approach to the automated tokenisation of identifier names which improves on existing techniques in two ways. First, it improves the tokenisation accuracy for single-case identifier names and for identifier names containing digits, which existing techniques largely ignore. Second, performance gains over existing techniques are achieved using smaller oracles, making the approach easier to deploy.
Our tokenisation library and the datasets used for its evaluation are made available in this package. Also included is a database of unique identifier names extracted from the 60 Java projects, as a resource for further research on program comprehension.

Item Type: Other
Copyright Holders: 2011 The Open University
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Computing and Communications
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Related URLs:
Item ID: 28352
Depositing User: Michel Wermelinger
Date Deposited: 24 Mar 2011 11:58
Last Modified: 07 Dec 2018 09:52
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU