Copy the page URI to the clipboard
Yu, Yijun
(2019).
DOI: https://doi.org/10.1109/ICSE-Companion.2019.00113
Abstract
Frequently source code analysis tools need to exchange internal representations of abstract syntax trees (AST) with each other. Conveniently, and intuitively, the externalised representations are in the form of hierarchical trees. We argue, counter-intuitively, that hierarchical representation is not the most efficient way for source analysis tools to exchange parsed AST. In this work, we propose to speed up AST parsing whilst preserving the equivalence of hierarchies in binary forms: (1) AST could be saved as a flat one-dimensional array where pointers to tree nodes are converted into integer offsets, and (2) such flattened AST are more efficient to access by programming tools through the generated application programming interfaces (API). In programming language-agnostic evaluations, we show that parsing flattened AST becomes 100x faster than in textual form AST on a benchmark of open-source projects of 6 different programming languages.