The Open UniversitySkip to content

Automated information extraction from web APIs documentation

Ly, Papa Alioune; Pedrinaci, Carlos and Domingue, John (2012). Automated information extraction from web APIs documentation. In: The 13th International Conference on Web Information System Engineering (WISE 2012), 28-30 Nov 2012, Paphos, Cyprus, pp. 497–511.

Full text available as:
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (479kB) | Preview
DOI (Digital Object Identifier) Link:
Google Scholar: Look up in Google Scholar


A fundamental characteristic of Web APIs is the fact that, de facto, providers hardly follow any standard practices while implementing, publishing, and documenting their APIs. As a consequence, the discovery and use of these services by third parties is significantly hampered. In order to achieve further automation while exploiting Web APIs we present an approach for automatically extracting relevant technical information from the Web pages documenting them. In particular we have devised two algorithms that automatically extract technical details such as operation names, operation descriptions or URI templates from the documentation of Web APIs adopting either RPC or RESTful interfaces. The algorithms devised, which exploit advanced DOM processing as well as state of the art Information Extraction and Natural Language Processing techniques, have been evaluated against a detailed dataset exhibiting a high precision and recall–around 90% for both REST and RPC APIs outperforming state of the art information extraction algorithms.

Item Type: Conference or Workshop Item
Copyright Holders: Not known
Keywords: Web API; RESTful service; Web Page segmentation; information extraction; service discovery
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group: Centre for Research in Computing (CRC)
Item ID: 34934
Depositing User: Kay Dave
Date Deposited: 31 Oct 2012 11:56
Last Modified: 22 Jun 2020 21:03
Share this page:


Altmetrics from Altmetric

Citations from Dimensions

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU