Harnessing the crowds for automating the identification of Web APIs

Pedrinaci, Carlos; Liu, Dong; Lin, Chenghua and Domingue, John (2012). Harnessing the crowds for automating the identification of Web APIs. In: AAAI Spring Symposium 2012, 26-28 Mar 2012, Stanford, California, USA.

URL: http://www.aaai.org/Symposia/Spring/sss12.php


Supporting the efficient discovery and use of Web APIs is increasingly important as their use and popularity grows. Yet, a simple task like finding potentially interesting APIs and their related documentation turns out to be hard and time consuming even when using the best resources currently available on theWeb. In this paper we describe our research towards an automatedWeb API documentation crawler and search engine. This paper presents two main contributions. First, we have devised and exploited crowdsourcing techniques to generate a curated dataset of Web APIs documentation. Second, thanks to this dataset, we have devised an engine able to automatically detect documentation pages. Our preliminary experiments have shown that we obtain an accuracy of 80% and a precision increase of 15 points over a keyword-based heuristic we have used as baseline.

