Early analysis and debugging of linked open data cubes

Daga, Enrico; d'Aquin, Mathieu; Gangemi, Aldo and Motta, Enrico (2014). Early analysis and debugging of linked open data cubes. In: Second International Workshop on Semantic Statistics, 7 Sep 2014, Riva del Garda, Trentino Italy.

URL: https://semstats2014.files.wordpress.com/2014/10/s...


The release of the Data Cube Vocabulary specification introduces a standardised method for publishing statistics following the linked data principles. However, a statistical dataset can be very complex, and so understanding how to get value out of it may be hard. Analysts need the ability to quickly grasp the content of the data to be able to make use of it appropriately. In addition, while remodelling the data, data cube publishers need support to detect bugs and issues in the structure or content of the dataset. There are several aspects of RDF, the Data Cube vocabulary and linked data that can help with these issues. One of the features of an RDF dataset is to be "self-descriptive". Here, we attempt to answer the question "How feasible is it to use this feature to give an overview of the data in a way that would facilitate debugging and exploration of statistical linked open data?" We present a tool that automatically builds interactive facets as diagrams out of a Data Cube representation without prior knowledge of the data content to be used for debugging and early analysis. We show how this tool can be used on a large, complex dataset and we discuss the potential of this approach.

