Copy the page URI to the clipboard
Daga, Enrico; Meroño-Peñuela, Albert and Motta, Enrico
(2019).
URL: http://ceur-ws.org/Vol-2496/paper2.pdf
Abstract
Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources. When pub- lishing these lists in RDF, an important concern is making them easy to consume. Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it. However, a specific domain model can be implemented in different ways and vocabularies may provide al- ternative solutions. In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data. We take the case of RDF Lists and make the hy- pothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothe- sis). To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability. Finally, we derive good (and bad) practices on how to publish lists as linked open data. By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmark- ing linked data modelling solutions.