Quantifying the bias in data links

Tiddi, Ilaria; d'Aquin, Mathieu and Motta, Enrico (2014). Quantifying the bias in data links. In: Knowledge Engineering and Knowledge Management, Lecture Notes in Computer Science, Springer International Publishing, pp. 531–546.

DOI: https://doi.org/10.1007/978-3-319-13704-9_40


The main idea behind Linked Data is to connect data from different sources together, in order to develop a hub of shared and publicly accessible knowledge. While the benefit of sharing knowledge is universally recognised, what is less visible is how much results can be affected when the knowledge in one dataset and in the connected ones are not equally distributed. This lack of balance in information, or bias, generally assumed a priori, can actually be quantified to improve the quality of the results of applications and analytics relying on such linked data. In this paper, we propose a process to measure how much bias one dataset contains when compared to another one, by identifying the most affected RDF properties and values within the set of entities that those datasets have in common (defined as the linkset). This process was ran on a wide range of linksets from Linked Data, and in the experiment section we present the results as well as measures of its performance.

Viewing alternatives


Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions