The Open UniversitySkip to content

User data discovery and aggregation: the CS-UDD algorithm

Carmagnola, Francesca; Osborne, Francesco and Torre, Ilaria (2014). User data discovery and aggregation: the CS-UDD algorithm. Information Sciences, 270(20) pp. 41–72.

Full text available as:
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (3MB) | Preview
DOI (Digital Object Identifier) Link:
Google Scholar: Look up in Google Scholar


In the social web, people use social systems for sharing content and opinions, for communicating with friends, for tagging, etc. People usually have different accounts and different profiles on all of these systems. Several tools for user data aggregation and people search have been developed and protocols and standards for data portability have been defined. This paper presents an approach and an algorithm, named Cross-System User Data Discovery (CS-UDD), to retrieve and aggregate user data distributed on social websites. It is designed to crawl websites, retrieve profiles that may belong to the searched user, correlate them, aggregate the discovered data and return them to the searcher which may, for example, be an adaptive system. The user attributes retrieved, namely attribute-value pairs, are associated with a certainty factor that expresses the confidence that they are true for the searched user. To test the algorithm, we ran it on two popular social networks, MySpace and Flickr. The evaluation has demonstrated the ability of the CS-UDD algorithm to discover unknown user attributes and has revealed high precision of the discovered attributes.

Item Type: Journal Item
Copyright Holders: 2014 Elsevier Inc.
ISSN: 0020-0255
Keywords: social web; user model; user profiling; user data discovery; information retrieval; entity matching; entity linkage
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Item ID: 40076
Depositing User: Francesco Osborne
Date Deposited: 21 Jul 2014 09:16
Last Modified: 07 Dec 2018 12:21
Share this page:


Altmetrics from Altmetric

Citations from Dimensions

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU