Comparing taxonomies for organising collections of documents

Fernando, Samuel; Hall, Mark; Agirre, Eneko; Soroa, Aitor; Clough, Paul and Stevenson, Mark (2012). Comparing taxonomies for organising collections of documents. In: Proceedings of COLING 2012, The COLING 2012 Organizing Committee, Mumbai, India pp. 879–894.



There is a demand for taxonomies to organise large collections of documents into categories for browsing and exploration. This paper examines four existing taxonomies that have been manually created, along with two methods for deriving taxonomies automatically from data items. We use these taxonomies to organise items from a large online cultural heritage collection. We then present two human evaluations of the taxonomies. The first measures the cohesion of the taxonomies to determine how well they group together similar items under the same concept node. The second analyses the concept relations in the taxonomies. The results show that the manual taxonomies have high quality well defined relations. However the novel automatic method is found to generate very high cohesion.

Viewing alternatives

Download history

Item Actions