The Open UniversitySkip to content
 

Concept-based Document Readability in Domain Specific Information Retrieval

Yan, Xin; Song, Dawei and Li, Xue (2006). Concept-based Document Readability in Domain Specific Information Retrieval. In: 2006 ACM CIKM International Conference on Information and Knowledge Management (CIKM 2006), 6-11 November 2006, Arlington, Virginia, USA.

DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1145/1183614.1183692
Google Scholar: Look up in Google Scholar

Abstract

Domain specific information retrieval has become in demand. Not only domain experts, but also average non-expert users are interested in searching domain speci¯c (e.g., medical and health) information from online resources. However, a typical problem to average users is that the search results are always a mixture of documents with di®erent levels of readability. Non-expert users may want to see documents with higher readability on the top of the list. Consequently the search results need to be re-ranked in a descending order of readability. It is often not practical for domain experts to manually label the readability of documents for large databases. Computational models of readability needs to be investigated. However, traditional readability formulas are designed for general purpose text and insu±cient to deal with technical materials for domain specific information retrieval. More advanced algorithms such as textual coherence model are computationally expensive for re-ranking a large number of retrieved documents. In this paper, we propose an effective and computationally tractable concept-based model of text readability. In addition to textual genres of a document, our model also takes into account domain specific knowledge, i.e., how the domain-specific concepts contained in the document a®ect the document's readability. Three major readability formulas are proposed and applied to health and medical information retrieval. Experimental results show that our proposed readability formulas lead to remarkable improvements in terms of correlation with users' readability ratings over four traditional readability measures.

Item Type: Conference Item
Extra Information: Published in Proceedings of the 15th ACM international conference on Information and knowledge management.
ISBN 1-59593-433-2
Keywords: Document Ranking; Document Readability; Document Scope and Cohesion
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Item ID: 6731
Depositing User: Users 4559 not found.
Date Deposited: 08 Feb 2007
Last Modified: 22 Jun 2012 11:43
URI: http://oro.open.ac.uk/id/eprint/6731
Share this page:

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk