The Open UniversitySkip to content
 

A language modeling approach to search distributed text databases

Yang, Hui and Zhang, Minjie (2003). A language modeling approach to search distributed text databases. In: Carbonell, J. G. and Siekmann, J. eds. AI 2003: Advances in Artificial Intelligence. Lecture Notes in Computer Science, 2903. UK: Springer, pp. 196–207.

URL: http://www.informatik.uni-trier.de/~ley/db/conf/au...
DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1007/b94701
Google Scholar: Look up in Google Scholar

Abstract

As the number and diversity of distributed information sources on the Internet exponentially increase, it is difficult for the user to know which databases are appropriate to search. Given database language models that describe the content of each database, database selection services can provide assistance in locate relevant databases of the users information need. In this paper, we propose a database selection approach based on statistical language modeling. The basic idea behind the approach is that, for the databases that are categorized into a topic hierarchy, individual language models are estimated at different search stages, and then the databases are ranked by the similarity to the query according to the estimated language model. Two-stage smoothed language models are presented to circumvent the inaccuracy due to word sparseness. Experimental results demonstrate such a language modeling approach is competitive with current state-of-the-art database selection approaches.

Item Type: Book Chapter
ISBN: 3-540-20646-9, 978-3-540-20646-0
Extra Information: 16th Australian Conference on AI, Perth, Australia, December 3-5, 2003. Proceedings
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Interdisciplinary Research Centre: Centre for Research in Computing (CRC)
Item ID: 12997
Depositing User: Hui Yang
Date Deposited: 02 Feb 2009 04:22
Last Modified: 02 Dec 2010 20:18
URI: http://oro.open.ac.uk/id/eprint/12997
Share this page:

Altmetrics

Scopus Citations

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk