The Open UniversitySkip to content
 

Simple yet effective methods for cross-lingual link discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2

Knoth, Petr and Herrmannova, Drahomira (2013). Simple yet effective methods for cross-lingual link discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2. In: NTCIR-10 Evaluation of Information Access Technologies, 18 - 21 June 2013, Tokyo, Japan, pp. 39–46.

Full text available as:
[img]
Preview
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (810kB) | Preview
URL: http://research.nii.ac.jp/ntcir/workshop/OnlinePro...
Google Scholar: Look up in Google Scholar

Abstract

Cross-Lingual Link Discovery (CLLD) aims to automatically find links between documents written in different languages. In this paper, we first present a relatively simple yet effective methods for CLLD in Wiki collections, explaining the fndings that motivated their design. Our methods (team KMI) achieved in the NTCIR-10 CrossLink-2 evaluation the best overall results in the English to Chinese, Japanese and Korean (E2CJK) task and were the top performers in the Chinese, Japanese, Korean to English task (CJK2E)1 [Tang et al.,2013]. Though tested on these language combinations, the methods are language agnostic and can be easily applied to any other language combination with sufficient corpora and available pre-processing tools. In the second part of the paper, we provide an in depth analysis of the nature of the task, the evaluation metrics and the impact of the system components on the overall CLLD performance. We believe a good understanding of these aspects is the key to improving CLLD systems in the future.

Item Type: Conference or Workshop Item
Copyright Holders: 2013 National Institute of Informatics
Extra Information: Proceedings of the 10th NTCIR Conference on Evaluation of Information Access
Technologies, June 18-21, 2013 Tokyo Japan
Edited by Noriko Kando, Kazuaki Kishida
2013 National Institute of Informatics
ISBN: ISBN 978-4-86049-062-1
Keywords: cross-lingual link discovery; link discovery; semantic similarity; explicit semantic analysis; NTCIR; Wikipedia
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Interdisciplinary Research Centre: Centre for Research in Computing (CRC)
Related URLs:
Item ID: 37825
Depositing User: Kay Dave
Date Deposited: 25 Jun 2013 08:32
Last Modified: 10 Jul 2017 10:03
URI: http://oro.open.ac.uk/id/eprint/37825
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU