Copy the page URI to the clipboard
He, Yulan and Chenghua, Lin
(2009).
DOI: https://doi.org/10.1007/978-3-642-12550-8_15
URL: http://www.springerlink.com/content/978-3-642-1254...
Abstract
Text classification is essential for narrowing down the number of documents relevant to a particular topic for further pursual, especially when searching through large biomedical databases. Protein-protein interactions are an example of such a topic with databases being devoted specifically to them. This paper proposed a semi-supervised learning algorithm via local learning with class priors (LL-CP) for biomedical text classification where unlabeled data points are classified in a vector space based on their proximity to labeled nodes. The algorithm has been evaluated on a corpus of biomedical documents to identify abstracts containing information about protein-protein interactions with promising results. Experimental results show that LL-CP outperforms the traditional semi-supervised learning algorithms such as SVM and it also performs better than local learning without incorporating class priors.