The Open UniversitySkip to content
 

Beyond redundancies: a metric-invariant method for unsupervised feature selection

Hou, Yuexian; Zhang, Peng; Yan, Tingxu; Li, Wenjie and Song, Dawei (2010). Beyond redundancies: a metric-invariant method for unsupervised feature selection. IEEE Transactions on Knowledge and Data Engineering, 22(3) pp. 348–364.

DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1109/TKDE.2009.84
Google Scholar: Look up in Google Scholar

Abstract

A fundamental goal of unsupervised feature selection is denoising, which aims to identify and reduce noisy features that are not discriminative. Due to the lack of information about real classes, denoising is a challenging task. The noisy features can disturb the reasonable distance metric and result in unreasonable feature spaces, i.e., the feature spaces in which common clustering algorithms cannot effectively find real classes. To overcome the problem, we make a primary observation that the relevance of features is intrinsic and independent of any metric scaling on the feature space. This observation implies that feature selection should be invariant, at least to some extent, with respect to metric scaling. In this paper, we clarify the necessity of considering the metric invariance in unsupervised feature selection and propose a novel model incorporating metric invariance. Our proposed method is motivated by the following observations: if the statistic that guides the unsupervised feature selection process is invariant with respect to possible metric scaling, the solution of this model will also be invariant. Hence, if a metric-invariant model can distinguish discriminative features from noisy ones in a reasonable feature space, it will also work on the unreasonable counterpart transformed from the reasonable one by metric scaling. A theoretical justification of the metric invariance of our proposed model is given and the empirical evaluation demonstrates its promising performance.

Item Type: Journal Article
Copyright Holders: 2010 IEEE
ISSN: 1041-4347
Keywords: information theory; feature evaluation and selection; metric invariant
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Item ID: 33893
Depositing User: Dawei Song
Date Deposited: 21 Jun 2012 09:39
Last Modified: 23 Oct 2012 14:29
URI: http://oro.open.ac.uk/id/eprint/33893
Share this page:

Altmetrics

Scopus Citations

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk