Heuristic Non Parametric Collateral Missing Value Imputation: A Step Towards Robust Post-Genomic Knowledge Discovery

Sehgal, Muhammad Shoaib B.; Gondal, Iqbal; Dooley, Laurence S. and Coppel, Ross (2008). Heuristic Non Parametric Collateral Missing Value Imputation: A Step Towards Robust Post-Genomic Knowledge Discovery. In: Chetty, Madhu; Ngom, Alioune and Ahmad, Shandar eds. Pattern Recognition in Bioinformatics. Lecture Notes in Computer Science, 5265. Berlin, Germany: Springer-Verlag, pp. 373–387.

DOI: https://doi.org/10.1007/978-3-540-88436-1_32

Abstract

Microarrays are able to measure the patterns of expression of thousands of genes in a genome to give profiles that facilitate much faster analysis of biological processes for diagnosis, prognosis and tailored drug discovery. Microarrays, however, commonly have missing values which can result in erroneous downstream analysis. To impute these missing values, various algorithms have been proposed including Collateral Missing Value Estimation (CMVE), Bayesian Principal Component Analysis (BPCA), Least Square Impute (LSImpute), Local Least Square Impute (LLSImpute) and K-Nearest Neighbour (KNN). Most of these imputation algorithms exploit either the global or local correlation structure of the data, which normally leads to larger estimation errors. This paper presents an enhanced Heuristic Non Parametric Collateral Missing Value Imputation (HCMVI) algorithm which uses CMVE as its core estimator and Heuristic Non Parametric strategy to compute optimal number of estimator genes to exploit optimally both local and global correlations.

Viewing alternatives

Download history

Metrics

Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions

Export

About