Copy the page URI to the clipboard
Twala, B. E. T. H.; Jones, M. C. and Hand, D. J.
(2008).
DOI: https://doi.org/10.1016/j.patrec.2008.01.010
Abstract
We propose a simple and effective method for dealing with missing data in decision trees used for classification. We call this approach 'missingness incorporated in attributes' (MIA). It is very closely related to the technique of treating 'missing' as a category in its own right, generalizing it for use with continuous as well as categorical variables. We show through a substantial data-based study of classification accuracy that MIA exhibits consistently good performance across a broad range of data types and of sources and amounts of missingness. It is competitive with the best of the rest (particularly, a multiple imputation EM algorithm method; EMMI) while being conceptually and computationally simpler. A simple combination of MIA and EMMI is slower but even more accurate.
Viewing alternatives
Download history
Metrics
Public Attention
Altmetrics from AltmetricNumber of Citations
Citations from DimensionsItem Actions
Export
About
- Item ORO ID
- 22531
- Item Type
- Journal Item
- ISSN
- 0167-8655
- Keywords
- C4.5; CART; EM algorithm; fractional cases; missingness as attribute; multiple imputation
- Academic Unit or School
-
Faculty of Science, Technology, Engineering and Mathematics (STEM) > Mathematics and Statistics
Faculty of Science, Technology, Engineering and Mathematics (STEM) - Copyright Holders
- © 2008 Elsevier B.V.
- Depositing User
- Sarah Frain