Copy the page URI to the clipboard
Moran, Stuart; He, Yulan and Liu, Kecheng
(2009).
URL: http://www.iaeng.org/publication/WCE2009/
Abstract
Data miners have access to a significant number of classifiers and use them on a variety of different types of dataset. This large selection makes it difficult to know which classifier will perform most effectively in any given case. Usually an understanding of learning algorithms is combined with detailed domain knowledge of the dataset at hand to lead to the choice of a classifier. We propose an empirical framework that quantitatively assesses the accuracy of a selection of classifiers on different datasets, resulting in a set of classification rules generated by the J48 decision tree algorithm. Data miners can follow these rules to select the most effective classifier for their work. By optimising the parameters used for learning and the sampling techniques applied, a set of rules were learned that select with 78% accuracy (with 0.5% classification accuracy tolerance), the most effective classifier.