Progress in data-based bandwidth selection for kernel density estimation

Jones, Chris; Marron, J. S. and Sheather, S. J. (1996). Progress in data-based bandwidth selection for kernel density estimation. Computational Statistics(11) pp. 337–381.



We review the extensive recent literature on automatic, data-based selection of a global smoothing parameter in univariate kernel density estimation. Proposals are presented in a unified framework, making considerable reference to their theoretical properties as we go. The results of a major simulation study of the practical performance of many of these methods are summarised. Also, our remarks are further consolidated by describing a small portion of our practical experience on real datasets. Our comparison of methods' practical performance demonstrates that improvements to be gained by using the better methods can be, and often are, considerable. It will be seen that achieving optimal theoretical performance (up to bounds derived by Hall and Marron, 1991) and acceptable practical performance is not accomplished by the same techniques. We put much effort into ~aking good practical choices whenever options arise. We emphasise that arguably the two best known bandwidth selection methods cannot be advocated for general practical use; these are "least squares cross-validation" (which suffers from too much variability) and normal-based "rules-of-thumb" (which are too biased towards oversmoothing). A number of methods that do seem to be 'worthy of further consideration are listed. We show why our o"erall current preference is for the method of Sheather and Jones (1991), It is hoped that the lessons learned in this comparatively simple setting will also prove useful in many other smoothing situations.

Viewing alternatives

No digital document available to download for this item

Item Actions