Anaya-Izquierdo, Karim; Critchley, Frank and Vines, Karen
Due to copyright restrictions, this file is not available for public download
Click here to request a copy from the OU Author.
|DOI (Digital Object Identifier) Link:||http://dx.doi.org/10.1214/10-AOAS374|
|Google Scholar:||Look up in Google Scholar|
Combining principles with pragmatism, a new approach and accompanying algorithm are presented to a longstanding problem in applied statistics: the interpretation of principal components. Following Rousson and Gasser [53 (2004) 539–555]
'the ultimate goal is not to propose a method that leads automatically to a unique solution, but rather to develop tools for assisting the user in his or her choice of an interpretable solution'.
Accordingly, our approach is essentially exploratory. Calling a vector ‘simple’ if it has small integer elements, it poses the open question:
'What sets of simply interpretable orthogonal axes—if any—are angle close'
to the principal components of interest? its answer being presented in summary form as an automated visual display of the solutions found, ordered in terms of overall measures of simplicity, accuracy and star quality, from which the user may choose. Here, ‘star quality’ refers to striking overall patterns in the sets of axes found, deserving to be especially drawn to the user’s attention precisely because they have emerged from the data, rather than being imposed on it by (implicitly) adopting a model. Indeed, other things being equal, explicit models can be checked by seeing if their fits occur in our exploratory analysis, as we illustrate. Requiring orthogonality, attractive visualization and dimension reduction features of principal component analysis are retained.
Exact implementation of this principled approach is shown to provide an exhaustive set of solutions, but is combinatorially hard. Pragmatically, we provide an efficient, approximate algorithm. Throughout, worked examples show how this new tool adds to the applied statistician’s armoury, effectively combining simplicity, retention of optimality and computational efficiency, while complementing existing methods. Examples are also given where simple structure in the population principal components is recovered using only information from the sample. Further developments are briefly indicated.
|Item Type:||Journal Article|
|Copyright Holders:||2011 Institute of Mathematical Statistics|
|Keywords:||simplified principal components; orthogonal integer loadings|
|Academic Unit/Department:||Mathematics, Computing and Technology > Mathematics and Statistics|
|Depositing User:||Karim Anaya Izquierdo|
|Date Deposited:||24 Mar 2011 16:15|
|Last Modified:||24 Oct 2012 19:27|
Actions (login may be required)
|Report issue / request change|