Jolliffe, I.T.; Trendafilov, N.T. and Uddin, M.
PDF (Not Set)
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
|DOI (Digital Object Identifier) Link:||http://doi.org/10.1198/1061860032148|
|Google Scholar:||Look up in Google Scholar|
In many multivariate statistical techniques, a set of linear functions of the original p variables is produced. One of the more difŽ cult aspects of these techniques is the interpretation of the linear functions, as these functions usually have nonzero coefŽ cients on all p variables.A common approach is to effectively ignore (treat as zero) any coefŽ cients less than some threshold value, so that the function becomes simple and the interpretation becomes easier for the users. Such a procedure can be misleading.There are alternatives to
principal component analysis which restrict the coefficients to a smaller number of possible values in the derivationof the linear functions,or replace the principal components by “principal variables.” This article introduces a new technique, borrowing an idea proposed by
Tibshirani in the context of multiple regressionwhere similar problemsarise in interpreting regression equations. This approach is the so-called LASSO, the “least absolute shrinkage and selection operator,” in which a bound is introduced on the sum of the absolute values of the coefficients, and in which some coefficients consequently become zero.We explore some of the propertiesof the newtechnique,both theoreticallyand using simulationstudies, and apply it to an example.
|Item Type:||Journal Article|
|Keywords:||Interpretation; Principal component analysis; Simplification|
|Academic Unit/Department:||Faculty of Science, Technology, Engineering and Mathematics (STEM) > Mathematics and Statistics
Faculty of Science, Technology, Engineering and Mathematics (STEM)
|Depositing User:||Nickolay Trendafilov|
|Date Deposited:||13 Nov 2006|
|Last Modified:||03 Aug 2016 03:25|
|Share this page:|
Download history for this item
These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.