Mathematics – Statistics Theory
Scientific paper
2009-01-21
Annals of Statistics 2008, Vol. 36, No. 6, 2791-2817
Mathematics
Statistics Theory
Published in at http://dx.doi.org/10.1214/08-AOS618 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Scientific paper
10.1214/08-AOS618
Principal component analysis (PCA) is a standard tool for dimensional reduction of a set of $n$ observations (samples), each with $p$ variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size $n$, and those of the limiting population PCA as $n\to\infty$. As in machine learning, we present a finite sample theorem which holds with high probability for the closeness between the leading eigenvalue and eigenvector of sample PCA and population PCA under a spiked covariance model. In addition, we also consider the relation between finite sample PCA and the asymptotic results in the joint limit $p,n\to\infty$, with $p/n=c$. We present a matrix perturbation view of the "phase transition phenomenon," and a simple linear-algebra based derivation of the eigenvalue and eigenvector overlap in this asymptotic limit. Moreover, our analysis also applies for finite $p,n$ where we show that although there is no sharp phase transition as in the infinite case, either as a function of noise level or as a function of sample size $n$, the eigenvector of sample PCA may exhibit a sharp "loss of tracking," suddenly losing its relation to the (true) eigenvector of the population PCA matrix. This occurs due to a crossover between the eigenvalue due to the signal and the largest eigenvalue due to noise, whose eigenvector points in a random direction.
No associations
LandOfFree
Finite sample approximation results for principal component analysis: a matrix perturbation approach does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Finite sample approximation results for principal component analysis: a matrix perturbation approach, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Finite sample approximation results for principal component analysis: a matrix perturbation approach will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-684530