PCA

m dim affine hyperplace spanned by first m eigenvectors. Only manifolds and no codebook vectors
Be able to reconstruct x from f(x) : decoding function $x \approx d \circ f (x)$

Steps

Center data (A)
- Subtract their mean from each pattern.
- $μ = \frac{1}{N} Σ_{i} x_{i}$ and getting patterns $\overset{x}{^}_{i} = x_{i} - μ$
- Point Cloud with center of Gravity : origin
  - Extend more in some “directions” characterized by unit norm direction vectors $u \in R^{n}$ .
  - Distance of a point from the origin in the direction of u : projection of $\overset{x}{ˉ}_{i}$ on u aka inner product $u^{'} \overset{x}{ˉ}_{i}$
  - Extension of cloud in direction u : Mean square dist to origin.
  - Largest extension : $u_{1} = a r g ma x_{u_{1}, ∣∣ u ∣∣ = 1} \frac{1}{N} Σ_{i} (u^{'} \overset{x}{ˉ}_{i})^{2}$
  - Since centered: mean is 0 and $\frac{1}{N} Σ_{i} (u^{'} \overset{x}{ˉ}_{i})^{2}$ is the variance
  - $u_{1}$ is the longest direction : First PC : PC1
Project points (B)
- Find orthogonal (90deg) subspace . (n-1) dim linear
- Map all points $\overset{x}{ˉ}$ to $\overset{x}{ˉ}^{*} = \overset{x}{ˉ} - (u^{'} \overset{x}{ˉ}_{i}^{*})^{2}$ - Second PC : PC2
Rinse and repeat (C)
New PCs plotted in original cloud (D)
For featurres $f_{k} : R^{n} \to R$ , $x \to u_{k}^{'} \overset{x}{ˉ}$
Reconstruction : $x = μ + Σ_{k = 1, \dots, n} f_{k} (x) u_{k}$
- First few PCs till index m
  - $(f_{1} (x), \dots, f_{m} (x))^{'}$
  - Decoding function $d : (f_{1} (x), \dots, f_{m} (x))^{'} \to μ + Σ_{k = 1} f_{k} (x) u_{k}$

How good is the reconstruction
- $Σ_{k = m + 1}^{n} (\frac{1}{N} Σ_{i} f_{k} (x_{i})^{2})$
- Relative amount of dissimilarity to mean empirical variance of patterns - 1
  - $\frac{Σ _{k = m + 1}^{n} ( \frac{1}{N} Σ _{i} f _{k} ( x _{i} ) ^{2} )}{Σ _{k = 1}^{n} ( \frac{1}{N} Σ _{i} f _{k} ( x _{i} ) ^{2} )}$
  - Ratio very small as index k grows. Very little info lost by reducing dims. Aka good for very high dim stuff.

Compute SVD
- $u_{1,} \dots u_{n}$ form orthonormal, real eigenvectors
- variances $σ_{1}^{2,} \dots, σ_{n}^{2}$ are eigenvalues
- $C = U Σ U^{'}$ to get PC vectors $u_{k}$ lined up in U and variances $σ_{k}^{2}$ as eigenvalues in $Σ$
- If we want to preserve 98% variance : Rhs of (1) st. ratio is (1-0.98)