import numpy as np
import scipy as sp
import sympy as sy
import matplotlib.pyplot as plt
"ggplot") plt.style.use(
=3)
np.set_printoptions(precision=True) np.set_printoptions(suppress
We have discussed what Spectral Decomposition which can decompose any symmetric matrices unconditionally into three special matrices.However only square matrices have eigenvalues and vectors, however we want to extend a similar concept for any \(m \times n\) matrices.
If \(A\) is an \(m \times n\) matrix, then \(A^TA\) and \(AA^T\) are both symmetric and orthogonally diagonalizable.
= np.array([[4, -3], [-3, 2], [2, -1], [1, 0]])
A A
array([[ 4, -3],
[-3, 2],
[ 2, -1],
[ 1, 0]])
Compute eigenvalues and eigenvectors of \(A^TA\).
= A.T @ A
ATA ATA
array([[ 30, -20],
[-20, 14]])
= np.linalg.eig(ATA) D1, P1
Check if \(P\) is an orthonormal matrix.
@ P1.T P1
array([[1., 0.],
[0., 1.]])
D1
array([43.541, 0.459])
The square roots of eigenvalues of \(A^TA\) are called singular values of \(A\), denoted by \(\sigma_1, ..., \sigma_n\) in decreasing order.
We can also show that singular values of \(A\) are the lengths of vectors \(A\mathbf{v}_1,..., A\mathbf{v}_n\), where \(\mathbf{v}_i\) is the eigenvalue of \(A^TA\).
The length of \(A\mathbf{v}_i\) is \(\|A\mathbf{v}_i\|\)
\[ \|A\mathbf{v}_i\| = \sqrt{(A\mathbf{v}_i)^TA\mathbf{v}_i} = \sqrt{\mathbf{v}_i^TA^T A\mathbf{v}_i}=\sqrt{\mathbf{v}_i^T(\lambda_i\mathbf{v}_i)} = \sqrt{\lambda_i}=\sigma_1 \]
where \(\sqrt{\mathbf{v}_i^T\mathbf{v}_i} = 1\) and \(\lambda_i\)’s are eigenvalues of \(A^TA\).
# Singular Value Decomposition
Singular Value Decomposition (SVD) is probably the most important decomposition technique in the history of linear algebra, it combines all the theory we discussed, then culminate at this point.
\(A\) is a \(m\times n\) matrix. However \(AA^T\) and \(A^TA\) are symmetric matrices,then both are orthogonally diagonalizable.
\[ AA^T = U\Sigma\Sigma^T U^T=(U\Sigma V^T)(V\Sigma U^T)\\ A^TA = V\Sigma^T \Sigma V^T = (V\Sigma^T U^T)(U \Sigma V^T) \]
where \(\Sigma\Sigma^T\) is a diagonal matrix with all eigenvalues of \(AA^T\) and \(\Sigma^T \Sigma\) is a diagonal matrix with all eigenvalues of \(VV^T\).
Because both \(AA^T\) and \(A^TA\) are symmetric, then \(UU^T= U^TU=I_{m\times m}\) and \(VV^T= V^TV=I_{n\times n}\).
We have implicitly shown the singular value decompositions above, one of the most important concept in linear algebra.
\[ \Large SVD:\quad A_{m\times n} = U_{m\times m}\Sigma_{m \times n} V^T_{n \times n} \]
The SVD theory guarantees that any matrix \(A\), no matter its ranks or shapes, can be unconditionally decomposed into three special matrices.
So next question: what is \(\Sigma\)?
It is an \(m\times n\) main diagonal matrix, with all singular values on the main diagonal. Rewrite
\[ A^TA = V\Sigma^T \Sigma V^T = V\Sigma^2 V^T \]
Post-multiply both sides by \(V\)
\[ A^TAV = V\Sigma^2 \]
This is the matrix version of \(A\mathbf{v}_i = \lambda_i \mathbf{v}_i\), but here the matrix of interest is \(A^TA\) rather than \(A\). Similarly it can be written with singular values
\[ A^TA\mathbf{v}_i = \sigma_i^2\mathbf{v}_i \]
Because \(U\) and \(V\) are not unique, we tend to standardize the solution by arranging \(\sigma_1 \geq \sigma_2 \geq \sigma_3\geq ... \geq\sigma_r\).
Why we only arrange \(r\) singular values? Because it is the rank of \(A\), so is the rank of \(A^TA\). Explicitly \(\Sigma\) looks like
\[\Sigma =\left[\begin{array}{cccccc} \sqrt{\lambda_{1}} & & & & &\\ & \sqrt{\lambda_{2}} & & & &\\ & & \ddots & & &\\ & & & \sqrt{\lambda}_{\mathrm{r}} & &\\ & & & & \ddots &\\ & & & & & 0 \end{array}\right] =\left[\begin{array}{cccccc} \sigma_1 & & & & &\\ & \sigma_2 & & & &\\ & & \ddots & & &\\ & & & \sigma_r & &\\ & & & & \ddots &\\ & & & & & 0 \end{array}\right] \]
We can do the same for \(AA^T\) and get
\[ AA^TU = U \Sigma^2 \]
or
\[ AA^T\mathbf{u}_i = \sigma_i^2\mathbf{u}_i \]
We have shown why \(A_{m\times n} = U_{m\times m}\Sigma_{m \times n} V^T_{n \times n}\) holds.
To perfomr a SVD on \(A\), we just need two equations and this is also a mannual procedure to decompose any matrix.
\[ A^TA = V\Sigma^T \Sigma V^T\\ AV = U\Sigma \]
Here’s an example, let’s say we have we have a data set \(A\)
= np.random.rand(10, 2) A
Give it a \(\text{SVD}\) decomposition
= np.linalg.svd(A, full_matrices=False) U, S, VT
Let’s say we want to reduce \(A_{10\times 3}\) in to \(A_{10\times 2}\).
# Keep only the first dimension
= np.dot(U[:, :1], np.dot(np.diag(S[:1]), VT[:1, :]))
A_reduced
# Plot original data
0], A[:, 1])
plt.scatter(A[:, "Original $A$")
plt.title(
plt.show()
# Plot reduced data
0], A_reduced[:, 1])
plt.scatter(A_reduced[:, "$A$ with Reduced Dimensionality")
plt.title( plt.show()
import numpy as np
import matplotlib.pyplot as plt
# Generate some random data
0)
np.random.seed(= np.random.randn(200, 2)
data
# Perform SVD on the data
= np.linalg.svd(data)
U, s, VT
# Perform PCA on the data
= np.mean(data, axis=0)
mean = data - mean
data_pca = np.cov(data_pca.T)
cov = np.linalg.eig(cov)
eigenvalues, eigenvectors
# Plot the data
0], data[:, 1])
plt.scatter(data[:, "Original Data")
plt.title(
plt.show()
# Plot the SVD components
0], U[:, 1])
plt.scatter(U[:, "SVD Components")
plt.title(
plt.show()
# Plot the PCA components
0], eigenvectors[:, 1])
plt.scatter(eigenvectors[:, "PCA Components")
plt.title( plt.show()
Reformulate SVD
Rewrite \(SVD\)
\[ AV = U\Sigma \]
vector version is
\[ A\mathbf{v}_i = \sigma_i \mathbf{u}_i \]
There two implications from the equation above: \((a)\) \(A\) can be decomposed into
\[ A = \sum_{i}^r\sigma_i\mathbf{u}_i \mathbf{v}_i^T \]
\((b)\) We can compute \(\mathbf{u}_i\) by using
\[\mathbf{u}_i = \frac{A\mathbf{v}_i}{\sigma_i}\]