Singular Value Decomposition (SVD)#
The SVD separates any matrix \(\boldsymbol A \in \mathbb R^{m\times n}\) into rank one pieces
The columns \(\boldsymbol u\) and rows \(\boldsymbol v^\mathsf{T}\) are eigenvectors of symmetric matrices \(\boldsymbol{AA}^\mathsf{T}\) and \(\boldsymbol A^\mathsf{T} \boldsymbol A\).
Singular value theorem#
Any rectangular matrix \(\boldsymbol A \in \mathbb R^{m\times n}\) has a singular value decomposition
Here
\(\boldsymbol U = [\boldsymbol u_1 \ldots \boldsymbol u_m] \in\mathbb R^{m\times m}\) is an orthogonal matrix;
\(\boldsymbol \Sigma = \begin{pmatrix} \boldsymbol \Lambda_r & \boldsymbol 0 \\ \boldsymbol 0 & \boldsymbol 0 \end{pmatrix} \in \mathbb R^{m\times n}\);
\(\boldsymbol \Lambda_r = \mathrm{diag}\{\sigma_1, \ldots, \sigma_r\}\), \(r \leqslant \min\{m, n\}\), \(\sigma_1 \geqslant \sigma_2 \geqslant \ldots \geqslant \sigma_r > 0\);
\(\boldsymbol V = [\boldsymbol v_1 \ldots \boldsymbol v_n] \in\mathbb R^{n\times n}\) is an orthogonal matrix.
The numbers \(\sigma_1, \ldots, \sigma_r\) are called the singular values of \(\boldsymbol A\).
Q. What is the shape of all blocks of the matrix \(\boldsymbol \Sigma\)?
Proof of SVD
Let \(\mathrm{rank}(\boldsymbol A) = r\). Based on spectral theorem we can find an orthonormal system \(\boldsymbol v_1, \ldots, \boldsymbol v_r\) such that
for some values \(\sigma_1 \geqslant \ldots \geqslant \sigma_r > 0\). Vectors \(\boldsymbol u_i\) are defined by the equalities \(\boldsymbol{Av}_i = \sigma_i \boldsymbol u_i\), \(i=1,\ldots, r\). Now observe that
Hence, system \(\boldsymbol u_1, \ldots, \boldsymbol u_r\) is also orthonormal. We can complete both systems \(\boldsymbol u_i\) and \(\boldsymbol v_j\) to orthonormal bases in \(\mathbb R^m\) and \(\mathbb R^n\) respectively. Since
due to the fundamental theorem of linear algebra the rest part spans nullspace of \(\boldsymbol A\):
Therefore, \(\boldsymbol{Av}_i = \boldsymbol 0\) for all \(i = r+1, \ldots, n\).
Finally, establish the equality \(\boldsymbol{AV} = \boldsymbol{U \Sigma}\). LHS is
while RHS is
SVD in NumPy#
import numpy as np
A = np.arange(6).reshape(2, 3)
A
array([[0, 1, 2],
[3, 4, 5]])
To find SVD, use np.linalg.svd:
U, S, V = np.linalg.svd(A)
print("U:")
print(U)
print("Singular values:", S)
print("V:")
print(V)
U:
[[-0.27472113 -0.96152395]
[-0.96152395 0.27472113]]
Singular values: [7.34846923 1. ]
V:
[[-0.39254051 -0.56077215 -0.7290038 ]
[ 0.82416338 0.13736056 -0.54944226]
[ 0.40824829 -0.81649658 0.40824829]]
Check the decomposition:
Sigma = np.hstack((np.diag(S), np.zeros(shape=(2, 1))))
U @ Sigma @ V
array([[9.6365988e-16, 1.0000000e+00, 2.0000000e+00],
[3.0000000e+00, 4.0000000e+00, 5.0000000e+00]])
Check that \(\Vert A \Vert_F = \Vert S\Vert_2\):
np.linalg.norm(S), np.linalg.norm(A)
(7.416198487095663, 7.416198487095663)
If \(\boldsymbol A\) is a positive semidefinite square matrix (\(m=n\)) then according to the spectral theorem
where \(\boldsymbol Q\) is orthogonal, \(\boldsymbol \Lambda\) is diagonal with nonnegative elements on the main diagonal.
A = np.array([[3, 2, 1], [2, 3, 2], [1, 2, 3]])
A
array([[3, 2, 1],
[2, 3, 2],
[1, 2, 3]])
Spectral decomposition can also be calculated by np.linalg.svd
:
U, S, V = np.linalg.svd(A)
U, S, V
(array([[-5.41774320e-01, 7.07106781e-01, 4.54401349e-01],
[-6.42620551e-01, 3.88578059e-16, -7.66184591e-01],
[-5.41774320e-01, -7.07106781e-01, 4.54401349e-01]]),
array([6.37228132, 2. , 0.62771868]),
array([[-5.41774320e-01, -6.42620551e-01, -5.41774320e-01],
[ 7.07106781e-01, 5.55111512e-17, -7.07106781e-01],
[ 4.54401349e-01, -7.66184591e-01, 4.54401349e-01]]))
Here \(\boldsymbol U = \boldsymbol V^\mathsf{T}\):
np.linalg.norm(U - V.T)
8.563843312570936e-16
Reduced form of SVD#
SVD (61) is equivalent to \(\boldsymbol{A} = \boldsymbol U_r \boldsymbol \Lambda \boldsymbol V_r^\mathsf{T}\) where
Also, this can be written as
Hence, SVD allows to represent a matrix as a sum of one-rank matrices.
Truncated SVD#
Let \(\boldsymbol A = \boldsymbol U \boldsymbol \Sigma \boldsymbol V^\mathsf{T}\) be SVD of \(\boldsymbol A\). Truncated SVD is
Note that \(\hat{\boldsymbol A}_K\) has rank \(K\) and \(\hat{\boldsymbol A}_K = \boldsymbol A\) if \(K=r\). It turns out that \(\hat{\boldsymbol A}_K\) is the best approximation of \(\boldsymbol A\) among all matrices of rank \(K\):
Image compression#
Based on [Murphy, 2022], section 7.5.
import numpy as np
import matplotlib.pyplot as plt
import requests
import io
from PIL import Image
from sys import getsizeof
%config InlineBackend.figure_format = 'svg'
r = requests.get('https://github.com/probml/probml-data/blob/main/data/clown.png?raw=true', stream=True)
img = Image.open(io.BytesIO(r.content))
plt.imshow(img);
X = np.array(img)
rk = np.linalg.matrix_rank(X)
print("Shape:", X.shape)
print("Rank:", rk)
Shape: (200, 320)
Rank: 200
def compress_img(X, r):
U, sigma, V = np.linalg.svd(X, full_matrices=True)
x_hat = np.dot(np.dot(U[:, :r], np.diag(sigma[:r])), V[:r, :])
plt.imshow(x_hat, cmap='gray')
1-rank approximation:
compress_img(X, 1)
\(2\)-rank approximaion:
compress_img(X, 2)
\(5\)-rank approximation:
compress_img(X, 5)
\(10\)-rank approximation:
compress_img(X, 10)
\(20\)-rank approximation is quite admittable:
compress_img(X, 20)
Now plot the error of arroximation:
plot_SVD_approx(X)
Exercises#
Show that \(\Vert \boldsymbol A \Vert_F^2 = \sum\limits_{i = 1}^r \sigma_i^2\).
Find SVD of a one-rank matrix \(\boldsymbol A = \boldsymbol{uv}^\mathsf{T}\), \(\boldsymbol u \in \mathbb R^m\), \(\boldsymbol v \in \mathbb R^n\).
Prove that \(\boldsymbol u_i\) are eigenvectors of \(\boldsymbol {AA}^\mathsf{T}\).
Explain why \(\mathrm{span}(\boldsymbol v_1, \ldots, \boldsymbol v_r) = C(\boldsymbol A^\mathsf{T})\).
Find SVD of matrices
\[\begin{split} \begin{pmatrix} 3 & 0 \\ 4 & 5 \end{pmatrix}, \quad \begin{pmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3 \\ 0 & 0 & 0 & 0 \\ \end{pmatrix}, \quad \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \end{pmatrix}. \end{split}\]Show that \(\Vert \boldsymbol A - \hat{\boldsymbol A}_K\Vert_F^2 = \sum\limits_{i=K+1}^r \sigma_i^2\).