The Lanczos Method
Central to the theory of matrix factorization is the spectral theorem, which provides conditions under which a linear operator can be diagonalized in terms of its eigenvectors:
When is symmetric, the eigen-decomposition is not only guarenteed to exist, but its canonical form may be obtained via orthogonal diagonalization. Such matrices are among the most commonly encountered matrices in applications.
In 1950, Cornelius Lanczos proposed an algorithm for obtaining by different kind of factorization of by means of tri-diagonalization:
Like the eigen-decomposition, the “outer” term(s) orthogonally project to a highly structured “inner” matrix . Unlike the eigen-decomposition, this factorization has no canonical form and does not readily yield the eigensets of , prompting questions of its utility.
Nonetheless, this decomposition ended up proving to be one of the most revealing spectral decompositions of all time. Despite its age, Lanczos’ method of tridiagonalization remains the standard algorithm1 for both computing eigensets and solving linear systems in the large-scale regime. An IEEE guest editorial places it among the top 10 most influential algorithms of the 20th century.
Lanczos on a bumper sticker
Given any non-zero , Lanczos generates a Krylov subspace via successive powers of :
These vectors are linearly independent, so orthogonalizing them yields not only an orthonormal basis for but also a change-of-basis , re-parameterizing by a new matrix :
Since is symmetric, is guaranteed to have a symmetric tridiagonal structure:
Since spans , the change-of-basis is a similarity transform—an equivalence relation on —thus we can obtain by diagonalizing :
As is quite structured, it is easy to diagonalize, and in doing so we have effectively solved the eigenvalue problem for . To quote the Lanczos introduction from Parlett, could anything be more simple?
The “iteration” part
Lanczos originally referred to his algorithm as the method of minimized iterations, and indeed nowadays it is often considered an iterative method. Where’s the iterative component?
If you squint hard enough, you can deduce that for every :
Equating the -th columns on both sides and rearranging yields a three-term recurrence:
The equation above is a variable-coefficient second-order linear difference equation, and it is known such equations have unique solutions; they are given below:
In other words, if () are known, then (, ) are completely determined. In theory, this means we can iteratively generate both and using just a couple vectors at a time—no explicit QR call needed. Pretty nifty, eh!
Wait, isn’t arbitrary?
Unlike the spectral decomposition2, there is no canonical choice of . Indeed, as is a family with degrees of freedom and was chosen arbitrarily, there are infinitely many essentially distinct such decompositions.
Not all hope is lost though. It turns out that is actually fully characterized by the pair . To see this, notice that since is an orthogonal matrix, we have:
By extension, given an initial pair satisfying , the following holds:
…this is a QR factorization, which is essentially unique! Indeed, the Implicit Q Theorem asserts that if an upper Hessenburg matrix has only positive elements on its first subdiagonal and there exists an orthogonal matrix such that , then and are uniquely determined by .
The end of the beginning
Nowadays, the Lanczos method lies at the core of a host of methods aimed at approximating various spectral quantities, and modern implementations of the Lanczos method deviate wildly from the original conception. Whether for computing eigenvalues (Parlett 1993), solving linear systems (Saad 1987), or approximating matrix functions (Musco, Musco, and Sidford 2018), applications of the Lanczos method are quite diverse.
In fact, the theory underlying the Lanczos method is intrinsically connected to many other algorithms and mathematical constructs, popular examples including the Rayleigh Ritz procedure, the conjugate gradient method, and notions of Gaussian quadrature. As each of these deserve their own post, I’ll defer them to later postings.
In the next post, I’ll cover more of the algorithmic components of the Lanczos method, including the pseudo-code, actual code, the time and space complexities, and additional nuances regarding the stability of the computation.
References
Musco, Cameron, Christopher Musco, and Aaron Sidford. 2018. “Stability of the Lanczos Method for Matrix Function Approximation.” In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 1605–24. SIAM.
Parlett, Beresford N. 1993. “Do We Fully Understand the Symmetric Lanczos Algorithm Yet.” Brown Et Al 3: 93–107.
Saad, Youcef. 1987. “On the Lanczos Method for Solving Symmetric Linear Systems with Several Right-Hand Sides.” Mathematics of Computation 48 (178): 651–62.
Footnotes
- A variant of the Lanczos method is actually at the heart of
scipy.sparse.linalg
’s defaulteigsh
solver (which is a port of ARPACK). - The spectral decomposition is canonical in the sense that it identifies a diagonalizable with its spectrum up to a change of basis