LU Decomposition Theory

LU Decomposition

Earlier, we explored how to transform a matrix \(\mathbf{A}\) into its upper triangular form \(\mathbf{U}\) through elimination. Now, we want to achieve the same result using matrices. We’ll call these matrices \(\mathbf{E}\), for elimination (a rather creative choice).

Elimination with Matrices

First, recall that when we multiply a matrix on the left, we’re essentially taking linear combinations of the rows of the matrix of the right. Consider the following example:

\(
\underbrace{ \begin{bmatrix}
0 & 0 & 1\\
0 & 2 & 0\\
3& 0 & 0\\
\end{bmatrix}}_{\text{left matrix}}
\) \(
\underbrace{\begin{bmatrix}
1 & 2 & 3\\
4 & 5 & 6\\
7 & 8 & 9\\
\end{bmatrix}}_{\text{right matrix}}
\) = \(
\begin{bmatrix}
7 & 8 & 9\\
8 & 10 & 12\\
3 & 6 & 9\\
\end{bmatrix}
\)

Here, we take the third row of the right matrix once and place it in the first row of the resulting matrix. Then, we take the second row twice and place it in the second row of the new matrix. Finally, we take the first row three times and place it in the last row of the resulting matrix.

The general idea is that if a row \(i\) of the left matrix looks like \((x,0,0)\), it means you multiply the first row of the matrix on the right by \(x\) and place it in the same row \(i\) of the new matrix (so, the second row of the left matrix would place its result in the second row of the new matrix). If a row of the left matrix looks like \((x,y,0)\), you take the first and second rows of the right matrix, multiply them by \(x\) and \(y\) respectively, and then add them together. This sum is placed into the same row \(i\) of the new matrix.

Let’s say, we are presented with the following matrix \(\mathbf{A}\):

\(
\begin{bmatrix}
1 & 2 & 3\\
2 & 6 & 8\\
4 & 10 & 20\\
\end{bmatrix}
\)

Now, if we want to eliminate the number below the first pivot of matrix \(\mathbf{A}\), we can use the following matrix \(\mathbf{E_{21}}\) to achieve this.

\(
\mathbf{E}_{21}\mathbf{A}
=
\begin{bmatrix}
1 & 0 & 0\\
-2 & 1 & 0\\
0 & 0 & 1\\
\end{bmatrix}

\begin{bmatrix}
1 & 2 & 3\\
2 & 6 & 8\\
4 & 10 & 20\\
\end{bmatrix}
=
\begin{bmatrix}
1 & 2 & 3\\
0 & 2 & 2\\
4 & 10 & 20\\
\end{bmatrix}
\)

Notice how we got a zero under the pivot? This makes sense because the elimination matrix subtracts two times row 1 from row 2, leaving the other rows unchanged. How did we get the -2 in the (2,1) position? That’s the multiplier \(l_{21}\) we already know. The minus just tells us to subtract the rows.

Now you’ve got the perfect recipe for creating your own delicious elimination matrices. First, find the multiplier. Then, take the identity matrix and place the negated multiplier in the correct position. What’s that position? If you’re eliminating the (2,1) entry under the first pivot, that’s where the multiplier will go in the elimination matrix. Since we’re working downwards, it will be placed in the lower triangular part of the identity matrix.

To keep things simple, we create separate elimination matrices for each number we want to eliminate:

\(

\mathbf{E}_{31}\mathbf{E}_{21}\mathbf{A}
=
\begin{bmatrix}
1 & 0 & 0\\
0 & 1 & 0\\
-4 & 0 & 1\\
\end{bmatrix}

\begin{bmatrix}
1 & 0 & 0\\
-2 & 1 & 0\\
0 & 0 & 1\\
\end{bmatrix}

\begin{bmatrix}
1 & 2 & 3\\
2 & 6 & 8\\
4 & 10 & 20\\
\end{bmatrix}
=
\begin{bmatrix}
1 & 2 & 3\\
0 & 2 & 2\\
0 & 2 & 8\\
\end{bmatrix}
\)

We continue this process until we achieve our desired result: the upper triangular matrix \(\mathbf{U}\). Finally, we multiply all of these elimination matrices together into a single matrix, which we’ll simply call \(\mathbf{E}\) (without a subscript). This is the matrix that transforms \(\mathbf{A}\) into \(\mathbf{U}\):

\(
\begin{align}
\underbrace{\mathbf{E}_{32}\mathbf{E}_{31}\mathbf{E}_{21}}_{\mathbf{E}}\mathbf{A} &= \mathbf{U} \\
\mathbf{E}\mathbf{A} &= \mathbf{U}
\end{align}
\)

Note

If you want to solve for \(\mathbf{b}\) while performing the triangulation, you also need to apply \(\mathbf{E}_{21}\) to \(\mathbf{b}\), so:

\(
\mathbf{E}_{ij}\mathbf{A} = \mathbf{E}_{ij}\mathbf{b}
\)

You could also just apply it to the augmented matrix:

\(
\begin{bmatrix}
\mathbf{A} & \mathbf{b}
\end{bmatrix}
\)

A = LU

To create \(\mathbf{L}\), we need to know how to invert the individual \(\mathbf{E}\)’s. Fortunately, this is simple: you keep the matrix the same and just invert the sign of the multiplier \(l\). This makes sense, since if you’re subtracting something, the inverse operation would be to add.

\(
\mathbf{E}_{21} =
\begin{bmatrix}
1 & 0 & 0\\
-2 & 1 & 0\\
0 & 0 & 1\\
\end{bmatrix}
\)

So, we have:

\(
\mathbf{E}_{21}^{-1} = \begin{bmatrix}
1 & 0 & 0\\
2 & 1 & 0\\
0 & 0 & 1\\
\end{bmatrix}
\)

Now, if we take the inverse of \( \mathbf{E} \) in \( \mathbf{EA} = \mathbf{U} \), we get the following:

\(
\begin{align}
\mathbf{A} &= \mathbf{E^{-1}U} \\
\mathbf{A} &= (\mathbf{E}_{32}\mathbf{E}_{31}\mathbf{E}_{21})^{-1}\mathbf{U} \\
\mathbf{A} &= \underbrace{\mathbf{E}_{21}^{-1}\mathbf{E}_{31}^{-1}\mathbf{E}_{32}^{-1}}_{\mathbf{L}}\mathbf{U} \\
\mathbf{A} &= \mathbf{LU}
\end{align}
\)

Notice that we swapped the order of the \(\mathbf{E}\)’s after taking the inverse, because \( (\mathbf{ABC})^{-1} = \mathbf{C}^{-1} \mathbf{B}^{-1} \mathbf{A}^{-1}\). The beauty of the \(\mathbf{L}\) matrix is that the multipliers fit perfectly into place, something we didn’t have with the full \(\mathbf{E}\). For example, in position (2,1) of \(\mathbf{L}\), we have the multiplier \(l_{21}\). Also, note that \(\mathbf{L}\) is lower triangular, which is why we call it \(\mathbf{L}\). In lower triangular matrices, all entries above the diagonal are zeros.

\(
\mathbf{L} = \begin{bmatrix}
1 & 0 & 0\\
l_{21} & 1 & 0\\
l_{31} & l_{32} & 1\\
\end{bmatrix}
\)

PA = LU

Remember the issue when there was a zero in a pivot position, and we had to swap rows to solve it, if possible? In that case, \(\mathbf{A} = \mathbf{LU}\) would fail. This is where \(\mathbf{P}\) comes in. In this case, \(\mathbf{P}\) stands for permutation (it can also stand for projection). It does exactly what the name suggests: when you multiply a matrix by \(\mathbf{P}\) on the left, it swaps its rows. Just what we need:

\(
\begin{align}
\mathbf{PA} =
\begin{bmatrix}
0 & 1\\
1 & 0 \\
\end{bmatrix}
\begin{bmatrix}
0 & 7\\
1 & 3 \\
\end{bmatrix} & = \begin{bmatrix}
1 & 3\\
0 & 7 \\
\end{bmatrix}
\end{align}
\)

Notice that we swapped rows 1 and 2. To create your own permutation matrix, simply identify the rows you want to switch — for example, rows 1 and 3 — then perform that operation on the identity matrix. The resulting matrix will be your \(\mathbf{P}\).

Why LU?

We create the matrix \(\mathbf{L}\) through Gaussian elimination. If we start with the augmented matrix \([\mathbf{A} \; | \; \mathbf{b}]\), or simply apply all row operations to both \(\mathbf{A}\) and \(\mathbf{b}\), then by the end of the elimination process, \(\mathbf{b}\) is already transformed and ready for back substitution.

So, what advantage does \(\mathbf{L}\) actually give us? The answer is: speed. In real-world scenarios, the matrix \(\mathbf{A}\) often stays the same while the right-hand side vector \(\mathbf{b}\) changes. Without LU decomposition, we would need to perform elimination from scratch every time \(\mathbf{b}\) changes. With LU, we only do Gaussian elimination once to compute \(\mathbf{L}\) and \(\mathbf{U}\). Then, for each new \(\mathbf{b}\), we solve two triangular systems using forward and backward substitution—since both \(\mathbf{L}\) and \(\mathbf{U}\) are triangular.

This is much faster: elimination is an \(\mathcal{O}(n^3)\) operation, while solving triangular systems is only \(\mathcal{O}(n^2)\).

KADOMIN