Gaussian Elimination

You work at KadoMin, a company that specializes in creating advanced androids. Your task is to determine the amount of metal used by each type of android. You’ve been given data on the number of androids produced in each batch, as well as the total metal consumption for each batch.

KadoMin produces two types of androids: Type X, designed for cleaning and housekeeping, and Type Y, built for construction tasks. Below is the latest production report.

Production Batch	Type X	Type Y	Total Metal Used (kg)
1	1	2	40
2	2	5	90

Since you were the math whiz back in school, you instantly know what to do. You quickly translate the data into a system of linear equations:

\( \begin{align}
x + 2y &= 40 \\
2x + 5y &=90
\end{align}\)

Next, rearrange the equation to isolate y on one side:

\( \begin{align*}
x + 2y &= 40 \\\\
y &= \frac{90 \, – \, 2x}{5}
\end{align*}\)

After that, substitute the isolated y into the first equation:

\( \begin{align*}
x + \frac{180 \, – \, 4x}{5} &= 40 \\\\
5x + 180 \, – \, 4x &= 200 \\
x &= 20
\end{align*}\)

Finally, substitute the x value back into the isolated y-equation to obtain:

\( \begin{align*}
y &= \frac{90 – 40}{5} \\\\
y &= 10
\end{align*}\)

Each Type X android requires 20kg of steel, while each Type Y uses 10kg. With the results in hand, your job contract is secured for another month.

You solved this small system using substitution, but this isn’t how linear algebra or numerical linear algebra would typically approach the problem.

Notice how these equations resemble matrix-vector multiplication? We can take the coefficients of the equations and organize them into a matrix, which we’ll call \(\mathbf{A}\). The corresponding vector will contain the variables, x and y, and on the right-hand side, we’ll define another vector, \(\mathbf{b}\).

\(\underbrace{
\begin{bmatrix}
1 & 2\\
2 & 5\\
\end{bmatrix}
}_{\mathbf{A}} \underbrace{
\begin{bmatrix}
x \\
y \\
\end{bmatrix}
}_{\mathbf{x}} = \underbrace{
\begin{bmatrix}
40 \\
90 \\
\end{bmatrix}
}_{\mathbf{b}}
\)

\(\mathbf{Ax}=\mathbf{b}\)

Now, we’re left with the key equation in linear algebra: \(\mathbf{Ax}=\mathbf{b}\).

Remember, these equations are linear: they involve coefficients multiplying unknown variables, with a sum of these products. In such systems, you won’t encounter terms like \(x \cdot y, x^2, e^x, \sqrt(x)\) and so on.

Now, how does linear algebra help solve these equations? Let’s begin by exploring the first part:

Performing Elimination

Our goal is to transform the matrix \(\mathbf{A}\) into an upper triangular form, which makes solving the system easy through backward substitution. But first, what exactly is an upper triangular matrix?

Picture a square matrix with some values. Now, set all the values below the diagonal to zero. What you’re left with is an upper triangular matrix.

\(
\begin{bmatrix}
1 & 2 & 3\\
0 & 2 & 2\\
0 & 0 & 6\\
\end{bmatrix}
\)

Now, how do we transform A into that state? Easy.
Look at the number in the top-left corner of the matrix, \(a_{11}\). (Here, it’s 1.) This will be our starting point, called the pivot. It cannot be zero!

\(
\begin{bmatrix}
\color{red}{1} & 2 & 3\\
2 & 6 & 8\\
4 & 10 & 20\\
\end{bmatrix}
\)

Next, we look at all the numbers below it—that is, all the numbers in the same column starting from the second row \(a_{21}\) onwards). Our goal is to zero out these entries in the matrix.

To do this, we create a multiplier \(l_{21}\), (It’s called \(l_{21}\) because we are eliminating the entry in the second row, first column) which defines the ratio between the pivot and the number below it:

\(l_{21} = \frac{a_{21}}{a_{11}}\)

Simply put, this tells us how much of row 1 (equation 1) we need to subtract from row 2 (equation 2) to get zero in the right position. We always subtract.

\(\text{row 2} = \text{row 2} \; – \; l_{21} \cdot \text{row 1}\)

In this case, the ratio is \(l_{21} = \frac{2}{1} = 2\). So we subtract 2 times row 1 from row 2. The result becomes our new row 2. (Remember to multiply the entire row by the multiplier, not just the pivot!) Notice that the first entry in this row is now zero, just as planned.

\(
\begin{bmatrix}
\color{red}{1} & 2 & 3\\
0 & 2 & 2\\
4 & 10 & 20\\
\end{bmatrix}
\)

We repeat this process for every row below the pivot, zeroing out each entry under it. \(l_{31} = \frac{4}{1} = 4\). We subtract 4 times row 1 from row 3. The result becomes our new row 3:

\(
\begin{bmatrix}
\color{red}{1} & 2 & 3\\
0 & 2 & 2\\
0 & 2 & 8\\
\end{bmatrix}
\)

Next, we move to the next pivot by going diagonally down—one step to the right and one step down. Our new pivot is \(a_{22}\). Again, it cannot be zero! We then repeat the same process as before, zeroing out every entry below this pivot.

\(
\begin{bmatrix}
1 & 2 & 3\\
0 & \color{red}{2} & 2\\
0 & 2 & 8\\
\end{bmatrix}
\)

\(l_{32} = \frac{2}{2} = 1\)

\(
\begin{bmatrix}
1 & 2 & 3\\
0 & \color{red}{2} & 2\\
0 & 0 & 6\\
\end{bmatrix}
\)

We switch our pivot one final time, and since there are no more rows below it, we’re done. We have transformed the input matrix \(\mathbf{A}\) into an upper triangular matrix \(\mathbf{U}\).

\(\underbrace{
\begin{bmatrix}
1 & 2 & 3\\
0 & 2 & 2\\
0 & 0 & \color{red}{6}\\
\end{bmatrix}}_{U}
\)

Note

I haven’t discussed the right-hand side vector \(\mathbf{b}\) in this explanation because we are working towards LU decomposition. However, in general, when solving a system using elimination, any operation you perform on the matrix \(\mathbf{A}\) must also be applied to the vector \(\mathbf{b}\).

For example, if you subtract 2 times row 1 from row 2 in \(\mathbf{A}\), you must do the same for \(\mathbf{b}\)—subtract 2 times \(b_1\) from \(b_2\).

What to Do When the Pivot Position is Zero?

What happens when there’s a zero in a position where we expect a pivot? Let’s take a look at this simple example:

\(
\begin{bmatrix}
1 & 2\\
2 & 4\\
\end{bmatrix}
\)

Applying the steps explained above leaves us with the following:

\(
\begin{bmatrix}
1 & 2\\
0 & 0\\
\end{bmatrix}
\)

This is a problem, because we cannot have a zero as a pivot. When this happens, it often means that \(\mathbf{A}\) is singular, and the elimination process will reveal that fact by producing a zero row.

It’s also easy to see that the matrix is singular in this case, since the second column is a multiple of the first. Depending on the right-hand side vector \(\mathbf{b}\), the system \(\mathbf{Ax}\) = \(\mathbf{b}\) will either have no solution or infinitely many solutions.

Specifically, if \(\mathbf{b}\) lies in the span of the columns of \(\mathbf{A}\), then there will be infinitely many solutions. If not, there will be no solution, because no combination of the columns of \(\mathbf{A}\) can produce \(\mathbf{b}\).

With this in mind, we can identify whether a square matrix is singular:
if an \(n \times n\) matrix has \(n\) pivots, it is invertible. If it does not—meaning there are fewer than n pivots—then the matrix is singular.

However, there is another case where we encounter a zero in a position that should contain a pivot, but where we can still fix the issue. Let’s take a look at this example:

\(
\begin{bmatrix}
0 & 7\\
1 & 3\\
\end{bmatrix}
\)

We immediately encounter a zero, but we can’t simply declare this matrix singular and move on, because it isn’t. We can fix this by exchanging rows, which gives us the following:

\(
\begin{bmatrix}
1 & 3\\
0 & 7\\
\end{bmatrix}
\)

Problem solved. So, if we encounter a zero, we can always check if there’s a non-zero number below the pivot spot and swap rows accordingly.

Remember, we can only look downwards, not upwards. Once we’ve eliminated all the entries below a pivot, we move on to the second pivot in the \((n-1) \times (n-1)\) submatrix. In simpler terms, once we’ve completed operations on the first row and pivot, we can’t change it anymore; we continue down the matrix.

KADOMIN