Systems of Linear Equations#

import numpy as np
import scipy.linalg as la
import matplotlib.pyplot as plt

Terminology#

A system of linear equations is a collection of equations of the form

\[\begin{split} \begin{array}{ccccccccc} a_{0,0}x_0 & + & a_{0,1}x_1 & + & \cdots & + & a_{0,n-1} x_{n-1} & = & b_0 \\ a_{1,0}x_0 & + & a_{1,1}x_1 & + & \cdots & + & a_{1,n-1} x_{n-1} & = & b_1 \\ \vdots & & & & & & & & \vdots & \\ a_{m-1,0}x_0 & + & a_{m-1,1}x_1 & + & \cdots & + & a_{m-1,n-1} x_{n-1} & = & b_{m-1} \end{array} \end{split}\]

where the numbers \(a_{i,j}\) and \(b_i\) are known coefficients, and \(x_j\) are the variables. We can write the equation is matrix form as

\[ A \mathbf{x} = \mathbf{b} \]

where

\[\begin{split} A = \begin{bmatrix} a_{0,0} & a_{0,1} & \cdots & a_{0,n-1} \\ a_{1,0} & a_{1,1} & \cdots & a_{1,n-1} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m-1,0} & a_{m-1,1} & \cdots & a_{m-1,n-1} \\ \end{bmatrix} \hspace{20mm} \mathbf{b} = \begin{bmatrix} b_0 \\ b_1 \\ \vdots \\ b_{m-1} \end{bmatrix} \hspace{20mm} \mathbf{x} = \begin{bmatrix} x_0 \\ x_1 \\ \vdots \\ x_{n-1} \end{bmatrix} \end{split}\]

Combine the matrix \(A\) and the vector \(\mathbf{b}\) to form the augmented matrix

\[\begin{split} [ A \ | \ \mathbf{b} ] = \begin{bmatrix} a_{0,0} & a_{0,1} & \cdots & a_{0,n-1} & b_0 \\ a_{1,0} & a_{1,1} & \cdots & a_{1,n-1} & b_1 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ a_{m-1,0} & a_{m-1,1} & \cdots & a_{m-1,n-1} & b_{m-1} \\ \end{bmatrix} \end{split}\]

Elementary Row Operations#

There are three types of row operations:

  1. Add a multiple of one row to another

  2. Multiply a row by a scalar

  3. Switch rows

Write a function called add_row which takes A, c, i and j and returns the matrix \(B\) obtained from \(A\) by adding \(c\) times row \(i\) to row \(j\) in \(A\).

def add_row(A,c,i,j):
    B = A.astype(np.float64).copy() # Create a copy of the input matrix
    B[j,:] = c*A[i,:] + A[j,:]
    return B
A = np.array([[1,2,3],[-2,2,2]])
print(A)
[[ 1  2  3]
 [-2  2  2]]
A1 = add_row(A,2,0,1)
print(A1)
[[1. 2. 3.]
 [0. 6. 8.]]

Write a function called scale_row which takes A, c and i and returns the matrix \(B\) obtained from \(A\) by multiplying \(c\) times row \(i\).

def scale_row(A,c,i):
    B = A.astype(np.float64).copy() # Create a copy of the input matrix
    B[i,:] = c*A[i,:]
    return B
A2 = scale_row(A1,1/6,1)
print(A2)
[[1.         2.         3.        ]
 [0.         1.         1.33333333]]

Write a function called switch_rows which takes A, i and j and returns the matrix \(B\) obtained from \(A\) by switching rows i and j.

def switch_rows(A,i,j):
    B = A.astype(np.float64).copy() # Create a copy of the input matrix
    B[i,:] = A[j,:]
    B[j,:] = A[i,:]
    return B
A = np.array([[0,-1],[1,0]])
print(A)
[[ 0 -1]
 [ 1  0]]
switch_rows(A,0,1)
array([[ 1.,  0.],
       [ 0., -1.]])

Row Echelon Form#

A pivot is the first nonzero entry of a row of a matrix. A matrix is in row echelon form if:

  1. All zero rows are at the bottom.

  2. Each pivot is to the right of the pivot in each row above.

  3. All entries are zero below each pivot.

A matrix is in reduced row echelon form if it is in row echelon form and:

  1. Each pivot is equal to 1.

  2. Each pivot is the only nonzero entry in its column.

Apply elementary row operations to transform the following matrix into row echelon form:

\[\begin{split} M = \left[ \begin{array}{rrrr} -2 & 1 & 0 & 1 \\ 1 & -2 & 1 & 1 \\ 0 & 1 & -2 & 1 \end{array} \right] \end{split}\]
M = np.array([[-2,1,0,1],[1,-2,1,1],[0,1,-2,1]])
print(M)
[[-2  1  0  1]
 [ 1 -2  1  1]
 [ 0  1 -2  1]]
M1 = add_row(M,1/2,0,1)
print(M1)
[[-2.   1.   0.   1. ]
 [ 0.  -1.5  1.   1.5]
 [ 0.   1.  -2.   1. ]]
M2 = add_row(M1,2/3,1,2)
print(M2)
[[-2.          1.          0.          1.        ]
 [ 0.         -1.5         1.          1.5       ]
 [ 0.          0.         -1.33333333  2.        ]]

Success! Now let’s take it a few steps farther and compute the reduced row echelon form:

M3 = scale_row(M2,-1/2,0)
print(M3)
[[ 1.         -0.5        -0.         -0.5       ]
 [ 0.         -1.5         1.          1.5       ]
 [ 0.          0.         -1.33333333  2.        ]]
M4 = scale_row(M3,-2/3,1)
print(M4)
[[ 1.         -0.5        -0.         -0.5       ]
 [-0.          1.         -0.66666667 -1.        ]
 [ 0.          0.         -1.33333333  2.        ]]
M5 = add_row(M4,1/2,1,0)
print(M5)
[[ 1.          0.         -0.33333333 -1.        ]
 [-0.          1.         -0.66666667 -1.        ]
 [ 0.          0.         -1.33333333  2.        ]]
M6 = scale_row(M5,-3/4,2)
print(M6)
[[ 1.          0.         -0.33333333 -1.        ]
 [-0.          1.         -0.66666667 -1.        ]
 [-0.         -0.          1.         -1.5       ]]
M7 = add_row(M6,2/3,2,1)
print(M7)
[[ 1.          0.         -0.33333333 -1.        ]
 [-0.          1.          0.         -2.        ]
 [-0.         -0.          1.         -1.5       ]]
M8 = add_row(M7,1/3,2,0)
print(M8)
[[ 1.   0.   0.  -1.5]
 [-0.   1.   0.  -2. ]
 [-0.  -0.   1.  -1.5]]

Success! In fact, we have cmputed the solution \(\mathbf{x}\) of the system \(A \mathbf{x} = \mathbf{b}\) where

\[\begin{split} A = \left[ \begin{array}{rrrr} -2 & 1 & 0 \\ 1 & -2 & 1 \\ 0 & 1 & -2 \end{array} \right] \hspace{10mm} \mathbf{b} = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \hspace{10mm} \mathbf{x} = \begin{bmatrix} -1.5 \\ -2 \\ -1.5 \end{bmatrix} \end{split}\]

Computing Solutions with SciPy#

The function scipy.linalg.solve computes the unique solution of the system \(A \mathbf{x} = \mathbf{b}\) for a nonsingular matrix \(A\). See the SciPy documentation for more information. For example, let’s compute the solution of the example from the previous section.

A = np.array([[-2,1,0],[1,-2,1],[0,1,-2]])
print(A)
[[-2  1  0]
 [ 1 -2  1]
 [ 0  1 -2]]
b = np.array([1,1,1])
print(b)
[1 1 1]
x = la.solve(A,b)
print(x)
[-1.5 -2.  -1.5]

See also

Check out Mathematical Python > Linear Algebra for more about solving linear systems with Python.