Simplex Method#

import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as la

Standard Form#

The standard form of a linear optimization problem is:

\[\begin{split} \begin{array}{rc} \text{maximize:} & \mathbf{c}^T \mathbf{x} \\ \text{subject to:} & A \mathbf{x} \leq \mathbf{b} \\ & \mathbf{x} \geq 0 \end{array} \end{split}\]

where:

  • \(\mathbf{x} \in \mathbb{R}^n\) is the vector of decision variables \(x_0,\dots,x_{n-1}\)

  • \(\mathbf{c} \in \mathbb{R}^n\) defines the objective function \(f(\mathbf{x}) = \mathbf{c}^T \mathbf{x}\)

  • \(A\) is a \(m \times n\) matrix defining the coefficients of the linear constraint inequalities

  • \(\mathbf{b} \in \mathbb{R}^m\) defines the right side of the linear constraint inequalities

Note: vector inequality \(\mathbf{u} \leq \mathbf{v}\) means \(u_i \leq v_i\) for each \(i\).

The matrix equations above can be written explicitly as:

\[\begin{split} \begin{array}{rc} \text{maximize:} & c_0x_0 + \cdots + c_{n-1}x_{n-1} \\ \text{subject to:} & a_{0,0}x_0 + \cdots + a_{0,n-1}x_{n-1} \leq b_0 \\ & a_{1,0}x_0 + \cdots + a_{1,n-1}x_{n-1} \leq b_1 \\ & \vdots \\ & a_{m-1,0}x_0 + \cdots + a_{m-1,n-1}x_{n-1} \leq b_{m-1} \\ & x_0 \geq 0, \dots , x_{n-1} \geq 0 \end{array} \end{split}\]

Note: we use 0-indexing therefore vectors in \(\mathbb{R}^n\) begin at index 0

\[\begin{split} \mathbf{x} = \begin{bmatrix} x_0 \\ x_1 \\ \vdots \\ x_{n-1} \end{bmatrix} \end{split}\]

Feasible Solutions#

A vector \(\mathbf{x} \in \mathbb{R}^n\) is a feasible solution if it satisfies the constraints: \(A \mathbf{x} \leq \mathbf{b}\) and \(\mathbf{x} \geq 0\). A feasible solution \(\mathbf{x} \in \mathbb{R}^n\) is optimal if \(\mathbf{c}^T \mathbf{y} \leq \mathbf{c}^T \mathbf{x}\) for all other feasible solutions \(\mathbf{y} \in \mathbb{R}^n\). A linear optimization problem is infeasible if there are no feasible solutions. A linear optimization problem is unbounded if the set of feasible solutions is unbounded.

A set \(S \subset \mathbb{R}^n\) is convex if for all \(\mathbf{x},\mathbf{y} \in S\) the line connecting \(\mathbf{x}\) and \(\mathbf{y}\) is also contained in \(S\). The set of feasible solutions \(S = \{ \mathbf{x} \in \mathbb{R}^n : A \mathbf{x} \leq \mathbf{b} \ , \ \mathbf{x} \geq 0 \}\) is a convex set. A polytope in \(\mathbb{R}^n\) is a bounded set such that the boundary consists of hyperplanes (of dimension n-1). A vertex of a polytope in \(\mathbb{R}^n\) is a point on the boundary which lies in the intersection of \(n\) hyperplanes (of dimension n-1) on the boundary.

What’s the point?! The set of feasible solutions of a bounded linear optimization problem is a convex polytope and an optimal solution exists at a vertex.

Slack Variables#

Introduce slack variables \(w_0,\dots,w_{m-1}\) for each inequality constraint:

\[\begin{split} \begin{array}{rc} \text{maximize:} & c_0x_0 + \cdots + c_{n-1}x_{n-1} \\ \text{subject to:} & a_{0,0}x_0 + \cdots + a_{0,n-1}x_{n-1} + w_0 = b_0 \\ & a_{1,0}x_0 + \cdots + a_{1,n-1}x_{n-1} + w_1 = b_1 \\ & \vdots \\ & a_{m-1,0}x_0 + \cdots + a_{m-1,n-1}x_{n-1} + w_{m-1} = b_{m-1} \\ & x_0 \geq 0, \dots , x_{n-1} \geq 0, w_0 \geq 0, \dots , w_{m-1} \geq 0 \end{array} \end{split}\]

The equality constraints can be written in matrix notation:

\[ A \mathbf{x} + I \mathbf{w} = \mathbf{b} \]

Equivalently, we can write the equality constraints as matrix multiplication:

\[\begin{split} [ \ A \ I \ ] \begin{bmatrix} \mathbf{x} \\ \mathbf{w} \end{bmatrix} = \mathbf{b} \end{split}\]

Let’s make some obervations:

  • The equation \(x_j = 0\) defines a hyperplane in \(\mathbb{R}^n\) for each \(j\)

  • The equation \(w_i = 0\) defines the hyperplane \(\langle \mathbf{a}_i, \mathbf{x} \rangle = b_i\) in \(\mathbb{R}^n\) for each \(i\) (where \(\mathbf{a}_i\) is the \(i\)th row of \(A\))

  • The slack variable \(w_i\) is the “distance” to the constraint \(\langle \mathbf{a}_i, \mathbf{x} \rangle = b_i\)

  • The intersection of \(n\) hyperplanes in \(\mathbb{R}^n\) defines a vertex on the boundary of the convex polytope of feasible solutions (or the intersection is empty or outside the feasible set)

Basic Variables#

An optimal solution exists at a vertex therefore we want to systematically identify vertices:

  • Choose any \(m\) variables from the list \(x_0,\dots,x_{n-1},w_0,\dots,w_{m-1}\) and call them basic variables

  • The remaining \(n\) variables are called nonbasic variables

  • Set all nonbasic variables to 0 and solve for positive values of basic variables (if possible)

  • The result \(\mathbf{x} = (x_0,\dots,x_{n-1})^T\) is a vertex of the convex polytope of feasible solutions

Tableau Matrix#

Construct the initial tableau matrix:

\[\begin{split} T = \begin{bmatrix} A & I & \mathbf{b} \\ \mathbf{c}^T & 0 & 0 \end{bmatrix} \end{split}\]

This definition is different from other sources. The vector \(\mathbf{c}^T\) is in the bottom row so that the row indices of \(T\) match the indices of the slack variables. For example, row 0 of \(T\) corresponds to \(w_0\), row 1 corresponds to \(w_1\), etc.

def tableau(A,b,c):
    m,n = A.shape
    I = np.eye(m)
    T = np.vstack([ np.hstack([A,I,b.reshape((m,1))]) , np.hstack([c,np.zeros(m+1)]) ])
    return T

The basic variables correspond to the columns of \(T\) which are columns of the identity matrix (in the first \(m\) rows). In the initial tableau, the slack variables \(w_0,\dots,w_{m-1}\) are the basic variables and the decision variables \(x_0,\dots,x_{n-1}\) are the nonbasic variables. The corresponding vertex is \(\mathbf{x}=0\).

Let’s do an example with the following \(A\), \(\mathbf{b}\) and \(\mathbf{c}\):

\[\begin{split} A = \begin{bmatrix} 2 & 1 & 1 & 3 \\ 1 & 3 & 1 & 2 \end{bmatrix} \hspace{1in} \mathbf{b} = \begin{bmatrix} 5 \\ 3 \end{bmatrix} \hspace{1in} \mathbf{c} = \begin{bmatrix} 6 \\ 8 \\ 5 \\ 9 \end{bmatrix} \end{split}\]
A = np.array([[2.,1.,1.,3.],[1.,3.,1.,2.]])
c = np.array([6.,8.,5.,9.])
b = np.array([5.,3.])
T = tableau(A,b,c)
print(T)
[[2. 1. 1. 3. 1. 0. 5.]
 [1. 3. 1. 2. 0. 1. 3.]
 [6. 8. 5. 9. 0. 0. 0.]]

Pivot Operation#

We want to move to an adjacent vertex by selecting a new basic variable (the entering variable) and removing an existing basic variable (the leaving variable).

We choose the entering and leaving variables such that:

  • The value of the objective function increases.

  • The new list of basic variables defines a vertex. In other words, the values of the basic variables (with nonbasic variables set to 0) are all positive.

How do we do this? Choose the entering variable such that it has a positive coefficient in the objective function. That is, choose a column with a positive value in the last row. Increasing the value of this variable will increase the value of the objective function.

Choose the leaving variable such that the values of the basic variables (with nonbasic variables set to 0) are all positive. To do this, let \(k\) be the column index of the entering variable. Look at the ratios

\[ \frac{t_{i,n+m}}{t_{i,k}} \ , \ i = 0,\dots,m-1 \]

where \(t_{i,n+m}\) are the entries in the last column of \(T\), and \(t_{i,k}\) are the entries in the \(k\)th column of \(T\). Select row \(i\) such that this ratio is postive and minimal.

Let \(k\) be the column index of the entering variable. Let \(\ell\) be the row index of the leaving variable. The pivot operation applies row operations to the tableau \(T\) such that column \(k\) has 1 in row \(\ell\) and all other entries 0. The last column of the tableau should always have positive entries in rows \(0,\dots,m-1\).

Simplex Method#

Let \(A\) be an \(m \times n\) matrix, \(\mathbf{b} \in \mathbb{R}^m\) and \(\mathbf{c} \in \mathbb{R}^n\). Consider the linear optimization problem: maximize \(\mathbf{c}^T \mathbf{x}\) subject to \(A \mathbf{x} \leq \mathbf{b}\), \(\mathbf{x} \geq 0\).

The simplex algorithm is:

  • Phase I: Find any vertex

    • If \(\mathbf{b} \geq 0\) then choose \(\mathbf{x} = 0\).

    • Otherwise, solve initialization problem.

  • Phase II: Move from vertex to vertex to increase the value of objective function

    • Identify entering and leaving variables.

    • Perform pivot operation.

    • Repeat!

Implementation#

Write a function called pivot which takes a tableau matrix \(T\) and indices k and l. The function performs the pivot operation on \(T\) and returns the new tableau matrix.

def pivot(T,k,l):
    E = np.eye(T.shape[0])
    E[:,l] = -T[:,k]/T[l,k]
    E[l,l] = 1/T[l,k]
    return E@T

Let’s do an example with the following \(A\), \(\mathbf{b}\) and \(\mathbf{c}\):

\[\begin{split} A = \begin{bmatrix} 2 & 1 & 1 & 3 \\ 1 & 3 & 1 & 2 \end{bmatrix} \hspace{1in} \mathbf{b} = \begin{bmatrix} 5 \\ 3 \end{bmatrix} \hspace{1in} \mathbf{c} = \begin{bmatrix} 6 \\ 8 \\ 5 \\ 9 \end{bmatrix} \end{split}\]
A = np.array([[2.,1.,1.,3.],[1.,3.,1.,2.]])
c = np.array([6.,8.,5.,9.])
b = np.array([5.,3.])
T = tableau(A,b,c)
print(T)
[[2. 1. 1. 3. 1. 0. 5.]
 [1. 3. 1. 2. 0. 1. 3.]
 [6. 8. 5. 9. 0. 0. 0.]]

We see that column \(k=3\) has the largest positive entry in the last row. Look at the ratios \(\frac{t_{0,6}}{t_{0,3}}=5/3\) and \(\frac{t_{1,6}}{t_{1,3}}=2/3\). Since the ratio for \(i=1\) is positive and minimal we choose \(\ell=1\).

T1 = pivot(T,3,1)
print(T1)
[[  0.5  -3.5  -0.5   0.    1.   -1.5   0.5]
 [  0.5   1.5   0.5   1.    0.    0.5   1.5]
 [  1.5  -5.5   0.5   0.    0.   -4.5 -13.5]]

Repeat!

T2 = pivot(T1,0,0)
print(T2)
[[  1.  -7.  -1.   0.   2.  -3.   1.]
 [  0.   5.   1.   1.  -1.   2.   1.]
 [  0.   5.   2.   0.  -3.   0. -15.]]
T3 = pivot(T2,1,1)
print(T3)
[[ 1.0000000e+00 -4.4408921e-16  4.0000000e-01  1.4000000e+00
   6.0000000e-01 -2.0000000e-01  2.4000000e+00]
 [ 0.0000000e+00  1.0000000e+00  2.0000000e-01  2.0000000e-01
  -2.0000000e-01  4.0000000e-01  2.0000000e-01]
 [ 0.0000000e+00  0.0000000e+00  1.0000000e+00 -1.0000000e+00
  -2.0000000e+00 -2.0000000e+00 -1.6000000e+01]]
T4 = pivot(T3,2,1)
print(T4)
[[ 1.00000000e+00 -2.00000000e+00 -2.22044605e-17  1.00000000e+00
   1.00000000e+00 -1.00000000e+00  2.00000000e+00]
 [ 0.00000000e+00  5.00000000e+00  1.00000000e+00  1.00000000e+00
  -1.00000000e+00  2.00000000e+00  1.00000000e+00]
 [ 0.00000000e+00 -5.00000000e+00  0.00000000e+00 -2.00000000e+00
  -1.00000000e+00 -4.00000000e+00 -1.70000000e+01]]

All the entries in the bottom row are negative therefore we have arrived! The optimal value is \(17\) and occurs at \(\mathbf{x} = (2,0,1,0)\).

Let’s try another example:

\[\begin{split} A = \left[ \begin{array}{rrr} 0 & 2 & 3 \\ 1 & 1 & 2 \\ 1 & 2 & 3 \\ \end{array} \right] \hspace{1in} \mathbf{b} = \left[ \begin{array}{r} 5 \\ 4 \\ 7 \end{array} \right] \hspace{1in} \mathbf{c} = \left[ \begin{array}{r} 2 \\ 3 \\ 4 \end{array} \right] \end{split}\]
A = np.array([[0,2,3],[1,1,2],[1,2,3]])
b = np.array([5,4,7])
c = np.array([2,3,4])
T = tableau(A,b,c)
T
array([[0., 2., 3., 1., 0., 0., 5.],
       [1., 1., 2., 0., 1., 0., 4.],
       [1., 2., 3., 0., 0., 1., 7.],
       [2., 3., 4., 0., 0., 0., 0.]])
T1 = pivot(T,2,0)
T1.round(2)
array([[ 0.  ,  0.67,  1.  ,  0.33,  0.  ,  0.  ,  1.67],
       [ 1.  , -0.33,  0.  , -0.67,  1.  ,  0.  ,  0.67],
       [ 1.  ,  0.  ,  0.  , -1.  ,  0.  ,  1.  ,  2.  ],
       [ 2.  ,  0.33,  0.  , -1.33,  0.  ,  0.  , -6.67]])
T2 = pivot(T1,0,1)
T2.round(2)
array([[ 0.  ,  0.67,  1.  ,  0.33,  0.  ,  0.  ,  1.67],
       [ 1.  , -0.33,  0.  , -0.67,  1.  ,  0.  ,  0.67],
       [ 0.  ,  0.33,  0.  , -0.33, -1.  ,  1.  ,  1.33],
       [ 0.  ,  1.  ,  0.  ,  0.  , -2.  ,  0.  , -8.  ]])
T3 = pivot(T2,1,0)
T3.round(2)
array([[  0. ,   1. ,   1.5,   0.5,   0. ,   0. ,   2.5],
       [  1. ,   0. ,   0.5,  -0.5,   1. ,   0. ,   1.5],
       [  0. ,   0. ,  -0.5,  -0.5,  -1. ,   1. ,   0.5],
       [  0. ,   0. ,  -1.5,  -0.5,  -2. ,   0. , -10.5]])

Optimial value \(10.5\) occurs at \(\mathbf{x} = (1.5,2.5,0.0)\).

Initialization#

If \(\mathbf{b}\) has negative entries then \(\mathbf{x} = 0\) is not a feasible solution. The initialization process simply adds a scalar \(y\) to each constraint to define the auxiliary problem:

\[\begin{split} \begin{array}{rc} \text{maximize:} & -y \\ \text{subject to:} & a_{0,0}x_0 + \cdots + a_{0,n-1}x_{n-1} - y \leq b_0 \\ & a_{1,0}x_0 + \cdots + a_{1,n-1}x_{n-1} - y \leq b_1 \\ & \vdots \\ & a_{m-1,0}x_0 + \cdots + a_{m-1,n-1}x_{n-1} - y \leq b_{m-1} \\ & y \geq 0, x_0 \geq 0, \dots , x_{n-1} \geq 0, w_0 \geq 0, \dots , w_{m-1} \geq 0 \end{array} \end{split}\]

You can think of \(y\) as moving each constraint towards the origin. The solution of the auxiliary problem finds a vertex “nearest” to the origin. When the auxiliary problem is solved, drop variable \(y\) and the result is a vertex of the original problem. Proceed to Phase II.

Let’s try the following example:

\[\begin{split} A = \left[ \begin{array}{rrr} -1 & -1 & -1 \\ 2 & -1 & 1 \end{array} \right] \hspace{1in} \mathbf{b} = \left[ \begin{array}{r} -2 \\ 1 \end{array} \right] \hspace{1in} \mathbf{c} = \left[ \begin{array}{r} 2 \\ -6 \\ 0 \end{array} \right] \end{split}\]

Append column to \(A\) for variable \(y\) and construct the tableau.

Ax = np.array([[-1,-1,-1,-1],[2,-1,1,-1]])
bx = np.array([-2,1])
cx = np.array([0,0,0,-1])
Tx = tableau(Ax,bx,cx)
print(Tx)
[[-1. -1. -1. -1.  1.  0. -2.]
 [ 2. -1.  1. -1.  0.  1.  1.]
 [ 0.  0.  0. -1.  0.  0.  0.]]

Apply pivot operations to solve:

Tx1 = pivot(Tx,3,0)
Tx1
array([[ 1.,  1.,  1.,  1., -1.,  0.,  2.],
       [ 3.,  0.,  2.,  0., -1.,  1.,  3.],
       [ 1.,  1.,  1.,  0., -1.,  0.,  2.]])
Tx2 = pivot(Tx1,0,1)
Tx2.round(2)
array([[ 0.  ,  1.  ,  0.33,  1.  , -0.67, -0.33,  1.  ],
       [ 1.  ,  0.  ,  0.67,  0.  , -0.33,  0.33,  1.  ],
       [ 0.  ,  1.  ,  0.33,  0.  , -0.67, -0.33,  1.  ]])
Tx3 = pivot(Tx2,1,0)
Tx3.round(2)
array([[ 0.  ,  1.  ,  0.33,  1.  , -0.67, -0.33,  1.  ],
       [ 1.  ,  0.  ,  0.67,  0.  , -0.33,  0.33,  1.  ],
       [-0.  ,  0.  ,  0.  , -1.  ,  0.  ,  0.  ,  0.  ]])

Found solution \(\mathbf{x} = (1,1,0)\) and \(y=0\). Now setup the tableau for the original problem:

A = np.array([[-1,-1,-1],[2,-1,1]])
b = np.array([-2,1])
c = np.array([2,-6,0])
T = tableau(A,b,c)
print(T)
[[-1. -1. -1.  1.  0. -2.]
 [ 2. -1.  1.  0.  1.  1.]
 [ 2. -6.  0.  0.  0.  0.]]

The solution of the initializatin problem tells us that we want to start with \(x_0\) and \(x_1\) as basic variables.

T1 = pivot(T,0,1)
T1.round(2)
array([[ 0. , -1.5, -0.5,  1. ,  0.5, -1.5],
       [ 1. , -0.5,  0.5,  0. ,  0.5,  0.5],
       [ 0. , -5. , -1. ,  0. , -1. , -1. ]])
T2 = pivot(T1,1,0)
T2.round(2)
array([[ 0.  ,  1.  ,  0.33, -0.67, -0.33,  1.  ],
       [ 1.  ,  0.  ,  0.67, -0.33,  0.33,  1.  ],
       [ 0.  ,  0.  ,  0.67, -3.33, -2.67,  4.  ]])

This is a feasible solution. Apply simplex method.

T3 = pivot(T2,2,1)
T3.round(2)
array([[-0.5,  1. ,  0. , -0.5, -0.5,  0.5],
       [ 1.5,  0. ,  1. , -0.5,  0.5,  1.5],
       [-1. ,  0. ,  0. , -3. , -3. ,  3. ]])

Soluiton is \(\mathbf{x} = (0,0.5,1.5)\) with \(\mathbf{c}^T \mathbf{x} = -3\).