2.4 Linear Programming

Linear programming (LP) was developed independently by G.B. Dantzig and L. Kantorovich in the first half of the \(20^{\textrm{th}}\) century to solve resource planning problems. Even though linear programming is insufficient for many modern-day applications in operations research, it was used extensively in economic and military contexts in the early days.

To motivate some key ideas in linear programming, we begin with a small example.

Example: A roadside stand sells lemonade and lemon juice. Each unit of lemonade requires 1 lemon and 2 litres of water to prepare, and each unit of lemon juice requires 3 lemons and 1 litre of water to prepare. Each unit of lemonade gives a profit of 3$ dollars upon selling, and each unit of lemon juice gives a profit of 2$ dollars, upon selling. With 6 lemons and 4 litres of water available, how many units of lemonade and lemon juice should be prepared in order to maximize profit?

If we let \(x\) and \(y\) denote the number of units of lemonade and lemon juice, respectively, to prepare, then the profit is the objective function, given by \(3x + 2y\)$. Note that there are a number of constraints that \(x\) and \(y\) must satisfy:

  • \(x\) and \(y\) should be non-negative;

  • the number of lemons needed to make \(x\) units of lemonade and \(y\) units of lemon juice is \(x+3y\) and cannot exceed 6;

  • the number of litres of water needed to make \(x\) units of lemonade and \(y\) units of lemon juice is \(2x+y\) and cannot exceed 4;

Hence, to determine the maximum profit, we need to maximize \(3x + 2y\) subject to \(x\) and \(y\) satisfying the constraints \(x + 3y \leq 6\), \(2x + y \leq 4\), \(x \geq 0\), and \(y \geq 0.\) A more compact way to write the problem is as follows: \[\begin{array}{|rrcrll} \max & 3x & + & 2y & \\ \mbox{s.t.} & x & + & 3y & \leq & 6 \\ & 2x & +& y & \leq & 4 \\ & x & & & \geq & 0 \\ & & & y & \geq & 0. \\ & x & , & y & \in & \mathbb{R}. \end{array}\] It is customary to omit the specification of the domain set in linear programming since the variables always take on real numbers. Hence, we can simply write \[\begin{array}{|rrcrll} \max & 3x & + & 2y & \\ \mbox{s.t.} & x & + & 3y & \leq & 6 \\ & 2x & +& y & \leq & 4 \\ & x & & & \geq & 0 \\ & & & y & \geq & 0. \end{array}\]

We can solve the above maximization problem graphically, as follows. We first sketch the set of \([x, y]^{\!\top}\) satisfying the constraints, called the feasible region, on the \((x,y)\)-plane.

We then take the objective function \(3x+2y\) and turn it into the equation of a line \(3x+2y = c\) where \(c\) is a parameter. Note that as the value of \(c\) increases, the line defined by the equation \(3x+2y=c\) moves in the direction of the normal vector \([3,2]^{\!\top}\). We call this direction the direction of improvement. Determining the maximum value of the objective function, called the optimal value, subject to the contraints amounts to finding the maximum value of \(c\) so that the line defined by the equation \(3x+2y=c\) still intersects the feasible region.

Figure 2.1 shows the lines with \(c=0,4,6.8\).

Graphical solution for the lemonade and lemon juice optimization problem.

Figure 2.1: Graphical solution for the lemonade and lemon juice optimization problem; the feasible region is shown in yellow, and level curves of the objective function in red.

We can see that if \(c\) is greater than 6.8, the line defined by \(3x+2y = c\) will not intersect the feasible region. Hence, the profit cannot exceed 6.8 dollars.

As the line \(3x+2y = 6.8\) does intersect the feasible region, \(6.8\) is the maximum value for the objective function. Note that there is only one point in the feasible region that intersects the line \(3x+2y=6.8\), namely \([x^*, y^*]^{\!\top} = [ 1.2, 1.6]^{\!\top}\). In other words, to maximize profit, we want to prepare 1.2 units of lemonade and 1.6 units of lemon juice.

This solution method can hardly be regarded as rigorous because we relied on a picture to conclude that \(3x + 2y \leq 6.8\) for all \([x,y]^{\!\top}\) satisfying the constraints. But we can also obtain this result algebraically.

Note that multiplying both sides of the constraint \(x + 3y \leq 6\) by \(0.2\) yields \[0.2x + 0.6 y \leq 1.2,\] and multiplying both sides of the constraint \(2x + y \leq 4\) by \(1.4\) yields \[2.8x + 1.4 y \leq 5.6.\] Hence, any \([x,y]^{\!\top}\) that satisfies both \[x+3y\leq 6\quad\mbox{and}\quad 2x+y \leq 4\] must also satisfy \[(0.2x+0.6y) + (2.8x+1.4y) \leq 1.2 + 5.6,\] which simplifies to \(3x + 2y \leq 6.8\), as desired.

It is always possible to find an algebraic proof like the one above for linear programming problems, which adds to their appeal. To describe the full result, it is convenient to call on duality, a central notion in mathematical optimization.

2.4.1 Linear Programming Duality

Let (P) denote following linear programming problem: \[\begin{array}{|rl} \min & \mathbf{c}^\mathsf{T} \mathbf{x} \\ \mbox{s.t.} & \mathbf{A}\mathbf{x} \geq \mathbf{b} \end{array}\] where \(\mathbf{c} \in \mathbb{R}^n\) \(\mathbf{b} \in \mathbb{R}^m\) \(\mathbf{A} \in \mathbb{R}^{m\times n}\) (inequality on \(m-\)tuples is applied component-wise.)

Then for every \(\mathbf{y} \in \mathbb{R}^m_+\) (that is, all components of \(\mathbf{y}\) are non-negative), the inferred inequality \(\mathbf{y}^\mathsf{T}\mathbf{A}\mathbf{x} \geq \mathbf{y}^\mathsf{T} \mathbf{b}\) is valid for all \(\mathbf{x}\) satisfying \(\mathbf{A}\mathbf{x} \geq \mathbf{b}\).

Furthermore, if \(\mathbf{y}^\mathsf{T}\mathbf{A} = \mathbf{c}^\mathsf{T},\) the inferred inequality becomes \(\mathbf{c}^\mathsf{T} \mathbf{x} \geq \mathbf{y}^\mathsf{T} \mathbf{b}\), making \(\mathbf{y}^\mathsf{T} \mathbf{b}\) a lower bound on the optimal value of (P). To obtain the largest possible bound, we can solve \[\begin{array}{|rl} \max & \mathbf{y}^\mathsf{T} \mathbf{b} \\ \mbox{s.t.} & \mathbf{y}^\mathsf{T} \mathbf{A} = \mathbf{c}^\mathsf{T} \\ & \mathbf{y} \geq \mathbf{0}. \end{array}\]

This problem is called the dual problem of (P), and (P) is called the primal problem. A remarkable result relating (P) and its dual (P’) is the Duality Theorem for Linear Programming: if (P) has an optimal solution, then so does its dual problem (P’), and the optimal values of the two problems are the same.

A weaker result follows easily from the discussion above: the objective function value of a feasible solution to the dual problem (P’) is a lower bound on the objective function value of a feasible solution to (P). This result is known as weak duality. Despite the fact that it is a simple result, its significance in practice cannot be overlooked because it provides a way to gauge the quality of a feasible solution to (P).

For example, suppose we have at hand a feasible solution to (P) with objective function value 3 and a feasible solution to the dual problem (P’) with objective function value 2. Then we know that the objective function value of our current solution to (P) is within \(1.5\) times the actual optimal value since the optimal value cannot be less than \(2\).

In general, a linear programming problem can have a more complicated form. Let \(\mathbf{A} \in \mathbb{R}^{m\times n}\), \(\mathbf{b} \in \mathbb{R}^m\), \(\mathbf{c} \in \mathbb{R}^n\). Let \({\mathbf{a}^{(i)}}^\mathsf{T}\) denote the \(i\)th row of \(\mathbf{A}\), \(\mathbf{A}_j\) denote the \(j\)th column of \(\mathbf{A}\), and (P) denote the minimization problem, with variables in the tuple \(\mathbf{x} = [ x_1, \cdots, x_n]^{\!\top}\), given as follows:

  • the objective function to be minimized is \(\mathbf{c}^\mathsf{T} \mathbf{x}\);

  • the constraints are \({\mathbf{a}^{(i)}}^\mathsf{T}\mathbf{x}~\sqcup_i~b_i\), where \(\sqcup_i\) is \(\leq\), \(\geq\), or \(=\) for \(i = 1,\ldots, m\), and

  • for each \(j \in \{1,\ldots,n\}\), \(x_j\) is constrained to be non-negative, nonpositive, or free.

Then the dual problem (P’) is defined to be the maximization problem, with variables in the tuple \(\mathbf{y} = [ y_1, \cdots, y_m]^{\!\top}\) given as follows:

  • the objective function to be maximized is \(\mathbf{y}^\mathsf{T} \mathbf{b}\);

  • for \(j = 1,\ldots, n\), the \(j\)th constraint is \[\left \{\begin{array}{ll} \mathbf{y}^\mathsf{T}\mathbf{A}_j \leq c_j & \text{if } x_j \text{ is constrained to be non-negative} \\ \mathbf{y}^\mathsf{T}\mathbf{A}_j \geq c_j & \text{if } x_j \text{ is constrained to be nonpositive} \\ \mathbf{y}^\mathsf{T}\mathbf{A}_j = c_j & \text{if } x_j \text{ is free}. \end{array}\right.\]

  • and for each \(i \in \{1,\ldots,m\}\), \(y_i\) is constrained to be non-negative if \(\sqcup_i\) is \(\geq\); \(y_i\) is constrained to be non-positive if \(\sqcup_i\) is \(\leq\); \(y_i\) is free if \(\sqcup_i\) is \(=\).

The following table can help remember the correspondences:

Primal (min) Dual (max)
\(\geq\) constraint \(\geq 0\) variable
\(\leq\) constraint \(\leq 0\) variable
\(=\) constraint free variable
\(\geq 0\) variable \(\geq\) constraint
\(\leq 0\) variable \(\leq\) constraint
free variable \(=\) constraint

Below is an example of a primal-dual pair of problems based on the above definition.

Consider the primal problem: \[\begin{array}{|rrcrcrcl} \mbox{min} & x_1 & - & 2x_2 & + & 3x_3 & \\ \mbox{s.t.} & -x_1 & & & + & 4x_3 & = &5 \\ & 2x_1 & + & 3x_2 & - & 5x_3 & \geq & 6 \\ & & & 7x_2 & & & \leq & 8 \\ & x_1 & & & & & \geq & 0 \\ & & & x_2 & & & & \mbox{free} \\ & & & & & x_3 & \leq & 0.\\ \end{array}\] Here, \(\mathbf{A}= \begin{bmatrix} -1 & 0 & 4 \\ 2 & 3 & -5 \\ 0 & 7 & 0 \end{bmatrix}\), \(\mathbf{b} = \begin{bmatrix}5 \\6\\8\end{bmatrix}\), and \(\mathbf{c} = \begin{bmatrix}1 \\-2\\3\end{bmatrix}\).

Since the primal problem has three constraints, the dual problem has three variables:

  • the first constraint in the primal is an equation, the corresponding variable in the dual is free;

  • the second constraint in the primal is a \(\geq\)-inequality, the corresponding variable in the dual is non-negative;

  • the third constraint in the primal is a \(\leq\)-inequality, the corresponding variable in the dual is non-positive.

Since the primal problem has three variables, the dual problem has three constraints:

  • the first variable in the primal is non-negative, the corresponding constraint in the dual is a \(\leq\)-inequality;

  • the second variable in the primal is free, the corresponding constraint in the dual is an equation;

  • the third variable in the primal is non-positive, the corresponding constraint in the dual is a \(\geq\)-inequality.

Hence, the dual problem is: \[\begin{array}{|rrcrcrcl} \max & 5y_1 & + & 6y_2 & + & 8y_3 & \\ \mbox{s.t.} & -y_1 & + & 2y_2 & & & \leq & 1 \\ & & & 3y_2 & + & 7y_3 & = & -2 \\ & 4y_1 & - & 5y_2 & & & \geq & 3 \\ & y_1 & & & & & & \mbox{free} \\ & & & y_2 & & & \geq & 0 \\ & & & & & y_3 & \leq & 0.\\ \end{array}\]

In some references, the primal problem is always a maximization problem – in that case, what we have considered to be a primal problem is their dual problem and vice-versa.

Note that the Duality Theorem for Linear Programming remains true for the more general definition of the primal-dual pair of linearprogramming problems.

2.4.2 Methods for Solving LP Problems

There are currently two families of methods used by modern-day linear programming solvers: simplex methods and interior-point methods.

We will not get into the technical details of these methods, except to say that the algorithms in either family are iterative, that there is no known simplex method that runs in polynomial time, but efficient polynomial-time interior-point methods abound in practice. We might wonder why anyone would still use simplex methods, given that they are not polynomial-time methods: simply put, simplex methods are in general more memory-efficient than interior-point methods, and they tend to return solutions that have few nonzero entries.

More concretely, suppose that we want to solve the following problem: \[\begin{array}{|rl} \min & \mathbf{c}^\mathsf{T}\mathbf{x} \\ \mbox{s.t.} & \mathbf{A}\mathbf{x} = \mathbf{b} \\ & \mathbf{x} \geq \mathbf{0}. \end{array}\]

For ease of exposition, we assume that \(\mathbf{A}\) has full row rank. Then, each iteration of a simplex method maintains a current solution \(\mathbf{x}\) that is basic, in the sense that the columns of \(\mathbf{A}\) corresponding to the nonzero entries of \(\mathbf{x}\) are linearly independent. In contrast, interior-point methods will maintain \(\mathbf{x} > \mathbf{0}\) throughout (whence the name “interior point”).

When we use an off-the-shelf linear programming solver, the choice of method is usually not too important since solvers have good default settings. Simplex methods are typically used in settings when a problem needs to be resolved after minor changes in the problem data or in problems with additional integrality constraints discussed in the next section.