Anda di halaman 1dari 45

ORF 522

Linear Programming and Convex Analysis


Canonical & Standard Programs,
Linear Equations in an LP Context
Marco Cuturi
ORF-522 1
Today
A typology for linear programs.
Linear equations reminders,
Basic solutions, the kind of solutions we will be interested in,
Hyperplanes, or how to visualize linear objectives/constraints.
A few grams of topology to dene halfspaces.
ORF-522 2
Typology of Linear Programs
ORF-522 3
Remember...
the general form of linear programs:
max or min z = c
1
x
1
+ c
2
x
2
+ + c
n
x
n
,
subject to
_

_
a
11
x
1
+ a
12
x
2
+ + a
1n
x
n
_
<,>
=
,
_
b
1
,
.
.
.
.
.
.
a
m1
x
1
+ a
m2
x
2
+ + a
mn
x
n
_
<,>
=
,
_
b
m
,
where x
1
, x
2
, , x
n
0.
This form is however too vague to be easily usable.
First step: get rid of the strict inequalities: do not bring much and would only
add numerical noise.
Second step: use matrix and vectorial notations to alleviate.
ORF-522 4
Notations
Unless explicitly stated otherwise,
A, B etc... are matrices whose size is clear from context.
x, b, a are vectors. a
1
, a
k
are members of a vector family.
x =
_
x
1
.
.
.
x
n
_
with vector coordinates x
i
in R.
x 0 is meant coordinate-wise, that is x
i
0 for 1 i n
x = 0 means that x is not the zero vector, i.e. there exists at least one index i
such that x
i
= 0.
x
T
is the transpose [
x
1
, ,x
n
] of x.
ORF-522 5
Linear Program
Common representation for all these programs?
Would help in developing both theory & algorithms.
Also helps when developing software, solvers, etc
The answer is yes. . .
There are 2: standard form and canonical form
ORF-522 6
Terminology
A linear program in canonical form is the program
max or min c
T
x
subject to Ax b,
x 0.
b 0 feasible canonical form (just a convention)
A linear program in standard form is the program
max or min c
T
x (1)
subject to Ax = b, (2)
x 0. (3)
ORF-522 7
Linear Programs: a look at the canonical form
Canonical form linear program
Maximize the objective
Only inequality constraints
All variables should be positive
Example:
maximize 5x
1
+ 4x
2
+ 3x
3
subject to 2x
1
+ 3x
2
+ x
3
5
4x
1
+ x
2
+ 2x
3
11
3x
1
+ 4x
2
+ 2x
3
8
x
1
, x
2
, x
3
0.
ORF-522 8
Linear Programs: canonical form
Although more intuitive than the standard form, the canonical is not the most
useful,
We will formulate the simplex method on problems with equality constraints,
that is standard forms.
Solvers do not all agree on this input format. MATLAB for example uses:
minimize

i
c
i
x
i
subject to

n
j=1
A
ij
x
j
b
i
, i = 1, . . . , m
1

n
j=1
B
ij
x
j
= d
i
, i = 1, . . . , m
2
l
i
x
i
u
i
, i = 1, . . . , n
Ultimately: this is a non-issue, we can easily switch from one form to the
other. . .
ORF-522 9
Linear Programs: standard & canonical form
equalities inequalities
What if the original problem has equality constraints?
Replace equality constraints by two inequality constraints.
The inequality
2x
1
+ 3x
2
+x
3
= 5,
is equivalent to
2x
1
+ 3x
2
+x
3
5 and 2x
1
+ 3x
2
+x
3
5
The new problem is equivalent to the previous one. . .
ORF-522 10
Linear Programs: standard & canonical form
inequalities equalities
The opposite direction works too. . .
Turn inequality constraints into equality constraints by adding variables.
The inequality
2x
1
+ 3x
2
+x
3
5,
is equivalent to
2x
1
+ 3x
2
+x
3
+w
1
= 5 and w
1
0,
The new variable is called a slack variable (one for each inequality in the
program). . .
The new problem is equivalent to the previous one. . .
ORF-522 11
Linear Programs: standard & canonical form
free variable positive variables
What about free variables?
A free variable is simply the dierence of its positive and negative parts. Again
the solution is again adding variables.
If the variable y is free, we can write it
y
1
= y
2
y
3
and y
2
, y
3
0,
We add two positive variables for each free variable in the program.
Again, the new problem is equivalent to the previous one.
ORF-522 12
Linear Programs: standard & canonical form
minimizing maximizing
What happens when the objective is to minimize? We can use the fact that
min
x
f(x) = max
x
f(x)
In a linear program this means
minimize 6x
1
3x
2
+ 5x
3
becomes:
maximize 6x
1
+ 3x
2
5x
3
Thats all we need to convert all linear programs in standard form. . .
ORF-522 13
Linear Programs: standard & canonical form
Example. . .
minimize 2x
1
4x
2
+ x
3
subject to 2x
1
+ 7x
2
+ x
3
= 5
4x
1
+ x
2
+ 9x
3
11
3x
1
+ 4x
2
+ 2x
3
= 8
x
1
, x
2
0.
This program has one free variable (x
3
) and one inequality constraint. Its a
minimization problem. . .
ORF-522 14
Linear Programs: standard & canonical form
We rst turn it into a maximization. . .
maximize 2x
1
+ 4x
2
x
3
subject to 2x
1
+ 7x
2
+ x
3
= 5
4x
1
+ x
2
+ 9x
3
11
3x
1
+ 4x
2
+ 2x
3
= 8
x
1
, x
2
0.
Just switch the signs in the objective. . .
ORF-522 15
Linear Programs: standard & canonical form
We then turn the inequality into an equality constraint by adding a slack
variable. . .
maximize 2x
1
+ 4x
2
x
3
subject to 2x
1
+ 7x
2
+ x
3
= 5
4x
1
+ x
2
+ 9x
3
+ w
1
= 11
3x
1
+ 4x
2
+ 2x
3
= 8
x
1
, x
2
0.
Now, we only need to get rid of the free variable. . .
ORF-522 16
Linear Programs: standard & canonical form
We replace the free variable by a dierence of two positive ones:
maximize 2x
1
+ 4x
2
(x
4
x
5
)
subject to 2x
1
+ 7x
2
+ x
4
x
5
= 5
4x
1
+ x
2
+ 9x
4
9x
5
+ w
1
= 11
3x
1
+ 4x
2
+ 2x
4
2x
5
= 8
x
1
, x
2
, x
4
, x
5
0.
Thats it, weve reached a standard form.
The simplex algorithm is easier to write with this form.
ORF-522 17
To sum up...
A linear program in standard form is the program
minimize c
T
x
subject to Ax = b,
x 0.
(4)
where
c, x R
n
the objective,
A R
mn
and b R
m
the equality constraints,
x 0 means that for x = (x
1
, . . . , x
n
), x
i
0 for 1 i n.
From now on we focus on
linear constraints Ax = b,
objective function c
T
x,
separately.
x 0 will reappear when we study convexity.
ORF-522 18
Linear Equations
ORF-522 19
Linear Equations
The usual linear equations we know, m = n
In the usual linear algebra setting, A is square of size n and invertible.
Straightforward: {x R
n
|Ax = b} is a singleton, {A
1
b}.
Focus: nd eciently that unique solution. Many methods (Gaussian pivot
etc.)
In classic statistics, most often m n
A few explicative variables, a lot of observations.
Generally {x R
n
|Ax = b} = so we need to tweak the problem
Least-squares regression: select x
0
| x
0
= argmin |Ax b|
2
More advanced, penalized LS regression: x
0
= argmin(|Ax b|
2
+x)
ORF-522 20
Linear Equations
On the other hand, in an LP setting where usually m < n
{x R
n
|Ax = b} is a wider set of candidates, a convex set.
In LP, a linear criterion is used to choose one of them.
In other elds, such as compressed sensing, other criterions are used.
Today we start studying some simple properties of the set {x R
n
|Ax = b}.
ORF-522 21
Linear Equations
Linear Equation: Ax = b, m equations.
a
11
x
1
+ a
12
x
2
+ + a
1n
x
n
= b
1
,
.
.
.
.
.
.
a
m1
x
1
+ a
m2
x
2
+ + a
mn
x
n
= b
m
.
Writing A = [a
1
, , a
n
] we have n columns R
m
.
Add now b: A
b
= [A, b] R
mn+1
.
remember: a solution to Ax = b is a vector x such that
n

i=1
x
i
a
i
= b,
that is the b and as should be linearly dependent (l.d.) for everything to
work.
ORF-522 22
Linear Equations
Two cases (note that Rank(A) cannot be > Rank(A
b
))
(i) Rank(A) < Rank(A
b
); b and as are linearly independent (l.i.). no
solution.
(ii) Rank(A) = Rank(A
b
) = k; every column of A
b
, b in particular, can be
expressed as a linear combination of k other columns of the matrix
a
i
1
, , a
i
k
. Namely, x such that
k

j=1
x
i
j
a
i
j
= b.
In practice
if m = n = k, then there is a unique solution: x = A
1
b;
Usually Rank(A) = k m < n and we have a plenty of solutions;
We assume from now on that Rank(A) = Rank(A
b
) = m.
ORF-522 23
Linear Equation Solutions
if x
1
and x
2
are two dierent solutions, then R, x
1
+ (1 )x
2
is a
solution.
Rank(A) = m. There are m independent columns. Suppose we reorder them
so that a
1
, , a
m
are linearly independent.
Then
A =
_

_
a
11
a
12
a
1m
a
21
a
22
a
2m
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mm

a
1m+1
a
1m+2
a
1n
a
2m+1
a
2m+2
a
2n
.
.
.
.
.
.
.
.
.
a
mm+1
a
mm+2
a
mn
_

_
= [B, R]
B is mm square, R is m(n m) rectangular.
ORF-522 24
Linear Equation Solutions
suppose we divide x =
_
x
B
x

_
where x
B
R
m
and x

R
mn
If Ax = b then Bx
B
+Rx

= b. Since B is non-singular, we have


x
B
= B
1
(b Rx

),
which shows that we can assign arbitrary values to x

and obtain dierent


points x such that Ax = b.
Solutions are parameterized by x

... a bit problematic since R is the


discarded part.
We choose x

= 0 and focus on the choice of B.


ORF-522 25
Basic Solutions
ORF-522 26
Basic Solutions
Denition 1. Consider Ax = b and suppose Rank(A) = m < n. Let
I = (i
1
, , i
m
) be a list of indexes corresponding to m linearly
independent columns taken among the n columns of A.
We call the m variables x
i
1
, x
i
2
, , x
i
m
of x its basic variables,
the other variables are called non-basic.
If x is a vector such that Ax = b and all its non-basic variables are equal
to 0 then x is a basic solution.
ORF-522 27
Basic Solutions
When reordering variables as in the previous slide, and dening
B = [a
i
1
, , a
i
m
] we can set x

= 0. Then x
B
= B
1
b and
x =
_
x
B
0
_
,
and we have a basic solution.
Sidenote: a basic feasible solution to an LP Equation (4) is such that x is
basic and x 0.
ORF-522 28
Basic Solutions
More generally, let
B
I
= [a
i
1
, , a
i
m
],
R
O
= [a
o
1
, , a
o
mn
],
where O = {1, , n} \ I = (o
1
, , o
mn
) is the complementary of I in
{1, , n} in increasing order.
I contains the indexes of vectors in the basis, O contains the indexes of vectors
outside the basis.
Equivalently set x
I
=
_
x
i
1
.
.
.
x
i
m
_
, x
O
=
_
x
o
1
.
.
.
x
o
nm
_
.
Ax = B
I
x
I
+R
O
x
O
ORF-522 29
Basic Solutions
The two things to remember so far:
A list I of m independent columns One basic solution x, with
x
I
= B
1
I
b and x
O
= 0
We are not interested in all basic solutions, only a subset: basic feasible
solutions.
ORF-522 30
Basic Solutions: Degeneracy
Denition 2. A basic solution to Ax = b is degenerate if one or more of
the m basic variables is equal to zero.
For a basic solution, x
O
is always 0. On the other hand, we do not expect
elements of x
I
to be zero.
This is degeneracy which appears whenever there is one or more components
of x
I
which are zero.
ORF-522 31
Basic Solutions: Example
Consider Ax = b where
A =
_
1 2 1 0 3
0 1 2 1 3
_
, b =
_
1
1
_
.
We start by choosing I:
I = (1, 2). B
I
= [a
1
, a
2
] = [
1 2
0 1
] x
I
= [
1
1
] ; x =
_
1
1
0
0
0
_
is basic.
I = (1, 4). B
I
= [a
1
, a
4
] = [
1 0
0 1
] x
I
= [
1
1
] ; x =
_
1
0
0
1
0
_
is basic.
I = (2, 5). B
I
= [a
2
, a
5
] = [
1 3
0 3
] x
I
=
_
0
1
3
_
; x =
_

_
0
0
0
0
0
1
3
_

_
is degenerate basic
note that a
5
and b are colinear...
ORF-522 32
Non-degeneracy
Theorem 1. A necessary and sucient condition for the existence and
non-degeneracy of all basic solutions of Ax = b is the linear independence
of every set of m columns of A
b
, the augmented matrix.
Proof. Proof strategy: the existence of all possible basic solutions is
already a good sign: all families of m columns of A are l.i. What we need is
show that m1 columns of A plus b are also l.i.
if all m columns choices are independent, basic solutions exist, and are
non-degenerate because b is l.i. with any combination of m1 columns.
ORF-522 33
Non-degeneracy
Proof. : Let I = (i
1
, , i
m
) a family of indexes.
The basic solution associated with I exists and is non-degenerate. b = 0
Hence by denition {a
i
1
, , a
i
m
} is l.i. and b =

m
k=1
x
k
a
i
k
.
For a given r, suppose {a
i
1
, , a
i
r1
, a
i
r+1
, , a
i
m
, b} is l.d.
Then (
1
, ,
r1
,
r+1
,
m
) and such that
b +
m

k=1,k=r

k
a
i
k
= 0.
Note that necessarily = 0 (otherwise {a
i
1
, , a
i
r1
, a
i
r+1
, , a
i
m
} is l.d)
Contradiction: degenerate solution for I, (

, ,

r1

, 0,

r+1

)
: Let I = (i
1
, , i
m
) a family of indexes.
A basic solution exists,

m
k=1
x
k
a
i
k
= b
Suppose it is degenerate, i.e. x
r
= 0. Then

m
k=1,k=r
x
k
a
i
k
b = 0
Contradiction: {a
i
1
, , a
i
r1
, a
i
r+1
, , a
i
m
, b}, of size m, is l.d.
ORF-522 34
Non-degeneracy
Corollary 1. Given a basic solution to Ax = b with basic variables x
i
1
, , x
i
m
,
a necessary and sucient condition for the solution to be non-degenerate is
the l.i. of b with every subset of m1 columns of {a
i
1
, , a
i
m
}
In our previous example,
A =
_
1 2 1 0 3
0 1 2 1 3
_
, b =
_
1
1
_
, m = 2.
Hence if I = (2, 5), [b, a
2
] and [b, a
5
] should be of rank 2 for the solution not
to be degenerate. Yet [b, a
5
] = [
1 3
1 3
] is clearly of rank 1.
ORF-522 35
Hyperplanes
ORF-522 36
Hyperplane
Denition 3. A hyperplane in R
m
is dened by a vector c = 0 R
m
and a
scalar z R as the set {x R
m
|c
T
x = z}.
z = 0,
A hyperplane H
c,z
contains 0 i z = 0.
In that case H
c,0
is a vector subspace and dim(H
c,0
) = n 1
z = 0,
For x
1
, x
2
easy to check that c
T
(x
1
x
2
) = 0. In other words c is orthogonal
to vectors lying in the hyperplane.
c is called the normal of the hyperplane
ORF-522 37
Ane Subspace
Denition 4. Let V be a vector space and let L be a vector subspace of V .
Then given x V , the translation T = L +x = {u +x, u L} is called an
ane subspace of V .
the dimension of T is the dimension of L.
T is parallel to L.
ORF-522 38
Ane Hyperplane
For c = 0, H
c,0
is a Vector subspace of R
m
of dimension n 1.
When z = 0, H
c,z
is an ane hyperplane: its easy to see that
H
c,z
= H
c,0
+
z
c
2
c
c
H
c,0
H
c,z
z
c
2
c
0
ORF-522 39
Some grams of Topology and Halfspaces
ORF-522 40
A bit of topology: open and closed balls
The n dimensional open ball centered at x
0
with radius r is dened as
B
r
(x
0
) = {x R
n
s.t.|x x
0
| < r},
its closure
B
r
(x
0
) = {x R
n
s.t.|x x
0
| r},
x
1
x
2
r
1
r
2
B
r
1
(x
1
)
B
r
2
(x
2
)
ORF-522 41
A bit of topology: boundary
Let S R
n
. A point x is a boundary point of S if every open ball centered
at x contains both a point in S and a point in R
n
\ S.
A boundary point can either be in S or not in S.
x
1
C
x
2
x
3
x
1
is a boundary point, x
2
and x
3
are not.
ORF-522 42
A bit of topology: open and closed sets
The set of all boundary points of S is the boundary S of S.
A set is closed if S S. A set is open if R
n
\ S is closed.
Note that there are sets that are neither open nor close.
The closure S of a set S is S S
The interior S
o
of a set S is S \ S
A set S is closed i S = S and open i S = S
o
.
ORF-522 43
Halfspaces
For a hyperplane H, its complement in R
n
is the union of two sets called open
halfspaces;
R
n
\ H = H
+
H

where
H
+
= {x R
m
|c
T
x > z}
H

= {x R
m
|c
T
x < z}
H
+
= H
+
H and H

= H

H are closed halfspaces.


H

H
c
ORF-522 44
Coming Next
some basic convexity,
important interplay between convex sets and hyperplanes,
starting with some nice results, Caratheodory theorem,
laying out theoretical fundations to attack the simplex.
ORF-522 45

Anda mungkin juga menyukai