Péter Gács
Spring 09
Vectors
Vector spaces
Linear dependence
Examples
{(1, 2), (3, 6)}. Two vectors are dependent when one is a scalar
multiple of the other.
{(1, 0, 1), (0, 1, 0), (1, 1, 1)}.
Theorem
A set is a basis iff it is a minimal generating set.
Examples
A basis of { (x, y, z) : x + y + z = 0 } is {(0, 1, −1), (1, 0, −1)}.
A basis of { (2t + u, u, t − u) : t, y ∈ R } is {(2, 0, 1), (1, 1, −1)}.
Theorem
All bases have the same number of elements.
Example
The set of all n-tuples of real numbers with the property that the
sum of their elements is 0 has dimension n − 1.
x = x1 b1 + · · · + xn b n .
Example
If M is the set Rn of all n-tuples of real numbers then the
n-tuples of form e i = (0, . . . , 1, . . . , 0) (only position i has 1) form a
basis. Then (x1 , . . . , xn ) = x1 e1 + · · · + xn e n .
Example
If A is the set of all n-tuples whose sum is 0 then the n − 1 vectors
Matrices
(a i j ). Dimensions. m × n
Diagonal matrix diag(a 11 , . . . , a nn )
Identity matrix.
Triangular (unit triangular) matrices.
Permutation matrix.
Transpose A T . Symmetric matrix.
x1 = a 11 y1 + · · · + a 1 q yq
.. ..
. .
x p = a p1 y1 + · · · + a pq yq
x = A y.
Matrix multiplication
y1 = b 11 z1 + · · · + b 1r z r
.. ..
. .
yq = b q1 z1 + · · · + b qr z r
x = Cz where C = (c ik ),
c ik = a i1 b 1k + · · · + a iq b qk (i = 1, . . . , p, k = 1, . . . , r).
AB = C
x = A y = A(Bz) = Cz = (AB)z.
Transpose of product
Easy to check: (AB)T = B T A T .
Inner product
If a = (a i ), b = (b i ) are vectors of the same dimension n taken as
column vectors then
aT b = a1 b1 + · · · + a n b n
The (less frequently used) outer product makes sense for any two
column vectors of dimensions p, q, and is the p × q matrix
ab T = (a i b j ).
Inverse, rank
Example
µ ¶−1 µ ¶
1 1 0 1
= .
1 0 1 −1
(AB)−1 = B−1 A −1 .
(A T )−1 = (A −1 )T .
A square matrix with no inverse is called singular. Nonsingular
matrices are also called regular.
Example
¡1 0¢
The matrix 10 is singular.
Ax = x1 a1 + · · · + xn a n .
Theorem
A square matrix A is singular iff KerA 6= {0}.
Theorem
The two ranks are the same (see proof later). Also, rank(A) is the
smallest r such that there is an m × r matrix B and an r × n
matrix C with A = BC.
Proposition
A triangular matrix with only r rows (or only r columns) and all
non-0 diagonal elements in those rows, has row rank and column
rank r.
Example
The outer product A = bc T of two vectors has rank 1, and this
product is the decomposition.
Proposition
A square matrix is nonsingular iff it has full rank.
Minors.
Determinant
Definition
A permutation: an invertible map σ : {1, . . . , n} → {1, . . . , n}.
The product of two permutations σ, τ is their consecutive
application: (στ)(x) = σ(τ(x)).
A transposition is a permutation that interchanges just two
elements.
An inversion in a permutation: a pair of numbers i < j with
σ(i) > σ( j). We denote by Inv(σ) the number of inversions in σ.
A permutation σ is even or odd depending on whether Inv(σ)
is even or odd.
Proposition
a A transposition is always an odd permutation.
b Inv(στ) ≡ Inv(σ) + Inv(τ) (mod 2).
Definition
Let A = (a i j ) an n × n matrix. Then
(−1) i+ j a i j det(A i j ).
X
det(A) =
j
Properties
det A = det(A T ).
det(v1 , v2 , . . . , v n ) is multilinear, that is linear in each
argument separately. For example, in the first argument:
Hence det(0, v2 , . . . , v n ) = 0.
Antisymmetric: changes sign at the swapping of any two
arguments. For example for the first two arguments:
det(v2 , v1 , . . . , v n ) = − det(v1 , v2 , . . . , v n ).
Hence det(u, u, v2 , . . . , v n ) = 0.
It follows that any multiple of one row (or column) can be added to
another without changing the determinant. From this it follows:
Theorem
A square matrix is singular iff its determinant is 0.
Theorem
det(AB) = det(A) det(B).
x 7→ xT Ax =
X
a i j xi x j .
ij
xT B T Bx = (Bx)T (Bx),
Theorem
A is positive definite iff A = B T B for some nonsingular B.
nX
−1
f = f (x) = a i xi ,
i =0
nX −1
g = f (x) = b i xi ,
i =0
2Xn−2
f (x)g(x) = h(x) = c k xk ,
k=0
where c k = a 0 b k + a 1 b k−1 + · · · + a k b 0 .
M(n) É n2 .
Can we do better?
f g = f 0 g 0 + x m ( f 0 g 1 + f 1 g 0 ) + x2 m f 1 g 1 .
M(2m) É 4M(m)
f 0 g 1 + f 1 g 0 = ( f 0 + f 1 )(g 0 + g 1 ) − f 0 f 1 − g 0 g 1 . (2)
M(2k ) É 3k M(1) = 3k .
µ ¶ µ ¶
a b e f
A= , B= ,
c d g h
µ ¶
r s
C = AB = .
t u
Let
P1 = a( f − h),
P2 = (a + b)h,
P3 = (c + d)e,
P4 = d(g − e),
P5 = (a + d)(e + h),
P6 = (b − d)(g + h),
P7 = (a − c)(e + f ).
Then
r = −P2 + P4 + P5 + P6 ,
s = P1 + P2 ,
(3)
t = P3 + P4 ,
u = P1 − P3 + P5 − P7 .
M(n) É nlog 7 .
Linear equations
Informal treatment first
a 11 x1 + · · · + a 1n xn = b 1 ,
.. ..
. .
a m1 x1 + · · · + a mn xn = b m .
n · n · (n + (n − 1) + · · · ) ≈ n3 /2.
1 + 2 + · · · + (n − 1) ≈ n2 /2.
Example (Chvatal)
A sparse system that fills in.
x1 + x2 + x3 + x4 + x5 + x6 = 4,
x1 + 6x2 = 5,
x1 + 6x3 = 5,
x1 + 6x4 = 5,
x1 + 6x5 = 5,
x1 + 6x6 = 5.
Duality
a 11 y1 + · · · + a m1 ym = 0,
.. ..
. .
a 1n y1 + · · · + a mn ym = 0,
b 1 y1 + · · · + b m ym = 1.
LUP decomposition
Example
The following matrix represents the permutation (2, 3, 1) since its
rows are obtained by this permutation from the unit matrix:
0 0 1
1 0 0
0 1 0
P A = LU
P b = P Ax = LU x.
Repeating:
1 −1
B3 = L−
2 L1 A,
0 a 11 a 12 a 13 ...
1 0 0 0 ...
λ 0 a(1) a(1) . . .
2 1 0 0 ... 0 22 23
a(2)
λ µ3 1 0 ... 0 0 0 . . .
A = L1 L2 B3 =
3
33
λ4 µ4
0 1 ... 0 0 0 a(2) . . .
.. .. .. .. .. . 43
.. .. .. ..
. . . . . . .. .
. .
Example: If
wT
µ ¶
a 11
A=
v A0
then setting
µ ¶ µ ¶
1 0 1 1 0
L1 = , L−
1 = ,
v/a 11 I n−1 −v/a 11 I n−1
wT
µ ¶
a 11
B2 = .
0 A 0 − vwT /a 11
PL−1 A = L3 B4 ,
P A = PLP −1 L3 B4 = L̂L3 B4 where
1 0 0 0 ... 0
λ 1 0 0 ... 0
2
−1 λ µπ(3) 1 0 . . . 0 ,
L̂ = PLP = π(3)
λπ(4) µπ(4) 0 1 . . .
0
.. .. .. .. . . ..
. . . . . .
P A = LB k+1 ,
for i = 1 to n do π[i] ← i
for k = 1 to n do
p←0
for i = k to n do
if |a ik | > p then
p ← |a ik |
k0 ← i
if p = 0 then error “singular matrix”
exchange π[k] ↔ π[k0 ]
for i = 1 to n do exchange a ki ↔ a k0 i
for i = k + 1 to n do
a ik ← a ik /a kk
for j = k + 1 to n do a i j ← a i j − a ik a k j
P AQ = LU, P A = LUQ −1 .
P b = P Ax = LUQ −1 x.
Proposition
For an n × n matrix A, the row rank is the same as the column
rank.
n Xn
( a2i j )1/2 .
Y
det A É |a1 | · · · |a n | =
i =1 j =1
a c ad + bc
+ = .
b d bd
If we are lucky, we can simplify the fraction.
It turns out that with Gaussian elimination, we will be lucky
enough.
Theorem
Assume that Gaussian elimination on an integer matrix A
succeeds without pivoting. Every intermediate term in the
Gaussian elimination is a fraction whose numerator and
denominator are some subdeterminants of the original matrix.
Example
0.0001x + y=1
(4)
0.5x + 0.5y = 1
Example
a 11 x + a 12 y = 10, 000
a 21 x + a 22 y = 1
The problem is that our system is not well scaled. Row scaling
and column scaling:
X
ri ai j s j x j = ri bi
ij
Example
In (5), let r 1 = 10−4 , all other coeffs are 1: We get back (4), which
we solve by partial pivoting as before.
Sometimes, like here, there are several ways to scale, and not all
are good.
Example
Choose s 2 = 10−4 , all other coeffs 1:
x+ y0 = 10, 000
0.5x + 0.00005y0 = 1
−0.49995y0 = −4999
y0 = 10000 after rounding
x= 0
Inverting matrices
A X i = ei, i = 1, . . . , n.
B−1 B−1
µ ¶µ ¶ µ ¶
−1 I 0 0 0
L = −1 = .
−D −1 C I 0 D −D CB−1
−1
D −1
¡B C
¢
For an³upper triangular
´
matrix U = 0 D we get similarly
U −1
= B−1 −B−1 CD −1 .
0 D −1
Theorem
Multiplication is no harder than inversion.
Proof. Let
I 0 0 I 0 0 I 0 0
D = L1 L2 = A I 0 = A I 0 0 I 0 .
0 B I 0 0 I 0 B I
Its inverse is
I 0 0 I 0 0 I 0 0
D −1 = L−
2
1 −1
L 1 = 0 I 0 − A I 0 = − A I 0 .
0 −B I 0 0 I AB −B I
Theorem
Inversion is no harder than multiplication.
Let n be power
³ of 2.´ Assume first that A is symmetric, positive
T
B C
definite, A = C D . Trying a block version of the LU
decomposition:
CT
µ ¶µ ¶
I 0 B
A= .
CB−1 I 0 D − CB−1 C T
³ ´³ ´
I 0 C T . By the inversion of triangular
We have A = Q T I B 0 S
matrices learned before:
¶−1 ¶ µ −1
CT B−1 −B−1 C T S −1 −QS −1
µ µ ¶
B B
= = ,
0 S 0 S −1 0 S −1
0 B−1 −QS −1 B + QS −1 Q T
¶ µ −1
−QS −1
µ ¶µ ¶
−1 I
A = = .
−Q T I 0 S −1 −S −1 Q T S −1
Q = B−1 C T , QT CT , S −1 Q T , Q(S −1 Q T ),
So,
Proposition
The Schur complement is positive definite.
Proof.
BT
µ ¶µ ¶
T AT y
(y , z ) = yT A y + yT B T z + z T B y + z T Cz
B C z
= (y + A −1 B T z)T A(y + A −1 B T z) + z T (C − BA −1 B T )z.
Linear programming
Ax É b.
maximize cT x
subject to Ax É b.
Example
Three voting districts: urban, suburban, rural.
Votes needed: 50,000, 100,000, 25,000.
Issues: build roads, gun control, farm subsidies, gasoline tax.
Votes gained, if you spend $ 1000 on advertising on any of these
issues:
minimize x1 + x2 + x3 + x4
subject to −2x1 + 8x2 + 10x4 Ê 50, 000
5x1 + 2x2 Ê 100, 000
3x1 − 5x2 + 10x3 − 2x4 Ê 25, 000
Implicit inequalities: x i Ê 0.
Two-dimensional example
maximize x1 + x2
subject to 4x1 − x2 É 8
2x1 + x2 É 10
5x1 − 2x2 Ê −2
x1 , x2 Ê 0
Worry: there may be too many extremal points. For example, the
set of 2n inequalities
0 É x i É 1, i = 1, . . . , n
Standard form
maximize cT x
subject to Ax É b
xÊ 0
Slack form
In the slack form, the only inequality constraints are
nonnegativity constraints. For this, we introduce slack variables
on the left:
n
X
x n+ i = b i − ai j x j.
j =1
In this form, they are also called basic variables. The objective
function does not depend on the basic variables. We denote its
value by z.
maximize d[t]
subject to d[v] É d[u] + w(u, v) for each edge (u, v)
d[s] Ê 0
Maximum flow
Capacity c(u, v) Ê 0.
P
maximize f (s, v)
v
subject to f (u, v) É c(u, v)
f (u, v) = − f (v, u)
P
v f (u, v) = 0 for u ∈ V − { s, t}
Minimum-cost flow
Edge cost a(u, v). Send d units of flow from s to t and minimize
the total cost
X
a(u, v) f (u, v).
u,v
Multicommodity flow
k different commodities K i = (s i , t i , d i ), where d i is the demand.
The capacities constrain the aggregate flow. There is nothing to
optimize: just determine the feasibility.
Games
Example
m = n = 2, pure strategies {1, 2} are called “attack left”, “attack
right” for player 1 and “defend left”, “defend right” for player 2.
The matrix is
µ ¶
−1 1
A= .
1 −1
minimize t
P
subject to t Ê j ai j q j, i = 1, . . . , m
q Ê 0, j = 1, . . . , n
P j
j qj = 1.
z= 3x1 + x2 + 2x3
x4 = 30 − x1 − x2 + 3x3
x5 = 24 − 2x1 − 2x2 − 5x3
x6 = 36 − 4x1 − x2 − 2x3
x1 = 9 − x2 /4 − x3 /2 − x6 /4
In general:
Lemma
The slack form is uniquely determined by the set of basic variables.
z = 27 + x2 /4 + x3 /2 − 3x6 /4
x1 = 9 − x2 /4 − x3 /2 − x6 /4
x4 = 21 − 3x2 /4 − 5x3 /2 + x6 /4
x5 = 6 − 3x2 /2 − 4x3 + x6 /2
minimize x0
T
subject to a i x − x0 Éb i = 1, . . . , m,
x, x0 Ê0
Duality
AT y Ê c yT A Ê c T
yÊ0 yT Ê 0
T
min b y min yT b
c T x É yT Ax É yT b = b T y. (6)
Interpretation:
b i = the total amount of resource i that you have (kinds of
workers, land, machines).
a i j = the amount of resource i needed for activity j.
c j = the income from a unit of activity j.
x j = amount of activity j.
Ax É b says that you can use only the resources you have.
Primal problem: maximize the income c T x achievable with the
given resources.
Dual problem: Suppose that you can buy lacking resources and
sell unused resources.
Let
sup f (x) = c T x∗ = z∗ .
x
yT (b − Ax) = 0, (yT A − c T )x = 0.
Proposition
Equality of the primal and dual optima implies complementary
slackness.
Interpretation:
Inactive constraints have shadow price yi = 0.
Activities that do not yield the income required by shadow
prices have level x j = 0.
z∗ = max c T x = min yT b = t∗ .
Theorem
If there is an optimum v then there is a basis B ⊂ {1, . . . , m + n}
belonging to a basic feasible solution, and coefficients c̃ i É 0 such
that
c T x = v + c̃ T x,
where c̃ i = 0 for i ∈ B.
ỹi = − c̃ n+ i i = 1, . . . , m.
Ax Éb
AT y Ê c
cT x − bT y = 0
x, yÊ 0
Theory of alternatives
yT A = 0,
yT b < 0.
maximize z
(7)
subject to Ax + z · e É b
minimize yT b
subject to yT A = 0
(8)
yT e = 1
yT Ê 0
Separating hyperplane
Vectors u1 , . . . , u m in an n-dimensional space. Let L be the set of
convex linear combinations of these points: v is in L if
X X
yi u i = v, yi = 1, y Ê 0.
j i
yT U = v T ,
X
yi = 1, y Ê 0. (9)
i
u Ti x É z (i = 1, . . . , m), vT x > z.
Application to games
minimize t
P
subject to t − j ai j q j Ê 0 i = 1, . . . , m (p i )
P
j q j = 1, (z)
q j Ê 0, j = 1, . . . , n
Dual:
maximize z
P
subject to i pi = 1,
P
− i ai j pi + z É 0, j = 1, . . . , n
pi Ê 0 i = 1, . . . , m.
P
maximize v∈V f (s, v)
subject to f (u, v) É c(u, v), u, v ∈ V ,
f (u, v) = − f (v, u), u, v ∈ V ,
P
v∈V f (u, v) = 0, u ∈ V \ { s, t}.
Two variables associated with each edge, f (u, v) and f (v, u).
Simplify. Order the points arbitrarily, but starting with s and
ending with t. Leave f (u, v) when u < v: whenever f (v, u) appears
with u < v, replace with − f (u, v).
P
maximize f (s, v)
v> s
subject to f (u, v) É c(u, v), u < v,
− f (u, v) É c(v, u), u < v,
P P
v> u f (u, v) − v< u f (v, u) = 0, u ∈ V \ { s, t}.
call it y(u).
Dual constraint for each primal variable f (u, v), u < v. Since
f (u, v) is not restricted by sign, the dual constraint is an
equation. If u, v 6= s then f (u, v) has coefficient 0 in the objective
function. Let
y(u, t) = 0 − y(u).
y(s, t) = 0 − (−1).
P
minimize u<v c(u, v)| y(v) − y(u)|
subject to y(s) = −1, y(t) = 0.
Claim
There is an optimal solution in which each y(u) is 0 or 1.
This is the value of the “cut” (S, T). So the dual problem is about
finding a minimum cut, and the duality theorem implies the
max-flow/min-cut theorem.
aTi x É b i , i = 1, . . . , m
L = m · n · k.
Ellipsoids
B(c, r) = { x : (x − c)T (x − c) É r 2 }
E = { Lx : xT x É r 2 } = { y : yT A −1 y É r 2 },
{ x : (x − c)T A −1 (x − c) É r 2 }
E = { x : xT A −2 x É 1 },
x2 y2
+ = 1.
a2 b 2
The numbers a, b are the lenghts of the principal axes of the
ellipse, measured from the center. When they are all equal, we
get the equation of a circle (sphere in n dimensions).
Volume of an ellipsoid
x2 x2
E = x : 12 + · · · + 2n É 1 .
© ª
a1 an
1 δ
N = n n/2 102kn , δ= , ε= ,
2mN 10k n
b0i = b i + δ.
Theorem
p
a There is a ball E 1 of radius É N n and center 0 with the
property that if there is a solution then there is a solution in
E1.
b Ax É b is solvable if and only if Ax É b0 is solvable and its set
of solutions of contains a ball of radius ε.
Lemma
If there is a solution then there is one with | x j | É N for all j.
This implies a .
Now for the lower bound. The coming homework has a problem
showing the following theorem, with
Lemma
If Ax É b has no solution then defining b0i = b i + δ, the system
Ax É b0 has no solution either.
Corollary
If Ax É b0 is solvable then its set of solutions contains a cube of
size 2ε.
The algorithm
Shrinking rate
Lemma
There is an ellipsoid E 2
containing H1 with
1
Vol(E 2 ) É e− 2n Vol(E 1 ). This is
x(1) x(2) true even if E 1 was also an
ellipsoid.
1
Note e− 2n ≈ 1 − 21n .
Proof
© (x1 − d)2 −2
X 2 ª
E2 = x : + b x j É 1 .
(1 − d)2 j Ê2
P 2
It touches the ball E 1 at the circle x1 = 0, j Ê2 x j = 1:
d2
+ b−2 = 1.
(1 − d)2
Hence
d2 1 − 2d
b−2 = 1 − 2
= ,
(1 − d) 1 − 2d + d 2
d2
b2 = 1 + É 1 + 2d 2 if d É 1/4.
1 − 2d
Using 1 + z É e z :
2
Vol(E 2 ) = Vn (1 − d)b n−1 É Vn (1 − d)(1 + 2d 2 )n/2 É Vn e nd −d
.
1
Choose d = 21n , then this is Vn e− 2n .
This proves the Lemma for the case when E 1 is a ball. When E 1
is an ellipsoid, transform it linearly into a ball, apply the lemma
and then transform back. The transformation takes ellipsoids
into ellipsoids and does not change the ratio of volumes.
then Vol(E r+1 ) is smaller than the volume of this small ball, so
there is no solution.
It is easy to see from here that r can be chosen to be polynomial
in m, n, k.
Péter Gács (Boston University) CS 530 Spring 09 127 / 165
NP-completeness
NP problems
Examples
Shortest vs. longest simple paths
Euler tour vs. Hamiltonian cycle
2-SAT vs. 3-SAT. Satisfiability for circuits and for conjunctive
normal form (SAT). Reducing sastisfiability for circuits to
3-SAT.
Use of reduction in this course: proving hardness.
Ultrasound test of sex of fetus.
Example
Given a graph G.
Decision Given k, does G have an independent subset of size Ê k?
Optimization What is the size of the largest independent set?
Search Given k, give an independent set of size k (if there is one).
Optimization+search Give a maximum size independent set.
Polynomial time
Abstract problems
Instance. Solution.
Encodings
Concrete problems: encoded into strings.
Polynomial-time computable functions, polynomial-time
decidable sets.
Polynomially related encodings.
Language: a set of strings. Deciding a language.
Polynomial-time verification
Example
Hamiltonian cycles.
V (x, w)
with yes/no values that verifies, for a given input x and witness
(certificate) w whether w is indeed witness for x.
Example (Compositeness)
Let the decision problem be the question whether a number x is
composite (nonprime). The obvious verifiable property is
Reducibility, completeness
Example
Reducing linear programming to solving a set of linear
inequalities.
NP-hardness.
NP-completeness.
Theorem
Satisfiability is NP-complete.
Theorem
INDEPENDENT SET is NP-complete.
Example
Integer linear programming. In particular, the subset sum
problem.
Approximations
The setting
Greedy algorithms
Try local improvements as long as you can.
Example (Maximum cut)
Graph G = (V , E), cut S ⊆ V , S = V \ S. Find cut S that maximizes
the number of edges in the cut:
|{ { u, v} ∈ E : u ∈ S, v ∈ S }|.
Greedy algorithm:
Repeat: find a point on one side of the cut whose
moving to the other side increases the cutsize.
Theorem
If you cannot improve anymore with this algorithm then you are
within a factor 2 of the optimum.
Randomized algorithms
Generalize maximum cut for the case where edges e have weights
w e , that is maximize
X
wuv .
u∈S,v∈S
Theorem
Approx_Vertex_Cover has a ratio bound of 2.
minimize wT x
subject to x i + x j Ê 1, (i, j) ∈ E,
x Ê 0.
Analysis
Theorem
Greedy_Set_Cover has a ratio bound maxS ∈F H(|S |).
Lemma
For all S in F we have
P
e∈S price(e) É w(S)H(|S |).
price(e) É w(S)/|Vi |.
w(S)
price(e k ) É .
|S | − k + 1
Proof of the theorem. Let C ∗ be the optimal set cover and C the
cover returned by the algorithm.
w(S)H(|S |) É H(|S ∗ |)
X X X X X
price(e) É price(e) É w(S)
e S ∈C ∗ e∈S S ∈C ∗ S ∈C ∗
Question
Is this the best possible factor for set cover?
Approximation scheme
maximize wT x
subject to aT x É b,
x i = 0, 1, i = 1, . . . , n.
maximize (w0 )T x
subject to aT x É b,
x i = 0, 1, i = 1, . . . , n.
(w)T x0 ε w1
Ê 1− Ê 1 − ε.
OPT OPT
P
With w = i wi , the amount of time is of the order of
aT x.
X
b−
i
Convex programming
Convexity
Equivalently, f is convex if
Examples
Each linear function aT x + b is convex.
If a matrix A is positive semidefinite then the quadratic
function xT Ax is convex.
If f (x), g(x) are convex and α, β Ê 0 then α f (x) + β g(x) is also
convex.
Definition
A convex program is an optimization problem of the form
min f 0 (x)
subject to f i (x) É 0 for i = 1, . . . , m,
x∈H
Separation oracle
aTi x É b i , i = 1, . . . , n.
Definition
Let a : Qn → Qn , b : Qn → Q be functions computable in
polynomial time and H ⊆ Rn a (convex) set. These are a
separating (hyperplane) oracle for H if for all x ∈ Rn , with
a = a(x), b = b(x) we have:
If x ∈ H then a = 0.
If x 6∈ H then aT y É b for all y ∈ H and aT x Ê b.
Example
For the unit ball H = { x : xT x É 1 }, the functions a = x · | xT x − 1|+ ,
and b = xT x − 1 give a separation oracle.
To find a separation oracle for an ellipsoid, transform it into a ball
first.
Semidefinite programs
{ X : X º 0}
aT X a Ê 0, that is
X
(a i a j )x i j Ê 0
ij
S = { i : z T u i É 0 }.
w i j u Ti u j .
X
i 6= j
Proposition
If A is positive definite then A 2 is also.