Unconstrained Optimization

Chapter 3
Unconstrained Optimization:
Functions of Several Variables
Many of the concepts for functions of one variable can be extended to functions of several variables.
For example, the gradient extends the notion of derivative. In this chapter, we review the notion
of gradient, the formula for small changes, how to nd extrema and the notion of convexity.
3.1 Gradient
Given a function f of n variables x1; x2; : : :; xn , we dene the partial derivative relative to variable
@f
, to be the derivative of f with respect to xi treating all variables except xi as
xi , written as @x
constant.
i
Example 3.1.1 Compute the partial derivatives of f (x1; x2) = (x1 , 2)2 + 2(x2 , 1)2.
The answer is:
@f
@x1 (x1; x2)
= 2(x1 , 2);
@f
@x2 (x1; x2)
= 4(x2 , 1).
@f (x) =
Let x denote the vector (x1; x2; : : :; xn ). With this notation, f (x) = f (x1; x20
; : : :; xn ), @x
1
@f
(
x
)
BB @x@f1 (x) CC
B @x2. CC. The
@f
@x (x1; x2; : : :; xn), etc. The gradient of f at x, written rf (x), is the vector B
B@ .. CA
@f
@x (x)
gradient vector rf (x) gives the direction of steepest ascent of the function f at point x. The
gradient acts like the derivative in that small changes around a given point x can be estimated
using the gradient.
i
f (x + ) f (x ) + rf (x)
where = (1; : : :; n ) denotes the vector of changes.
Example 3.1.2 If f (x1; x2) = x21 , 3x1x2 + x22, then f (1; 1) = ,1. What about f (1:01; 1:01)?
@f
@f
(x1 ; x2) = 2x1 , 3x2 and @x
(x1 ; x2) =
In this case, x = (1; 1) and = (0:01; 0:01). Since @x
1
2
,3x1 + 2x2, we get
37
38CHAPTER 3. UNCONSTRAINED OPTIMIZATION: FUNCTIONS OF SEVERAL VARIABLES
rf (1; 1) = ,,11
!
,
1
So f (1:01; 1:01) = f ((1; 1)+(0:01; 0:01)) f (1; 1)+(0:01; 0:01)rf (1; 1) = ,1+(0:01; 0:01) ,1 =
,1:02.
Example 3.1.3 Suppose that we want to put away a shing pole in a closet having dimensions 3
by 5 by 1 feet. If the ends of the pole are placed at opposite corners, there is room for a pole of
length,
p
f (x1 ; x2; x3) = f (3; 5; 1) = x21 + x22 + x23 = 5:9ft:
It turns out that the actual dimensions of the closet are 3 + 1 , 5 + 2 and 1 + 3 feet, where
1 , 2 and 3 are small correction terms. What is the change in pole length, taking into account
these corrections?
By the formula for small changes, the change in pole length is
f (3 + 1 ; 5 + 2 ; 1 + 3 ) , f (3; 5; 1) (1; 2; 3)rf (3; 5; 1):
So, we need to compute the partial derivatives of f . For i = 1; 2; 3

@
x
f (x1; x2; x3) = p 2 i 2 2 :
@xi
x1 + x2 + x3
Now we get
0
1
0:51
(1; 2; 3)rf (3; 5; 1) = (1; 2; 3) B
@ 0:85 CA = 0:511 + 0:852 + 0:173:
0:17
Exercise 28 Consider the function f (x1; x2) = x1 ln x2.

(a) Compute the gradient of f .
(b) Give the value of the function f and give its gradient at the point (3; 1).
(c) Use the formula for small changes to obtain an approximate value of the function at the point
(2:99; 1:05).
Exercise 29 Consider a conical drinking cup with height h and radius r at the open end. The
volume of the cup is V (r; h) = 3 r2h.
a) Suppose the cone is now 5 cm high with radius 2 cm. Compute its volume.
b) Compute the partial derivatives @V=@r and @V=@h at the current height and radius.
c) By about what fraction (i.e., percentage) would the volume change if the cone were lengthened
10%? (Use the partial derivatives.)
d) If the radius were increased 5%?
3.2. MAXIMUM AND MINIMUM
39
e) If both were done simultaneously?

Hessian matrix
2
Second partials @x@ @xf (x) are obtained from f (x) by taking the derivative relative to xi (this
@f
@f (x) relative to x . So we
yields the rst partial @x
(x) ) and then by taking the derivative of @x
j
@2f
@ 2f
can compute @x1@x1 (x), @x1 @x2 (x) and so on. These values are arranged into the Hessian matrix
i
0
BB
H (x) = B
BB
@
@2f
@x12@x1 (x)
@ f
@x2 @x1 (x)
@ 2f
@x12@x2 (x)
@ f
@x2 @x2 (x)

@2f
@x12@xn (x)
@ f
@x2 @xn (x)
@2f
@xn @x1 (x)
@ 2f
@xn @x2 (x)
@2f
@xn@xn (x)
..
.
..
.
The Hessian matrix is a symmetric matrix, that is
...
@2f
@xi @xj (x)
..
.
1
CC
CC
CA
f
= @x@ @x
(x).
j
Example 3.1.1 (continued): Find the Hessian matrix of f (x1; x2) = (x1 , 2)2 + 2(x2 , 1)2:
!
The answer is H (x) =
2 0 :
0 4
3.2 Maximum and Minimum

Optima can occur in three places:
1. at the boundary of the domain,
2. at a nondierentiable point, or
3. at a point x with rf (x ) = 0.
We will identify the rst type of point with Kuhn{Tucker conditions (see next chapter). The
second type is found only by ad hoc methods. The third type of point can be found by solving the
gradient equations.
In the remainder of this chapter, we discuss the important case where rf (x ) = 0. To identify
if a point x with zero gradient is a local maximum or local minimum, check the Hessian.
If H (x) is positive denite then x is a local minimum.

If H (x) is negative denite, then x is a local maximum.
Remember (Section 1.6) that these properties can be checked by computing the determinants of
the principal minors.
Example 3.2.1 Find the local extrema of f (x1; x2) = x31 + x32 , 3x1x2.

This function is everywhere dierentiable, so extrema can only occur at points x such that
rf (x) = 0.
!
2 , 3x
3
x
2
1
rf (x) = 3x2 , 3x1
2
This equals 0 i (x1 ; x2) = (0; 0) or (1; 1). The Hessian is
H (x) =
So,
6x1 ,3
,3 6x2
H (0; 0) = ,03 ,03

Let H1 denote the rst principal minor of H (0; 0) and let H2 denote its second principal minor
(see Section 1.6). Then det(H1) = 0 and det(H2) = ,9. Therefore H (0; 0) is neither positive nor
negative denite.
H (1; 1) = ,63 ,63

Its rst principal minor has det(H1) = 6 > 0 and its second principal minor has det(H2) =
36 , 9 = 25 > 0. Therefore H (1; 1) is positive denite, which implies that (1; 1) is a local minimum.
Example 3.2.2 Jane and Jim invested $20,000 in the design and development of a new product.
They can manufacture it for $2 per unit. For the next step, they hired marketing consultants XYZ.
In a nutshell, XYZ's conclusions are the following: if Jane and Jim spend $a on advertizing and
sell the product at price p (per unit), they will sell
2; 000 + 4 a , 20p units.

Using this gure, express the prot that Jane and Jim will make as a function of a and p. What
price and level of advertising will maximize their prots?
The revenue from sales is (2; 000 + 4pa , 20p)p.

The production costs are (2; 000 + 4 a , 20p)2, the development cost is $20,000 and the cost
of advertizing is a.
Therefore, Jane and Jim's prot is
f (p; a) = (2; 000 + 4 a , 20p)(p , 2) , a , 20; 000
To nd the maximum prot,

we compute the partial derivatives of f and set them to 0:
p
@f
@p (p; a) = 2; 040 + 4 a , 40p = 0:
@f (p; a) = 2(p , 2)=pa , 1 = 0:
@a
Solving this system of two equations yields
p = 63:25
a = 15; 006:25
We verify that this is a maximum by computing the Hessian.
3.3. GLOBAL OPTIMA
41
!
pa
,
40
2
=
p
p
H (x) =
2= a ,(p , 2)=a a
p
det(H ) = ,40 < 0 and det(H ) = 40(p , 2)=a a , 4=a > 0 at the point p = 63:25, a = 15; 006:25.
1
So, indeed, this solution maximizes prot.
Example 3.2.3 Find the local extrema of f (x1; x2; x3) = x21 + (x1 + x2)2 + (x1 + x3)2.
@f
@x1 (x)
@f
@x2 (x)
@f
@x3 (x)
= 2x1 + 2(x1 + x2 ) + 2(x1 + x3 )

= 2(x1 + x2 )
= 2(x1 + x3 )
Setting these partial derivatives to 0 yields the unique solution x1 = x2 = x3 = 0. The Hessian
matrix is
0
1
6 2 2
H (0; 0; 0) = B
@ 2 2 0 CA
2 0 2
The determinants of the principal minors are det(H1) = 6 > 0, det(H2) = 12 , 4 = 8 > 0 and
det(H3) = 24 , 8 , 8 = 8 > 0. So H (0; 0; 0) is positive denite and the solution x1 = x2 = x3 = 0
is a minimum.
Exercise 30 Find maxima or minima of the following functions when possible.

(a) f (x1; x2; x3) = ,x21 , 3x22 , 10x23 + 4x1 + 24x2 + 20x3
(b) f (x1; x2; x3) = x1 x2 + x2 x3 + x3 x1 , 2x1 , 2x2 , 2x3
Exercise 31 Consider the function of three variables given by
f (x1; x2; x3) = x21 , x1 , x1x2 + x22 , x2 + x43 , 4x3 :
(a) Compute the gradient rf (x1; x2; x3).
(b) Compute the Hessian matrix H (x1; x2; x3).
(c) Use the gradient to nd a local extremum of f .
Hint: if x33 = 1, then x3 = 1.
(d) Compute the three principal minors of the Hessian matrix and use them to identify this
extremum as a local minimum or a local maximum.
3.3 Global Optima

Finding global maxima and minima is harder. There is one case that is of interest.
We say that a domain is convex if every line drawn between two points in the domain lies within
the domain.
We say that a function f is convex if the line connecting any two points lies above the function.
That is, for all x; y in the domain and 0 < < 1, we have f (x + (1 , )y ) f (x) + (1 , )f (y ),
as before (see Chapter 2).
If a function is convex on a convex domain, then any local minimum is a global minimum.
If a function is concave on a convex domain, then any local maximum is a global maximum.
To check that a function is convex on a domain, check that its Hessian matrix H (x) is positive
semidenite for every point x in the domain. To check that a function is concave, check that its
Hessian is negative semidenite for every point in the domain.
Example 3.3.1 Show that the function f (x1; x2; x3) = x41 + (x1 + x2)2 + (x1 + x3)2 is convex over
<3.
@f
@x1 (x)
@f
@x2 (x)
@f
@x3 (x)
= 4x31 + 2(x1 + x2 ) + 2(x1 + x3 )

= 2(x1 + x2 )
= 2(x1 + x3 )
0 2
1
12x1 + 4 2 2
H (x1; x2; x3) = B
2 0C
@ 2
A
2
0 2
The determinants of the principal minors are det(H1) = 12x21 + 4 > 0, det(H2 ) = 12x21 0 and
det(H3) = 48x21 0. So H (x1; x2; x3) is positive semidenite for all (x1; x2; x3) in <3 . This implies
that f is convex over <3.
Exercise 32 For each of the following, determine whether the function is convex, concave, or
neither over <2.
(a) f (x) = x1 x2 , x21 , x22
(b) f (x) = 10x1 + 20x2
(c) f (x) = x41 + x1x2
(d) f (x) = ,x21 , x1 x2 , 2x22
Exercise 33 Let the following function be dened for all points (x; y) in the plane.
f (x; y ) = 2xy , x4 , x2 , y 2:
(a)
(b)
(c)
(d)
(e)
Write the gradient of the function f .

Write the Hessian matrix of f .
Is the function f convex, concave or neither?
Use the gradient to nd a local extremum of f .
Identify this extremum as a minimum, a maximum or neither.

Unconstrained Optimization

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Unconstrained Optimization

Diunggah oleh

Hak Cipta:

Format Tersedia

Chapter 3

f (x + )  f (x ) + rf (x)

where = (1; : : :; n ) denotes the vector of changes.

38CHAPTER 3. UNCONSTRAINED OPTIMIZATION: FUNCTIONS OF SEVERAL VARIABLES

So, we need to compute the partial derivatives of f . For i = 1; 2; 3

Exercise 28 Consider the function f (x1; x2) = x1 ln x2.

3.2. MAXIMUM AND MINIMUM

e) If both were done simultaneously?

The Hessian matrix is a symmetric matrix, that is

3.2 Maximum and Minimum

 If H (x) is positive de nite then x is a local minimum.

40CHAPTER 3. UNCONSTRAINED OPTIMIZATION: FUNCTIONS OF SEVERAL VARIABLES

H (0; 0) = ,03 ,03

H (1; 1) = ,63 ,63

2; 000 + 4 a , 20p units.

The revenue from sales is (2; 000 + 4pa , 20p)p.

f (p; a) = (2; 000 + 4 a , 20p)(p , 2) , a , 20; 000

To nd the maximum pro t,

We verify that this is a maximum by computing the Hessian.

3.3. GLOBAL OPTIMA

So, indeed, this solution maximizes pro t.

= 2x1 + 2(x1 + x2 ) + 2(x1 + x3 )

Exercise 30 Find maxima or minima of the following functions when possible.

3.3 Global Optima

42CHAPTER 3. UNCONSTRAINED OPTIMIZATION: FUNCTIONS OF SEVERAL VARIABLES

= 4x31 + 2(x1 + x2 ) + 2(x1 + x3 )

Write the gradient of the function f .

Anda mungkin juga menyukai

f (x + ) f (x ) + rf (x)

If H (x) is positive denite then x is a local minimum.

To nd the maximum prot,

So, indeed, this solution maximizes prot.