Anda di halaman 1dari 305

Math 320 Exam I.

Friday Feb 13, 09:55-10:45 Answers


I. (60 points.) (a) Find x = x(t) if Answer: By separation of variables
ln |x| = so (as |x| = x) we get x = exp tan1 (t) + C = c exp tan1 (t) , As tan1 (0) = 0 the initial condition gives x0 = c. so x = x0 exp tan1 (t) . c = eC . dx = x dt = tan1 (t) + C 1 + t2

dx x + = 0 and x(0) = x0 . dt 1 + t2

(b) Find y = y(t) if

dy y + = exp( tan1 t) and y(0) = y0 . dt 1 + t2


(t) := exp tan1 t .

Answer: This is an inhomogeneous linear equation and the general solution


x(t) = c(t),

of the corresponding homogeneous linear equation was found in part (a). We use as Ansatz y(t) = c(t)(t). Then y dy + = dt 1 + t2 c(t)(t) 1 + t2 (t) = c (t)(t) + c(t) (t) + 1 + t2 = c (t)(t) c (t)(t) + c(t) (t) + = exp tan1 t if c (t) = 1

i.e. if c(t) = t+constant. Evaluating at t = 0 shows that the constant must be y0 so c(t) = t + y0 so the solution is y(t) = (t + y0 ) exp tan1 t .

II. (40 points.) (a) State the Existence and Uniqueness Theorem for Ordinary Dierential Equations. Answer: If f (t, x) has continuous partial derivatives then the initial value problem dy = f (x, y), dx has a unique solution y = y(x). y(x0 ) = y0

(b) State the Exactness Criterion for the dierential equation M (x, y) dx + N (x, y) dy = 0. Answer: If M (x, y) and N (x, y) have continuous partial derivatives, then there
is a function F = F (x, y) solving the equations F = M, x if and only if F =N y

M N = y x

(c) Does the dierential equation (1 + x2 + x4 y 6 ) dx + (1 + x2 + y 2 ) dy = 0 have a solution y = y(x) satisfying the initial condition y(0) = 5? (Explain your answer.) Answer: Yes. The dierential equation can be written as
dy = f (x, y), dx f (x, y) = 1 + x2 + x4 y 6 1 + x2 + y 2

so the Existence and Uniqueness Theorem applies. (The Exactness Criterion is irrelevant here.)

III. (30 points.) Consider the dierential equation 2

dx = (1 x)x(1 + x). dt

(a) Draw a phase diagram. Answer:


x < 1 1 < x < 0 0 < x < 1 1 < x (1 x)x(1 + x) >0 <0 >0 <0

1
r

0
r

1
r

(b) Determine the limit lim x(t) if x(t) is the solution with x(0) = 0.5.
t

Answer: From the phase diagram lim x(t) = 1.


t

(b) Determine the limit lim x(t) if x(t) is the solution with x(0) = 1.
t

Answer: The constant function x(t) = 1 satises both the dierential equation
and initial condition x(0) = 1. Therefore it is the only solution by the the Existence and Uniqueness Theorem so lim x(t) = lim 1 = 1. Similarly if x(0) = 0 then lim x(t) = lim 0 = 0.
t t t t

IV. (60 points.) A 1200 gallon tank initially holds 900 gallons of salt water with a concentration of 0.5 pounds of salt per gallon. Salt water with a concentration of 11 pounds of salt per gallon ows into the tank at a rate of 8 gallons per minute and the well stirred mixture ows out of the tank at a rate of 3 gallons per minute. Write a dierential equation for the amount x = x(t) of salt in the tank after t minutes. (You need not solve the dierential equation but do give the initial condition.) Show your reasoning. Answer: After t minutes 8t gallons of saltwater has owed into the tank and 3t
gallons has owed out so the volume of the saltwater in the tank is V = 900 + 5t. The concentration of salt in this saltwater is x/V pounds per gallon. In a tiny time interval of size dt the amount of saltwater owing out of the tank is 3 dt gallons and the amount of salt in this saltwater is (x/V ) (3 dt) pounds. In this same tiny time interval the amount of saltwater owing in is 8 dt gallons and the amount of

salt in that saltwater is 11 8 dt = 88 dt pounds. Hence the net change in the amount of salt is dx = 88 dt 3x dt = V 88 3x 900 + 5t dt.

Initially V = 900 gallons so x(0) = 0.5 900 pounds. Thus the ODE is dx 3x = 88 , dt 900 + 5t x(0) = 450.

The fact that the tank holds 1200 gallons means that it is full after 60 minutes. This problem (with dierent numbers) is Example 5 on page 53 of the text.

V. (60 points.) A projectile is launched straight upward from its intial position y0 with initial velocity v0 > 0. Air resistance exerts a force proportional to the square of the projectiles velocity so that Newton second law gives that dv FG + FR = = 1 v|v|. dt m (To simplify the problem we chose units where the gravitation constant is g = 1.) The projectile goes up for 0 t < T and then goes down. (a) Find a formula for v(t) for 0 < t < T . Answer:
tan1 v = so v = tan(t + tan1 v0 ) = dv = 1 + v2 dt = t + tan1 v0 v0 tan t . 1 + v0 tan t

(b) Find a formula for v(t) for t > T . Answer:


tanh1 v = as v = 0 when t = T so v = tanh(T t). dv = 1 v2 dt = t + T

(c) Find a formula T . Answer: From part (a) v0 tan T = 0 so T = tan1 (v0 ). 4

Table of Integrals and Identities1


a+u u eaw eaw = e2aw = aw au a e + eaw du u 1 1 u+a du + C. = tan1 + C. = ln 2 + u2 2 u2 a a a a 2a ua eit eit et et sin(t) = sinh(t) = 2i 2 t it it e +e e + et cos(t) = cosh(t) = 2 2 eit eit et et i tan(t) = it tanh(t) = t e + eit e + et cos2 (t) + sin2 (t) = 1 cosh2 (t) sinh2 (t) = 1 d sin(t) = cos(t) dt d sinh(t) = cosh(t) dt d cos(t) = sin(t) dt d cosh(t) = sinh(t) dt tan(t) + tan(s) tanh(t) + tanh(s) tan(t + s) = tanh(t + s) = 1 tan(t) tan(s) 1 + tanh(t) tanh(s) Remark (added on answer sheet). The formulas a2 dv 1 v = tan1 + C. 2 + vu a a a2 du 1 u = tanh1 + C. 2 u a a

can be related as follows. The formula a2 du 1 u+a = ln +C 2 u 2a ua

can be proved with partial fractions. Assume that a < u < a so that (u + a)/(u a) > 0 and introduce the abbreviation w := 1 ln 2a u+a ua .

Multiply by 2a and exponentiate to get e2aw =


1

a+u au

This is the same table that was emailed to the class yesterday morning.

so ae2aw ue2aw = a + u so a(e2aw 1) = u(e2aw + 1) so u e2aw 1 eaw eaw = 2aw = aw = tanh(aw) a e +1 e + eaw by high school algebra. Thus tanh1 (u/a) = aw so w= 1 u tanh1 . a a

Now the trig functions and hyperbolic functions are related by the formulas i sin(t) = sinh(it), cos(t) = cosh(it), i tan(t) = tanh(it)

so the substitution u = iv, du = i dv gives a2 du = + u2 a2 i dv i iv 1 v = tanh1 + C = tan1 + C. 2 v a a a a

Math 320 Exam II. Friday Mar 27, 09:55-10:45 Answers


I. (50 points.) Complete the denition. (i) A subset W V of a vector space V is called a subspace i Answer: A subset W V of a vector space V is called a subspace i it is closed under the vector
space operations, i.e. i (i) 0 W , (ii) u, v W = u + w W , and (iii) c R, u W = cu W .

(ii). The vectors v1 , v2 , . . . , vk are said to be linearly independent i Answer: The vectors v1 , v2 , . . . , vk are said to be linearly independent i the only solution of the
equation x1 v2 + x2 v2 + + xk vk = 0 is the trivial solution x1 = x2 = = xk = 0.

(iii). The vectors v1 , v2 , . . . , vk are said to span V i they lie in V and Answer: The vectors v1 , v2 , . . . , vk are said to span V i they lie in V and every vector in V is a
linear combination of v1 , v2 , . . . , vk , i.e. i for every vector v in V there exist numbers x1 , x2 , . . . , xk such that v = x1 v2 + x2 v2 + + xk vk .

II. (50 points.) I have been assigned the task of nding matrices P, P1 and W so that W is in reduced echelon form and PA = W where 1 2 3 2 6 4 . A = 2 4 5 10 14 8 After some calculation I 1 2 0 0 0 1 B= 0 0 1 found matrices B 4 2 , N= 2 and N such that NA = B and 0 7 3 1 2 5 3 4 1 , 9 . N1 = 2 3 1 3 1 5 7 21

Only one elementary row operation to go! Find P, P1 , and W. Answer: Let E be the
1 2 0 0 0 1 ENA = EB = 0 0 0

1 0 0 1 0 0 1 0 so E1 = 0 1 0 and W = elementary matrix E = 0 0 1 1 0 1 1 4 2 is in reduced echelon form. We can take 0

0 7 3 0 7 3 1 0 0 4 1 1 0 3 4 1 = 3 P = EN = 0 2 1 0 1 3 1 0 1 1

so P1 = N1 E1

1 2 5 1 0 0 1 3 5 9 0 1 0 = 2 6 9 = 2 3 5 7 21 0 1 1 5 14 21

III. (50 points.) Here are the matrices from Problem II again. 1 2 3 2 6 4 . A = 2 4 5 10 14 8 0 7 3 1 2 5 1 2 0 4 9 . N1 = 2 3 B = 0 0 1 2 , N = 3 4 1 , 5 7 21 1 3 1 0 0 1 2 (They still satisfy the equation NA = B.) True or false? The second and fourth columns of A form a basis for its column space. Justify your answer. Answer: This is true. First note that the second and fourth coulumns of B form a basis for the column
1 1 space of B = b1 b2 b3 b4 . This is because b1 = 2 b2 + 0b4 and b3 = 2 b4 b2 . Hence the span of b1 , b2 , b3 , b4 (i.e. the column space of B) is the same as the span of b2 , b4 . The vectors b2 , b4 are 2 4 independent since the submatrix of b2 b4 has nonzero determinant so the only soultion 0 2 1 of xb2 + yb4 = 0 is x = y = 0. Because ai = N1 bi we have a1 = 2 a2 + 0a4 and a3 = 1 a4 a2 so the 2 span of a1 , a2 , a3 , a4 (i.e. the column space of A) is the same as the span of a2 , a4 . If xa2 + ya4 = 0 then xNa2 + yNa4 = xb2 + yb4 0 so x = y = 0 so a2 , a4 are independent. Thus a2 , a4 is a basis for the column space of A and the dimension of this column space is two. (Both the algorithm from the book on pages 250-260 and the proof of Theorem 73 in the notes tell us that the rst and third columns also form a basis, and the arguments we just used are the same as the arguments used there.)

IV. (50 points.) (i) Find the inverse of the matrix Answer:
a b c d
1

7 3 . Hint: You can use the formula. 2 1


1

1 adbc

d b c a

so

7 3 2 1

1 3 . 2 7

(ii) The vector (4, 1, b) is in the span of the vectors (7, 2, 3) and (3, 1, 4). What is b? Hint: You can save a little bit of work by using part (i).
3 7 4 Answer: If 1 = x1 2 + x2 1 then 4 3 b x1 x2 = 7 3 2 1
1

4 1 =

= x1 7 15

7 2

+ x2

3 1

7 3 2 1

x1 x2

so

4 1

1 3 2 7

4 1

so b = 3x1 + 4x2 = 21 + 60 = 39.

V. (50 points.) Find a basis for the subspace of R5 consisting of all vectors w which are orthogonal to the vectors (1, 0, 2, 3, 4) and (0, 1, 4, 5, 6). What is the dimension of this subspace?
1 0 2 3 4 . The three vectors 0 1 4 5 6 (1, 4, 1, 0, 0), (3, 5, 0, 1, 0), and (4, 6, 0, 0, 1), form a basis and so the dimension is three.

Answer: This is the same as the null space of the matrix

Math 320 (part 1) : First-Order Dierential Equations

(by Evan Dummit, 2012, v. 1.01)

Contents
1 First-Order Dierential Equations
1.1 1.2 1.3 Introduction and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Motivating Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First-Order: Separable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 1.4 1.5 The Logistic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
1 2 3 4 5 6 6 7 8 9 10 10 10

First-Order: Linear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Substitution Methods 1.5.1 1.5.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Bernoulli Equations

Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.6 1.7 1.8

First Order: Exact Equations and Integrating Factors

First Order: General Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . First Order: General Problems and Solutions 1.8.1 1.8.2 Problems Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 First-Order Dierential Equations


1.1 Introduction and Terminology
A dierential equation is merely an equation involving a derivative (or several derivatives) of a function or functions. In every branch of science  from physics to chemistry to biology (as well as 'other' elds such as engineering, economics, and demography)  virtually any interesting kind of process is modeled by a dierential equation, or a system of dierential equations.

The reason for this is that most anything interesting involves change of some kind, and the derivative expresses the rate of change. Thus, anything that can be measured numerically with values that change in some known manner will give rise to a dierential equation. Example: The populations of several species in an ecosystem which aect one another in some way. Example: The position, velocity, and acceleration of a physical object which has external forces acting on it. Example: The concentrations of molecules involved in a chemical reaction. Example: The production of goods, availability of labor, prices of supplies, and many other quantities over time in economic processes.

Here are some examples of single dierential equations and systems of dierential equations, with and without additional conditions.

Example:

y + y = 0. y(x) = Cex
for any constant

Answer:

C.

Example:

y + 2y + y = 3x

, with

y(0) = y (0) = 1

Answer:

y(x) = 3x2 12x + 18 4xex 17ex .


with

Example:

f f = (f )2 , f = 2f g

f (1) = f (1) = 1.

g = f + 2g , with f (0) = g(0) = 2. Answer: f (x) = 2e2x cos(x ) and g(x) = 2e2x sin(x ). 4 4 df df Example: + = s + t. ds dt
Example: and

Answer:

f (x) = ex1 .

Answer: Many solutions. Two examples are

f (s, t) = st

and

f (s, t) =

1 2 1 2 s + t . 2 2

Most dierential equations (very much unlike the carefully chosen ones above) are dicult if not impossible to nd exact solutions to, in the same way that most random integrals or innite series are hard to evaluate exactly.

Prototypical example:

(f )7 2ef f = sin20 (x) + ex .

However, it is generally possible to nd very accurate approximate solutions using numerical techniques or by using Taylor series. In this course we will only cover how to solve some of the more basic types of equations and systems: rst-order separable, linear, and exact equations; higher-order linear equations with constant coecients; and systems of rst-order linear equations with constant coecients.

If a dierential equation involves functions of only a single variable (i.e., if is called an ordinary dierential equation (or ODE).

is a function only of

x)

then it

We will only talk about ODEs in this course. But for completeness, dierential equations involving functions of several variables are called partial dierential equations, or PDEs. (Recall that the derivatives of functions of more than one variable are called partial derivatives, hence the name.)

PDEs, obviously, arise when functions depend on more than one variable. They occur often in physics (with functions that depend on space and time) and economics (with functions that depend on time and other parameters).

An

nth

order dierential equation is an equation in which the highest derivative is the

nth

derivative.

Example: The equations Example: The equation Example: The equation

y + xy = 3x2
y

and

y y =2

are rst-order.

y +y +y =0 e =y (y )3 ,

is second-order.

is fourth-order.

A dierential equation is linear if it is linear in the terms involving there are no terms like

and its derivatives. In other words, if

y2 ,

or

or

yy

, or

ln(y),

or

ey .
are linear. are not linear.


1.2

Example: The equations Example: The equations

y + xy = 3x2 y y = 3x
2
,

and

y +y +y =0
2
and

x + (y ) = 1,

y = sin(y)

Some Motivating Applications


Simple motivating example: A population (unrestricted by space or resources) tends to grow at a rate proportional to its size. [Reason: imagine each male pairing o with a female and having a xed number of ospring each year.]

In symbols, this means that

This is a homogeneous rst-order linear dierential equation with constant coecients. It's not hard to see that one population model that works is

dP = k P, dt

where

P (t)

is the population at time

and

is the growth rate.

P (t) = ekt

 hence, exponential growth.

More complicated example: The Happy Sunshine Valley is home to Cute Mice and Adorable Kittens. The Cute Mice grow at a rate proportional to their population, minus the number of Mice that are eaten by their predators, the Kittens. The population of Adorable Kittens grows proportional to the number of mice (since they have to catch Mice to survive and reproduce).

Symbolically, this says of Mice and Kittens,

dK dM = k1 M k2 K , and = k3 M , where M (t) and K(t) are the populations dt dt and k1 , k2 , k3 are some constants.

This is a system of two linear dierential equations; we will learn how to solve a system like this later in the course. The conditions here are fairly natural for a simple predator-prey system. But in general, there could

be non-linear terms too  perhaps when two Kittens meet, they ght with each other and cause injury, which might change the equation to

dK = k3 M k4 K 2 . dt

This system would get even more dicult if we added additional species each of which interacts in some way with the others.

Non-linear example: A simple pendulum consists of a weight suspended on a string, with gravity the only force acting on the weight. If

is the angle the pendulum's string makes with a vertical line, then horizontal

force on the weight toward the vertical is proportional to

sin().

Symbolically, this says

d2 = k sin(). dt2

This is a non-linear second-order dierential equation.

This equation cannot be solved exactly for the function can be found by using the rough estimate dierential equation

(t). However, a reasonably good approximation sin() , which turns the problem into the linear second-order

d2 = k , dt2

whose solutions are much easier to nd.

1.3

First-Order: Separable
A separable equation is of the form to something of this form.

y = f (x)g(y) for some functions f (x) and g(y), or an equation equivalent

Here is the method for solving such equations:

Step 1:

Replace

with

(including

dy ) dx

dy , dx

and then cross-multiply as necessary to get all the

y -stu

on one side

and all of the

x-stu

on the other.

Step 2: Integrate both sides (indenitely, with respect to

x). dy dx dx
to get

For the integral involving in terms of

terms, remember that you can cancel

dy

and an integral

only.

Don't forget to put the

+C

on the side with the

x-terms. C.
(Otherwise, just leave it where

Step 3: If given, plug in the initial condition to solve for the constant it is.) Step 4: Solve for

as a function of where

x,

if possible.

Example: Solve

y = k y,

is some constant.

Step 1: Rewrite as

Step 2: Integrate

1 dy = k. y dx 1 dy to get dx = k dx, y dx

or

1 dy = k dx. y
.

Evaluate to get

ln(y) = kx + C .

Step 4: Exponentiate to get

y = ekx+C = C ekx

Example: Solve the dierential equation

y = exy .

Step 1: Rewrite as

ey

Step 2: Integrate to

dy = ex . dx y dy get e dx = ex dx. dx

Simplify to then evaluate to get .

ey = ex + C .

Step 4: Take the natural logarithm to get

y = ln(ex + C)

Example: Find

given that

y = x + xy 2

and

y(0) = 1.

Step 1: Rewrite as

Step 2: Step 3:

dy = x dx. 1 + y2 1 dy Integrate to get dx = x dx, then simplify and evaluate 2 dx 1+y 1 Plug in the initial condition to get tan (1) = C hence C = /4. y = tan 1 2 x + 2 4
.

to get

tan1 (y) =

1 2 x + C. 2

Step 4: Take the natural logarithm to get

1.3.1 The Logistic Equation

Example: Solve the dierential equation

P = aP (b P ),

where

and

are positive constants.

Step 1: Rewrite as

dP = a dt. P (b P ) 1 dP = a dt. To P (b P ) 1/b 1/b 1 = + . Evaluating the P (b P ) P bP


evaluate the

Step 2: Integrate both sides to obtain fraction decomposition:

P -integral,

use partial

integrals therefore yields

1 ln(P ) b

1 ln(b P ) = at + C . b
Step 4: Multiply both sides by

and then combine the logarithms to obtain

ln

P bP
.

= abt + C ;

now

exponentiate to get

P = Ceabt . bP

Solving for

yields, nally,

P (t) =

b 1 + Ceabt

Note: If we want to satisfy the initial condition

P (0) = P0 ,

then plugging in shows

C=

b 1. P0

Then

the solution can be rewritten in the form

P (t) =

bP0 P0 + (b P0 )eabt

Remark: Dierential equations of this form are called logistic equations. With the explicit solution given here, we can observe some properties of the solution curves.

For example, as

t ,

as long as the starting population

P0

is positive, the population

P (t)

tends

toward the carrying capacity of

b.
is increasing  but that as

We can also see directly from the original dierential equation less than

b,

then

P > 0,

so that

P = aP (b P ) that if the population is P approaches b, the value of P shrinks to P


is decreasing. And if the population is

0. (This can also be seen from graphing some of the solution curves.)

Similarly, if the population is greater than exactly

b,

then

P <0

and

b,

then

P =0 P

and the population is constant.

Thus we can see that the value nearby values of

is an attracting point for the population, because as time goes on,

all get pushed closer toward

b.

This is an example of a stable critical point. we can see that the value 0 is a repulsing point,

By doing a similar analysis near the value because as time goes on, nearby values of an unstable critical point.

P = 0

all get pushed farther away from 0. This is an example of

1.4

First-Order: Linear
The general form for a rst-order linear dierential equation is (upon dividing by the coecient of by

) given

y + P (x) y = Q(x), y

where

P (x)

and

Q(x)

are some functions of

x.

We would really like it if we could just integrate both sides to solve the equation. However, in general, we cannot: the term is easy to integrate, but the

P (x) y

term causes trouble.

To solve this equation we use an integrating factor: we multiply by a function left-hand side into the derivative of a single function.

I(x)

which will turn the

What we would like to happen is for

I(x) y + I(x)P (x) y

to be the derivative of something nice.

When written this way, this sum looks sort of like the output of the product rule. If we can nd that the derivative of

I(x)

so

I(x)

is

I(x)P (x),

then this sum will be the derivative

d [I(x) y]. dx I(x),

What we want is

and the solution is

I(x)P (x) = I (x). This I(x) = e P (x) dx .

is now a (very easy) separable equation for the function

Motivated by the above logic, here is the method for solving rst-order linear equations:

Step 1: Put the equation into the form

y + P (x) y = Q(x). e
P (x) dx
to get

Step 2: Multiply both sides by the integrating factor

P (x) dx

y +e

P (x) dx

P (x) y =

P (x) dx

Q(x).

Step 3: Observe that the right-hand side is Don't forget the constant of integration

C.

d e P (x) dx y dx

, and take the antiderivative on both sides.

Step 4: If given, plug in the initial condition to solve for the constant it is.) Step 5: Solve for

C.

(Otherwise, just leave it where

as a function of

x.
and

Example: Find

given that

y + 2xy = x
and

y(0) = 1.

Step 1: We have

P (x) = 2x

Step 2: Multiply both sides by Step 3: Step 4:

Step 5:

ex y + ex 2x y = x ex 1 x2 x2 Taking the antiderivative on both sides yields e y = e + C. 2 1 0 1 0 Plugging in yields e 1 = e + C hence C = . 2 2 1 1 x2 . Solving for y gives y = + e 2 2 e
to get

Q(x) = x.
P (x) dx

= ex

Example: Find all functions

for which

xy = x4 4y .

Step 1: We have Step 2: Step 3:

Step 5:

4 3 and Q(x) = x . x P (x) dx Multiply both sides by e = e4 ln(x) = x4 to get x4 y + 4x3 y = x7 , 1 8 4 Taking the antiderivative on both sides yields x y = x + C. 8 1 4 Solving for y gives y = x + C x4 . 8 y + 4 y = x3 , x
so

P (x) =

Example: Find

given that

y cot(x) = y + 2 cos(x)

with

and

y(0) =

1 . 2
and

Step 1: We have

y y tan(x) = 2 sin(x), e
P (x) dx

P (x) = tan(x)

Q(x) = 2 sin(x).

Step 2: Multiply both sides by

=e

ln(cos(x))

= cos(x) to get y cos(x)ysin(x) = 2 sin(x) cos(x).

Step 3: Taking the antiderivative on both sides yields Step 4: Plugging in yields

1 [y cos(x)] = cos(2x) + C . 2

1 1 = 1+C 2 2 cos(2x) 2 cos(x)


.

hence

C = 0.

Step 5: Solving for

gives

y=

1.5

Substitution Methods
Just like with integration, sometimes we come across dierential equations which we cannot obviously solve, but which, if we change variables, will turn into a form we know how to solve.

Determining what substitutions to try is a matter of practice, in much the same way as in integral calculus. In general, there are two kinds of substitutions: obvious ones that arise from the form of the dierential equation, and formulaic ones which are standard substitutions to use if a dierential equation has a particular form.

The general procedure is the following:

Step 1: Express the new variable Step 2: Find

in terms of

and

x.

dv dx

in terms of

y , y,

and

using implicit dierentiation.

Step 3: Rewrite the original dierential equation in Step 4: Solve the new equation in

as a dierential equation in

v.

v.

(The hope is, after making the substitution, the new equation is in

a form that can be solved with one of the other methods.) Step 5: Substitute back for

y.

Example: Solve the equation

y = (x + y)2 . x+y
involves

This equation is not linear, nor is it separable as written. The obstruction is that the term both

and

y.

Step 1: Let us try substituting Step 2: Step 3:

v = x + y. dv dy Dierentiating yields =1+ , so y = v 1. dx dx 2 2 The new equation is v 1 = v , or v = v + 1. v


is separable. Separating it gives

Step 4: The equation in or

v2

dv = 1 dx, +1

so that

tan1 (v) = x + C ,

v = tan(x + C). y = tan(x + C) x


.

Step 5: Substituting back yields

1.5.1 Bernoulli Equations

An equation of the form restriction that linear, and if for some integer n = 0, 1 is called a Bernoulli equation. (The n not be 0 or 1 is not really a restriction, because if n = 0 then the equation is rst-order n = 1 then the equation is the same as y = (Q(x) P (x))y , which is separable.)

y + P (x)y = Q(x) y n

As with rst-order linear equations, sometimes Bernoulli equations can be hidden in a slightly dierent form. The trick for solving a Bernoulli equation is to make the substitution if we rst multiply both sides of the original equation by So we have For

v = y 1n .

The algebra is simplied

(1 n) y

, and then make the substitution.

(1 n)y y n + (1 n)P (x) y 1n = (1 n)Q(x).


we have

v = y 1n

v = (1 n)y n y

Substituting in to the original equation then yields the rst-order linear equation

v + (1 n)P (x) v =

(1 n) Q(x)

for

v. y + 2xy = xy 3 . P (x) = 2x, Q(x) = x, v 4xv = 2x. I(x) = e


4x dx 2x2
and

Example: Solve the equation

This equation is of Bernoulli type, with

n = 3.

Making the substitution

v = y 2

thus results in the equation

Next, we compute the integrating factor Scaling by the integrating factor gives

= e2x

.
2

2x2

v 4xe

Taking the antiderivative on both sides then yields


2 1 + Ce2x 2

v = 2xe2x . 1 2 2 e2x v = e2x + C , 2


.

so that

v=

1 2 + Ce2x . 2

1/2

Finally, solving for

gives

y=

Example: Solve the equation

y 2 y = ex y 3 . y +y = e y
x 2

The equation as written is not of Bernoulli type. However, if to both sides we add by y , we n = 2.

obtain the equation

, which is now Bernoulli with

y 3 and then divide P (x) = 1, Q(x) = ex , and

Making the substitution The integrating factor is

v = y3

results in the equation

I(x) = e

v + 3v = 3ex . e3x v + 3e3x v = 3e4x .


and

3 dx

= e3x ,

so the new equation is

Taking the antiderivative then gives

e3x v =

3 4x 3 e + C , so v = ex + Ce3x 4 4

y=

3 x e + Ce3x 4

1/3
.

1.5.2 Homogeneous Equations

An equation of the form

y =f

y x

for some function

is called a homogeneous equation.

The trick to solving an equation of this form is to make the substitution

v=

y = vx.
Then dierentiating

y , x

or equivalently to set

y = vx

shows

separable once written in the form

y = v + xv , hence v 1 = . f (v) v x 2x2 y = x2 + y 2 .


2

the equation becomes

v + xv = f (v),

which is

Example: Solve the dierential equation

This equation is not separable nor linear, and it is not a Bernoulli equation. If we divide both sides by

2x2

then we obtain

y =

1 1 + 2 2

y x

, which is homogeneous.

1 2 1 2v 1 v v + , and rearranging gives = . 2 2 2 (v 1) x 2dv 1 2 2 Then integrating yields = dx, so = ln(x)+C . Solving for v gives v = 1 , 2 (v 1) x v1 ln(x) + C 2x so y = x . ln(x) + C
Setting

v = y/x

yields the equation

xv =

Example: Solve the dierential equation

y =

x2 + y 2 . xy x2 ,
we obtain

If we divide the numerator and denominator of the fraction by homogeneous.

y =

1 + (y/x)2 , (y/x)

which is

Setting

v = y/x

yields

xv =

Separating and integrating then

1 + v2 1 v = . v v 1 yields v dv = dx, x

so that

1 2 v = ln(x) + C , 2

so

v=

2 ln(x) + C

and

y=x

2 ln(x) + C

1.6

First Order: Exact Equations and Integrating Factors


Theorem (Exact Equations): For functions exists a function equation

M (x, y) and N (x, y) with My = Nx (on some rectangle), there F (x, y) with Fx = M and Fy = N (on that rectangle). Then the solutions to the dierential M (x, y) + N (x, y) y = 0 are given (implicitly) by F (x, y) = C where C is an arbitrary constant.
denotes the partial derivative of

My

with respect to

y,

namely

M , y

and similarly for the other

functions. The equation

M (x, y) + N (x, y) y = 0

is also sometimes written

M (x, y) dx + N (x, y) dy = 0.

In this

form, it is more symmetric between the variables Fun Fact: The part of the theorem stating that that

and

y.

I will generally do this. implies the existence of a function

M y = Nx F

such

Fx = M

and

Gy = N

is a theorem from vector calculus: the criterion

M y = Nx

is equivalent to the

vector eld

M, N

being conservative. The function

is the corresponding potential function, with

F = M, N

. The rest of the theorem is really just an application of this result.

Remark: Note that if equation looks like equation. Since

M = f (x)

is a function only of

and

N =

1 g(y)

is a function only of

y,

then our

1 y = 0. Rearranging it gives the general form y = f (x) g(y) of a separable g(y) My = 0 = Nx in this case, separable equations are a special case of exact equations. f (x) M y = Nx .
If the partial derivatives are not equal,

We can use the theorem to solve exact equations, where

we are not necessarily out of luck  like with rst-order linear equations, there may exist an integrating factor

I(x, y)

which we can multiply the equation by, in order to make the equation exact.

Unfortunately, we don't really get much for free: trying to solve for the integrating factor is often as hard as solving the original equation. Finding

I(x, y),

in general, requires solving the PDE

I I M N + y x

I (My Nx ) = 0,

which is just as tricky to solve as the original equation. Only in a few special cases

are there methods for computing the integrating factor

I(x, y). x
(and not

Case 1: Suppose we want to see if there exists an integrating factor that depends only on on

y ).

Then

I y

would be zero, since

does not depend on

y,

and so

I(x) P (x)

would need to satisfy only of

M y Nx I = . This can I N P (x) dx then I(x) = e .

only happen if the ratio

M y Nx N

is a function

(and not

y );

The form of this integrating factor should look familiar  it is the same as the one from a rst-order linear equation. There is a very good reason for this; namely, a rst-order linear equation is a special case of this form of equation.

Case 2: We could also look to see if there is an integrating factor that depends only on We can do the same calculation, this time using the ratio

and not on

x.

Nx M y M

I = 0, x

to see that such an integrating factor exists if

is a function

Q(y)

only of

(and not

x);

then

I(y) = e

Q(y) dy

Remark: There is no really good reason only to consider these cases, aside from the fact that they're the easiest. We could just as well try to look for integrating factors that are a function of the variable Or of

t = xy .

v = x/y .

Or of

w = y + ln(x).

In each case we'd end up with some other kind of condition. But

we won't think about those things  we really just care about the two kinds of integrating factors above.

Example: Solve for

y(x),

if

(4y 2 + 2x) + (8xy)y = 0.

There is no obvious substitution to make, and it is not separable, linear, homogeneous, or Bernoulli. So we must check for exactness. In dierential form the equation is Therefore we have So we want to nd

(4y 2 + 2x) dx + 8xy dy = 0.

Therefore,

M = 4y 2 + 2x

and

N = 8xy .

My = 8y

and

Nx = 8y .

Since these are equal, the equation is exact.

F with Fx = M and Fy = N . Taking the anti-partial-derivative of M with respect x yields F (x, y) = 4xy 2 + x2 + g(y) for some function g(y). Checking then shows Fy = 8xy + g (y) so g (y) = 0.
to Therefore, our solutions are given implicitly by

4xy 2 + x2 = C

Example: Solve for

y(x),

if

(2xy 2 4y) + (3x2 y 8x)y = 0.

There is no obvious substitution to make, and it is not separable, linear, homogeneous, or Bernoulli. So we must check for exactness. In dierential form the equation is

(2xy 2 4y) dx + (3x2 y 8x) dy = 0. Nx = 6xy 8.

Therefore,

M = 2xy 2 4y

and

N = 3x2 y 8x.
Therefore we have

My = 4xy 4

and

These are not equal, so the equation isn't exact.

We look for integrating factors using the two criteria we know.

2xy + 4 M y Nx = 2 is not a function of x only. N 3x y 8x Nx M y 2xy 4 1 Second, we have = = is a function of y only. 2 4y M 2xy y (1/y) dy by the integrating factor I(y) = e = y.
First, we have

Therefore we need to multiply

Our new equation is therefore Now we want to nd

(2xy 3 4y 2 ) + (3x2 y 2 8xy)y = 0.

with

of the rst equation gives

Fx = 2xy 3 4y 2 and Fy = 3x2 y 2 8xy . Taking the anti-partial-derivative F (x, y) = x2 y 3 4xy 2 + f (y) and checking in the second equation shows x2 y 3 4xy 2 = C
.

f (y) = 0.
1.7
Therefore, our solutions are given implicitly by

First Order: General Procedure


We can combine all of the techniques for solving rst-order dierential equations into a handy list of steps. For the purposes of this course, if the equation cannot be simplied via a substitution (an obvious substitution, or if it is homogeneous or Bernoulli) then it is either exact, or can be made exact by multiplying by an integrating factor. (If it's not one of those, then in this course we have no idea how to solve it.)

Note: It is not really necessary to check ahead of time whether the equation is separable or a rst-order linear equation. Separable equations are exact, and rst-order linear equations can be made exact after multiplying by an integrating factor, which will be detected using the

two special types at the beginning only because it's faster to solve it using the usual methods.

M y Nx N

test. I check for these

Here is the general procedure to follow to solve rst-order equations:

Step 1: Write the equation in the two standard forms check to see if it is rst-order linear or separable.

y = f (x, y)

and

M (x, y) + N (x, y) y = 0

and

Step 1a: If the equation is rst-order linear  namely, of the form by the integrating factor

y + P (x)y = Q(x)  then multiply

I(x) = e

P (x) dx

and then take the antiderivative of both sides.

Step 1b: If the equation is separable  namely, of the form

y = f (x) g(y)  then separate the y -terms and x-terms on opposite sides of the equation and then take the antiderivative of both sides. y = f (x, y)
form).

Step 2: Look for possible substitutions (generally, using the

Step 2a: Check to see if there is any 'obvious' substitution that would simplify the equation.

Step 2b: Check to see if the equation is of Bernoulli type  namely, of the form If so, multiply both sides by rst-order linear equation

y for some y =F x y dv function F . If so, make the substitution v = to obtain a separable equation x = F (v) v . x dx Step 3: If the equation is not of a special type, use the M (x, y) + N (x, y) y = 0 form to nd the partial derivatives My and Nx .
Step 2c: Check to see if the equation is homogeneous  namely, of the form

(1 n) y and then make the dv + (1 n)P (x) v = (1 n) Q(x). dx

substitution

y +P (x)y = Q(x)y n . v = y 1n to obtain a

Step 4: If factor

My = Nx ,

no integrating factor is needed. Otherwise, if

M y = Nx , x,

look for an integrating

to multiply both sides of the equation by. Compute

Step 3a:

I(x) = e
Step 3b:

P (x) dx

M y Nx . N Nx My . M

If it is a function

P (x)

only of

then the integrating factor is

. If it is a function

I(y) = e

Compute

Q(y)

only of

y,

then the integrating factor is

Q(y) dy

If neither of these methods works, you're out of luck unless you can nd an integrating factor some other way.

Step 5:

Take antiderivatives to nd the function

F (x, y)

with

Fx = M

and

Fy = N ,

and write the

solutions as

F (x, y) = C .

1.8

First Order: General Problems and Solutions


Part of the diculty of seeing rst-order dierential equations outside of a homework set (e.g., on exams) is that it is not always immediately obvious which method or methods will solve the problem. Thus, it is good to practice problems without being told which method to use.

1.8.1 Problems

Solve the equation

xy = y + y =

xy .

Solve the equation

Solve the equation Solve the equation

y 2xy 2 . 3x2 y 2x xy = y + x.

y 1 = y 2 + x3 + x3 y 2 . y = 4x3 y 2 + y . 2x4 y + x

Solve the equation

Solve the equation Solve the equation

y = xy 3 6xy . y = 2xy + 2x . x2 + 1

1.8.2 Solutions

Solve the equation

xy = y +

xy . y = y + x y x y =
and

Step 1: The two standard forms are separable or rst-order linear.

(y y x

xy) + xy = 0.

The equation is not

Step 2: We go down the list and recognize that

y + x

is a homogeneous equation.

Setting

v = y/x

(with

y = vx

and

y = xv + v )

yields

xv =

v.
so

This equation is separable: we have

dv 1 = dx x v
2
.

hence

2v 1/2 = ln(x) + C ,

v=

ln(x) +C 2

2
.

Solving for

gives

y=x

ln(x) +C 2

Note: The equation is also of Bernoulli type, and could be solved that way too. Of course, it will give the same answer.

Solve the equation

y =

y 2xy 2 . 3x2 y 2x (2xy 2 y) + (3x2 y 2x) y = 0.


so The equation is not separable or

Step 1+2: The other standard form is

linear, nor is it homogeneous or Bernoulli.

Nx = 6xy 2. 2xy + 1 M y Nx = 2 , Step 4: We need to look for an integrating factor, because My = Nx . We have N 3x y 2x Nx M y 2xy 1 1 which is not a function of x alone. Next we try = = , so the integrating factor is M 2xy 2 y y 1 dy I(y) = e y = y .
Step 3: We have and and Step 5: The new equation is

M = 2xy 2 y

N = 3x2 y 2x

My = 4xy 1

with respect to

gives

(2xy 3 y 2 ) + (3x2 y 2 2xy) y = 0. Taking the anti-partial of the new F (x, y) = x2 y 3 + xy 2 + f (y), and checking shows that f (y) = 0. Hence the
.

solutions are

x2 y 3 + xy 2 = C xy = y + x.

Solve the equation

Step 1: The two standard forms are linear.

y =

1 y + x x

and

(y

x) + xy = 0.

The equation is rst-order

Rewrite in the usual rst-order linear form We have the integrating factor Thus the new equation is

I(x) = e

y (x1 )y = x1/2 .
1

dx

= e ln(x) = eln(x

= x1 . 1 y = x1/2 + Cx 2

x1 y x2 y = x3/2 . 1 x1 y = x1/2 + C , 2
so .

Taking the antiderivative on both sides yields

Solve the equation

y 1 = y 2 + x3 + x3 y 2 . y = (y 2 + 1)(x3 + 1).
This equation is

Step 1: Adding 1 and then factoring the right-hand side gives separable. Separating it gives

y = x3 + 1 . y2 + 1 dy x4 x2 = (x3 + 1) dx, so tan1 (y) = + + C. y2 + 1 4 2 (4x3 y 2 + y) . (2x4 y + x) (4x3 y 2 + y) + (2x4 y + x) y = 0.


so The equation is not separable or Then

Integrating yields

y = tan

x4 x2 + +C 4 2

Solve the equation

y =

Step 1+2: The other standard form is

linear, nor is it homogeneous or Bernoulli. Step 3: We have

M = 4x3 y 2 + y

and

N = 2x4 y + x

My = 8x3 y + 1

and

Nx = 8x3 y + 1.

Step 4: No integrating factor is needed since

M y = Nx .

Step 5: Taking the anti-partial of shows that

with respect to

gives

F (x, y) = x4 y + xy + f (y),
.

and checking

f (y) = 0.

Hence the solutions are

x y + xy = C

4 2

Solve the equation

y = xy 3 6xy . y + 6xy = xy 3 .

Step 1: The equation is of Bernoulli type when written as Multiply both sides by

2y 3

to get

2y 3 y

12 2 y = 2x. x
then yields the linear equation

Making the substitution The integrating factor is The new equation is

v = y 2 e

with

v = 2y 3 y

12 v = 2x. x

(12/x) dx

= e12 ln(x) = x12 .


1/2
.

x12 v 12x13 v = 2x11 . x12 v = 1 10 1 x +C , so v = x2 +Cx12 and y = 5 5 1 2 x + Cx12 5

Taking the antiderivative on both sides yields

Solve the equation (method #1)

y =

2xy + 2x . x2 + 1 2x (y + 1). x2 + 1

Step 1: The equation is separable, since after factoring we see that

y =

Separating and integrating gives

2x dy = dx, y+1 x2 + 1
2

so that

ln(y + 1) = ln(x2 + 1) + C .
.

Exponentiating yields

y + 1 = e ln(x

+1)+C

C , x2 + 1

so

y=

C 1 x2 + 1

(method #2)

Step 1: The other standard form is Step 3: We have Step 4: We have

(2xy + 2x) + (x2 + 1)y = 0. N = x2 + 1 M y


so

M = 2xy + 2x M y = Nx ,
so

and

My = 2x x

and

Nx = 2x.
and checking

so the equation is exact. with respect to gives

Step 5: Taking the anti-partial of shows that

F (x, y) = x2 y + x2 + f (y),
2
.

f (y) = 1

f (y) = y .

Hence the solutions are

x y+x +y =C

Note of course that we can solve for other solution.

explicitly, and we obtain exactly the same expression as in the

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 1a) : First-Order Supplement

(by Evan Dummit, 2012, v. 1.00)

Contents
0.1 First Order: Some Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 0.1.1 0.2 0.3 The General Mixing Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 4 Autonomous Equations, Equilibria, and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Euler's Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

0.1
0.1.1

First Order: Some Applications


The General Mixing Problem

Setup:

We have some reservoir (pool, lake, ocean, planet, room) of liquid (water, gas) which has some substance (pollution, solute) dissolved in it. The reservoir starts at an initial volume

V0

and there is an initial amount of substance

y0

in the reservoir.

We have some amount of liquid In(t) owing in with a given concentration some other amount of liquid Out(t) owing out.

k(t)

of the substance, and

We assume that the substance is uniformly and perfectly mixed in the reservoir, and want to know the amount

y(t)

of the substance that remains in the reservoir after time

t.

Note that this is the general setup. In more specic examples, the amount of liquid owing in or out may be constants (i.e., not depending on time), and similarly the concentration of the liquid owing in could also be a constant. The solution is the same, of course.

Solution:

Let

V (t)

be the total volume of the reservoir.

Then if

reservoir, the concentration of substance in the reservoir moving in is

y(t) is the total amount of substance in the y(t) . Thus the total amount of substance is V (t)

k In(t)

and the total amount of substance moving out is the concentration of substance

times the volume moving out, or

y(t) Out(t). V (t)


and

Thus we have diagram:

V (t) =

In(t)

Out(t),

y (t) = k(t) In(t)

y(t) Out(t). V (t)

For clarity, refer to the

Now to solve this system, we integrate to nd Then we can rewrite the other equation as it is rst-order linear.

V (t)

explicitly.

y +

Out(t)

V (t)

y = k(t) In(t),

which we can now solve because

0.2

Autonomous Equations, Equilibria, and Stability


An autonomous equation is a rst-order equation of the form

dy = f (y) dt

for some function

f.

An equation of this form is separable, and thus solvable in theory. However, sometimes the function equation explicitly. Nonetheless, would like to be able to say something about what the solutions look like, without actually solving the equation. Happily, this is possible.

f (y)

is suciently complicated that we cannot actually solve the

An equilibrium solution, also called a steady state solution or a critical point, is a solution of the form for some constant

y(t) = c,

c.

(In other words, it is just a constant-valued solution.)

Clearly, if hard.)

y(t)

is constant, then

to an autonomous equation

y (t) is zero everywhere. y = f (y), we just need to

Thus, in order to nd the equilibrium solutions solve

f (y) = 0.

(And this is not generally so

For equilibrium solutions, we have some notions of stability:

An equilibrium solution

y=c

is stable from above if, when we solve , the solution

y(0) = c +
A solution to

for some small but positive

y = f (y) with the initial condition y(t) moves toward c as t increases. This statement

is equivalent to

f (c + ) < 0.
from below if when we solve , the solution

y = c is stable

some small but positive

y(t)

moves toward

y = f (y) with the initial condition y(0) = c for c as t increases. This statement is equivalent y = f (y) with the initial condition y(0) = c + c as t increases. This statement is y = f (y) with the initial condition y(0) = c c as t increases. This statement is
We say it is unstable if it unstable

f (c ) > 0. y = c is unstable f (c + ) > 0.


from below if when we solve , the solution from above if when we solve , the solution

A solution

for some small but positive equivalent to

y(t)

moves away from

A solution

y = c is unstable f (c ) < 0.

for some small but positive equivalent to

y(t)

moves away from

We say a solution is stable if it is stable from above and from below. semistable.

from above and from below. Otherwise (if it is stable from one side and unstable from the other) we say it is

From the equivalent conditions about the sign of equilibrium states of

f,

here are the steps to follow to nd and classify the

y = f (y): c
for which

Step 1: Find all values of

f (c) = 0,

to nd the equilibrium states.

Step 2: Mark all the equilibrium values on a number line, and then in each interval between two critical points, plug in a test value to

to determine whether

is positive or negative on that interval.

Step 3: On each interval where draw left-arrows.

is positive, draw right-arrows, and on each interval where

is negative,

Step 4: Using the arrows, classify each critical point: if the arrows point toward it from both sides, it is stable. If the arrows point away, it is unstable. If the arrows both point left or both point right, it is semistable.

Step 5 (optional): Draw some solution curves, either by solving the equation or by using the stability information.

Example: Find the equilibrium states of

y =y

and determine stability.

Step 1: We have

f (y) = y , | .
0

which obviously is zero only when

y = 0.

Step 2: We draw the line and plug in 2 test points (or just think for a second) to see that the sign diagram looks like

Step 3: Changing the diagram to arrows gives

| .
0

Step 4: So we can see from the diagram that the only equilibrium point

is unstable .

Step 5: We can of course solve the equation to see that the solutions are of the form indeed, the equilibrium solution

y(t) = C et ,

and

y=0

is unstable:

Example: Find the equilibrium states of

y = y 2 (y 1)(y 2)

and determine stability.

Step 1: We have

f (y) = y 2 (y 1)(y 2),

which conveniently is factored. We see it is zero when

y = 0,

y = 1,

and

y = 2. | |
0 1

Step 2: We draw the line and plug in 4 test points (or just think for a second) to see that the sign diagram looks like

| .
2

Step 3: Changing the diagram to arrows gives Step 4: So we can see from the diagram that

| | | .
0 1 2

is semistable ,

is stable , and

is unstable .

Step 5: In this case, it is possible to obtain an implicit solution by integration; however, an explicit solution does not exist. However, we can graph some solution curves to see, indeed, our classication is accurate:

0.3

Euler's Method
If we have exhausted all of our techniques trying to solve a rst-order initial value problem

y = f (x, y)

with

y(a) = y0 , we interval [a, b].

are sad. However, perhaps we would like to be able to nd an approximate solution on some

One method we can use to nd an approximate solution is Euler's Method, named after the Swiss mathematician Leonhard Euler (pronounced oiler).

The general idea behind Euler's Method, which should bring back memories of basic calculus, is to break up the interval

[a, b]

into many small pieces, and then to use a linear approximation to the function

y(x)

on each interval to trace a rough solution to the equation.

Here is the method, more formally:

Step 1: Choose the number of subintervals Step 2: Dene the

x-values

ba be the width of the subintervals. n x0 = a, x1 = x0 + h, x2 = x1 + h, ... , xn = xn1 + h = b. n,


and let

h=

Step 3: Take y0 to be the given initial value. Then compute, iteratively, the values y1 = y0 + h f (x0 , y0 ), y2 = y1 + h f (x1 , y1 ), ... , yn = yn1 + f (xn1 , yn1 ). It is easiest to organize this information in a table.

Step 4 (optional): Plot the points

(x0 , y0 ), (x1 , y1 ), . . . , (xn , yn )

and connect them with a smooth curve.

Example: Use Euler's Method to nd an approximate solution on the interval

[1, 2] to the dierential equation

y = ln(x + y)

with

y(1) = 1. h = 0.1. x-values. y -value and the


1.9 1.8946 1.334 0.1334 2.0 2.0280 -

Step 1: Let's take 10 subintervals. Then

Steps 2+3: We organize our information in the table below. We ll out the rst row with the Then we ll in the empty columns one at a time: to start the next column, we add the

h f (x, y) x y f (x, y) h f (x, y)

value to get the next 1 1 1.1 1.0693 0.774 0.0774 1.2

y -value.
1.3 1.2320 0.929 0.0929 1.4 1.3249 1.002 0.1002 1.5 1.4251 1.073 0.1073 1.6 1.5324 1.1418 0.1142 1.7 1.6466 1.208 0.1208 1.8 1.7674 1.272 0.1272

1.1467 0.853 0.0853

0.693 0.0693

Step 4: Finally, we can plot the points, and (for comparison) the actual solution curve obtained using a computer. As can be seen from the graph, the approximation is very good:

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 2) : Matrices

(by Evan Dummit, 2012, v. 1.00)

Contents
1 Matrices
1.1 1.2 1.3 1.4 Linear Equations, Gauss-Jordan Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Operations: Addition, Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Determinants and Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrices and Systems of Linear Equations, revisited . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
1 4 6 9

Matrices
An

mn

matrix is an array of numbers with

columns and

rows. For example,

4 1 3 2

1 0

is a

23

matrix, and

0 0

0 0

0 0 9

is a

33

matrix.


1.1

A square matrix is one with the same number of rows and columns; i.e., an

nn

matrix for some

n.

Matrices originally arose as a way to simplify the algebra involved in solving systems of linear equations.

Linear Equations, Gauss-Jordan Elimination


Reminder: A linear equation is an equation of the form

a1 x1 + a2 x2 + + an xn = b,

for some constants

a1 , . . . an , b,

and variables

x1 , . . . , x n . x1 ,
then plugging in the result to all the other

The traditional method for solving a system of linear equations (typically covered in basic algebra) is by elimination  i.e., by solving one equation for the rst variable a contradiction (e.g., equations to obtain a reduced system involving fewer variables. Eventually, the system will simplify either to

1 = 0),

a unique solution, or an innite family of solutions.

Example: Given the system

x 2x
we can solve the rst equation for equation gives

+ y 2y

= =

7 2

x to obtain x = 7 y . Then plugging in this relation to the second 2(7 y) 2y = 2, or 14 4y = 2, so that y = 4 . Then since x = 7 y we obtain

x=3

Another way to perform elimination is to add and subtract multiples of the equations, so to eliminate variables (and remove the need to solve for each individual variable before eliminating it). In the example above, instead of solving the rst equation for equation, so as to eliminate computationally dicult.

x,

we could multiply the rst equation by

and then add it to the second

from the second equation.

This yields the same overall result, but is less

Example: Given the system

x 2x x

+ y + 3y + 2y

+ 3z z + 2z

= 4 = 1 = 1

if we label the equations #1, #2, #3, then we can eliminate This gives

by taking [#2]

2[#1]

and [#3]

+ [#1].

y y 3y

+ +

3z 7z 5z

= 4 = 7 = 5

and now we can eliminate

by taking [#3]

3[#1]. +

This yields

2y y

3z 7z 26z

= 4 = 7 . = 26 y = 0,
and the rst equation

Now the third equation gives gives

z = 1,

and then the second equation requires

x = 1.

Now, this procedure of elimination can be simplied even more, because we don't really need to write the variables down every time. We only need to keep track of the coecients, which we can do by putting them into a matrix.

Example: The system

x 2x x
in matrix form becomes

+ y + 3y + 2y 1 2 1

+ 3z z + 2z

= 4 = 1 = 1 4 1 1

1 3 3 1 2 2

Note: When working with a coecient matrix, I generally draw a line to separate the coecients of the variables from the constant terms.

Now we can perform the elimination operations on the rows of the matrix, (the elementary row operations) in order to reduce the matrix to row-echelon form. The three basic row operations on the coecient matrix which will not change the solutions to the equations are the following: 1. Interchange two rows. 2. Multiply all entries in a row by a nonzero constant. 3. Add a constant multiply of one row to another row.

Here are the steps to follow to solve a system of linear equations by reducing the coecient matrix to rowechelon form. This procedure is known as Gauss-Jordan Elimination:

Step 1: Convert the system to its coecient matrix. Step 2: If all entries in the rst column are zero, then the variable corresponding to that column is a free variable. Ignore this column and repeat this step until an entry in the rst column is nonzero. Swap rows, if necessary, so that the upper-left entry in the rst column is nonzero.

Step 3: Use row operation #3 (and #2, if it will simplify arithmetic) to clear out all entries in the rst column below the rst row. Step 4: Repeat steps 2 and 3 on the submatrix obtained by ignoring the rst row and rst column, until all remaining rows have all entries equal to zero.

After following the steps, the matrix will now be in row-echelon form, and the system can be fairly easily solved. To put the matrix in reduced row-echelon form, we have an optional extra step:

Step 5: Identify the pivotal columns (columns containing a leading row-term), and then perform row operations to clear out all non-leading entries in each pivotal column.

Reduced row-echelon form is useful because, when the matrix is converted back into a system of equations, it expresses each of the determined variables in terms of the free variables. It is also theoretically useful because it can be proven that every matrix has a unique reduced row-echelon form.

Note: It is rather distressingly dicult to write a clear description of how to reduce a matrix to row-echelon form. Examples illustrate the procedure much better. Example: Solve the system

x + y + 3z = 4, 2x + 3y z = 1, x + 2y + 2z = 1. x 2x x + y + 3y + 2y 1 2 1 + 3z z + 2z = 4 = 1 = 1 4 1 1

Given the system

we write it in matrix form to get

1 3 3 1 2 2

and now we repeatedly apply elementary row operations to clear out the rst column:

1 2 1

1 3 3 1 2 2

4 1 R 2R1 1 2 0 1 1

1 3 1 7 2 2

4 1 R3 +R 7 1 0 1 0

1 1 3

3 7 5

4 7 . 5

Now we are done with the rst column and look at the second column (ignoring the rst row), and obtain

1 0 0

1 1 3

3 7 5

4 1 1 R 3R2 7 3 0 1 5 0 0

3 7 26

1 1 1 4 R3 7 26 0 1 0 0 26

3 7 1

4 7 1

Now the system is in row-echelon form. To put it in reduced row-echelon form, we can work from the bottom up:

1 0 0

1 1 0

3 7 1

1 4 R +7R3 7 2 0 0 1

1 1 0

3 0 1

4 1 R1 R 0 2 0 1 0

0 1 0

3 0 1

4 1 R 3R3 0 2 0 1 0

0 1 0

0 0 1

1 0 . 1

And now from here it is very easy to get the solution to the system:

x = 1, y = 0, z = 1

, which is the

same answer we got when we did the row operations without using a matrix.

Example: Solve the system

x + y 2z = 3, x + 3y 5z = 1, 3x y + z = 2. 1 1 3 1 3 1 2 5 1 3 1 . 2

The coecient matrix is

Now put it in row-echelon form:

1 1 3

1 3 1

2 5 1

3 1 R2 +R 1 1 0 2 3

1 4 1

2 7 1

3 1 R 3R1 4 3 0 2 0

1 4 4

2 7 7

3 1 1 R2 +R 4 3 0 4 7 0 0 0 = 3.

2 7 0

3 4 . 3

At this stage we have reached a contradiction, since the bottom row reads no solution .

Therefore, there is

Example: Find all solutions to the system

x y + z t + w = 5, x + y z + t = 4, 2t + 3w = 3.

The coecient matrix is

1 1 0
Now put it in row-echelon form:

1 1 0

1 1 0

1 1 1 0 2 3

5 4 . 3

1 1 0

1 1 0

1 1 0

1 1 1 0 2 3

5 1 R2 +R1 4 0 3 0

1 0 0

1 1 0 0 0 2

1 1 3

5 1 R2 R3 9 0 3 0

1 0 0

1 1 0 2 0 0

1 3 1

5 3 9

Now put it in reduced row-echelon form (the pivotal columns are 1, 4, and 5):

1 1 0 0 0 0

1 1 0 2 0 0
R1 R3

1 0 0

1 3 1

5 1 R2 3R3 3 0 9 0 1 0 0 1 1 0 1 0 0 0 0 1

1 0 0

5 1 1 1 0 2 0 24 9 0 0 1 4 1 1 R1 +R 12 2 0 0 9 0 0

1 1 R2 2 0 0 1 0 0 0 1 0 0 0 1

1 0 0

1 1 0 1 0 0 16 12 . 9

1 0 1

5 12 9

And now we see that the general solution to the system is where

(x, y, z, t, w) = (y z 16, y, z, 12, 9)

and

are our arbitrary free variables.

1.2

Matrix Operations: Addition, Multiplication


Notation: If

is a matrix, we will denote by

ai,j

the entry of

in the

i-th

column and

j -th

row of

A.

This

entry will also be called the

(i, j)-entry

of

A. i and j
as analogous

Memory aid: If you have trouble getting the coordinates mixed up, you can think of to the

and

coordinates in the Cartesian plane: relative to the entry in the top left, the value of

indicates the horizontal position, and the value of

indicates the vertical position.

We say two matrices are equal if all of their entries are equal. Like with vectors, we can add matrices of the same dimension, and multiply a matrix by a scalar. Each of these operations is done componentwise: to add, we just add corresponding entries of the two matrices; to multiply by a scalar, we just multiply each entry by that scalar.

Example: If

A = =

21 22

26 22

1 6 2 2 2 12 4 4
matrix

and , and

B =

3 0

0 2

, then

A+B =
.

1+3 2+0

6+0 2+2

4 2

6 4

2A =

1 A B = 3

0 2 A,

6
4 3

We also have a transposition operation, where we interchange the rows and columns of the matrix. explicitly, given an equal to the

More is

nm
of

A,

the transpose of

denoted

AT

is the

mn

matrix whose

(i, j)-entry

(j, i)-entry A=

A. 2 5 3 6
, then

Example: If

1 4

1 AT = 2 3
is an

4 5 . 6
matrix and

Matrix multiplication, however, is NOT performed componentwise. Instead, the product of two matrices is the row-column product. Explicitly, if is the

mn

mq

matrix whose

(i, j)-entry

is the dot product of the

B is an n q matrix, then the product AB j th row of A and the ith column of B (where
n

the rows and columns are thought of as vectors of length

n). (AB)i,j =
k=1

More explicitly, the

(i, j)-entry

of

AB

is given by the sum of products

ai,k bk,j .

Note: In order for the matrix product to exist, the number of columns of of rows of

must equal the number

B.

In particular, if

and

are the same size, their product exists only if they are square

matrices. Also, if

AB

exists, then

BA

may not necessarily exist.

If you are wondering why matrix multiplication is dened this way, the general answer is: in order to make compositions of linear transformations of vector spaces work correctly. The more specic answer is: in order to solidify the relationship between solving systems of linear equations and matrices.

Motivating Example #1: The system of equations

x 2x
can be rewritten as a matrix equation

+ y 2y

= =

7 2

1 2

1 2

x y

7 2 x+y 2x 2y
.

since the product on the left-hand side is the column matrix

It is very useful to rewrite systems of equations as a single matrix equation; we will do this frequently. Motivating Example #2: Consider what happens if we are given the equations and the equations can just plug in

y1 = 3z1 z2 , y2 = z1 z2 , and want to express x1 and check that x1 = 4z1 2z2 and x2 = 5z1 z2 . x1 x2 = 1 2

and

x2

in terms of

x1 = y1 +y2 , x2 = 2y1 y2 z1 and z2 . You

In terms of matrices this says

So we would want to be able to say

Indeed, we have the matrix product

2 y1 y1 3 1 z1 and = . 1 y2 y2 1 1 z2 x1 1 2 3 1 z1 = . x2 2 1 1 1 z2 1 2 3 1 4 2 = . So the denition 2 1 1 1 5 1 2 0 , 3

of

matrix multiplication makes everything consistent with what we'd want to happen.

Example: If

A=

1 0

1 2 1 1 AB AB AB AB

and

1 B= 2 3

then

AB

is dened and is a

22

matrix.

The The The The

(1, 1) (2, 1) (1, 2) (2, 2)

entry of entry of entry of entry of

equals the dot product equals the dot product equals the dot product equals the dot product

1, 1, 2 1, 2, 3 = (1)(1) + (1)(2) + (2)(3) = 7. 1, 1, 2 2, 0, 3 = (1)(2) + (1)(0) + (2)(3) = 4. 0, 1, 1 1, 2, 3 = (0)(1) + (1)(2) + (1)(3) = 5. 0, 1, 1 2, 0, 3 = (0)(2) + (1)(0) + (1)(3) = 3. 4 3
.

Putting all of this together gives

AB =

7 5

Example: If

A=

1 0

1 2 1 1 BA BA BA BA BA BA BA

and

1 B= 2 3

2 0 , 3

then

BA

is also dened and is a

33

matrix.

The The The The The The The

(1, 1) (2, 1) (3, 1) (1, 2) (2, 2) (3, 2) (1, 3)

entry of entry of entry of entry of entry of entry of entry of

equals the dot product equals the dot product equals the dot product equals the dot product equals the dot product equals the dot product equals the dot product

1, 2 1, 0 = 1. 1, 2 1, 1 = 3. 1, 2 2, 1 = 4. 2, 0 1, 0 = 2. 2, 0 1, 1 = 2. 2, 0 2, 1 = 4. 3, 3 1, 0 = 3.

The The

(2, 3) (3, 3)

entry of entry of

BA BA

equals the dot product equals the dot product

3, 3 1, 1 = 6.

Putting all of this together gives

3, 3 2, 1 = 8. 1 3 4 BA = 2 2 3 . 3 6 8

If we restrict our attention to square matrices, then matrices under addition and multiplication obey some, but not all, of the algebraic properties that real numbers do.

In general, multiplication is NOT commutative: both square matrices.

AB

typically isn't equal to

BA,

even if

and

are

Example:

A=

1 3

2 4

and

B=

2 4

3 5

. Then

AB =

10 22

13 29

while

BA =

11 19
and

16 28

Matrix multiplication distributes over addition, on both sides:

(A + B)C = AC + BC

A(B + C) =

AB + AC .
Matrix multiplication is associative:

(AB)C = A(BC),

if

A, B, C

are of the proper dimensions.

In particular, taking the

nth

power of a square matrix is well-dened for every positive integer

n.

This is not easy to see from the denition.

The transpose of the product of two matrices is the product of their transposes, in reverse order:

(AB)T =
and

B A
If

A is an n n matrix, then there is a zero matrix Zn which has the properties Zn + A = A Zn A = A Zn = Zn . This matrix Zn is the matrix whose entries are all zeroes.
Example: The

22

zero matrix is

0 0

0 0

Semi-Related Interesting Example: If

A=

0 0

1 0

, then

is not the zero matrix, but

A2

is the

zero matrix. This is in contrast to real (or complex numbers), where

x2 = 0

implies

x = 0. In A = 1 0 0 1

If A is an n n matrix, then A In = A. This matrix In is

there is an

nn

identity matrix

In

which has the property that

the matrix whose diagonal entries are 1s and whose other entries are 0s.

Example:

The

22 b d

identity matrix is

I2 = 0 1

1 0

0 1

It is not hard to check that matrix

a c
1.3

b d

a c

a c

b d

1 0

for any

22

a c

b d

Determinants and Inverses


Given a square matrix

nn

matrix

A,

we might like to know whether it has a multiplicative inverse: namely, a

A1

(necessarily of the same dimension) such that

A A1 = A1 A = In ,

where

In

is the

nn

identity matrix.

Motivating Example: Suppose we have a system of linear equations, written in matrix form

for a matrix of coecients

a1,1
. . .

.. .

an,1
. . .

Ax=c ,
a column vector

x=

x1
. . .


of variables and a

A=

a1,n
column vector

an,n

xn

c1
. . .


of constants.

c=

cn
If we knew that

A1

to see that

A were invertible, then A1 A x = A1 c.

we could multiply both sides of the equation on the left by


If

A1 A x = x, so we see x = A1 c. 1 In other words, if we knew the matrix A , we would immediately be able to write down the solution
Now since is the identity matrix we have to the system.

A1 A

and

inverse

B are invertible matrices with inverses A1 and B 1 , then AB is also an invertible matrix, with B 1 A1 : observe that (AB)(B 1 A1 ) = A(B B 1 )A = A(In )A1 = A A1 = In , and similarly for (A1 A2 An )1 = (An )1 (A2 )1 (A1 )1 ,
provided that each of

the product in the other order.

By induction, one can verify that

A1 , A2 , . . . , An

are invertible.

Not every matrix has a multiplicative inverse. We say a matrix is singular if it is not invertible, and we say it is nonsingular if it is invertible.

Obviously,

0 0

0 0

does not. (But we wouldn't expect the zero matrix to be invertible.)

But we can also check that

0 0

1 0

a c

b d

c 0

d 0

, so

0 0

1 0

does not have an inverse either.

Theorem: An

nn

matrix

is invertible if and only if it is row-equivalent to the identity matrix.

To prove this theorem, simply consider the reduced row-echelon form of the matrix all zeroes.

A.

Because the

matrix is square, the reduced row-echelon form is either the identity matrix, or a matrix with a row of

Suppose

is row-equivalent to the identity matrix.

Then, because each elementary row operation

corresponds to left-multiplication by an invertible matrix, we can write where each of the

A as a product A = A1 A2 . . . Ak ,
Then

A1 If A is

Ai is = A1 . . . A1 A1 . 2 1 k

an invertible matrix corresponding to an elementary row operation.

not row-equivalent to the identity matrix, then its reduced row-echelon form

row of all zero entries. Clearly an inverse then so would

Ared

cannot be invertible, and since

Ared must contain a A = A1 A2 . . . Ak Ared , then if A had

Ared . nn
matrix

In order to compute the inverse of an the following procedure:

using row reduction (or see if it is non-invertible), use

[A |In ] where In is the identity matrix. A in reduced row-echelon form. (Carry the computations through on the entire matrix, but only pay attention to the A-side.) Step 3: If A can be reduced to the n n identity matrix, then A1 will be what appears on the In -side of the double matrix. If A cannot be row-reduced to the n n identity matrix, then A is not invertible.
Step 1: Set up the double matrix Step 2: Perform row operations to put

Among other things, this algorithm shows that the inverse of a matrix is unique.

Example: Find the inverse of the matrix

1 A= 2 0 1 2 0 0 1 2

0 1 2 1 1 5

1 1 . 5 1 0 0 0 1 0 0 0 . 1 0 0 1 0 2 1 3 2 1 10 5 3 . 4 2 1

We set up the starting matrix

Now we perform row-reductions:

1 2 0

0 1 1 0 1 1 0 1 2 5 0 0 1 0 1 R2 3R2 0 1 0 0 0 1

0 1 0 1 1 0 0 1 0 1 R2 2R1 R3 +2R2 0 0 1 3 2 1 0 0 1 3 1 0 2 5 0 0 1 0 0 1 1 0 0 1 0 0 3 2 1 1 R1 +R R2 10 5 3 3 0 1 0 10 5 3 0 4 2 1 0 0 1 4 2 1 0

1 2 4 0 1 0 0 0 1

We've reduced

to the identity matrix, and so

is invertible, with

A1

3 = 10 4 a c

2 1 5 3 2 1 b d

We can work out the calculations explicitly in small cases. For example, the matrix and only if

is invertible if

ad bc = 0,

and if so, the inverse is given by

1 ad bc

d b c a A

. is invertible. This motivates the

We might like to know, without performing all of the row-reductions, if

idea of the determinant, which will tell us precisely when a matrix is invertible. Denition: The determinant of a square matrix

11
set

matrix

[a]
n

it is just the constant

a.

expansion: dene

A(i,j)

to be the matrix

A, denoted det(A) or |A|, is dened inductively. For a n n matrix we compute the determinant via cofactor obtained from A by deleting the ith column and j th row. Then we
For an

det(A) =
k=1

(1)k+1 ak,1 det(A(k,1) ).

Note: The calculation of the determinant this way is called expansion by minors. It can be shown that the same value results from expanding along any row or column. The best way to understand determinants is to work out some examples.

Example: The determinant

a c 1 3

b d 2 4

is given by

a c

b d

= ad bc. 1 2 a2 b2 c2 1 2 a3 b3 c3 = (1)(2) (1)(2) = 0.

So, as particular cases,

= (1)(4) (2)(3) = 2 a1 b1 c1 a2 b2 c2 a3 b3 c3

and

Example: The general determinant

is given by

a1 b1 c1

= a1

b2 c2

b3 a2 c3

b1 c1

b3 + c3

a3

b1 c1

b2 c2

As a particular case,

1 1 2

2 4 1 0 1 3

=1

1 1

0 2 3

1 2

1 0 +4 2 3

1 1

= 1(3) 2(3) + 4(1) = 13.

Here are some very useful properties of the determinant:

Interchanging two rows multiplies the determinant by

1.

Multiplying all entries in one row by a constant scales the determinant by the same constant. Adding or subtracting a scalar multiple of one row to another leaves the determinant unchanged. If a matrix has a row of all zeroes, its determinant is zero. More generally, if one row is a scalar multiple of another, then its determinant is zero. The determinant is multiplicative:

det(AB) = det(A) det(B). det(AT ) = det(A).

The determinant of the transpose matrix is the same as the original determinant:

The determinant of any upper-triangular matrix (a matrix whose entries below the diagonal are all zeroes) is equal to the product of the diagonal entries. matrix is In particular, the determinant of the identity

1. 6 0 0 1 2 0 3 0 3 = 36. det(A) = 0.

Example:

A matrix

is invertible precisely when

We can see this by putting a matrix in reduced row-echelon form, and applying the properties above.

If

is invertible, then

det(A1 ) =

1 . det(A)

With some more eort, we can even write down a formula for the inverse: Theorem: The matrix

is invertible precisely when

det(A) = 0,

and in that case,

A1 =

1 T [adj(A)] , det(A)

where adj(A) is the matrix whose

(i, j)-entry

is given by

(1)i+j det(A(i,j) ).

The name adj(A) is short for adjugate. Remember that [adj(A)]

is the transpose of adj(A), and

A(i,j)

is the matrix obtained from

A by deleting

the row and column containing the

(i, j)-entry

of

A. 1 A= 2 0 1 2 0 1 2 1 1 . 5

Example: Use the formula to compute the inverse of the matrix

First, we have

det(A) = 1

1 2 1 2

1 5 1 5

2 0

1 5

2 0

= 3 4 = 1.

Now we compute all of the entries of the adjugate matrix:

The (1,1)-entry is

= 3,

the (2,1)-entry is

2 0

1 5

= 10,

the (3,1)-entry is

2 0

1 2

= 4. 0 2 1 5 = 2,
the (2,2)-entry is

The (1,2)-entry is

1 0

1 5

= 5,

the (3,2)-entry is

1 0

0 2

= 2. + 0 1 1 1 = 1,
the (2,3)-entry is

The (1,3)-entry is

1 2

1 1

= 3,

the (3,3)-entry is

1 2

0 1

= 1. 10 5 3 4 2 . 1 3 = 10 4 2 1 5 3 2 1

3 Thus, we have adj(A) = 2 1


1.4

Since

det(A) = 1, we thus obtain A1

Note that this is the same answer we obtained by doing row-reductions.

Matrices and Systems of Linear Equations, revisited


As an application of the utility of the matrix approach, let us revisit systems of linear equations. First suppose we have a homogeneous system of

equations in

n =

variables, of the form

a1,1 x1 + + an,1 xn
. . .

0
. . .

a1,k x1 + + an,k xn
Let

0 x1
. . .

a1,1
. . .

.. .

an,1
. . .


be the matrix of coecients,


the column vector of variables,

A=

x=

and

a1,k an,k 0 . 0 = . the zero column . 0

xk
vector.

Then in matrix form, the system takes the much simpler form Clearly,

A x = 0.
is also a solution,

x=0

(i.e., all variables equal to zero) is a solution.

If we have any solution because

v (so that A v = 0), then for any constant c, then the vector cv A cv = c(A v) = c(0) = 0.

In other words, any scalar multiple of a solution to a homogeneous system is also a solution to the system. Explicit example: Given the system then we claim that

x1 + x2 + x3 = 0, with the solution x1 = 1, x2 = 1, x3 = 2, x1 = c, x2 = c, x3 = 2c is also a solution. (Which it is.)


that

If we have any two solutions

v and w (so A (v + w) = A v + A w = 0 + 0 = 0.
system. Explicit example: Given the system

A v = A w = 0),

then

v+w

is also a solution, because

In other words, the sum of any two solutions to a homogeneous system is also a solution to the

(x1 , x2 , x3 ) = (3, 1, 4),


a solution. (Which it is.)

then we claim that

x1 + x2 + x3 = 0, with the solutions (x1 , x2 , x3 ) = (1, 1, 2) and (x1 , x2 , x3 ) = (1 3, 1 1, 2 + 4) = (2, 0, 2) is also

This means that the set of solutions to the homogeneous linear system forms a vector space (which we will discuss in much more detail later).

Now suppose we have a general system of

linear equations in

variables, written in matrix form

for a square matrix of coecients

a1,1
. . .

A=

a1,n
column vector

Ax=c an,1 . .. . , . . an,n

a column vector

x1
. . .


of variables and a

x=

xn

c1
. . .


of constants.

c=

cn
We claim that every solution to this system (if there is one) is of the form where

vparticular

is any one solution to the general system

A x = c,

and

x = vparticular + vhomogeneous , vhomogeneous is a solution to the


then

homogeneous system

A x = c. A vparticular = c
and

To see this, rst observe that if

A vhomogeneous = 0,

A (vparticular + vhomogeneous ) = A vparticular + A vhomogeneous = c + 0 = c,


and so

vparticular + vhomogeneous vw

is also a solution to the original system.


If

Conversely, if so that

v and w are two solutions to the original system, then A(vw) = AvAw = cc = 0,
is a solution to the homogeneous system.

The upshot of this result is: if we can nd

one

solution to the original system, then we can nd

all

of them just by solving the homogeneous system.

A is invertible, then we can multiply both sides of the equation on the left by A1 to see that x = A1 c. A
is not invertible, then the homogeneous system has innitely many solutions (as the reduced row-

In particular, the system has a unique solution. If

echelon form of

must have a row of all zeroes, and hence at least one free variable). Then the original

system either has no solutions, or innitely many solutions.

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 3) : Vector Spaces

(by Evan Dummit, 2012, v. 1.00)

Contents
1 Vector Spaces 1

1.1 1.2 1.3 1.4

Review of Vectors in

Rn

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2 4 6 6 7 8

Formal Denition of Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Span, Independence, Bases, Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 1.4.2 1.4.3 Linear Combinations and Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Vector Spaces

1.1 Review of Vectors in Rn


A vector, as we typically think of it, is a quantity which has both a magnitude and a direction. This is in contrast to a scalar, which carries only a magnitude.

Real-valued vectors are extremely useful in just about every aspect of the physical sciences, since just about everything in Newtonian physics is a vector  position, velocity, acceleration, forces, etc. There is also vector calculus  namely, calculus in the context of vector elds  which is typically part of a multivariable calculus course; it has many applications to physics as well.

We often think of vectors geometrically, as a directed line segment (having a starting point and an endpoint). We denote the where the

n-dimensional

vector from the origin to the point

(a1 , a2 , , an )

as

v = a1 , a2 , , an

ai

are scalars.

Some vectors:

1, 2

3, 5, 1

4 , e2 , 27, 3, , 0, 0, 1 3

Notation: I prefer to use angle brackets them in boldface (thus

rather than parentheses

()

so as to draw a visual distinction

between a vector, and the coordinates of a point in space. I also draw arrows above vectors or typeset

or

v),

in order to set them apart from scalars. This is not standard notation

everywhere; many other authors use regular parentheses for vectors.

Note/Warning: Vectors are a little bit dierent from directed line segments, because we don't care where a vector starts: we only care about the dierence between the starting and ending positions. Thus: the directed segment whose start is the same vector,

(0, 0)

and end is

(1, 1)

and the segment starting at

(1, 1)

and ending at

(2, 2)

represent

1, 1
and

We can add vectors (provided they are of the same length!) in the obvious way, one component at a time: if

v = a1 , , an

w = b1 , , b n

then

v + w = a1 + b1 , , an + bn

We can justify this using our geometric idea of what a vector does:

(a1 , , an ).

v moves us from the origin to the point w tells us to add b1 , , bn to the coordinates of our current position, and so w moves us from (a1 , , an ) to (a1 +b1 , , an +bn ). So the net result is that the sum vector v + w moves us from the origin to (a1 + b1 , , an + bn ), meaning that it is just the vector a1 + b1 , , an + bn .
Then Another way (though it's really the same way) to think of vector addition is via the parallelogram diagram, whose pairs of parallel sides are

and

w,

and whose long diagonal is

v + w.

We can also 'scale' a vector by a scalar, one component at a time:

if

is a scalar, then we have

rv =

ra1 , , ran

Again, we can justify this by our geometric idea of what a vector does: if direction, then

moves us some amount in a

far in that direction, and

1 v 2

should move us half as far in that direction. Analogously,2v should move us twice as

should move us exactly as far, but in the opposite direction. then

Example: If

v = 1, 2, 2
.

and

w = 3, 0, 4

2w =

6, 0, 8

, and

v +w =

2, 2, 2

. Furthermore,v

2w =

7, 2, 10

The arithmetic of vectors in the denition):

Rn

satises several algebraic properties (which follow more or less directly from

Addition of vectors is commutative and associative. There is a zero vector (namely, the vector with all entries zero) such that every vector has an additive inverse. Scalar multiplication distributes over addition of both vectors and scalars.

1.2 Formal Denition of Vector Spaces

The two operations of addition and scalar multiplication (and the various algebraic properties they satisfy) are the key properties of vectors in the same properties as vectors in

Rn .
n
.

We would like to investigate other collections of things which possess

Denition: A (real) vector space is a collection

of vectors together with two binary operations, addition of

vectors (+) and scalar multiplication of a vector by a real number (), satisfying the following axioms:

Let

v , v1 , v2 , v3

be any vectors and

, 1 ,2

be any (real number) scalars.

Note: The statement that

and

are binary operations means that

v1 + v2

and

are always dened.

[A1] Addition is commutative: [A2] Addition is associative:

v1 + v2 = v2 + v1 . 0,
with

(v1 + v2 ) + v3 = v1 + (v2 + v3 ). v + 0 = v. v ,
with

[A3] There exists a zero vector [A4] Every vector

has an additive inverse

v + (v) = 0. 1 (2 v) = (1 2 ) v .

[M1] Scalar multiplication is consistent with regular multiplication: [M2] Addition of scalars distributes: [M3] Addition of vectors distributes:

(1 + 2 ) v = 1 v + 2 v . (v1 + v2 ) = v1 + v2 . 1 v1 = v1 .

[M4] The scalar 1 acts like the identity on vectors:

Important Remark: One may also consider vector spaces where the collection of scalars is something other than the real numbers  for example, there exists an equally important notion of a complex vector space, whose scalars are the complex numbers. (The axioms are the same.)

We will principally consider real vector spaces, in which the scalars are the real numbers. The most general notion of a vector space involves scalars from a eld, which is a collection of numbers which possess addition and multiplication operations which are commutative, associative, and distributive, with an additive identity 0 and multiplicative identity 1, such that every element has an additive inverse and every nonzero element has a multiplicative inverse.

Aside from the real and complex numbers, another example of a eld is the rational numbers (i.e., fractions). One can formulate an equally interesting theory of vector spaces over the rational numbers.

Examples: Here are some examples of vector spaces:

The vectors in

Rn

are a vector space, for any

n > 0.

(This had better be true!)

In particular, if we take

n = 1,

then we see that the real numbers themselves are a vector space.

Note: For simplicity I will demonstrate all of the axioms for vectors in

R2 ; there, the vectors are of the form x, y and scalar multiplication is dened as x, y = x, y . [A1]: We have x1 , y1 + x2 , y2 = x1 + x2 , y1 + y2 = x2 , y2 + x1 , y1 . [A2]: We have ( x1 , y1 + x2 , y2 )+ x3 , y3 = x1 + x2 + x3 , y1 + y2 + y3 = x1 , y1 +( x2 , y2 + x3 , y3 ). [A3]: The zero vector is 0, 0 , and clearly x, y + 0, 0 = x, y . [A4]: The additive inverse of x, y is x, y , since x, y + x, y = 0, 0 . [M1]: We have 1 (2 x, y ) = 1 2 x, 1 2 y = (1 2 ) x, y . [M2]: We have (1 + 2 ) x, y = (1 + 2 )x, (1 + 2 )y = 1 x, y + 2 x, y . [M3]: We have ( x1 , y1 + x2 , y2 ) = (x1 + x2 ), (y1 + y2 ) = x1 , y1 + x2 , y2 . [M4]: Finally, we have 1 x, y = x, y . 0,
with

The zero space with a single element

0+0=0

and

0=0

for every

is a vector space.

All of the axioms in this case eventually boil down to

0 = 0.

This space is rather boring: since it only contains one element, there's really not much to say about it.

The set of

mn

matrices for any

and any

n,

forms a vector space.

The various algebraic properties we know about matrix addition give [A1] and [A2] along with [M1], [M2], [M3], and [M4]. The zero vector in this vector space is the zero matrix (with all entries zero), and [A3] and [A4] follow easily. Note of course that in some cases we can also multiply matrices by other matrices. However, the requirements for being a vector space don't care that we can multiply matrices by other matrices! (All we need to be able to do is add them and multiply them by scalars.)

The complex numbers

a + bi,

where

i2 = 1,

are a vector space.

The axioms all follow from the standard properties of complex numbers. As might be expected, the zero vector is just the complex number

0 = 0 + 0i.

Again, note that the complex numbers have more structure to them, because we can also multiply two complex numbers, and the multiplication is also commutative, associative, and distributive over addition. However, the requirements for being a vector space don't care that the complex numbers have these additional properties.

The collection of all real-valued functions on any part of the real line is a vector space, where we dene the sum of two functions as

(f + g)(x) = f (x) + g(x)


then

for every

x,

and scalar multiplication as

( f )(x) = f (x).
To illustrate: if is the function

f (x) = x and g(x) = x2 , with (2f )(x) = 2x.

f +g

is the function with

(f + g)(x) = x + x2 ,

and

2f

The axioms follow from the properties of functions and real numbers. The zero vector in this space is the zero function; namely, the function

which has

z(x) = 0

for every

For example (just to demonstrate a few of the axioms), for any value and

x. x in [a, b]

and any functions

g,

we have

(f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x). [M2]: (f + g)(x) = f (x) + g(x) = (f )(x) + (g)(x). [M4]: (1 f )(x) = f (x).
[A1]:

There are many simple algebraic properties that can be derived from the axioms (and thus, are true in every vector space), using some amount of cleverness. For example: 1. Addition has a cancellation law: for any vector

v,

if

v+a=v+b

then

a = b.
to

Idea: Add

to both sides and then use [A1]-[A4] to rearrange

(v + a) + (v) = (v + b) + (v)

a = b.
2. The zero vector is unique: for any vector

v,

if

v + a = v,

then

a = 0.

Idea: Use property (1) applied when

b = 0. v,
if

3. The additive inverse is unique: for any vector Idea: Use property (1) applied when

v+a=0 0v =0 0=0

then

a = v . v. . v.
and then use

b = v .
for any vector via [M2] and then apply property (2). for any scalar

4. The scalar

times any vector gives the zero vector:

Idea: Expand

v = (1 + 0) v = v + 0 v

5. Any scalar times the zero vector is the zero vector: Idea: Expand

0 = (0 + 0) = 0 + 0

via [M1] and then apply property (1).

6. The scalar

times any vector gives the additive inverse:

(1) v = v

for any vector

Idea: Use property (3) and [M2]-[M4] to write property (1) with

0 = 0 v = (1 + (1)) v = v + (1)v , (v) = v


2

a = v .
for any vector

7. The additive inverse of the additive inverse is the original vector:

v.

Idea: Use property (5) and [M1], [M4] to write

(v) = (1) v = 1 v = v .

1.3 Subspaces

Denition: A subspace

of a vector space

is a subset of the vector space

which, under the same addition

and scalar multiplication operations as

V,

is itself a vector space.

Very often, if we want to check that something is a vector space, it is often much easier to verify that it is a subspace of something else we already know is a vector space.

We will make use of this idea when we talk about the solutions to a homogeneous linear dierential equation (see the examples below), and prove that the solutions form a vector space merely by checking that they are a subspace of the set of all functions, rather than going through all of the axioms.

We are aided by the following criterion, which tells us exactly what properties a subspace must satisfy:

Theorem (Subspace Criterion): To check that properties:

is a subspace of

V,

it is enough to check the following three

[S1] [S2] [S3]

W W W

contains the zero vector of

V. w1
and

is closed under addition: For any

w2

in

W,

the vector

w1 + w2 W,

is also in

W.
is also in

is closed under scalar multiplication: For any scalar

and

in

the vector

W.

The reason we don't need to check everything to verify that a collection of vectors forms a subspace is that most of the axioms will automatically be satised in

because they're true in

V. W
because they

As long as all of the operations are dened, axioms [A1]-[A2] and [M1]-[M4] will hold in hold in

V.

But we need to make sure we can always add and scalar-multiply, which is why we need [S2]

and [S3].

In order to get axiom [A3] for [S1]. In order to get axiom [A4] for

W, W

we need to know that the zero vector is in

W,

which is why we need

we can use the result that

(1) w = w,

to see that the closure under

scalar multiplication automatically gives additive inverses. Remark: Any vector space automatically has two easy subspaces: the entire space consisting only of the zero vector. Examples: Here is a rather long list of examples of less trivial subspaces (of vector spaces which are of interest to us):

V , and the trivial

subspace

The vectors of the form

t, t, t

are a subspace of

R3 .

[This is the line

x = y = z .]

[S1]: The zero vector is of this form: take

t = 0.

[S2]: We have we take [S3]:

t1 , t1 , t1 + t2 , t2 , t2 = t1 + t2 , t1 + t2 , t1 + t2 , which is again t = t1 + t2 . We have t1 , t1 , t1 = t1 , t1 , t1 , which is again of the same form if s, t, 0


are a subspace of

of the same form if

we take

t = t1 .

The vectors of the form

. [This is the

xy -plane,

aka the plane

z = 0.]

s = t = 0. s1 , t1 , 0 + s2 , t2 , 0 = s1 + s2 , t1 + t2 , 0 , which is again of the same form, if we take s = s1 + s2 and t = t1 + t2 . [S3]: We have s1 , t1 , 0 = s1 , t1 , 0 , which is again of the same form, if we take s = s1 and t = t1 .
[S1]: The zero vector is of this form: take [S2]: We have

The vectors

x, y, z

with

2x y + z = 0

are a subspace of

R3 .

2(0) 0 + 0 = 0. x2 , y2 , z2 have 2x1 y1 + z1 = 0 and 2x2 y2 + z2 = 0 then adding the equations shows that the sum x1 + x2 , y1 + y2 , z1 + z2 also lies in the space. [S3]: If x1 , y1 , z1 has 2x1 y1 + z1 = 0 then scaling the equation by shows that x1 , x2 , x3
[S1]: The zero vector is of this form, since [S2]: If

x1 , y1 , z1

and

also lies in the space.

More generally, the collection of solution vectors of

homogeneous equations, form a subspace of

x1 , , xn Rn .

to any homogeneous equation, or system

It is possible to check this directly by working with equations. But it is much easier to use matrices: write the system in matrix form, as [S1]: We have

Ax = 0, where x = x1 , , xn is a solution vector. A0 = 0, by the properties of the zero vector. [S2]: If x and y are two solutions, the properties of matrix arithmetic imply A(x + y) = Ax + Ay = 0 + 0 = 0 so that x + y is also a solution. [S3]: If is a scalar and x is a solution, then A( x) = (Ax) = 0 = 0, so that x is also a
solution.

The collection of

22

matrices of the form

a 0

b a

is a subspace of the space of all

22

matrices.

[S1]: The zero matrix is of this form, with [S2]: We have

a1 0

b1 a1 a1 0

+ b1 a1

a2 0 =

[S3]: We have

b2 a2 a1 0

a = b = 0. a1 + a2 b1 + b2 = 0 a1 + a2 b1 , which is also of a1 a + 2ai

, which is also of this form.

this form.

The collection of complex numbers of the form

is a subspace of the complex numbers.

The three requirements should be second nature by now!

The collection of continuous functions on [S1]: The zero function is continuous.

[a, b]

is a subspace of the space of all functions on

[a, b].

[S2]: The sum of two continuous functions is continuous, from basic calculus. [S3]: The product of continuous functions is continuous, so in particular a constant times a continuous function is continuous.

The collection of on

[a, b],

for any positive integer

n-times dierentiable functions on [a, b] is a subspace of the space of continuous functions n. n


times.

The zero function is dierentiable, as are the sum and product of any two functions which are dierentiable

The collection of all polynomials is a vector space.

Observe that polynomials are functions on the entire real line. Therefore, it is sucient to verify the subspace criteria. The zero function is a polynomial, as is the sum of two polynomials, and any scalar multiple of a polynomial.

The collection of solutions to the (homogeneous, linear) dierential equation vector space.

y + 6y + 5y = 0

form a

We show this by verifying that the solutions form a subspace of the space of all functions. [S1]: The zero function is a solution. [S2]: If

y1

and

y2

are solutions, then

properties of derivatives shows that solution.

y1 + 6y1 + 5y1 = 0 and y2 + 6y2 + 5y2 = 0, so adding and using (y1 + y2 ) + 6(y1 + y2 ) + 5(y1 + y2 ) = 0, so y1 + y2 is also a
using properties

[S3]: If

is a scalar and

of derivatives shows that

y1 is a solution, then scaling y1 + 6y1 + 5y1 = 0 by and (y1 ) + 6(y1 ) + 5(y1 ) = 0, so y1 is also a solution.

Note: Observe that we can say something about what the set of solutions to this equation looks like, namely that it is a vector space, without actually solving it!

For completeness, the solutions are

y = Aex + Be5x

for any constants

and

B.

From here,

if we wanted to, we could directly verify that such functions form a vector space. The collection of solutions to any

nth-order homogeneous linear dierential equation y (n) +Pn (x)y (n1) + + P2 (x) y + P1 (x) y = 0 for continuous functions P1 (x), , Pn (x), form a vector space.
Note that

y (n)

means the

nth

derivative of

y.

As in the previous example, we show this by verifying that the solutions form a subspace of the space of all functions. [S1]: The zero function is a solution.

(n) (n1) y1 and y2 are solutions, then by adding the equations y1 +Pn (x)y1 + +P1 (x)y1 = 0 (n) (n1) and y2 + Pn (x) y2 + + P1 (x) y2 = 0 and using properties of derivatives shows that (y1 + y2 )(n) + Pn (x) (y1 + y2 )(n1) + + P1 (x) (y1 + y2 ) = 0, so y1 + y2 is also a solution. (n) (n1) [S3]: If is a scalar and y1 is a solution, then scaling y1 +Pn (x)y1 + +P2 (x)y1 +P1 (x)y1 = 0 (n) by and using properties of derivatives shows that (y1 ) +Pn (x)(y1 )(n1) + +P1 (x)(y1 ) = 0,
[S2]: If so

y1

is also a solution.

Note: This example is a fairly signicant amount of the reason we are interested in linear algebra (as it relates to dierential equations): because the solutions to homogeneous linear dierential equations form a vector space. In general, for arbitrary functions to solve the dierential equation explicitly for the solutions look like.

P1 (x), , Pn (x),

it is not possible

y;

nonetheless, we can still say something about what

1.4 Span, Independence, Bases, Dimension

One thing we would like to know, now that we have the denition of a vector space and a subspace, is what else we can say about elements of a vector space  i.e., we would like to know what kind of structure the elements of a vector space have.

In some of the earlier examples we saw that, in some terminology.

Rn

and a few other vector spaces, subspaces could all be

written down in terms of one or more parameters. In order to discuss this idea more precisely, we rst need

1.4.1

Linear Combinations and Span

Denition: Given a set exist scalars

v1 , , vn
such that

of vectors, we say a vector

is a linear combination of

v1 , , vn

if there

a1 , , an R2 ,
.

w = a1 v1 + + an vn . 1, 1
is a linear combination of

Example: In

the vector

1, 0

and

0, 1

, because

1, 1 = 1 1, 0 +
, and

1 0, 1
because

Example: In

R4 , the vector 4, 0, 5, 9 is a linear combination of 1, 0, 0, 1 4, 0, 5, 9 = 1 1, 1, 2, 3 2 0, 1, 0, 0 + 3 1, 1, 1, 2 .

0, 1, 0, 0

1, 1, 1, 2

Non-Example: In exist no solution

R3 , the vector 0, 0, 1 is not a linear combination of 1, 1, 0 and 0, 1, 1 because there scalars a1 and a2 for which a1 1, 1, 0 + a2 0, 1, 1 = 0, 0, 1 : this would require a common to the three equations a1 = 0, a1 + a2 = 0, and a2 = 1, and this system has no solution.

Denition: We dene the span of vectors which are linear combinations of

an vn ,

for some scalars

v1 , , vn , denoted span(v1 , , vn ), to be the set W of all vectors v1 , , vn . Explicitly, the span is the set of vectors of the form a1 v1 + + a1 , , an . 0 v1 + + 0 vn ,
and

Remark 1: The span is always subspace: since the zero vector can be written as the span is closed under addition and scalar multiplication. Remark 2: The span is, in fact, the smallest subspace any scalars

a1 , , an ,

closure under scalar multiplication requires each of

W.

Then closure under vector addition forces the sum

v1 , , vn : because for a1 v1 , a2 v2 , , an vn to be in a1 v1 + + an vn to be in W .
containing the vectors

Remark 3: For technical reasons, we dene the span of the empty set to be the zero vector. Example: The span of the vectors

1, 0, 0

and

0, 1, 0

in

R3

is the set of vectors of the form

a 1, 0, 0 +

b 0, 1, 0 = a, b, 0 . Equivalently, the plane z = 0.


generating set for

span of these vectors is the set of vectors whose

z -coordinate V,

is zero  i.e., the

Denition: Given a vector space

V,

if the span of vectors

v1 , , vn

is all of

we say that

v1 , , vn

are a

V,

or that they generate

V.
generate

Example: The three vectors can write

1, 0, 0 , 0, 1, 0 , and 0, 0, 1 a, b, c = a 1, 0, 0 + b 0, 1, 0 + c 0, 0, 1 .

R3 ,

since for any vector

a, b, c

we

1.4.2

Linear Independence

Denition: We say a nite set of vectors

v1 , , vn

is linearly independent if

a1 v1 + + an vn = 0

implies

a1 = = an = 0.

(Otherwise, we say the collection is linearly dependent.)

Note: For an innite set of vectors, we say it is linearly independent if every nite subset is linearly independent (per the denition above); otherwise (if some nite subset displays a dependence) we say it is dependent.

In other words,

v1 , , vn

are linearly independent precisely when the only way to form the zero vector

as a linear combination of

v1 , , vn

is to have all the scalars equal to zero.

An equivalent way of thinking of linear (in)dependence is that a set is dependent if one of the vectors is a linear combination of the others  i.e., it depends on the others. Explicitly, if

a1 v1 +a2 v2 + +an vn = 1 0 and a1 = 0, then we can rearrange to see that v1 = (a2 v2 + + an vn ). a1 Example: The vectors 1, 1, 0 and 0, 2, 1 in R3 are linearly independent, because if we have scalars a and b with a 1, 1, 0 + b 0, 2, 1 = 0, 0, 0 , then comparing the two sides requires a = 0, a + 2b = 0, b = 0, which has only the solution a = b = 0. Example: The vectors 1, 1, 0 and 2, 2, 0 in R3 are linearly dependent, because we can write 2 1, 1, 0 + (1) 2, 2, 0 = 0, 0, 0 . Or, in the equivalent formulation, we have 2, 2, 0 = 2 1, 1, 0 . Example: The vectors 1, 0, 2, 2 , 2, 2, 0, 3 , 0, 3, 3, 1 , and 0, 4, 2, 1 in R4 are linearly dependent, because we can write 2 1, 0, 2, 2 + (1) 2, 2, 0, 3 + (2) 0, 3, 3, 1 + 1 0, 4, 2, 1 = 0, 0, 0, 0 .
Theorem: The vectors may be uniquely written as a sum

v1 , , vn are linearly independent if and only if every vector w in the span of v1 , , vn w = a1 v1 + a2 v2 + + an vn .


because

a1 v1 + a2 v2 + + an vn = 0 implies 0. For the other direction, suppose we had two dierent ways of decomposing a vector w , say as w = a1 v1 + a2 v2 + + an vn and w = b1 v1 + b2 v2 + + bn vn . Then subtracting and then rearranging the dierence between these two equations yields w w = (a1 b1 ) v1 + + (an bn ) vn . Now w w is the zero vector, so we have (a1 b1 ) v1 + + (an bn ) vn = 0. But now because v1 , , vn are linearly independent, we see that all of the scalar coecients a1 b1 , , an bn are zero. But this says a1 = b1 , a2 = b2 , . . . , an = bn  which is to say, the two
For one direction, if the decomposition is always unique, then

a1 = = an = 0,

0 v1 + + 0 vn = 0

is by assumption the only decomposition of

decompositions are actually the same.

1.4.3

Bases and Dimension

Denition: A linearly independent set of vectors which generate

is called a basis for

V.

Terminology Note: The plural form of the (singular) word basis is bases. Example: The three vectors

1, 0, 0 , 0, 1, 0 , and 0, 0, 1 generate R3 , as we saw above. They are also linearly independent, since a 1, 0, 0 + b 0, 1, 0 + c 0, 0, 1 is the zero vector only when a = b = c = 0. 3 Thus, these three vectors are a basis for R .
Example: More generally, in

Rn ,

the standard unit vectors

e1 , e2 , , en

(where

ej

has a 1 in the

j th
it is

coordinate and 0s elsewhere) are a basis. Non-Example: The vectors not possible to obtain the

1, 1, 0 and 0, 2, 1 in R3 are not a basis, as they fail to generate V : vector 1, 0, 0 as a linear combination of 1, 1, 0 and 0, 2, 1 . 1, 0, 0 , 0, 1, 0 , 0, 0, 1 , and 1, 1, 1 in R3 are not a basis, as 1 1, 0, 0 + 1 0, 1, 0 + 1 0, 0, 1 + (1) 1, 1, 1 = 0, 0, 0 .
are a basis for the vector space of all polynomials.

Non-Example: The vectors linearly dependent: we have Example: The polynomials

they are

1, x, x2 , x3 ,
2 3

First observe that polynomial).

1, x, x , x ,

certainly generate the set of all polynomials (by denition of a

Now we want to see that these polynomials are linearly independent.

So suppose we had scalars

x. nth derivative of both sides (which is allowable because a0 1+a1 x+ +an xn = 0 is assumed to be true for all x) then we obtain n! an = 0, from which we see that an = 0. Then repeat by taking the (n1)st derivative to see an1 = 0, and so on, until nally we are left with 2 n just a0 = 0. Hence the only way to form the zero function as a linear combination of 1, x, x , , x 2 3 is with all coecients zero, which says that 1, x, x , x , is a linearly-independent set.
such that for all values of Then if we take the

a0 , a1 , , an

a0 1 + a1 x + + an xn = 0,

Theorem: A collection of are the vectors

n vectors v1 , , vn

in

Rn

is a basis if and only if the

n n matrix B , whose columns

v1 , , vn ,

is an invertible matrix.

The idea behind the theorem is to multiply out and compare coordinates, and then analyze the resulting system of equations. So suppose we are looking for scalars

a1 , , an

such that

a1 v1 + + an vn = w,
where

for some vector

in

Rn .
This vector equation is the same as the matrix equation are the vectors

B a = w,

is the matrix whose columns

v1 , , vn , a

is the column vector whose entries are the scalars

a1 , , an ,

and

is

thought of as a column vector.

Now from what we know about matrix equations, we know that

is an invertible matrix precisely when

Ba=w

has a unique solution for every

w.

But having a unique way to write any vector as a linear combination of vectors in a set is precisely the statement that the set is a basis. So we are done.

Theorem: Every vector space generating set for

has a basis. Any two bases of

contain the same number of elements. Any

contains a basis. Any linearly independent set of vectors can be extended to a basis.

Remark: If you only remember one thing about vector spaces, remember that every vector space has a basis ! Remark: That a basis always exists is really, really, really useful. It is without a doubt the most useful fact about vector spaces: vector spaces in the abstract are very hard to think about, but a vector space with a basis is something very concrete (since then we know exactly what the elements of the vector space look like).

To show the rst and last parts of the theorem, we show that we can build any set of linearly independent vectors into a basis:

Start with

being some set of linearly independent vectors. (In any vector space, the empty set is

always linearly independent.)

1. If

S S
in

spans

V,

then we are done, because then

is a linearly independent generating set  i.e., a

basis. 2. If does not span

V,

there is an element

of

which is not in the span of

S.

Then if we put

S,

the new

is still linearly independent. Then start over.

Eventually (to justify this statement in general, some fairly technical and advanced machinery may be needed), it can be proven that we will eventually land in case (1).

If

has dimension

(see below), then we will always be able to construct a basis in at most

steps; it is in the case when

has innite dimension that things get tricky and confusing, and

requires use of what is called the axiom of choice.

To show the third part of the theorem, the idea is to imagine going through the list of elements in a generating set and removing elements until it becomes linearly independent.

This idea is not so easy to formulate with an innite list, but if we have a nite generating set, then we can go through the elements of the generating set one at a time, throwing out an element if it is linearly dependent with the elements that came before it. Then, once we have gotten to the end of the generating set, the collection of elements which we have not thrown away will still be a generating set (since removing a dependent element will not change the span), but the collection will also now be linearly independent (since we threw away elements which were dependent).

To show the second part of the theorem, we will show that if

is a set of vectors with

elements and

is a basis with

elements, with

m > n,

then

is linearly dependent.

To see this, since elements of

is a basis, we can write every element

ai

in

as a linear combination of the

B,

say as

ai =
j=1

ci,j bj

for

1 i m. ai
which is the zero vector. We would like to see

Now suppose we have a linear combination of the that there is some choice of scalars

dk , B

not all zero, such that

dk ak = 0.
k=1

If we substitute in for the vectors in

B,

then we obtain a linear combination of the elements of

equalling the zero vector.

Since

is a basis, this means each coecient of

bj

in the resulting

expression must be zero. If we tabulate the resulting system, we can check that it is equivalent to the matrix equation where

C d = 0,

is the

mn

matrix of coecients with entries

ci,j ,

and

is the

n1

matrix with entries

the scalars

Now since

dk . C is

a matrix which has more rows than columns, by the assumption that

m > n,

we see

that the homogeneous system

C d=0

has a solution vector

which is not the zero vector.

n
But then we have

dk ak = 0
k=1

for scalars

dk

not all zero, so the set

is linearly dependent.

Denition: We dene the number of elements in any basis of

to be the dimension of

V.

The theorem above assures us that this quantity is always well-dened. Example: The dimension of

Rn

is

n,

since the

standard unit vectors form a basis.

This says that the term dimension is reasonable, since it is the same as our usual notion of dimension.

Example: The dimension of the vector space of of the

mn

matrices is

mn

matrices

Ei,j ,

where

Ei,j

is the matrix with a 1 in the

mn, because there is a basis consisting (i, j)-entry and 0s elsewhere. ,


because the (innite list of )

Example:

The dimension of the vector space of all polynomials is

polynomials

1, x, x2 , x3 ,

are a basis for the space.

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 3a) : Linear Transformations supplement

(by Evan Dummit, 2012, v. 2.00)

This supplement discusses the basic ideas of linear transformations. We do not ocially cover linear transformations in this class, for reasons which remain unclear to me. However, I think that knowing about linear transformations is useful, because among other things, the topic ties together vector spaces and matrices very nicely.

Contents
0.1 Linear Transformations 0.1.1 0.1.2 0.1.3 Kernel and Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 4 6 Isomorphisms of Vector Spaces

The Derivative as a Linear Transformation

0.1

Linear Transformations
Now that we have a reasonably good idea of what the structure of a vector space is, the next natural question is: what do maps from one vector space to another look like?

It turns out that we don't want to ask about arbitrary functions, but about functions from one vector space to another which preserve the structure (namely, addition and scalar multiplication) of the vector space.

The analogy to the real numbers is: once we know what the real numbers look like, what can we say about arbitrary real-valued functions? The answer is, not much, unless we specify that the functions preserve the structure of the real numbers  which is abstract math-speak for saying that we want to talk about continuous functions, which turn out to behave much more nicely.

This is the idea behind the denition of a linear transformation: it is a map that preserves the structure of a vector space. Denition: If

and

are vector spaces, we say a map

linear transformation if, for any vectors

v, v1 , v2

and scalar

T from V to W , we have the two

(denoted properties

T : V W)

is a

[T1] The map respects addition of vectors:

T (v1 + v2 ) = T (v1 ) + T (v2 ) T ( v) = v . V)

[T2] The map respects scalar multiplication:

Remark: Like with the denition of a vector space, one can show a few simple algebraic properties of linear transformations  for example, that any linear transformation sends the zero vector (of (of to the zero vector

W ). V = W = R2 ,
, then the map , and

Example: If

which sends

x, y

to

x, x + y

is a linear transformation. .

Let

v = x, y

v1 = x1 , y1

v2 = x2 , y2

, so that

v1 + v2 = x1 + x2 , y1 + y2

[T1]: We have [T2]: We have

T (v1 + v2 ) = x1 + x2 , x1 + x2 + y1 + y2 = x1 , x1 + y1 + x2 , x2 + y2 = T (v1 ) + T (v2 ). T ( v) = x, x + y = x, x + y = T (v). V = W = R2 ,


then the map

More General Example: If transformation.

which sends

x, y

to

ax + by, cx + dy

is a linear

Just like in the previous example, we can work out the calculations explicitly. But another way we can think of this map is as a matrix map: column vector

sends the column vector

x y

to the

ax + by cx + dy

a c

b d

x y

So, in fact, this map

is really just (left) multiplication by the matrix

a c b d

b d

When we think of the map in this way, it is easier to see what is happening: [T1]: We have

T (v1 + v2 ) = a c b d

a c

b d

(v1 + v2 ) = a c b d

a c

b d

v1 +

a c

v2 = T (v1 ) + T (v2 ).

[T2]: Also,

T ( v) =
If

v1 =

v1

= T (v).

Really General Example: matrices) and

is any

V = Rm (thought of as m 1 matrices) and W = Rn (thought of as n 1 n m matrix, then the map T sending v to A v is a linear transformation.

The verication is exactly the same as in the previous example. [T1]: We have [T2]: Also,

T (v1 + v2 ) = A (v1 + v2 ) = A v1 + A v2 = T (v1 ) + T (v2 ).

T ( v) = A v1 = (A v1 ) = T (v). Rm
to

This last example is very general: in fact, it is so general that every linear transformation from of this form! Namely, if that

is a linear transformation from

to

, then there is some

mn

matrix

Rn is A such mn

T (v) = A v

(where we think of

as a column matrix). it is just the

The reason is actually very simple, and it is easy to write down what the matrix matrix whose rows are the vectors elements of

Rm (ej v=

is the vector

A is: T (e1 ), T (e2 ), . . . , T (em ), where e1 , , em are with a 1 in the j th position and 0s elsewhere). v
in

the standard basis

To see that this choice of

works, note that every vector

Rm

can be written as a unique linear

combination

aj ej
j=1

of the basis elements.

Then, after applying the two properties of a linear

m
transformation, we obtain

aj T (ej ). If we write down this map one coordinate at a time, we j=1 see that it agrees with the result of computing the matrix product of the matrix A with the coordinates T (v) =
of

v. T
explicitly, we see that the term in each coordinate in

Tangential Remark: If we write down the map is a linear function of the coordinates in and

 e.g., if

A=

a c

b d

then the linear functions are

ax + by

cx + dy .

This is the reason that linear transformations are named so  because they are really just

linear functions, in the traditional sense.

In fact, we can state something far more general is true: if we take any any

m-dimensional

vector space

and

n-dimensional

vector space

and choose bases for each space, then a linear transformation

from

to

behaves just like multiplication by (some)

nm

matrix

A. Rm
to

The proof in this general case is the same as for linear transformations from basis Let the

Rn :

rst choose a

v1 , v2 , , vm

for

and a basis

w1 , w2 , , wn

for

W,

and then look at how the map

behaves.

elements

V . The claim is that if we write v and T (v) as linear combinations of the basis v1 , v2 , , vm and w1 , w2 , , wn (respectively) then, suitably interpreted, the relation between coecients of v and T (v) will be multiplication by a matrix.
be any element of

By the hypothesis that the combination we obtain

vi are a basis for V , every element of V can be written uniquely as a linear v = a1 v1 + a2 v2 + + am vm . Then by using the fact that T is a linear transformation,
m

T (v) =
j=1

aj T (vj ). T (vj )
uniquely in terms of the basis elements of

Now we can also express each element

W
m

 say, as

T (vi ) = ci,1 w1 + ci,2 w2 + + ci,n wn .


then gives us an explicit expression for of the basis elements

Plugging in all of these expressions to

T (v) =
j=1

aj T (vj )

T (v) as a linear combination T (v) = b1 w1 + b2 w2 + + bn wn

w1 , , wn .

If we multiply out, we will (eventually) end up with a system of equations equivalent to the matrix

equality

C a = b,

where

c1,1 c2,1 C= . . . cm,1

c1,2 c2,2
. . .


.. .

cm,1

a1 c1,n a2 c2,n , a = . , . . . . . an cm,n

and

b1 b2 b = . . . . bm

Remark 1: This result underlines one of the reasons that matrices and vector spaces (which initially seem like they have almost nothing to do with one another) are in fact very closely related: because matrices describe the maps from one vector space to another.

Remark 2: One can also use this relationship between maps on vector spaces and matrices to provide almost trivial proofs of some of the algebraic properties of matrix multiplication which are hard to prove by direct computation.

For example: the composition of linear transformations is associative (because linear transformations are functions, and function composition is associative). Multiplication of matrices is the same as composition of functions. Hence multiplication of matrices is associative.

0.1.1

Kernel and Image

Denition: If elements exists a

T : V W is a linear transformation, then the kernel of T , denoted ker(T ), is the set of v in V with T (v) = 0. The image of T , denoted im(T ), is the set of elements w in W such that there v in V with T (v) = w. T,
and the image is the elements in

Intuitively, the kernel is the elements which are sent to zero by which are hit by

(the range of

T ). T
is, and the image

Essentially (see below), the kernel measures how far from one-to-one the map measures how far from onto the map

is.

One of the reasons we care about these subspaces is that (for example) the set of solutions to a set of homogeneous linear equations by

A.

And the set of vectors

transformation

of

A x = 0 is the b for which there multiplication by A. V.

kernel of the linear transformation exists a solution to

of multiplication

Ax = b

is the image of the linear

The kernel is a subspace of

[S1] We have [S2] If

T (0) = 0,

by simple properties of linear transformations.

v1 and v2 0 + 0 = 0.
[S3] If

are in the kernel, then

T (v1 ) = 0 and T (v2 ) = 0.


Hence

Therefore,

T (v1 +v2 ) = T (v1 )+T (v2 ) =

is in the kernel, then

T (v) = 0.

T ( v) = T (v) = 0 = 0.

The image is a subspace of

W.
by simple properties of linear transformations. and

[S1] We have [S2] If Then [S3] If

T (0) = 0,

w1 and w2 are in the image, then there exist v1 and v2 are such that T (v1 ) = w1 T (v1 + v2 ) = T (v1 ) + T (v2 ) = w1 + w2 , so that w1 + w2 is also in the image. w
is in the image, then there exists

T (v2 ) = w2 .
so

with

T (v) = w.

Then

T ( v) = T (v) = w,

is

also in the image. Theorem: The kernel

im(T ) consists of all of

ker(T ) consists of only the zero vector if and only if the map T W if and only if the map T is onto.

is one-to-one. The image

The statement about the image is just the denition of onto. If

is one-to-one, then (at most) one element of

maps to

0.

But since the zero vector is taken to the

zero vector, we see that If

cannot send anything else to

0.

Thus

ker(T ) = 0.

T is a linear transformation, the statement T (v1 ) = T (v2 ) is equivalent to the statement that T (v1 ) T (v2 ) = T (v1 v2 ) is the zero vector. But, by the denition of the kernel, T (v1 v2 ) = 0 precisely when v1 v2 is in the kernel. However, this means v1 v2 = 0, so v1 = v2 . Hence T (v1 ) = T (v2 ) implies v1 = v2 , which means T is one-to-one.
is only the zero vector, then since

ker(T )

Denitions: The dimension of of

ker(T )

is called the nullity of

T,

and the dimension of im(T ) is called the rank

T.
A linear transformation with a large nullity has a large kernel, which means it sends many elements to zero (hence nullity).

Theorem (Rank-Nullity): For any linear transformation

T : V W , dim(ker(T )) + dim(im(T )) = dim(V ). w1 , , wk , then there ker(T ), the goal is to j vj ,


j=1
where the exist show

The idea behind this theorem is that if we have a basis for im(T ), say

v1 , , vk

with

that the set of

T (v1 ) = w1 , . . . , T (vk ) = wk . Then if a1 , , al vectors {v1 , , vk , a1 , , al } is a basis for V .


k k

is a basis for

j T (vj ) = T

To do this, given any unique.

write

T (v) =
j=1

j w j =
j=1

are


Then subtraction shows that

j vj = 0
so that

T v
j=1

v
j=1

j vj

is in

ker(T ),

hence can be

l
written as a sum

i ai ,
i=1

where the

are unique.

Putting all this together shows

v=
j=1

j vj +
i=1

i ai

for unique scalars

and

i ,

which says that

{v1 , , vk , a1 , , al }

is a basis for

V.

Remark: Here is another way of interpreting and proving the nullity-rank theorem.

If we x a basis for vectors

and for

and view the linear transformation as a matrix, then the kernel of the

transformation is the solution space to the homogeneous system

A x = 0,

and the image is the space of

such that there exists a solution to

A x = b. A. A,
since the row vectors

The value of

dim(ker(T )) dim(im(T ))

is the size of a basis for the solutions to the homogeneous equation, which we

know is the number of nonpivotal columns in the reduced row echelon form of The value of form of

is the size of a basis for the collection of row vectors of

span the image. So the dimension of im(T ) is the number of pivotal columns in the reduced row-echelon

A. A
(since every column

Therefore, the sum of these two numbers is the number of columns of the matrix is either pivotal or nonpivotal). But this is just

dim(V ).

0.1.2

Isomorphisms of Vector Spaces

Denition: A linear transformation

T :V W

is called an isomorphism if and im(T )

is also one-to-one and onto.

Equivalently,

is an isomorphism if

ker(T ) = 0

= W. V
and

We say that two vector spaces are isomorphic if there exists an isomorphism between them. Saying that two spaces are isomorphic is a very strong statement: it says that the spaces exactly the same structure, as vector spaces.

have

Informally, this means that if we used elements of

to relabel the elements of

to have the same names as the

W,

we wouldn't be able to tell to the space

and of

apart at all, as vector spaces. matrices, with an isomorphism

Example: The space

T (x1 , x2 , x3 , x4 ) =

R4 is isomorphic x1 x2 . x3 x4

M22

22

given by

This map is a linear transformation; it clearly is additive and respects scalar multiplication.

Also,

that im(T ) Thus

ker(T ) = 0 since = M22 .

the only element mapping to the zero matrix is

(0, 0, 0, 0).

And it is also clear

is an isomorphism. for

Isomorphisms preserve linear independence: independent if and only if

is an isomorphism, the vectors

v1 , , vn

are linearly

T (v1 ), , T (vn )

are.

Because

is a linear transformation, we have

a1 T (v1 ) + + an T (vn ) = T (a1 v1 + + an vn ).


independent:

To see that

v1 , , vn

independent implies

T (v1 ), , T (vn )

a1 T (v1 ) + + an T (vn ) = 0, then by the above we have T (a1 v1 + + an vk ) = 0. But now since ker(T ) = 0, we get a1 v1 + + an vn = 0, and independence of v1 , , vn gives a1 = = an = 0. So T (v1 ), , T (vn ) are independent.
If

then

To see that

T (v1 ), , T (vn )

independent implies then

v1 , , vn

independent:

If

a1 v1 + + an vn = 0,

But now the independence of dent.

a1 T (v1 ) + + an T (vn ) = T (a1 v1 + + an vn ) = T (0) = 0. T (v1 ), , T (vn ) gives a1 = = an = 0, so v1 , , vn are indepen-

If

is an isomorphism, then (because

with

(T (v)) = v

and

T (T

T is one-to-one and onto) there exists an inverse function T 1 : W V , (w)) = w for any v in V and w in W . T 1 is
actually a linear transformation, too:

As we might hope, the inverse map

T (v1 ) = w1 and T (v2 ) = w2 , then because T (v1 + v2 ) = w1 + w2 , we have T 1 (w1 + w2 ) = v1 + v2 = T 1 (w1 ) + T 1 (w2 ). [T2] If T (v) = w, then because T ( v) = w, we have T 1 ( w) = v = T 1 (w).
[T1] If

Theorem: Two (nite-dimensional) vector spaces

and

are isomorphic if they have the same dimension.

In particular, any nite-dimensional vector space is isomorphic to

Rn

for some value of for

n.

To show the result, choose a basis We claim the map between

v1 , , vn

for

and a basis

w1 , , wn

W.
is an isomorphism

dened by

T (a1 v1 + + an vn ) = a1 w1 + + a1 wn T
is unambiguously dened, that

and

W. T
respects addition, that

We need to check ve things: that scalar multiplication, that

respects

is one-to-one, and that

is onto.

T ambiguous  i.e., V , and that we haven't tried to send one element of V to two dierent elements of W . However, we are safe because v1 , , vn is a basis, which means that for every v in V , we have a unique way of writing v as a linear combination of v1 , , vn . [Addition]: If v = a1 v1 + + an vn and v = b1 v1 + + bn vn , then T (v + v ) = (a1 + b1 ) w1 + + (an + bn ) wn = T (v) + T () by the distributive law. v [Multiplication]: For any scalar we have T ( v) = (a1 ) w1 + + (an ) wn = T (v) by
[Well-dened]: We need to make sure that we have not made the denition of that we have dened

on every element of

consistency of multiplication.

w1 , , wn are linearly independent, the only way that a1 w1 + + an wn a1 = a2 = = an = 0, which means ker(T ) = 0. [Onto]: Since w1 , , wn span W , every element w in W can be written as w = a1 w1 + + an wn for some scalars a1 , an . Then for v = a1 v1 + + an vn , we have T (v) = w .
[One-to-one]: Since can be the zero vector is if Isomorphisms preserve linear independence, hence they also preserve dimension. So the

Remark 1:

theorem could be strengthened a little to say that two vector spaces are isomorphic if and only if they have the same dimension.

Remark 2: I think this result is rather unexpected, at least the rst time: it certainly doesn't seem obvious, just from the eight axioms of a vector space, that all nite-dimensional vector spaces are really the same as

Rn .

But they are!

0.1.3

The Derivative as a Linear Transformation

Example: If then

and

T1 + T2
bother.)

and

W are any vector spaces, and T1 and T2 are any linear T1 are also linear transformations, for any scalar .

transformations from

to

W,

These follow from the criteria. (They are somewhat confusing to follow when written down, so I won't

Example: If

is the vector space of real-valued functions and

W = R,

then the evaluation at 0 map taking

to the value

f (0)

is a linear transformation.

[T1]: We have [T2]: Also,

T (f1 + f2 ) = (f1 + f2 )(0) = f1 (0) + f2 (0) = T (f1 ) + T (f2 ).

T ( f ) = (f )(0) = f (0) = T (f ). ,
and the map would still be a linear transformation.

Note of course that being a linear transformation has nothing to do with the fact that we are evaluating at 0. We could just as well evaluate at 1, or

Example: If

and

then the map taking

W are both the vector space of real-valued functions and P (x) is any real-valued function, f (x) to the function P (x)f (x) is a linear transformation. T (f1 + f2 ) = P (x)(f1 + f2 )(x) = P (x)f1 (x) + P (x)f2 (x) = T (f1 ) + T (f2 ).

[T1]: We have [T2]: Also,

T ( f ) = P (x)(f )(x) = P (x)f (x) = T (f ).


is the vector space of all

Example: If

functions, then the

nth

derivative map, taking

n-times dierentiable functions and W is f (x) to its nth derivative f (n) (x), is a
of the

the vector space of all linear transformation.

[T1]: The

nth derivative of the sum is the sum (n) (n) f2 )(n) (x) = f1 (x) + f2 (x) = T (f1 ) + T (f2 ).

nth

derivatives, so we have

T (f1 + f2 ) = (f1 +

[T2]: Also,

T ( f ) = (f )(n) (x) = f (n) (x) = T (f ).

V is the vector space of all n-times dierentiable functions, then the map T which sends a function y to the function y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y is a linear transformation, for any functions Pn (x), , P1 (x).
If we combine the results from the previous four examples, we can show that if

In particular, the kernel of this linear transformation is the collection of all functions

such that

y (n) +

Pn (x) y

(n1)

+ + P2 (x) y + P1 (x) y = 0

 i.e., the set of solutions to this dierential equation.

Note that since we know the kernel is a vector space (as it is a subspace of solutions to

V ),

we see that the set of

y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = 0

forms a vector space. (Of course, we could

just show this statement directly, by checking the subspace criteria.)

However, it is very useful to be able to think of this linear dierential operator sending

to

y (n) +

Pn (x) y

(n1)

+ + P2 (x) y + P1 (x) y

as a linear transformation.

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 4): Linear Dierential Equations

(by Evan Dummit, 2012, v. 1.10)

Contents
1 Linear Dierential Equations
1.1 1.2 1.3 1.4 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . General Theory of Linear Dierential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Homogeneous Linear Equations with Constant Coecients . . . . . . . . . . . . . . . . . . . . . . . . Non-Homogeneous Linear Equations with Constant Coecients . . . . . . . . . . . . . . . . . . . . . 1.4.1 1.4.2 1.5 Undetermined Coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
1 3 3 5 5 8 10 10 12

Second-Order Equations: Applications to Newtonian Mechanics . . . . . . . . . . . . . . . . . . . . . 1.5.1 1.5.2 Spring Problems and Damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Resonance and Forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 Linear Dierential Equations

The general

nth-order linear dierential equation is of the form y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = Q(x) Pn (x), , P2 (x), P1 (x),
and

for some functions

Q(x).

(Note that

(n)

denotes the

nth

derivative of

y .)

The goal is to study the behavior of the solutions to these equations.

Solving general linear dierential equations explicitly is generally hard, unless we are lucky and the equation has a particularly nice form.

We can only give a method for writing down the full set of solutions for a small classes of linear equations: namely, linear dierential equations with constant coecients. There are a few equation types (e.g., Euler equations like constant-coecient equations via substitution.

x2 y + xy + y = 0)

which can be reduced to

Thus, we will spend most of our eort discussing how to solve linear equations with constant coecients, because we can always solve them  these are equations of the form for some constants

y (n) + an1 y (n1) + + a1 y + a0 y = Q(x)

an1 , , a0

and some function

Q(x).

Even harder than the general linear dierential equation are are non-linear equations of higher order, such as

y = ey y

or

y y 2 = y + 1.

We will not discuss non-linear equations at all. Compare with the example of rst-order equations: we can solve any rst-order linear equation, but very few types of rst-order nonlinear equations.

1.1 Terminology

The standard form of a dierential equation is when it is written with all terms involving on one side, and functions of the variable on the other side.

y or higher derivatives

Example: The equation Example: The equation

y +y +y =0 y = 3x2 xy

is in standard form.

is not in standard form.

An equation is homogeneous if, when it is put into standard form, the nonhomogeneous otherwise.

x-side

is zero.

An equation is


An

Example: The equation Example: The equation

y +y +y =0 y + xy = 3x
2

is homogeneous.

is nonhomogeneous.

nth

order dierential equation is an equation in which the highest derivative is the

nth

derivative.

Example: The equations Example: The equation

y + xy = 3x2

and

y y =2

are rst-order.

y +y +y =0

is second-order.

A dierential equation is linear if it is a linear combination of are allowed to be functions of

and its derivatives. (Note that coecients

x.)

In other words, if there are no terms like and

y2 ,

or

(y )3 ,

or

yy

, or

ey .

Example: The equations Example: The equations

y + xy = 3x2 y y = 3x2

y +y +y =0

are linear.

and

y + ey = 0

are not linear.

We say a linear dierential equation has constant coecients if the coecients of

y, y , y

, ... are all constants.


For

Example: The equation Example: The equation

y +y +y =0 y + xy = 3x2

has constant coecients.

does not have constant coecients.

functions

y1 , y2 , , yn

which are each dierentiable

the functions is dened to be the determinant of the

(n 1) times, the y1 y2 y1 y2 matrix . . . . . .


(n1) y1

Wronskian


.. .

(n1) y2

yn

(n1)

W (y1 , y2 , , yn ) yn yn . . . .

of

Note that the Wronskian will also be a function of

x.

The purpose of the Wronskian is to provide a way to show that functions are linearly independent.

Theorem: A collection of space of

n, n-times dierentiable functions y1 , y2 , , yn is linearly independent (in the vector


n

n-times

dierentiable functions) if their Wronskian is not the zero function.

To prove the theorem, note that if the functions are linearly dependent with

aj yj = 0,
j=1

then by

n
dierentiating the appropriate number of times we see that

aj yj = 0
j=1

(i)

for any

0 i n.

Hence,

in particular, the rows of the matrix are linearly dependent (as vectors), and so the determinant of the matrix is zero. Therefore, if the determinant is not zero, the functions cannot be dependent.

Remark:

The theorem becomes an if-and-only if statement (i.e., the functions are linearly indepen-

dent if and only if the Wronskian is nonzero) if we know that the functions dierentiable. The proof of the other direction is signicantly more dicult.

y1 , y2 , , yn

are innitely

Example: The functions

1 and x are linearly independent, because we can compute W (1, x) =

1 0

x 1

1.
Example: The functions

sin(x) and cos(x) are linearly independent, as W (sin(x), cos(x)) = 1, x, and 1 + x are (rather clearly) linearly dependent.
We have

sin(x) cos(x) cos(x) sin(x)

1.
Example: The functions

W (1, x, 1 + x) =

1 0 0

x 1 0

x+1 1 0

= 0,

by expanding along the bottom row.

1.2 General Theory of Linear Dierential Equations

Theorem (Homogeneous Linear Equations): solutions to the homogeneous If

nth

order equation

Pn (x), , P1 (x) are continuous functions, then the set of y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = 0 is an

n-dimensional

vector space.

The fact that the set of solutions is a vector space is not so hard to show using the subspace criteria. The real result of this theorem, which is analogous to the existence-uniqueness theorem for rst-order equations, is that the set of solutions is

n-dimensional. y (x) = 0. y(x) = Ax + B


, for any constants

Example: Consider the homogeneous equation

We can just integrate twice to see that the solutions are

and

B.

Indeed, as the theorem dictates, the solution functions form a two-dimensional space, spanned by the two basis elements

and

x. Pn (x), , P1 (x) and Q(x) are functions continuous

Theorem (Existence-Uniqueness for Linear Equations): If on an interval containing problem

a,

then there is a unique solution (possibly on a smaller interval) to the initial value

y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = Q(x), for any initial conditions y(a) = b1 , y (a) = b2 , , and y (n1) (a) = bn . Additionally, every solution ygen to the general nth order equation may be written as ygen = ypar + yhom , where ypar is any one particular solution to the equation, and yhom is a solution to the (n) homogeneous equation y + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = 0.
What this theorem says is: in order to solve the general equation

y (n) + Pn (x) y (n1) + + P2 (x) y +

P1 (x) y = Q(x),

it is enough to nd one solution to this equation along with the general solution to the

homogeneous equation. The existence-uniqueness part of the theorem is hard, but the second part is fairly simple to show: and

y1

y2

are solutions to the general equation, then their dierence

y1 y2

is a solution to the homogeneous

equation  to see this, just subtract the resulting equations and apply derivatives rules.

A more advanced way to see the same thing is to use the fact that the map

sending

to

y (n) +

Pn (x) y + + P2 (x) y + P1 (x) y is a linear transformation. Then L(y1 ) = L(y2 ) says L(y1 y2 ) = 0, so that y1 y2 is a solution
Example: In order to solve the equation is a solution, and then solve the

(n1)

to the homogeneous equation.

y (x) = ex , the theorem says we only need to nd one function which homogeneous equation y (x) = 0. y(x) = ex
x
has

We can just try simple functions until we discover that Then we need only solve the homogeneous equation Thus the general solution to the general equation

y (x) = ex . Ax + B .
x

y (x) = 0,
is

whose solutions we know are

y (x) = e

y(x) = e + Ax + B

We can also verify that if we impose the initial conditions dictates) there is the unique solution

y(0) = c1 and y (0) = c2 , y = ex + (c2 1)x + (c1 1).

then (as the theorem

1.3 Homogeneous Linear Equations with Constant Coecients

The general linear homogeneous dierential equation with constant coecients is

y (n) + an1 y (n1) + + n-dimensional


vector

a1 y + a0 y = 0,
space.

where

an1 , , a0

are some constants.

From the existence-uniqueness theorem we know that the set of solutions is an

Based on solving rst-order linear homogeneous equations (i.e., solutions to involve exponentials. If we try setting

y + ky = 0), erx

we might expect the and cancelling yields

y = erx

then after some arithmetic we end up with

rn erx + an1 rn1 erx + + a1 rerx + a0 erx = 0.


the characteristic equation

Multiplying both sides by .

r + an1 r

n1

+ + a1 r + a0 = 0

If we can nd

values of

satisfying this

nth-degree

polynomial  i.e., if we can factor the polynomial

and see that it has

distinct roots, then the theorem tells us we will have found all of the solutions. If

we are unlucky and the polynomial has a repeated root, then we need to try something else.

If there are non-real roots (note they will come in complex conjugate pairs!) then we would end up with and

r1 = + i and r2 = i

e r1 x

and

e r2 x

as our solutions.

But we really want real-valued solutions, However we can just write out the real

e r1 x

and

e r2 x

have complex numbers in the exponents.

and imaginary parts using Euler's Theorem, and take linear combinations to obtain the two real-valued

1 1 r1 x [e er2 x ] and ex cos(x) = [er1 x + er2 x ]. 2i 2 Taking motivation from the case of y (k) = 0, whose characteristic equation is rk = 0 (with the k -fold 2 k1 repeated root 0) and whose solutions are y(x) = A1 + A2 x + A3 x + + Ak x , we guess wildly that rx if other roots are repeated, we want to multiply the corresponding exponentials e by a power of x.
solutions

ex sin(x) =

If we put all of these ideas together we can prove that this general outline will, in fact, give us equation with constant coecients.

linearly independent functions, and hence gives the general solution to any homogeneous linear dierential

Advanced Remark:

Another way to organize this information is to think of everything in terms of If

linear transformations.

represents the linear transformation sending a function to its derivative,

then we are looking for functions

with the property that

L(y) = 0,

where

is the linear operator

Dn + an1 Dn1 + + a1 D + a0 .
Note that derivative. Then what we are secretly doing is factoring the polynomial product of linear terms

D2 ,

for instance, means taking the derivative twice in a row  i.e.,

D2

is the second

(Dr1 )

k1

Dn + an1 Dn1 + + a1 D + a0 into a (Drj ) , solving each of the simpler equations (Dri )ki y = 0,
kj

and adding up the resulting terms.

To solve a linear homogeneous dierential equation with constant coecients, follow these steps:

Step 1: Rewrite the dierential equation in the standard form (if necessary). Step 2: Factor the characteristic equation

y (n) + an1 y (n1) + + a1 y + a0 y = 0

rn + an1 rn1 + + a1 r + a0 = 0.

Step 3: For each irreducible factor in the characteristic equation, write down the corresponding terms in the solution:

For terms

(r )k

where is a real number then add the terms of the form

ex , xex , , xk1 ex .

For irreducible terms (r + cx + d) with roots r = i, then add the terms of the form ex sin(x), xex sin(x), , xk1 ex sin(x) and ex cos(x), xex cos(x), , xk1 ex cos(x).

Step 4: If given additional conditions on the solution, solve for the coecients (if necessary).

Example: Find all functions

such that

y + y 6 = 0. r2 + r 6 = 0
which has roots

Step 2: The characteristic equation is Step 3:

r=2 e
3x
.

and

r = 3.

We have two distinct real roots, so our terms are

2x

and

So the general solution is

y = C1 e

2x

+ C2 e

3x

Example: Find all functions

such that

y 2y + 1 = 0,

with

y(0) = 1

and

y (0) = 2. r = 1.

Step 2: The characteristic equation is Step 3: There is a double root at

r2 2r + 1 = 0

which has only the solution

r = 1,

so our terms are

and

xe

Hence the general solution is

y = C1 ex + C2 xex .
Step 4: Plugging in the two conditions gives which

1 = C1 e0 + C2 0,

and

2 = C1 e0 + C2 (0 + 1)e0
x x
.

from

C1 = 1

and

C2 = 1.

Hence the particular solution requested is

y = e + xe

Example: Find all real-valued functions

such that

y = 4y .

Step 1: The standard form here is

y + 4y = 0.

Step 2: The characteristic equation is Step 3: write

r2 + 4 = 0

which has roots

r = 2i

and

r = 2i.

We have two complex-conjugate roots. and

Since the problem asks for real-valued functions we to see that the general solution is

er1 x = cos(2x) + i sin(2x) y = C1 cos(2x) + C2 sin(2x) .

er2 x = cos(2x) i sin(2x)

Example: Find all real-valued functions

such that

y (5) + 5y (4) + 10y + 10y + 5y + y = 0.


which factors as

Step 2: The characteristic equation is Step 3: We have a 5-fold repeated root Hence the general solution is

r5 + 5r4 + 10r3 + 10r2 + 5r + 1 = 0

(r + 1)5 = 0.

r = 1. Thus the terms are ex , xex , x2 ex , x3 ex , and x4 ex . y = C1 ex + C2 xex + C3 x2 ex + C4 x3 ex + C5 x4 ex . y


whose fourth derivative is the same as or in standard form,

Example: Find all real-valued functions

y.

Step 1: This is the equation

= y,

y = 0. (r + 1)(r 1)(r + i)(r i) = 0.


x
,

Step 2: The characteristic equation is Step 3: We have the four roots general solution is

r 1=0

which factors as

1, 1, i, i.
x

Thus the terms are .

ex , sin(x),

and

cos(x).

Hence the

y = C1 e + C2 e

+ C3 sin(x) + C4 cos(x)

1.4 Non-Homogeneous Linear Equations with Constant Coecients



The general linear dierential equation with constant coecients is of the form

a0 y = Q(x),

where

an1 , , a0

are some constants and

Q(x)

is some function of

y (n) + an1 y (n1) + + a1 y + x.

From the general theory, all we need to do is nd one solution to the general equation, and nd all solutions to the homogeneous equation. Since we know how to solve the homogeneous equation in full generality, we just need to develop some techniques for nding one solution to the general equation.

There are essentially two ways of doing this.

The Method of Undetermined Coecients is just a fancy way of making an an educated guess about what the form of the solution will be and then checking if it works. It will work whenever the function

Q(x) is linear combination of terms of the form xk ex


not something like

(where

is an integer and

thus, for example, we could use the method for something like

is a complex number): Q(x) = x3 e8x cos(x) 4 sin(x) + x10 but

Q(x) = tan(x).

Variation of Parameters is a more complicated method which uses some linear algebra and cleverness to use the solutions of the homogeneous equation to nd a solution to the non-homogeneous equation. It will always work, for any function

Q(x),

but generally requires more setup and computation.

1.4.1 Undetermined Coecients

The idea behind the method of undetermined coecients is that we can 'guess' what our solution should look like (up to some coecients we have to solve for), if of things similar to the terms in

Q(x)

involves sums and products of polynomials,

exponentials, and trigonometric functions. Specically, we try a solution

y = [stu],

where the 'stu ' is a sum

Q(x).

Here is the procedure for generating the trial solution:

Step 1: Generate the rst guess for the trial solution as follows:

Replace all numerical coecients of terms in of

Q(x)

with variable coecients. If there is a sine (or

cosine) term, add in the companion cosine (or sine) terms, if they are missing. Then group terms

Q(x)

into blocks of terms which are the same up to a power of

x,

and add in any missing

lower-degree terms in each block.

xn erx appears in Q(x), ll in the terms of the form erx [A0 + A1 x + + An xn ], n x and if a term of the form x e sin(x) or xn ex cos(x) appears in Q(x), ll in the terms of the ax form e cos(bx) [D0 + D1 x + + Dn xn ] + eax sin(bx) [E0 + E1 x + + En xn ].
Thus, if a term of the form

Step 2: Solve the homogeneous equation, and write down the general solution. Step 3: Compare the rst guess for the trial solution with the solutions to the homogeneous equation. If any terms overlap, multiply all terms in the overlapping block by the appropriate power of will remove the duplication.

which

Here is a series of examples demonstrating the procedure for generating the trial solution:

Example:

y y = x. Q(x)
to get

Step 1: We ll in the missing constant term in Step 2: General solution is

D0 + D1 x.
.

A1 ex + A2 ex . D0 + D1 x

Step 3: There is no overlap, so the trial solution is

Example:

y + y = x 2. D0 + D1 x. A + Bex . solution D0 ) so we multiply


the corresponding trial solution terms is the trial solution.

Step 1: We have

Step 2: General homogeneous solution is Step 3: There is an overlap (the by

x,

to get

D0 x + D1 x D0 ex .

. Now there is no overlap, so

D0 x + D1 x2

Example:

y y = ex . Aex + Bex . x solution D0 e ) so we multiply the x is the trial solution. so D0 xe

Step 1: We have

Step 2: General homogeneous solution is Step 3: There is an overlap (the

trial solution term by

x,

to get

D0 xe
Example:

. Now there is no overlap,

y 2y + y = 3ex . D0 ex . Aex + Bxex . D0 ex ) so we multiply D0 x e


2 x
. the trial solution term by

Step 1: We have

Step 2: General homogeneous solution is Step 3: There is an overlap (the solution

x2 ,

to get

rid of the overlap, giving us the trial solution Example:

y 2y + y = x3 ex . D0 ex + D1 xex + D2 x2 ex + D3 x3 ex . x x The general homogeneous solution is A0 e + A1 xe . x x There is an overlap (namely D0 e + D1 xe ) so we multiply the trial solution
2 x 3 x 4 x 5 x
as the trial solution.

Step 1: We ll in the lower-degree terms to get Step 2: Step 3: get

terms by

x2

to

D0 x e + D1 x e + D2 x e + D3 x e y + y = sin(x).

Example:

Step 1: We ll in the missing cosine term to get Step 2: The general homogeneous solution is

D0 cos(x) + E0 sin(x). A cos(x) + B sin(x). Step 3: There is an overlap (all of D0 cos(x) + E0 sin(x)) so we multiply the trial solution terms by x to get D0 x cos(x) + E0 x sin(x). There is now no overlap so D0 x cos(x) + E0 x sin(x) is the trial y + y = x sin(x). D0 cos(x) +

solution.

Example:

Step 1: We ll in the missing cosine term and then all the lower-degree terms to get

E0 sin(x) + D1 x cos(x) + E1 x sin(x).


Step 2: The general homogeneous solution is Step 3: There is an overlap (all of in that group by

A cos(x) + B sin(x). D0 cos(x) + E0 sin(x)) so we


2

multiply the trial solution terms , which is the trial

to get

D0 x cos(x) + E0 x sin(x) + D1 x cos(x) + E1 x2 sin(x)

solution since now there is no overlap.

Example:

y y = x + xex . xex
and the lower-degree term for

Step 1: We ll in the lower-degree term for

x,

to get

A0 + A1 x +

B0 e + B1 xe

C0 + C1 x + Dex . x Step 3: There are overlaps in both groups of terms: A0 + A1 x and B0 e each overlap, so we multiply 2 x the  x group by x and the  e  group by x to get rid of the overlaps. There are now no additional 2 3 x 2 x overlapping terms, so the trial solution is A0 x + A1 x + B0 xe + B1 x e .
Step 2: The general homogeneous solution is

Example:

+ 2y + y = xex + x cos(x). x cos(x)


and

Step 1: We ll in the lower-degree term for the lower-degree terms for

xex , then the missing sine term for x cos(x), and then x sin(x), to get A0 ex + A1 xex + D0 cos(x) + E0 sin(x) + B0 cos(x) + C0 sin(x) + B1 x cos(x) + C1 x sin(x). D0 cos(x) + E0 sin(x) + D1 x cos(x) + D1 x sin(x)) so

D1 x cos(x) + E1 x sin(x).
Step 2: The general homogeneous solution is Step 3: There is an overlap (namely, all of multiply that group by the trial solution is we so

x2 to get rid of the overlap. There are no additional overlapping terms, A0 ex + A1 xex + D0 x2 cos(x) + E0 x2 sin(x) + D1 x3 cos(x) + E1 x3 sin(x) .

Here is a series of examples nding the general trial solution and then solving for the coecients:

Example: Find a function

such that

y + y + y = x. y = D0 + D1 x,
so that because there is no overlap with the

The procedure produces our trial solution as solutions to the homogeneous equation. We plug in and get So our solution is

0 + (D1 ) + (D1 x + D0 ) = x,
.

D1 = 1

and

D0 = 1.

y =x1 y

Example: Find a function

such that

y y = 2ex . y = D0 xex ,
since

The procedure gives the trial solution as homogeneous equation. If

D 0 ex

overlaps with the solution to the

y = D0 xex

then

Solving yields

y = D0 (x + 2)ex so plugging in D0 = 1, so our solution is y = xex . y


such that

yields

y y = [D0 (x + 2)ex ] [D1 x ex ] = 2ex .

Example: Find a function

y 2y + y = x + sin(x). y = (D0 + D1 x) + (D2 cos(x) + D3 sin(x)),


by lling in the

The procedure gives the trial solution as homogeneous equation.

missing constant term and cosine term, and because there is no overlap with the solutions to the

Then we have

y = D2 cos(x) D3 sin(x) and y = D1 D2 sin(x) + D3 cos(x) so plugging in yields

y 2y +y = [D2 cos(x) D3 sin(x)]2 [D1 D2 sin(x) + D3 cos(x)]+[D0 + D1 x + D2 cos(x) + D3 sin(x)]


and setting this equal to

x + sin(x)

then requires

D3 2D2 D3 = 0,

so our solution is

D0 2D1 = 0, D1 = 1, D2 + 2D3 D2 = 1, 1 y = x + 2 + cos(x) . 2 y +y =0 y = C1 cos(x) + C2 sin(x). y = D0 x cos(x) + then multiplying both by x due to the overlap
are

Example: Find all functions

such that

y + y = sin(x).

The solutions to the homogeneous system

Then the procedure gives the trial solution for the non-homogeneous equation as

D1 x sin(x),

by lling in the missing cosine term and

with the solutions to the homogeneous equation.

y = D0 x cos(x) 2D0 sin(x) D1 x sin(x) + 2D1 cos(x). Plugging in yields y +y = (D0 x cos(x) 2D0 sin(x) D1 x sin(x) + 2D1 cos(x))+(D0 x sin(x) + D1 x cos(x)), 1 and so setting this equal to sin(x), we obtain D0 = 0 and D1 = . 2 1 Therefore the set of solutions is y = x cos(x) + C1 cos(x) + C2 sin(x) , for constants C1 and C2 . 2
We can compute (eventually) that

Advanced Remark: The formal idea behind the method of undetermined coecients is that the terms on the right-hand side are themselves solutions of dierential equations  and hence are sent to zero by a polynomial in the dierential operator

sending a function to its derivative.

(Thus, for example,

D(x2 ) = 2x

and

D(e ) = e

.)

Then if we apply that polynomial in

D which sends Q(x) to zero, to both sides of the original equation, we Q(x). D3 to both sides), in order = 0, which has characteristic

will end up with a homogeneous equation whose characteristic polynomial is the product of the original characteristic polynomial and the characteristic polynomial for

Example: Find the form of a solution to

y y =x

If we dierentiate both sides 3 times (i.e., apply the dierential operator to kill o the polynomial

x term on the right-hand side then we get y y r5 r3 = (r2 1) (r3 ). Observe that this polynomial is the product of the characteristic 2 3 3 polynomial r 1 of the original equation and the polynomial r corresponding to D . Now y (5) y (3) = 0 is homogeneous, so we can write down the general solution to obtain y = C1 + C2 x + C3 x2 + C4 ex + C5 ex . This is the same solution form that the method of undetermined coecients gives, though with some extra lower-degree terms (which are solutions to the homogeneous equation y y = 0).
Example: Find the form of a solution to

(5)

(3)

y + y = x + sin(x).
2

This is the same as the equation We want to apply the operator

(D + 1) y = x + sin(x) D2 to kill the x term, and the operator D2 + 1 to kill the sin(x) term. 2 2 2 The new dierential equation, after we are done, is (D + 1)(D )(D + 1) y = 0. 2 2 2 The characteristic polynomial is (r + 1) r , which has a double root at each of r = 0 and i. Solving the resulting homogeneous equation gives the solutions as y = C1 sin(x) + C2 x sin(x) + C3 cos(x) + C4 x cos(x) + C5 + C6 x, which is the same thing that the method of undetermined

coecients gives, up to some extra terms.

1.4.2 Variation of Parameters

Variation of Parameters will solve any non-homogeneous linear equation provided that the solutions to the homogeneous equation are known. However, the derivation is not entirely enlightening, so I will just give the steps to follow to solve

y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = Q(x).

(The method requires being able

to solve the homogeneous equation, so we will typically apply this method to the constant-coecient equation

y (n) + an1 y (n1) + + a1 y + a0 y = 0.)


Step 1: Solve the corresponding homogeneous equation and nd

y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = 0


a solution to the original

(linearly independent) solutions

y1 , , yn .

Step 2: Look for functions

equation: do this by requiring

v1 , , vn making yp = v1 y1 + v2 y2 + + vn yn v1 , v2 , , vn to satisfy the system of equations v 1 y 1 + v 2 y 2 + + vn y n v1 y 1 + v2 y 2 + + vn y n


. . .

= =
. . .

0 0
. . .

v1 y1 v1
Solve the relations for where

(n2)

+ v2 y2 + v2

(n2)

(n2) + + vn y n

= =

0 Q(x)
This yields

(n1) y1

(n1) y2

+ + vn

(n1) yn

v1 , v2 , , vn using Cramer's Rule (or any other method). y1 , y2 , , yn


and the Wi is 0 . . . . 0 Q(x)

vi =

Wi (x) W (x)

is the Wronskian of the functions

same Wronskian determinant

except with the

ith

column replaced with the column vector

Step 3: Integrate each of these relations to nd

v1 , , vn .

(Ignore constants of integration.)

Step 4: Write down the particular solution to the nonhomogeneous equation,

yp = v1 y1 + v2 y2 + + vn yn

Step 5: If asked, add the particular solution to the general solution to the homogeneous equation, to nd all solutions of the nonhomogeneous equation. This will yield extra conditions given to solve for coecients.

y = yp + C1 y1 + + Cn yn

. Plug in any

Example: Find all functions

for which

y + y = sec(x). y +y = 0
which has two independent solutions of

Step 1: The homogeneous equation is and

y1 = cos(x)

y2 = sin(x). Q(x) = sec(x).

Step 2: We have

cos(x) sin(x) = 1. sin(x) cos(x) 0 sin(x) = sin(x) sec(x). Also, W1 = sec(x) cos(x) cos(x) 0 = cos(x) sec(x) = 1. Finally, W2 = sin(x) sec(x) Thus plugging in to the formulas gives v1 = sin(x) sec(x) = tan(x) and v2 = cos(x) sec(x) = 1.
We have

W =

Step 3: Integrating yields

v1 = ln(cos(x))

and

v2 = x . y = [ln(cos(x)) cos(x) + x sin(x)] + C1 sin(x) + C2 cos(x)


.

Step 4: We obtain the particular solution of

yp = ln(cos(x)) cos(x) + x sin(x).

Step 5: The general solution is, therefore, given by

Example: Find all functions

for which

y y + y y = ex .

We could use undetermined coecients to solve this, but let's use variation of parameters. Step 1: The homogeneous equation is

y y +y y = 0

which has characteristic polynomial

r 1 = (r 1)(r + 1).
Step 2: We have

So three independent solutions are

y1 = cos(x), y2 = sin(x),

and

r3 r2 + y3 = ex .

Q(x) = ex .

cos(x) sin(x) ex cos(x) sin(x) ex R3 +R1 x sin(x) cos(x) ex = 2ex . = We have W = sin(x) cos(x) e x 0 0 2ex cos(x) sin(x) e 0 sin(x) ex cos(x) ex = e2x (sin(x) cos(x)). Next, W1 = 0 x e sin(x) ex cos(x) 0 ex Also, W2 = sin(x) 0 ex = e2x (cos(x) + sin(x)). cos(x) ex ex cos(x) sin(x) 0 0 = ex . Finally, W3 = sin(x) cos(x) cos(x) sin(x) ex 1 1 Thus plugging in to the formulas gives v1 = ex (sin(x) cos(x)), v2 = ex (cos(x) + sin(x)), and 2 2 1 v3 = . 2 1 1 1 Step 3: Integrating yields v1 = ex cos(x), v2 = ex sin(x), and v3 = x. 2 2 2 1 1 1 1 1 Step 4: The particular solution is yp = ex cos2 (x) ex sin2 (x) + xex = ex + xex . 2 2 2 2 2 1 Step 5: The general solution is, therefore, given by y = xex + C1 cos(x) + C2 sin(x) + C3 ex . (Note 2 1 x x that we absorbed the e term from the particular solution into C3 e .) 2
Theorem (Abel's Identity): If y1 , y2 , , yn are any solutions to the homogeneous linear dierential equation y (n) +Pn (x) y (n1) + +P2 (x) y +P1 (x) y = 0, then the Wronskian W (y1 , , yn ) is equal to C e Pn (x) dx , for some constant C .

Abel's Identity allows one to compute the Wronskian (up to a constant factor) without actually nding the solutions

y1 , y2 , , yn . C = 0)
or nonzero

Another result of Abel's Identity is that the Wronskian is either zero everywhere (if everywhere (if

C = 0,

since exponentials are never zero).

Abel's Identity provides some relationships between the solution functions, which can sometimes be used to deduce information about them.

y1 and y2 , the WronW (x) = y2 y1 y1 y2 . Therefore, if one solution y1 is known, Abel's identity gives the value of W (x) (up to a value for C , which we can take to be 1 by rescaling), and therefore yields a rst-order linear dierential equation for the other solution y2 , which can then be solved for y2 . This procedure is known in general as reduction of order.
For example, for a second-order linear dierential equation with two solutions skian is

The (very clever) proof of Abel's Identity involves showing that the Wronskian

W = Pn (x) W , by dierentiating operations to deduce that W = Pn (x) W .


equation

the determinant that denes

W satises the dierential W , and then using row

1.5 Second-Order Equations: Applications to Newtonian Mechanics

One of the applications we somewhat care about is the use of second-order dierential equations to solve certain physics problems. Most of the examples involve springs, because springs are easy to talk about.

Note: Second-order linear equations also arise often in basic circuit problems in physics and electrical engineering. All of the discussion of the behaviors of the solutions to these second-order equations also carries over to that setup.

1.5.1 Spring Problems and Damping

Basic setup: An object is attached to one end of a spring whose other end is xed. The mass is displaced some amount from the equilibrium position, and the problem is to nd the object's position as a function of time.

Various modications to this basic setup include any or all of (i) the object slides across a surface thus adding a force (friction) depending on the object's velocity or position, (ii) the object hangs vertically thus adding a constant gravitational force, (iii) a motor or other device imparts some additional nonconstant force (varying with time) to the object.

In order to solve problems like this one, follow these steps:

Step 1: Draw a diagram and label the quantity or quantities of interest (typically, it is the position of a moving object) and identify and label all forces acting on those quantities. Step 2: Find the values of the forces involved, and then use Newton's Second Law (F conditions.

= ma)

to write

down a dierential equation modeling the problem. Also use any information given to write down initial

In the above,

is the net force on the object  i.e., the sum of each of the individual forces acting on

m is the mass of the object, and a is the object's acceleration. Remember that acceleration is the second derivative of position with respect to time  thus, if y(t) is the object's position, acceleration is y (t). You may need to do additional work to solve for unknown constants  e.g., for a spring constant, if
the mass (with the proper sign)  while it is not explicitly given to you  before you can fully set up the problem.

Step 3: Solve the dierential equation and nd its general solution. Step 4: Plug in any initial conditions to nd the specic solution. Step 5: Check that the answer obtained makes sense in the physical context of the problem.

In other words, if you have an object attached to a xed spring sliding on a frictionless surface, you should expect the position to be sinusoidal, something like constants

C1 sin(t) + C2 cos(t) + D

for some

C1 , C2 , , D. t
grows to

If you have an object on a spring sliding on a surface imparting friction, you should expect the position to tend to some equilibrium value as down' as time goes on.

since the object should be 'slowing

Basic Example: An object, mass object is displaced a distance time

m,

is attached to a spring of spring constant

whose other end is xed. The

from the equilibrium position of the spring, and is let go with velocity

v0

at

t = 0.

If the object is restricted to sliding horizontally on a frictionless surface, nd the position of the

object as a function of time.

Step 1:

Take

y(t)

to be the displacement of the object from the equilibrium

position. The only force acting on the object is from the spring, Step 2: We know that

Fspring .

Fspring = k y

from Hooke's Law (aka, the only thing we know about springs).

Therefore we have the dierential equation

k y = m y

We are also given the initial conditions

y(0) = d
Step 3:

and

y (0) = v0 . m y + k y = 0,
or as

We can rewrite the dierential equation as

y +

k y = 0. m

The

k k = 0 with roots r = i. Hence the general solution is m m k y = C1 cos(t) + C2 sin(t) , where = . m Step 4: The initial conditions give d = y(0) = C1 and v0 = y (0) = C2 hence C1 = d and C2 = v0 / . v0 sin(t) . Hence the solution we want is y = d cos(t) + Step 5: The solution we have obtained makes sense in the context of this problem, since on a frictionless
characteristic equation is then

r2 +

surface we should expect that the object's motion would be purely oscillatory  it should just bounce back and forth along the spring forever since there is nothing to slow its motion. We can even see that the form of the solution agrees with our intuition: the fact that the frequency

k m

increases with bigger

spring constant but decreases with bigger mass makes sense  a stronger spring with larger spring's force and oscillate more slowly.

should pull

back harder on the object and cause it to oscillate more quickly, while a heavier object should resist the

Most General Example: An object, mass xed. by The object is displaced a distance velocity

m, is attached to a spring of spring d from the equilibrium position of

constant

whose other end is

the spring, and is let go with

v0

at time

t = 0.

A motor attached to the object imparts a force along its direction of motion given

R(t).

If the object is restricted to sliding horizontally on a surface which imparts a frictional force of

times the velocity of the object (opposite to the object's motion), set up a dierential equation modeling the problem.

Here is the diagram: As before we take

. to be the displacement of the object from the equilibrium position. The forces

y(t)

acting on the object are from the spring,

Fspring ,

from friction,

Ff riction ,

and from the motor,

Fmotor .

We know that given by

also given that

Fspring = k y from Hooke's Law (aka, the only thing we know about springs). We are Ff ric = y , since the force acts opposite to the direction of motion and velocity is And we are just given Fmotor = R(t).
. We are also

Plugging in gives us the dierential equation

m y + y + k y = R(t)

k y y + R(t) = m y , which in standard given the initial conditions y(0) = d and y (0) = v0 . my +y +ky = 0
and

form is

Some Terminology: If we were to solve the dierential equation few dierent kinds of behavior depending on the parameters

(here we assume

that there is no outside force acting on the object, other than the spring and friction), we would observe a

m, ,

k.

Overdamped Case: If

2 4mk > 0

and

R(t) = 0,

we would end up with general solutions of the form

C1 e

r1 t

+C2 e

r2 t

, which when graphed is just a sum of two exponentially-decaying functions. Physically,

as we can see from the condition equilibrium at

2 4mk > 0,

this means we have 'too much' friction, since we can just

see from the form of the solution function that the position of the object will just slide back towards its

y=0

without oscillating at all. This is the overdamped case. [Overdamped because

there is 'too much' damping.]

Critically Damped Case: If form

2 4mk = 0

and

R(t) = 0,

we would end up with general solutions of the once, depending on the values of

(C1 + C2 t)e

rt

, which when graphed is a slightly-slower-decaying exponential function that still

does not oscillate, but could possibly cross the position

y=0

C1

and

C2 .

This is the critically damped case. [Critically because we give the name 'critical' to values where

some kind of behavior transitions from one thing to another.] Underdamped Case: If

2 4mk < 0
where

et [C1 cos(t) + C2 sin(t)],

R(t) = 0, we end up with 4mk 2 2 = and = . 2m 4m2


and

general solutions of the form When graphed this is a sine

curve times an exponentially-decaying function. Physically, this means that there is some friction (the exponential), but 'not enough' friction to eliminate the oscillations entirely  the position of the object will still tend toward

y = 0,

but the sine and cosine terms will ensure that it continues oscillating. This

is the underdamped case. [Underdamped because there's not enough damping.]

Undamped Case: If there is no friction (i.e.,

= 0),

we saw earlier that the solutions are of the form

y = C1 cos(t) + C2 sin(t)

where

2 = k/m.

Since there is no friction, it is not a surprise that this is

referred to as the undamped case.

1.5.2 Resonance and Forcing

Suppose an object of mass at the same frequency

(sliding on a frictionless surface) is oscillating on a spring with frequency

Examine what happens to the object's motion if an external force

. F (t) = A cos(t) is applied which oscillates = k/m k = m 2 .


Newton's

.
so we must have

From the solution to the Basic Example above, we know that Then if

y(t)

is the position of the object once we add in this new force

Second Law now gives

k y + R(t) = m y m
and put in

If we divide through by

R(t) = A cos(t + ), m y + k y = R(t). A k = m 2 we get y + 2 y = cos(t). m


, or

Now we use the method of undetermined coecients to nd a solution to this dierential equation. We would like to try something of the form

y = D1 cos(t) + D2 sin(t),

but this will not work because

functions of that form are already solutions to the homogeneous equation

y + 2 y = 0.

Instead the method instructs that the appropriate solution will be of the form y = D1 t cos(t) + D2 t sin(t). We can use a trigonometric formula (the sum-to-product formula) to rewrite this as y = D t cos(t + ), where is a phase shift. (We can solve for the coecients in terms of A, m, but it will not be so useful.)

We can see from this formula that as

grows, so does the amplitude

D t:

in other words, as time goes

on, the object will continue oscillating with frequency back and forth will get larger and larger.

around its equilibrium point, but the swings

You can observe this phenomenon for yourself if you sit in a rocking chair, or swing an object back and forth  you will quickly nd that the most eective way to rock the chair or swing the object is to push back and forth at the same frequency that the object is already moving at.

We may work out the same computation with an external force

F (t) = A cos(1 t)

oscillating at a frequency

1 = .
In this case (using the same argument as above) we have

y + 2 y =

A cos(1 t). m
, where

The trial solution (again by undetermined coecients) is Thus the overall solution is Now as we can see, if

y(t) = B cos(1 t)

B=

A/m 2. 2 1

B cos(1 t),

plus a solution to the homogeneous system.

and

are far apart (i.e., the driving force is oscillating at a very dierent

frequency from the frequency of the original system) then

will be small, and so the overall change

B cos(1 t)

that the driving force adds will be relatively small.

However, if

and

are very close to one another (i.e., the driving force is oscillating at a frequence

close to that of the original system) then

will be large, and so the driving force will cause the system

to oscillate with a much bigger amplitude.

As

approaches

the amplitude

will go to

which agrees with the behavior seen in the previous

example (where we took

1 = ).

Important Remark: Understanding how resonance arises (and how to minimize it!) is a very, very important application of dierential equations to structural engineering.

A poor understanding of resonance is something which, at several times in the not-too-distant past, has caused bridges to fall down, airplanes to crash, and buildings to fall over. We can see from the two examples that resonance arises when an external force acts on a system at (or very close to) the same frequency that the system is already oscillating at. Of course, resonance is not always bad. The general principle, of applying an external driving force at (one of ) a system's natural resonance frequencies, is the underlying physical idea behind the construction of many types of musical instruments.

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 5): Eigenvalues and Eigenvectors

(by Evan Dummit, 2012, v. 1.00)

Contents
1 Eigenvalues and Eigenvectors
1.1 1.2 1.3 1.4 The Basic Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some Slightly More Advanced Results About Eigenvalues Theory of Similarity and Diagonalization

1
1 4 5 8

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

How To Diagonalize A Matrix (if possible) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Eigenvalues and Eigenvectors


We have discussed (perhaps excessively) the correspondence between solving a system of homogeneous linear equations and solving the matrix equation vectors.

A x = 0,

for

an

nn

matrix and

and

each

n1

column

For reasons that will become more apparent soon, a more general version of this question which is also of interest is to solve the matrix equation problem corresponds to

A x = x,

where

is a scalar. (The original homogeneous system

= 0.) T :V V x
does

In the language of linear transformations, this says the following: given a linear transformation from a vector space

to itself, on what vectors

act as multiplication by a constant

1.1

The Basic Setup


Denition: For

an

corresponding scalar

n n matrix, a nonzero is called an eigenvalue

vector of

with

A x = x

is called an eigenvector of

A,

and the

A. 0
an eigenvector.

Important note: We do not consider the zero vector For a xed value of

the set

the zero vector, is a subspace of

S whose elements are the eigenvectors x with A x = x, together with V . (This set S is called the eigenspace associated to the eigenvalue .)

S contains the zero vector. S is closed under addition, because if A x1 = x1 and A x2 = x2 , then A (x1 + x2 ) = (x1 + x2 ). [S3]: S is closed under scalar multiplication, because for any scalar , A (x) = (A x) = (x) = (x).
[S1]: [S2]:

It turns out that it is fairly straightforward to nd all of the eigenvalues: because

nn

identity matrix, we can rewrite the eigenvalue equation

we know precisely when there will be a nonzero vector is not invertible, or, in other words, when

x with det(I A) = 0. det(tI A),

x = (I) x where I is the A x = x = (I) x as (I A) x = 0. But (I A) x = 0: it is when the matrix (I A) n


in the

Denition: When we expand the determinant variable

we will obtain a polynomial of degree

t.

This polynomial is called the characteristic polynomial

p(t)

of the matrix

A,

and its roots are

precisely the eigenvalues of

A. tn

Notation 1: Some authors instead dene the characteristic polynomial as the determinant of the matrix

A tI rather n than (1) .

than

tI A.

I dene it this way because then the coecient of

will always be 1, rather

Notation 2: It is often customary, when referring to the eigenvalues of a matrix, to include an eigenvalue the appropriate number of extra times if the eigenvalue is a multiple root of the characteristic polynomial. Thus, for the characteristic polynomial

t2 (t 1)3 ,

we could say the eigenvalues are

= 0, 0, 1, 1, 1

if we

wanted to emphasize that the eigenvalues occurred more than once.

Remark: The characteristic polynomial may have non-real numbers as roots. Non-real eigenvalues are absolutely acceptable; the only wrinkle is that the eigenvectors for these eigenvalues will also necessarily contain non-real entries. (If

A has real number entries, then any non-real roots of the characteristic poly-

nomial will come in complex conjugate pairs. The eigenvectors for one root will be complex conjugates of the eigenvectors for the other root.)

Proposition: The eigenvalues of an upper-triangular matrix are the diagonal entries.

This statement follows from the observation that the determinant of an upper-triangular matrix is the product of the diagonal entries, combined with the observation that if of times.)

A is upper-triangular,

then

tI A

is also upper-triangular. (If diagonal entries are repeated, the eigenvalues are repeated the same number


Example: The eigenvalues of 2, and 3.

1 0 0

i 3 0

3 8

are 1, 3, and

and the eigenvalues of

2 0 0

0 3 0

1 2 2

are 2,

To nd all the eigenvalues (and eigenvectors) of a matrix

A,

follow these steps:

Step 1: Write down the matrix characteristic polynomial Step 2: Set

tI A

and compute its determinant (using any method) to obtain the

p(t).
of

p(t)

equal to zero and solve. The roots are precisely the eigenvalues

A. x form

Step 3: For each eigenvalue

solve for all vectors

satisfying

A x = x.

(Either do this directly, or by

solving the homogeneous system the eigenspace associated to to

(I A)x = 0 via row-reduction.)

The resulting solution vectors

and the nonzero vectors in the space are the eigenvectors corresponding

. A= 1 0 0 1
.

Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix

Step 1: We have

tI A =

t1 0

0 t1

, so

p(t) = det(tI A) = (t 1)2 .


So the eigenvalues are

Step 2: The characteristic equation

(t 1)2 = 0 has a double root t = 1. 1 0 0 1 a b =1 a b

= 1, 1

(Alternatively, we could have used the fact that the matrix is upper-triangular.) Step 3: We want to nd the vectors with

. Clearly, all vectors

a b
.

have this

property. Therefore, a basis for the eigenspace with

is given by

1 0 A=

and

0 1 1 1
.

Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix

1 0

Step 1: We have

tI A =

t1 0

1 t1

, so

p(t) = det(tI A) = (t 1)2 .


So the eigenvalues are

Step 2: The characteristic equation

(t 1)2 = 0 has a double root t = 1. 1 0 1 1 a b a b

= 1, 1 a+b b

(Alternatively, we could have used the fact that the matrix is upper-triangular.) Step 3: We want to nd the vectors with which means

. This requires

a b

= a 0

can be arbitrary and

b = 0.

So the vectors we want are those of the form

, and so

a basis for the eigenspace with

=1

is given by

1 0

Remark: Note that this matrix

1 0

1 1

and the identity matrix

1 0

0 1

have the same characteristic

polynomial and eigenvalues, but do not have the same eigenvectors. In fact, for for

= 1,

the eigenspace

1 0

1 1

is 1-dimensional, while the eigenspace for

1 0

0 1

is 2-dimensional.

Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix

A=

2 3

2 1

Step 1: We have Step 2: Since Step 3:

tI A =

t2 3

2 t1

, so

p(t) = det(tI A) = (t 2)(t 1) (2)(3) = t2 3t 4.


the eigenvalues are

p(t) = t2 3t 4 = (t 4)(t + 1), 2 3 2 1 a b

= 1, 4 2a + 2b 3a + b

For

= 1

we want

a b

, so we need

a b

, which reduces

to

2 a = b. 3 =1

So the vectors we want are those of the form

For

we want

2 3

2 1

a b

=4

a b

, so we need

2 2 b , so a basis is given by . 3 3 b 2a + 2b 4a = , which reduces to 3a + b 4b 1 1 .

a = b.

So the vectors we want are those of the form

b b

, so a basis is given by

Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix

0 A= 1 0 t 1 1 t

0 0 0 1 . 1 0 = t (t2 + 1).

Step 1: We have

t tI A = 1 0

0 t 1

0 1 , t

so

p(t) = det(tI A) = t = 0, i, i
.

Step 2: Since Step 3:

p(t) = t (t2 + 1), 0 1 0 0 0 1

the eigenvalues are

a a 0 0 0 1 b = 0 b , so we need a c = 0 , so a = c and For = 0 we want c c b 0 0 1 a b = 0. So the vectors we want are those of the form 0 , so a basis is given by 0 . 1 a 0 0 0 a a 0 ia For = i we want 1 0 1 b = i b , so we need a c = ib , so a = 0 0 1 0 c c b ic 0 0 and b = ic. So the vectors we want are those of the form ic , so a basis is given by i . c 1 a a 0 ia 0 0 0 For = i we want 1 0 1 b = i b , so we need a c = ib , so 0 1 0 c c ic b 0 a = 0 and b = ic. So the vectors we want are those of the form ic , so a basis is given by c 0 i . 1

Example: Find all eigenvalues, and a basis for each eigenspace, for the matrix

1 A = 1 1

0 1 1 3 . 0 3 t1 0

t1 Step 1: We have tI A = 1 1 2 (t 1) (t 3) + (t 1).


Step 2: Since Step 3:

0 t1 0

1 t1 3 , so p(t) = (t1) 0 t3

1 3 +(1) 1 t3

p() = (t 1) [(t 1)(t 3) + 1] = (t 1)(t 2)2 , 1 1 1 0 1 a a 1 3 b = 1 b , 0 3 c c

the eigenvalues are

= 1, 2, 2

a+c a For = 1 we want so we need a + b + 3c = b , so a + 3c c 0 0 c = 0 and a = 0. So the vectors we want are those of the form b , so a basis is given by 1 . 0 0 1 0 1 a a a+c 2a For = 2 we want 1 1 3 b = 2 b , so we need a + b + 3c = 2b , so 1 0 3 c c a + 3c 2c c a = c and b = 2c. So the vectors we want are those of the form 2c , so a basis is given by c 1 2 . 1

1.2

Some Slightly More Advanced Results About Eigenvalues


Theorem: If

is an eigenvalue of the matrix

which appears exactly

times as a root of the characteristic

polynomial, then the dimension of the eigenspace corresponding to

is at least 1 and at most

k.

Remark: The number of times that algebraic multiplicity of multiplicity of multiplicity.

appears as a root of the characteristic polynomial is called the , and the dimension of the eigenspace corresponding to is called the geometric

So what the theorem says is that the geometric multiplicity is at most the algebraic

Example: If the characteristic polynomial is 3-dimensional, and the eigenspace for

(t 1)3 (t 3)2 ,

then the eigenspace for

=1

is at most

=3

is at most 2-dimensional.

Proof: The statement that the eigenspace has dimension at least 1 is immediate, because (by assumption)

is a root of the characteristic polynomial and therefore has at least one nonzero eigenvector associated

to it.

For the statement that the dimension is at most

k,

the idea is to look at the homogeneous system

(I A) x = 0. If appears k times

I A B must have at most k rows of all zeroes. Otherwise, the matrix B (and hence I A too, although this requires a check) would have 0 as an eigenvalue more than k times, because B is in echelon form and therefore upper-triangular. But the number of rows of all zeroes in a square matrix is the same as the number of nonpivotal
as a root of the characteristic polynomial, then when we put the matrix into its reduced row-echelon form

B,

then

columns, which is the number of free variables, which is the dimension of the solution space.

So, putting all the statements together, we see that the dimension of the eigenspace is at most

k.

Theorem: If

v1 , v2 , . . . , vn are eigenvectors of A associated to distinct eigenvalues 1 , 2 , . . . , n , then v1 , v2 , . . . , vn

are linearly independent.

Proof: Suppose we had a nontrivial dependence relation between

(Note that at least two coecients have to be nonzero, because none of

v1 , . . . , vn , say a1 v1 + + an vn = 0. v1 , . . . , vn is the zero vector.)

A (a1 v1 + + an vn ) = A 0 = 0. Now since v1 , . . . , vn are eigenvectors this says a1 (1 v1 ) + + an (n vn ) = 0. But now if we scale the original equation by 1 and subtract (to eliminate v1 ), we obtain a2 (2 1 )v2 + a3 (3 1 )v3 + + an (n 1 )vn = 0. Since by assumption all of the eigenvalues 1 , 2 , . . . , n were dierent, this dependence is still nontrivial, since each of j 1 is nonzero, and at least one of a2 , , an is nonzero. But now we can repeat the process to eliminate each of v2 , v3 , . . . , vn1 in turn. Eventually we are left with the equation b vn = 0 for some nonzero b. But this is impossible, because it would say that vn = 0, contradicting our denition saying that the zero vector is not an eigenvector. So there cannot be a nontrivial dependence relation, meaning that v1 , . . . , vn are linearly independent.
Multiply both sides by the matrix

A:

this gives

Corollary: If

is an

nn

matrix with

distinct eigenvalues

eigenvectors associated to those eigenvalues, then

v1 , v2 , . . . , vn

are a basis for

1 , 2 , . . . , n , and v1 , v2 , . . . , vn Rn . v1 , v2 , . . . , vn
they are a basis.

are (any)

This result follows from the previous theorem: it guarantees that so since they are vectors in the

are linearly independent,

n-dimensional A

vector space

Rn ,

Theorem: The product of the eigenvalues of

is the determinant of

A.

Proof: If we expand out the product is equal to

p(t) = (t 1 ) (t 2 ) (t n ), we see that the constant term (1)n 1 2 2 . But the constant term is also just p(0), and since p(t) = det(tI A) we n have p(0) = det(A) = (1) det(A). Thus, setting the two expressions equal shows that the product of the eigenvalues equals the determinant of A. A
equals the trace of

Theorem: The sum of the eigenvalues of

A.

Note: The trace of a matrix is dened to be the sum of its diagonal entries. Proof: If we expand out the product

p(t) = (t 1 ) (t 2 ) (t n ) we see that the coecient of tn1 is equal to (1 + + n ). If we expand out the determinant det(tI A) to nd the coecient n1 of t , we can show (with a little bit of eort) that the coecient is the negative of the sum of the diagonal entries of A. Therefore, setting the two expressions equal shows that the sum of the eigenvalues equals the trace of A.

1.3

Theory of Similarity and Diagonalization


Denition: We say two matrix

such that

n n matrices A and B are similar (or conjugate) if there exists B = P 1 AP . (We refer to P 1 AP as the conjugation of A by P .) A= 3 2 1 1
and

an invertible

nn

Example: The matrices

B = 2 1 3 2

1 1

2 1 3 2

are similar: with

P = =

P 1 =

2 1 P 1 AP = B .

3 2

, we can verify that

1 1

2 1

3 2

2 3 1 2 1 2 1 1 A

, so that , so that

Remark: The matrix

Q=

0 1

1 2

also has

B = Q1 AQ. Q
with

In general, if two matrices

and

are

similar, then there can be many dierent matrices

Q1 AQ = B . B = P 1 AP

Similar matrices have quite a few useful algebraic properties (which justify the name similar). If and

D=P

CP ,

then we have the following:

The sum of the conjugates is the conjugate of the sum:

B + D = P 1 AP + P 1 CP = P 1 (A + C)P . BD = P 1 AP P 1 CP = P 1 (AC)P .
exists if and only if

The product of the conjugates is the conjugate of the product: The inverse of the conjugate is the conjugate of the inverse:

A1

B 1

exists, and

=P

P.

The determinant of the conjugate is equal to the original determinant:

det(B) = det(P 1 AP ) = det(tI B) = det(P 1

det(P

) det(A) det(P ) = det(A) det(P


1

P ) = det(A).

The conjugate has the same characteristic polynomial as the original matrix:

tI P P

A P ) = det(P

(tI A)P ) = det(tI A).

In particular, a matrix and its conjugate have the same eigenvalues (with the same multiplicities). Also, by using the fact that the trace is equal both to the sum of the diagonal elements and a coecient in the characteristic polynomial, we see that a matrix and its conjugate have the same trace.

If

x is an eigenvector of A with eigenvalue , then P 1 x is an eigenvector of B with A x = x then B (P 1 x) = P 1 A(P P 1 )x = P 1 A x = P 1 (x) = (P 1 x).
This is also true in reverse: if same eigenvalue). In particular, the eigenspaces for

eigenvalue

if

is an eigenvector of

then

P y

is an eigenvector of

(with the

have the same dimensions as the eigenspaces for

A. B
that

One question we might have about similarity is: given a matrix similar to?

A,

what is the simplest matrix

is

As observed above, any matrix similar to

has the same eigenvalues for

A.

So, if the eigenvalues are

1 , , n ,
the simplest form we could plausibly hope for would be a diagonal matrix

1
.. .

whose diagonal elements are the eigenvalues of

A.
if it is similar to a diagonal matrix

Denition: We say that a matrix exists an invertible matrix

with

A is diagonalizable D = P 1 AP . 2 3 6 7

D;

that is, if there

Example: The matrix

A=

is diagonalizable. We can check that for

P =

1 1

2 1

and

P 1 =

1 1

2 1 A

, then we have

P 1 AP =

4 0

0 1

= D.
then it is very easy to compute any power of

If we know that

is diagonalizable and have

D = P 1 AP ,

A:

Since Then

is diagonal,

Dk
k

is the diagonal matrix whose diagonal entries are the

k th

powers of

D. 2 1

D = (P

AP ) = P

(A )P , 6 7

so

A =P D P

Example: With

A = 2 1 =

2 3

as above, we have

Dk =

4k 0

0 1

, so that

Ak =

1 1

4k 0

0 1

1 1

2 4k 1 + 4k

2 2 4k 1 + 2 4k

k which are not positive integers. For 7 3 4 2 , which is actually the inverse matrix A1 . And example, if k = 1 we get the matrix 1 3 4 2 1 0 2 2 6 2 if we set k = we get the matrix B = , whose square satises B = = A. 1 3 3 7 2
Observation: This formula also makes sense for values of

Theorem: An

nn

matrix

is diagonalizable if and only if it has

linearly independent eigenvectors. In

particular, every matrix whose eigenvalues are distinct is diagonalizable.

Proof: If

has

linearly independent eigenvectors

v1 , , vn

with respective eigenvalues

1 , , n ,

then consider the matrix

| P = v1 |

| |

| vn |

whose columns are the eigenvectors of

A.

| | | | | Because v1 , , vn are eigenvectors, we have A P = Av1 Avn = 1 v1 | | | | | 1 | | | | | | .. But we also have 1 v1 n vn = v1 vn = P D. . | | | | | | n


Therefore, we can therefore write For the other direction, if

| n vn . |

A P = P D. Now since the eigenvectors D = P 1 AP , as desired. | | D = P 1 AP

are linearly independent,

is invertible, and

then (like above) we can rewrite this to say

AP = P D.

| If P = v1 |

| | | | | | | vn then AP = P D says Av1 Avn = 1 v1 n vn , which | | | | | | | (by comparing columns) says that Av1 = 1 v1 , . . . , Avn = n vn . Thus the columns v1 , , vn of P are eigenvectors, and (because P is invertible) they are linearly independent. n
distinct

Finally, the last statement in the proof follows because (as shown earlier) a matrix with eigenvalues has

linearly independent eigenvectors.

Advanced Remark: As the theorem demonstrates, if we are trying to diagonalize a matrix, we can run into trouble if the matrix has repeated eigenvalues. However, we might still like to know what the simplest form a non-diagonalizable matrix is similar to.

The answer is given by what is called the Jordan Canonical Form (of a matrix): every matrix is similar

to a matrix of the form

J1 J2
.. .

, Jn
where each

J1 , , Jn

is a square Jordan block matrix

of the form

J =

1
.. .

, 1

with

on the diagonal and 1s directly above the diagonal (where

blank entries are zeroes).

2 Example: The non-diagonalizable matrix 0 0 and J2 = [3].


generalized eigenvectors: vectors eigenvectors would correspond to

1 2 0

0 0 is in Jordan Canonical Form, with J1 = 3

2 0

1 2

The existence and uniqueness of the Jordan Canonical Form can be proven using a careful analysis of

x satisfying (I A)k x = 0 k = 1.)

for some positive integer

k.

(Regular

Roughly speaking, the idea is to use certain carefully-chosen generalized eigenvectors to ll in for the missing eigenvectors; doing this causes the appearance of the extra 1s appearing above the diagonal in the Jordan blocks.

Theorem (Cayley-Hamilton): If matrix

p(x)

is the characteristic polynomial of a matrix

A,

then

p(A)

is the zero

(where in applying a polynomial to a matrix, we replace the constant term with that constant times

the identity matrix).

Example: For the matrix

A=

2 3

2 1

, we have

det(tI A) =

t2 3

2 t1

= (t 1)(t 2) 6 = 10 9 6 7

t2 3t 4. 6 9 6 3

We can compute

A2 = 0 0
.

10 9

6 7

, and then indeed we have

A2 3A 4I =

4 0

0 4

0 0

Proof (if

is diagonalizable): If

the characteristic polynomial of

A is A.

diagonalizable, then let

D = P 1 AP

with

diagonal, and

p(x)

be

D are the eigenvalues 1 , , n of A, hence are roots of the characteristic p(1 ) = = p(n ) = 0. Then, because raising D to a power just raises all of its diagonal entries to that power, we can see 0 p(1 ) 1 .. .. .. that p(D) = p = 0. = = . . . 0 p(n ) n Now by conjugating each term and adding the results, we see that 0 = p(D) = p(P 1 AP ) = P 1 [p(A)] P . So by conjugating back, we see that p(A) = P 0 P 1 = 0.
The diagonal entries of polynomial of

A.

So

In the case where Canonical Form

A
of

is not diagonalizable, the proof is more dicult.

One way is to use the Jordan

in place of the diagonal matrix

D;

then (one can verify)

p(J) = 0,

and then the

remainder of the argument is the same.

1.4

How To Diagonalize A Matrix (if possible)


In order to determine whether a matrix

is diagonalizable (and if it is, how to nd a diagonalization

A = P 1 DP ),

follow these steps:

Step 1: Find the characteristic polynomial and eigenvalues of Step 2: Find a basis for each eigenspace of Step 3a: Determine whether

A.
dimension (namely,

A.

A is diagonalizable  if each eigenspace has the proper

the number of times the corresponding eigenvalue appears as a root of the characteristic polynomial) then the matrix is diagonalizable. Otherwise, the matrix is not diagonalizable.

Step 3b: If the matrix is diagonalizable, then eigenvalues of

is the diagonal matrix whose diagonal entries are the

(with appropriate multiplicities), and then

can be taken to be the matrix whose

columns are linearly independent eigenvectors of

in the same order as the eigenvalues appear in

D.

Example: For

A=

0 3

2 5

, determine whether there exists a diagonal matrix

and an invertible matrix

with

D = P 1 AP ,

and if so, nd them.

Step 1: We have

tI A =

The eigenvalues are therefore Step 2:

t 2 3 t 5 = 2, 3. 0 3

so

det(tI A) = t(t 5) + 6 = t2 5t + 3 = (t 2)(t 3).

For

= 2 we need to solve

The eigenvectors are of the

For

= 3 we need to solve

The eigenvectors are of the

2 a a 2b 2a =2 , so = and thus a = b. 5 b b 3a + 5b 2b b 1 form so a basis for the = 2 eigenspace is . b 1 2 0 2 a a 2b 3a =3 , so = and thus a = b. 3 5 b b 3a + 5b 3b 3 2 2 b form so a basis for the = 3 eigenspace is . 3 3 b A
is diagonalizable, and

Step 3: Since the eigenvalues are distinct we know that

D=
.

2 0

0 3

. We

have two linearly independent eigenvectors, and so we can take

P = 0 3

1 1 2 5

2 3 1 1

To check: we have

P 1 =

3 1

2 1

, so

P 1 AP =

3 1

2 1

2 3

2 0

0 3

D.

Note: We could also take

D=

3 0

0 2

if we wanted. There is no particular reason to care much about

which diagonal matrix we want as long as we make sure to arrange the eigenvectors in the correct order.

1 1 0 Example: For A = 0 2 0 , determine whether there exists a diagonal matrix D and an invertible 0 2 1 1 matrix P with D = P AP , and if so, nd them. t1 1 0 t2 0 t2 0 so det(tIA) = (t1) Step 1: We have tIA = 0 = (t1)2 (t2). 2 t 1 0 2 t 1 The eigenvalues are therefore = 1, 1, 2.
Step 2:

0 a a ab a 0 b = b , so 2b = b and thus b = 0. 1 c c 2b + c c a 1 0 The eigenvectors are of the form 0 so a basis for the = 1 eigenspace is 0 , 0 . c 0 1 2a ab a a 1 1 0 For = 2 we need to solve 0 2 0 b = 2 b , so 2b = 2b and thus 2c 2b + c c c 0 2 1 b a = b and c = 2b. The eigenvectors are of the form b so a basis for the = 2 eigenspace is 2b 1 1 . 2 1 0 0 Step 3: Since the eigenspace for = 1 is 2-dimensional, the matrix A is diagonalizable, and D = 0 1 0 0 0 2 1 0 1 0 1 . We have three linearly independent eigenvectors, so we can take P = 0 0 1 2 1 1 0 1 1 0 1 1 0 1 0 1 To check: we have P 1 = 0 2 1 , so P 1 AP = 0 2 1 0 2 0 0 0 1 = 0 1 0 0 1 0 0 2 1 0 1 2 1 0 0 0 1 0 = D. 0 0 2 1 For = 1 we need to solve 0 0 1 2 2 1 1 1 Example: For A = 0 1 1 , determine whether there exists a diagonal matrix D and an invertible matrix 0 0 1 P with D = P 1 AP , and if so, nd them. t 1 1 1 t 1 1 so det(tIA) = (t1)3 since tIA is upper-triangular. Step 1: We have tIA = 0 0 0 t1 The eigenvalues are therefore = 1, 1, 1.
Step 2:


For

=1

we need to solve

1 0 0

a = b = 0.

The eigenvectors are of

1 a a a+b+c a 1 b = b , so b + c = b and thus 1 c c c c 0 0 the form 0 so a basis for the = 1 eigenspace is 0 . c 1 1 1 0


is 1-dimensional but the eigenvalue appears 3 times as a root of

Step 3: Since the eigenspace for

=1

the characteristic polynomial, the matrix

is

not diagonalizable .

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 6): Supplement

(by Evan Dummit, 2012, v. 1.00)

0.1

Repeated-Eigenvalue Systems
We would like to be able to solve systems

y = Ay

where

is an

nn

matrix which does not have

linearly-independent eigenvectors (equivalently: for which

is not diagonalizable). is a solution to the dierential equation. has an

Recall that if

v is an eigenvector with eigenvalue , then y = vet

By the existence-uniqueness theorem we know that the system space. So if

y =Ay

n-dimensional

solution

has

linearly-independent eigenvectors, then we can write down the general solution to the

system directly. If

has fewer than

linearly-independent eigenvectors, we are still missing some of the solutions, and

need to construct the missing ones. We do this via generalized eigenvectors: vectors which are not eigenvectors, but are close enough that we can use them to write down more solutions to the system If

y = A y.
has multiplicity

for

is a root of the has dimension

characteristic equation less than

k,

we say that

k times, we say that is defective. y =Ay A.

k.

If the eigenspace

Here is the procedure to follow, to solve a system

which may be defective:

Step 1: Find the eigenvalues of the Step 2: For each eigenvalue nd a basis

nn

matrix

(appearing as a root of the characteristic polynomial with multiplicity

n),

v1 , , vk

for the

-eigenspace. k = n), then the solutions to the system et v1 , , et vn . other words, if k < n), then above each eigenvector w in = w = w2
. . . . . .

Step 2a: If the eigenspace is not defective (in other words, if

y =Ay
the basis

coming from this eigenspace are

Step 2b: If the eigenspace is defective (in

v1 , , vk ,

look for a chain of generalized eigenvectors satisfying

(A I) w2 (A I) w3
. . .

(A I) wl

= wl1 . et [w], et [tw + w2 ], et [ t2 w + tw2 + w3 ],


2

Then the solutions to the system for the eigenspace are

, et n,
where

tl1 l! w

tl2 (l2)! w2

+ + twl1 + wl .
If the

Note: If the

-eigenspace

is 1-dimensional, then there is only 1 chain, and it will always be of length

is the multiplicity of

-eigenspace

has dimension

> 1,

then the chains may have

dierent lengths, and it may be necessary to toss out some elements of some chains, as they may lead to linearly dependent solutions.

Step 3: If system

y1 , , yn are the n solution functions obtained y = A y is y = c1 y1 + c2 y2 + + cn yn .

in step 2, then the general solution to the

If there are complex-conjugate eigenvalues then we generally want to write the solutions as realvalued functions. To obtain real-valued solutions to the system from a pair of complex-conjugate solutions

and

y,

replace

and

y with Re(y) y1 y2

and

Im(y),

the real and imaginary parts of

y.

Example: Find the general solution to the system

= =

5y1 9y2 4y1 7y2 5 4 9 7


.

Step 1: In matrix form this is

y = A y,

where

A=

We have

AtI =

5t 9 4 7 t

so

det(AtI) = (5t)(7t)(4)(9) = 1+2t+t2 = (t+1)2 . 6 4 9 6 a b 0 0

Thus there is a double eigenvalue To compute the eigenvectors for

= 1.
we want

= 1, a
2 3a

, so that

2a 3b = 0.

So the eigenvectors are of the form by

, and the eigenspace is 1-dimensional with a basis given

3 2

Step 2: There is only one eigenvector (the triple eigenvector

= 1),

so we need to compute a chain of

generalized eigenvectors to nd the remaining solution to the system.

We start with

w=

3 2

, and also have

A I = 9 6 a b

6 4

We want to nd

w2 =

a b

with

6 4

9 6 3 = 2

Dividing the rst row by 3 and the second row by 2 yields the system we want

2 2

3 3

1 1

, and so

2a 3b = 1,

so (for example) we can take

a=2

and

b = 1.

This gives our choice of

w2 = 3 2

2 1
and

Now we have the chain of the proper length (namely 2), so we can write down the two solutions for this eigenspace: they are

et

3 2 y1 y2

tet + = c1

2 1 3 2

et . et + c2 3 2 tet + 2 1 et
.

Step 3: We thus obtain the general solution

Slightly more explicitly, this is

y1 y2

= (3c1 + 2c2 + 3c2 t)et = (2c1 + c2 + 2c2 t)et y1 y2 y3

Example: Find the general solution to the system

Step 1: In matrix form this is

y = A y,

where

= 4y1 y2 2y3 = 2y1 + y2 2y3 . = 5y1 3y3 4 1 2 A = 2 1 2 . 5 0 3

4 t 1 2 1 2 4 t 1 1t 2 so det(AtI) = 5 We have AtI = 2 +(3t) = 1 t 2 2 1t 5 0 3 t 2 t + 2t2 t3 = (2 t)(1 + t2 ). Thus the eigenvalues are = 2, i, i. 2 1 2 a 0 = 2: For = 2, we want 2 1 2 b = 0 , so that 2ab2c = 0 and 5a5c = 0. 5 0 5 c 0 a Hence c = a and b = 2a 2c = 0, so the eigenvectors are of the form 0 . So the eigenspace is a 1 1-dimensional, and has a basis given by 0 . 1 4 i 1 2 a 0 1i 2 b = 0 . Subtracting the rst row = i: For = i, we want 2 5 0 3 i c 0


from the second row, and then dividing the second row by

2i

yields

4i 1 5

1 1 0

2 0 3 i
or

a 0 b = 0 . c 0 c =

Hence

a + b = 0

so

b = a;

then the third row gives

5a + (3 i)c = 0,

5(3 i) 5 a = a. 3+i 10
So the eigenvectors are of the form

1-dimensional, and has a basis given by

2 2 . 3i

a a . 3i a 2

So the eigenspace is

= i: For = i 2 2 . by 3+i
tors.

we can just take the conjugate of the eigenvectors for

= i,

so a basis is given

Step 2: The eigenspaces are all the proper sizes, so we do not need to compute any generalized eigenvec-

2 2 y1 1 Step 3: The general solution is y2 = c1 0 e2t + c2 2 eit + c3 2 eit . 3+i 3i y3 1 2 sin(t) 2 cos(t) y1 1 + c3 2 sin(t) 2 cos(t) With real-valued functions: y2 = c1 0 e2t + c2 cos(t) + 3 sin(t) 3 cos(t) + sin(t) 1 y3
Slightly more explicitly, this is

y1 y2 y3

= c1 e2t + 2c2 cos(t) + 2c3 sin(t) = 2c2 cos(t) + 2c3 sin(t) = c1 e2t + c2 (3 cos(t) + sin(t)) + c3 ( cos(t) + 3 sin(t)) y1 y2 y3 = = = 4y1 y3 2y1 + 2y2 y3 3y1 + y2 0 2 1 1 1 . 0 +(1) 2 3

Example: Find the general solution to the system

Step 1: In matrix form this is

y = A y,

where

4 A= 2 3

4t 0 1 2 t 1 2 t 1 so det(AtI) = (4t) We have AtI = 2 1 t 3 1 t 8 12t + 6t2 t3 = (2 t)3 . Thus there is a triple eigenvalue = 2. 2 0 1 a To compute the eigenvectors for = 2, we want 2 0 1 b = 3 1 2 c
and

2t 1

0 0 , 0

so that

2a c = 0 a a . 2a

3a + b 2c = 0.

Hence

c = 2a

and

b = 2c 3a = a,

so the eigenvectors are of the form


So the eigenspace is 1-dimensional, and has a basis given by

1 1 . 2
so we need to compute a chain of

Step 2: There is only one eigenvector (the triple eigenvector

= 2),

generalized eigenvectors to nd the remaining 2 solutions to the system.

2 0 1 0 1 . We start with and also have A I = 2 3 1 2 a 2 0 1 a First we want to nd w2 = b with 2 0 1 b = c 3 1 2 c 2 The corresponding system of equations in matrix form is 2 3
row-reduce:

1 w = 1 , 2

1 1 . 2 0 0 1 1 1 2 1 1 , 2 0 1 0 0 1 0
which we now

2 2 3

0 0 1

1 1 2

1 2 R2 R 1 1 0 2 3

0 0 1

1 0 2

1 2 R 2R1 0 3 0 2 1

1 0 0

1 and hence a + b = 0 and 2a c = 1, so one possibility for w2 is w2 = 1 . 1 d 2 0 1 d 1 Now we want to nd w3 = e with 2 0 1 e = 1 . f 3 1 2 f 1 2 0 1 1 The corresponding system of equations in matrix form is 2 0 1 1 ; 3 1 2 1
same row-reduction as above to obtain

we can use the

2 2 3

0 0 1

1 1 2

2 0 1 R2 R 1 1 0 0 3 1 1

1 0 2

2 1 R 2R1 0 3 0 1 1

0 1 0 0 1 0

1 0 1

and hence

a + b = 1

and

2a c = 1,

so one possibility for

w3

is

1 w3 = 0 . 1

Now we have the chain of the proper length (namely 3), so we can write down the three solutions for this eigenspace:

1 1 1 1 1 1 t2 2t 2t 2t 2t e + 1 te2t + 0 e2t . they are 1 e , 1 te + 1 e , and 1 2 1 2 1 1 2 2

Step 3: We thus obtain the general solution as the (rather unwieldy and complicated) expression

y1 1 1 1 1 1 1 t2 2t 2t 2t y2 = c1 1 e2t + c2 1 te2t + 1 e2t + c3 1 e + 1 te + 0 e 2 y3 2 1 1 2 1 2 y1 y2 y3 = ( c1 + c2 + c3 + c2 t + c3 t + 1 c3 t2 )e2t 2 1 = ( c1 + c2 + c2 t + c3 t + 2 c3 t2 )e2t = (2c1 + c2 + c3 + 2c2 t + c3 t + c3 t2 )e2t

Slightly more explicitly, this is

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Math 320 (part 6): Systems of First-Order Dierential Equations

(by Evan Dummit, 2012, v. 1.00)

Contents
1 Systems of First-Order Linear Dierential Equations
1.1 1.2 General Theory of (First-Order) Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Eigenvalue Method (for diagonalizable coecient matrices)

1
1 2

1 Systems of First-Order Linear Dierential Equations

In many (perhaps most) applications of dierential equations, we have not one but several quantities which change over time and interact with one another: populations in an ecosystem, economic quantities, concentrations of molecules in a reaction, etc. Naturally, we would like to develop methods to solve such systems.

1.1 General Theory of (First-Order) Linear Systems

Before we start our discussion of systems of linear dierential equations, we rst observe that we can reduce any system of linear dierential equations to a system of rst-order linear dierential equations (in more variables): if we dene new variables equal to the higher-order derivatives of our old variables, then we can rewrite the old system as a system of rst-order equations (in more variables).

Example: Consider the single 3rd-order equation

y + y = 0.
, then the original equation tells us that

If we dene new variables so

z=y

and

w=y =z

= y

w =y

= y = z . y = z , z = w, w = z . z2 = y2 = ex y1 y1 + y1 y2 = 0
and and

Thus, this single 3rd-order equation is equivalent to the rst-order system

Example: Consider the system If we dene new variables

y2 + y1 + y2 sin(x) = ex .
then

z1 = y1 y2 sin(x) = e z1 z2 sin(x)
x

z2 = y2 ,

z1 = y1 = y1 + y2

and

So this system is equivalent to the rst-order system

ex z1 z2 sin(x).
arbitrary systems of linear dierential equations. dierential equations from now on.

y1 = z1 , y2 = z2 , z1 = y1 + y2 , z2 =

Thus, whatever we can show about solutions of systems of rst-order linear equations, will carry over to So we will talk only about systems of rst-order linear

A system of rst-order linear dierential equations (with unknown functions

y1 , , yn )

has the general form

y1 y2
. . .

= =

a1,1 (x) y1 + a1,2 (x) y2 + + a1,n (x) yn + p1 (x) a2,1 (x) y1 + a2,2 (x) y2 + + a2,n (x) yn + p2 (x)
. . .

yn
for some functions

=
and

an,1 (x) y1 + an,2 (x) y2 + + an,n (x) yn + pn (x) pj (x),


where

ai,j (x)

1 i, j n. ai,j (x)

Most of our time we will be dealing with systems with constant coecients, in which all of the are constant functions. We say a rst-order system is homogeneous if each of An initial condition for this system consists of

p1 (x), p2 (x), , pn (x)

is zero. ,

yn (x0 ) = bn ,

where

x0

is the starting value for

n pieces of information: y1 (x0 ) = b1 , y2 (x0 ) = b2 , . . . x and the bi are constants.

Many of the theorems about general systems of rst-order linear equations are very similar to the theorems about

nth

order linear equations.

Theorem (Homogeneous Systems): If the coecient functions

ai,j (x)

are continuous, then the set of solutions

(y1 , y2 , , yn )

to the homogeneous system

y1 y2
. . .

= a1,1 (x) y1 + a1,2 (x) y2 + + a1,n (x) yn = a2,1 (x) y1 + a2,2 (x) y2 + + a2,n (x) yn
. . .

yn
is an

an,1 (x) y1 + an,2 (x) y2 + + an,n (x) yn

n-dimensional

vector space.

The fact that the set of solutions forms a vector space is not so hard to show using the subspace criteria. The real result of this theorem, which follows from the existence-uniqueness theorem below, is that the set of solutions is

n-dimensional. pj (x)
are each continuous in an interval around

Theorem (Existence-Uniqueness): For a system of rst-order linear dierential equations, if the coecient functions

ai,j (x)

and nonhomogeneous terms

x = x0 ,

then

the system

y1 y2
. . .

= =

a1,1 (x) y1 + a1,2 (x) y2 + + a1,n (x) yn + p1 (x) a2,1 (x) y1 + a2,2 (x) y2 + + a2,n (x) yn + p2 (x)
. . .

yn
with initial conditions smaller) interval around

an,1 (x) y1 + an,2 (x) y2 + + an,n (x) yn + pn (x)


has a unique solution

y1 (x0 ) = b1 , . . . , yn (x0 ) = bn x = x0 .

(y1 , y2 , , yn )

in some (possibly

Example: The system

y = ex y + sin(x)y , z = 3x2 y

has a unique solution for every initial condition

y(a) = b1 , z(a) = b2 .
Denition: Given

with functions as

n vectors s1 = (y1,1 , y1,2 , , y1,n ), s2 = (y2,1 , y2,2 , , y2,n ), , sn = (yn,1 , yn,2 , , yn,n ) y1,1 y1,2 y1,n y2,1 y2,2 y2,n . entries, their Wronskian is dened as the determinant W = . . . .
. . . . .. . .

yn,1
The

yn,2

yn,n

vectors

s1 , , sn

are linearly independent if their Wronskian is nonzero.

1.2 Eigenvalue Method (for diagonalizable coecient matrices)

We now restrict our discussion to homogeneous rst-order systems with constant coecients: those of the form

y1 y2
. . .

= =

a1,1 y1 + a1,2 y2 + + a1,n yn a2,1 y1 + a2,2 y2 + + a2,n yn


. . .

yn

an,1 y1 + an,2 y2 + + an,n yn a1,2 a2,2


. . .

y1 a1,1 y2 a2,1 We can rewrite this system in matrix form as y = Ay , where y = . and A = . . . . . yn an,1


.. .

an,2

a1,n a2,n . . . . an,n

The idea behind the so-called Eigenvalue Method is the following observation:

Observation: If

c1 c2 v= . . . cn

is an eigenvector of

with eigenvalue

then

c1 c2 y = . ex . . cn

is a solution

to

y = A y. y = ex v
with respect to

Proof: Dierentiating If

gives

y = ex v = y = A y .

Theorem: where

has

linearly independent eigenvectors

solutions to the matrix dierential system

y = Ay

are given by

v1 , , vn with eigenvalues 1 , , n , then the y = c1 e1 x v1 + c2 e2 x v2 + + cn en x vn ,

c1 , , cn

are arbitrary constants.

Important Remark: The statement that values

1 , , n

is equivalent to the statement that

diagonal elements of

are

A has n linearly independent eigenvectors v1 , , vn with eigenA is diagonalizable with A = P 1 DP where the 1 , , n and the columns of P are the vectors v1 , , vn . e1 x v1 , e2 x v2 , , en x vn
is a solution to

Proof: By the observation above, each of

y = A y.

We claim

that they are a basis for the solution space.

We can compute the Wronskian of these solutions; after factoring out the exponentials from each

| | | ( ++n )x column, we obtain W = e 1 v1 vn . Then this product | | | exponential is nonzero and the vectors v1 , , vn are linearly independent. , en x vn are linearly independent.
is

is nonzero because the Hence

e1 x v1 , e2 x v2 , y = Ay

We also know by the existence-uniqueness theorem that the set of solutions to the system

n-dimensional. n
linearly independent elements

So since we have

e1 x v1 , e2 x v2 , , en x vn

in an

n-dimensional

vector space, they are a basis. Finally, since these solutions are a basis, all solutions are of the form

y = c1 e1 x v1 + c2 e2 x v2 + +

cn e

n x

vn ,

where

c1 , , cn

are arbitrary constants.

By the remark, the theorem allows us to solve all homogeneous systems of linear dierential equations whose coecient matrix

is diagonalizable. To do this, follow these steps:

Step 1: Write the system in the form

y = Ay

for an

n1

column matrix

and an

nn

matrix

(if

the system is not already in this form). If the system has equations which are not rst order, introduce new variables to make the system rst-order.

Step 2: Find the eigenvalues and eigenvectors of izable, generate the list of

A,

and check that

linearly independent eigenvectors

A is diagonalizable. If A is diagonalv1 , , vn with corresponding eigenvalues


are the complex

1 , , n .
If there are complex-conjugate eigenvalues conjugates of those for

and

then the eigenvectors for

. y = c1 e1 x v1 + c2 e2 x v2 + + cn en x vn ,
where

Step 3: Write down the general solution to the system:

c1 , , cn

are arbitrary constants.

Note: If there are complex-conjugate eigenvalues then we generally want to write the solutions as real-valued functions. To do this, we take a linear combination: if

= a + bi

= a bi
x

has an eigenvector

v = w1 iw2
ax

has an eigenvector

v = w1 + iw2

so that

(the conjugate of

v ).
ax

Then to obtain real-valued solutions to the system, replace the two complex-valued solutions

ex v and e v with the two real-valued solutions e (w1 cos(bx) w2 sin(bx)) and e (w1 sin(bx) + w2 cos(bx)). c1 , , cn .

Step 4 (if necessary): Plug in any initial conditions and solve for

Example: Find all functions

y1

and

y2

such that

y1 y2

= =

y1 3y2 y1 + 5y2

Step 1: The system is

y = A y,

with

y =

y1 y2

and

A=

1 1 t1 1

3 5

Step 2: The characteristic polynomial of so the eigenvalues are

A is det(tI A) = 2a 2b

3 t5 =

= (t1)(t5)+3 = t2 6t+8, 2a 2b

For so

=2 3 1 =2 1 1

we

= 2, 4. 1 3 want 1 5

a b

so that

a 3b a + 5b

. This yields

a = 3b,

is an eigenvector.

For so

we want

1 1

3 5

a b

4a 4b

so that

a 3b a + 5b

4a 4b

. This yields

a = b,

is an eigenvector.

Step 3: The general solution is

y1 y2 y1

= c1

3 1

e2x + c2 y1 y2 A=

1 1 = y2 = y1 0 1 t 1 1 0 1 t

e4x =

3c1 e2x + c2 e4x c2 e2x + c2 e4x

Example: Find all real-valued functions

and

y2

such that

Step 1: The system is

y = A y,

with

y = A
is

y1 y2

and

Step 2: The characteristic polynomial of

det(tI A) = ia ib

= t2 + 1 , 1 i

so the eigenvalues are

= i.
For

=i

we want

0 1

1 0

a b

so

b = ia

and thus

is an eigenvector.

For

= i

we can take the complex conjugate of the eigenvector for

=i

to see that

1 i

is an

eigenvector.

Step 3: The general solution is

y1 y2

= c1

1 i

eix + c2

1 i

eix . 1 i eix

But we want real-valued solutions, so we need to replace the complex-valued solutions and

1 i

eix

with real-valued ones.

We have

=i

and

v=

1 0

0 1

so that

w1 =

1 0

and

w2 =

0 1

Plugging into the formula in the note gives us the equivalent real-valued solutions

1 0

cos(x) +

0 1

sin(x) =

cos(x) sin(x)

and

1 0

sin(x) y1 y2 = c1

0 1

cos(x) = cos(x) sin(x) +c2

sin(x) cos(x) sin(x) cos(x)

This gives the solution to the system as

c1 cos(x) + c2 sin(x) c1 sin(x) c2 cos(x)

Remark: If the coecient matrix is not diagonalizable, life is more dicult, as we cannot generate a basis for the solution space using eigenvectors alone. To solve systems with non-diagonalizable coecient matrices requires introducing the exponential of a matrix, and to develop methods for computing it. We do not cover such techniques in this course.

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

Some selected proof-based homework problems

Suppose A is an n n matrix with only one eigenvalue and n linearly independent eigenvectors. Show that every vector in Rn is an eigenvector of A. Solution: Say the only eigenvalue of A is , and consider the eigenspace for . We know that the eigenspace is a subspace of Rn . Because is the only eigenvalue, every eigenvector has eigenvalue . So since A has n linearly independent eigenvectors, the -eigenspace contains n linearly independent vectors. But the only subspace of Rn which can contain n linearly independent vectors is Rn itself (as every set of n linearly independent vectors spans Rn ). Therefore the -eigenspace is all of Rn . But this means every vector in Rn is an eigenvector of A with eigenvalue . Note: In fact, the argument tells us that A acts by multiplying every vector by . But this means that A = I (where I is the identity matrix). Suppose that A is an invertible matrix. Show that 0 is not an eigenvalue of A. Solution 1: By denition, 0 is an eigenvalue precisely when there is a nonzero vector v such that A v = 0 (since 0v = 0). But if A is invertible, then Av = 0 only has the trivial solution v = 0. Reason: we have v = (A1 A)v = A1 (Av) = A1 0 = 0. So, if A is invertible, then 0 is not an eigenvalue. Solution 2: A is invertible precisely when det(A) = 0. By denition, the characteristic polynomial is p(t) = det(A tI). In particular, setting t = 0 shows that det(A) = p(0). Since they are equal, we see that det(A) = 0 if and only if p(0) = 0. But the roots of the characteristic polynomial are precisely the eigenvalues. So p(0) = 0 is exactly the same as saying that 0 is not a root of p(t), and hence that 0 is not an eigenvalue. Note: Both solutions work both ways, and show that if A is not invertible, then 0 is an eigenvalue. Prove that if A and B are n n matrices and there exists an invertible n n matrix P with B = P 1 AP , then det(A) = det(B). Also show that if either A or B is invertible, then the other is invertible, and B 1 = P 1 A1 P . Solution:
Taking the determinant of both sides of B = P 1 AP yields det(B) = det(P 1 ) det(A) det(P ). Since determinants are scalars we can rearrange them to get det(B) = det(A) det(P 1 ) det(P ). Using multiplicativity of determinants again we have det(B) = det(A) det(P 1 P ) = det(A) det(I). Since the determinant of the identity matrix is 1, we obtain det(B) = det(A), as desired. The equality of the determinants means that det(B) = 0 precisely when det(A) = 0. But since a determinant is nonzero exactly when that matrix is invertible, we see that A is invertible if and only if B is invertible. Thus, if either is invertible, the other is as well. Finally, if both are invertible, then we can write B 1 = (P 1 AP )1 = P 1 A1 (P 1 )1 , since the inverse of a product is the product of the inverses in the opposite order. Since (P 1 )1 = P we obtain B 1 = P 1 A1 P , as desired.

Prove that if is an eigenvalue of A, then n is an eigenvalue of An for every n > 1. Solution 1: Saying is an eigenvalue of A means that there is a nonzero vector v such that A v = v . If we multiply both sides by A then we have A(A v) = A(v). Since is a scalar, A(v) = (A v) = (v) = 2 v , where in the middle we used the fact that A v = v . Therefore we have A2 v = 2 v . If we multiply both sides by A again, then we can do the same rearrangement to see that A3 v = 3 v . By continuing this process, we will eventually end up with An v = n v , for each n > 1. But this says precisely that v is an eigenvector of An with eigenvalue n . In particular, n is an eigenvector of An . Note: To be fully rigorous, one should set this up as an induction on n. The base case n = 1 is immediate, since A1 v = 1 v . The inductive step is: given An1 v = n1 v , multiply both sides by A to get An v = A(n1 v) = n1 (Av) = n1 (v) = n v . Solution 2: Saying is an eigenvalue of A means that det(A I) = 0. We want to see that det(An n I) = 0, since this is the same as saying that n is an eigenvalue of An . We can factor An n I as (A I) An1 + An2 (I) + An3 (I)2 + + A(I)n2 + (I)n1 . If we write B = An1 +An2 (I)+An3 (I)2 + +A(I)n2 +(I)n1 , then An n I = (AI)B . Taking determinants says det(An n I) = det(A I) det(B). But since det(A I) = 0, we see that det(An n I) = 0 as well. Therefore, since det(An n I) = 0, we see that n is an eigenvalue of An . Suppose U and W are subspaces of a vector space V . Let S be the set of all vectors in V of the form u + w, where u is a vector in U and w is a vector in W . Prove that S is also a subspace of V . Solution: We need to show three things: that S contains the zero vector, that S is closed under addition, and that S is closed under scalar multiplication. [S1]: S contains 0. Because U and W are subspaces, we know that each of them contains the zero vector 0. Therefore, S contains 0 + 0 = 0, so S contains the zero vector. [S2]: S is closed under addition. Suppose v1 and v2 are vectors in S . We want to show that v1 + v2 is also in S . By denition of S , we can write v1 = u1 + w1 and v2 = u2 + w2 , where u1 and u2 are in U and w1 and w2 are in W . Then v1 + v2 = u1 + w1 + u2 + w2 . We can rearrange this to read v1 + v2 = (u1 + u2 ) + (w1 + w2 ). But now since U is a subspace, u1 + u2 is also in U . Similarly, w1 + w2 is in W . So we have written v1 + v2 as the sum of a vector in U and a vector in W . Therefore, v1 + v2 is in S , by the denition of S . [S3]: S is closed under scalar multiplication.
Suppose v is a vector in S , and is a scalar. We want to show that v is also in S . By denition of S , we can write v = u + w, where u is in U and w is in W . Then by the distributive law we have v = (u + w) = ( u) + ( w). But now since U is a subspace, u is also in U . Similarly, w is in W . So we have written v as the sum of a vector in U and a vector in W . Therefore, v is in S , by the denition of S .

If v1 , , vk , vk+1 span a vector space V , and vk+1 is a linear combination of v1 , , vk , show that v1 , , vk span V . Solution: The statement that v1 , , vk , vk+1 span V says that any vector w in V can be written as a linear combination of v1 , , vk , vk+1 : say w = a1 v1 + a2 v2 + + ak vk + ak+1 vk+1 . We are also told that vk+1 is a linear combination of v1 , , vk : say as vk+1 = b1 v1 +b2 v2 + +bk vk . Now we can just substitute this expression for vk+1 into the expression for w: this gives w = a1 v1 + a2 v2 + + ak vk + ak+1 (b1 v1 + b2 v2 + + bk vk ). If we expand out the product and collect terms, we obtain the equivalent expression w = (a1 + ak+1 b1 ) v1 + (a2 + ak+1 b2 ) v2 + + (ak + ak+1 bk ) vk . This expresses w as a linear combination of v1 , , vk . Since w was arbitrary, this says every vector in V is a linear combination of v1 , , vk  which is to say, v1 , , vk span V .

Well, you're at the end of my handout. Hope it was helpful. Copyright notice: This material is copyright Evan Dummit, 2012. You may not reproduce or distribute this material without my express permission.

MATH 320: HOMEWORK 2 1.5: 6, 17, 30, 38 2.1: 6, 21, 30 2.2: 10, 20, 24 2.3: 12, 20 EP1.5.6 Solve the initial value problem xy + 5y 7x2 , y(2) = 5. Solution. The rst step is to rewrite the problem appropriately. 5 y + y = 7x. x Now we can apply what we know of integrating factors. First, our integrating factor is (x) = exp( 5/x dx) = exp(log x5 ) = x5 .

The general theory of rst order linear ODEs gives us x5 y = x5 7x y(x) = x7 + C . x5

Applying the intial condition results in 5= so the solution is y(x) = 128 + C C = 5 32 128 = 32 32

x7 +32 . x5

EP1.5.17 Solve the initial value problem (1 + x)y + y = cos x, y(0) = 1 Solution. Again, rewriting is the key rst step. cos x 1 y + y= . 1+x 1+x This gives the integrating factor (x) = exp(log(1 + x)) = 1 + x. Using this in our original ODE gives (1 + x)y = cos x dx y(x) = sin x + C 1+x

Now we apply the initial condition to nd 2= Thus the solution is y(x) =


1+sin x 1+x . 1

1+C C = 1. 1+0

MATH 320: HOMEWORK 2

EP1.5.30 Express the solution of the initial value problem 2x dy = y + 2x cos x, y(1) = 0 dx

as an integral as in Example 3 of this section. Solution. As per usual, start with a rewrite. 1 dy y = cos x dx 2x Now we get the integrating factor.
x

(x) = exp
1

1 dt 2t

1 (x) = x

Apply this to the ODE to get y(x) = Ta-da. EP1.5.38 Consider the cascade of two tanks shown in Fig. 1.5.5, with V1 = 100 (gal) and V2 = 200 (gal) the volumes of brine in the two tanks. Each tank also initially contains 50 lb of salt. The three ow rates indicated in the gure are 5 gal/min of pure water owing into tank 1, 5 gal/min owing from tank 1 to tank 2, and 5 gal/min owing out of tank 2. (a) Find the amount x(t) of salt in tank 1 at time t. (b) Suppose that y(t) is the amount of salt in tank 2 at time t. Show rst that dy 5x 5y = dt 100 200 and then solve for y(t), using the function x(t) found in part (a). (c) Finally, nd the maximum amount of salt ever in tank 2. Solution. We must set up the appropriate IVP rst. Since we are losing 5 gal/min of solution and there is x/100 lb/gal of salt in the solution, the rate at which we are losing salt is dx 5x = dt 100 and we start with x(0) = 50. (a) Solving the ODE gives x(t) = Det/20 . Applying the initial condition results in x(t) = 50et/20 . (b) Since tank 1 is losing salt at the rate 5x/100, that is the rate at which tank 2 is gaining salt. Similarly to our reasoning for tank 1, this gives a outow rate of 5y/200 lb/min of salt. Using the previous solution for x(t) we get dy 250et/20 5y = . dt 100 200
x

x y0 +
1

cos t dt t

x
1

cos t dt. t

MATH 320: HOMEWORK 2

This can be rewritten as the linear ODE dy 5y 250et/20 + = . dt 200 100 The integrating factor is (t) = exp 1 dt 40 = exp(t/40).

Using the usual formula for linear ODEs (t)y(t) = 5 exp 2 t 2t 40 dt y(t) = 100 exp(t/40) + C exp(t/40)

y(t) = 100et/20 + Cet/40 . Now applying the initial condition gives 50 = 100 + C C = 150. So the solution is y(t) = 150et/40 100et/20 . (c) The max occurs when y = 0, so that means 0= 5x 5y y = 2x. 100 200
225 4

Using our formulas for y and x will show that the max must occur at t = 40 log 3 . 4 Plugging this back into y(t) shows the max is = 56.25.

EP2.1.6 Solve the initial value problem dx = 3x(x 5), x(0) = 2. dt Use the exact solution or a computer-generated slope eld to sketch graphs of several solutions of the ODE, and highlight the particular solution of the IVP. Solution. This is a separable ODE. It goes like this: dx = 3 dt. x(x 5) A direct integration would be dicult, so we use partial fractions to break down the left hand side. 1 A B = + x(x 5) x x5 1 = A(x 5) + Bx x = 0 : 1 = 5A A = x = 5 : 1 = 5B B = 1 5 1 5

MATH 320: HOMEWORK 2

So we can rewrite the ODE as 1 1 + 5x 5(x 5) dx = 3 dt

x5 1 log 5 x Doing a little bit of algebra yields x(t) =

= 3t + C

5 . 1 De15t

Applying the initial conditions results in 2= 3 5 D= . 1D 2 10 . 2 3e15t

Multiply the numerator and denominator by 2 to get the particular solution x(t) =

I recommend using DFIELD for the gure portion of this. EP2.1.21 Suppose that the population P (t) of a country satises the dierential equation P = kP (200 P ) with k constant. Its population in 1940 was 100 million and was then growing at the rate of 1 million per year. Predict this countrys population for the year 2000. Solution. The rst step is to solve this separable ODE using partial fractions. dP = k dt P (200 P ) is the separation. For the partial fractions, 1 A B = + P (200 P ) P 200 P 1 = A(200 P ) + BP 1 200 1 P = 200 : 1 = 200A B = . 200 P = 0 : 1 = 200A A = Now doing the integration gives 1 200 Straightforward algebra gives P (t) = 200De200kt 1 + De200kt log P 200 P = kt + C.

and we apply the initial condition P (0) = 100 gives 100 = 200D D = 1. 1+D

MATH 320: HOMEWORK 2

Since the initial growth is 1 with population 100 million we can nd the growth constant k by 200 . 1 + e2102 t To predict the population in the year 2000 we evaluate at t = 60: 1 = k 100(200 100) k = 104 P (t) = P (60) = which completes the problem. EP2.1.30 A tumor may be regarded as a population of multiplying cells. It is found empirically that the birth rate of the cells in a tumor decreases exponentially with time, so that (t) = 0 et (where and 0 are positive constants) and hence dP = 0 et P, P (0) = P0 . dt Solve this IVP for P (t) = P0 exp 0 (1 et ) . 200 153.7 million 1 + e6/5

Observe that P (t) approaches the nite limiting population P0 e0 / . Solution. We start by separating the variables 0 0 dP = 0 et dt log P = et + C P (t) = exp et + C P We can apply the initial condition at the middle step in the above line, 0 + log P0 = C. Now inserting this at the end P (t) = exp 0 t 0 e + + log P0 = P0 exp 0 (1 et )

which veries the solution. As t then 1et 1, so the solution tends to P0 e0 / . EP2.2.10 For the autonomous ODE (x = f (x)) dx = 7x x2 10 dt nd the critical points. Use the sign (positive or negative) of f (x) to determine each critical points stability and construct a phase diagram for the ODE. Then solve the ODE explicity for x(t). Finally, use either the exact solution or a computer-generated slope eld to sketch the typical solution curves, and verify visually the stability of each critical point. Solution. To nd the critical points we solve the quadratic equation: 0 = 7x x2 10 0 = (x 2)(x 5) x = 2, 5. Now for the stability analysis.

MATH 320: HOMEWORK 2

(1) In the region (, 2) the derivative x is negative (evaluate at x = 0 to see this), so the solution x(t) is decreasing. (2) In the region (2, 5) the derivative x is positive (x = 3: x = (1)(2) = 2 > 0) so the solution x is increasing. (3) In the region (5, ) the derivative x is negative (x = 6: x = (4)(1) = 4 < 0) so the solution is decreasing. Using this or the phase diagram it is clear the x = 2 is unstable and x = 5 is stable. For the exact solution we separate variables and use partial fractions: dx = dt, (x 2)(x 5) 1 A B = + (x 2)(x 5) x2 x5 1 = A(x 5) + B(x 2) x = 2 : 1 = 3A A = 1 x = 5 : 1 = 3B B = . 3 Now we can integrate to nd 1 3 and some algebra leads to 5 2De3t . 1 De3t For the graphing portion I would use DFIELD. x(t) = EP2.2.20 The ODE x = x(x 5)/100 + s models a population with stocking rate s. Determine the dependence of the number of critical points c on the parameter s, and then construct the corresponding bifurcation diagram in the sc-plane. Solution. The number of critical points given by the number of solutions to the equation 0= This is a quadratic equation: (0.1) (0.2) 1 2 5 100 c c+ s 100 100 100 0 = c2 5c + 100s 0= Now we use the quadratic formula 25 4(1)(100s) 1 c= = 2 There are three regions of solution to this equation. 5 1 16s . 2
1 16 .

1 3

log

x5 x2

= t + C

c(c 5) + s. 100

(1) This formula has no solutions if 1 16s < 0 which is equivalent to s >

MATH 320: HOMEWORK 2

(2) There is exactly one solution if 1 16s = 0 which occurs for s = (3) There are two solutions if 1 16s > 0 which is equivalent to s <

1 16 . 1 16 .

We can graph this in the sc plane by graphing s = c(c 5)/100. The graph is provided.

EP2.2.24 Separate the variables in the logistic harvesting equation x = k(N x)(x H) and then use partial fractions to derive the solution x(t) = N (x0 H) H(x0 N )ek(N H)t . (x0 H) (x0 N )ek(N H)t dx = k dt. (N x)(x H) The partial fraction decomposition goes like this: 1 A B = + (N x)(x H) N x xH 1 = A(x H) + B(N x) 1 N H 1 . x = H : 1 = B(N H) B = N H x = N : 1 = A(N H) A = The integration yields 1 (log(x H) log(N x)) = kt + C N H and resulting algebra gives x(t) = DN + Hek(N H)t . D + ek(N H)t

Solution. Separating the variables we get

Apply the initial condition x(0) = x0 and do the algebra to nd D= H x0 x0 H = . x0 N x0 N

MATH 320: HOMEWORK 2

Plug this into our general solution, then multiply numerator and denominator by (x0 N ) to arrive at x(t) = And were done. EP2.3.12 It is proposed to dispose of nuclear wastesin drums with weight W = 640 lb and volume 8 ft3 by dropping them into the ocean (v0 = 0). The force equation for a drum falling through water is dv = W + B + FR dt where the buoyant force B is equal to the weight (at 62.5 lb/ft3 ) of the volume of water m displaced by the drum (Archimedes principle) and FR is the force of water resistance, found empirically to be 1 lb for each foot per second of the velocity of the drum. If the drums are likely to burst upon an impact of more than 75 ft/s, what is the maximum depth they can be dropped in the ocean without likelihood of bursting? Solution. Note that the relation between mass and weight is m = W g. Plugging in the relevant constants, the law of motion is 20 dv dv v = 640 + 62.5 8 v = 7+ . dt dt 20 N (x0 H) H(x0 N )ek(N H)t . (x0 H) (x0 N )ek(N H)t

This is solved by separation of variables, v dv = dt 20 log 7 + = t + C v(t) = Det/20 140. 7 + v/20 20 The initial condition gives D = 140. The drum will burst if it reaches a downward velocity of 75 ft/s, so 65 140 is the time when the drum would reach that velocity. The depth it would be at at that 75 = 140(et/20 1) t = 20 log time is given by integrating the velocity:
20 log(65/140)

x(20 log(65/140)) =
0

v(t) dt
20 log(65/140)

= 140
0

et/20 1 dt
20 log(65/140) 0

= 140 20et/20 t

648.

Provided we drop the drums to a depth less than 648 feet, they should not break up impact. EP2.3.20 An arrow is shot straight upward from the ground with an initial velocity of 160 ft/s. It experiences both the deceleration of gravity and deceleration v 2 /800 due to air resistance. How high in the air does it go?

MATH 320: HOMEWORK 2

Solution. The ODE for the law of motion is dv v2 = g . dt 800 We can change this into an ODE for which the independent variable is x, since we want to nd the max height. We use variables: v dv g+
v2 800 dx dt

dv dx dx dt

= v dv . We can solve this using separation of dt v2 = Dex/400 . 800

= dx 400 log g +

v2 800

= x + C g +

Applying the initial condition v(0) = 160 and g = 32, we get that D = 64. The maximum height is reached when the velocity is 0. Applying this to the above equation results in 32 = 64ex/400 x = 400 log .5 277.27 as the maximum height.

Solutions to Midterm 2 Practice Problems


Evan Dummit

April 13, 2012

1. 2. For the matrices

A= 3 2

1 1 1 1

3 2 3 2

and

B= 4 3 9 7

1 2
.

3 1

1 1

, either compute, or explain why it does not exist:

(a)

A2 = A1 =

1 1

3 1
.

(b)

2 1

, either by the formula or by a row-reduction.

(c) (d) (e) (f )

det(A) = 1 B B
2 1

does not exist (B is not square). does not exist (B is not square). does not exist (B is not square).

det(B) AB = BA

(g)

1 1

3 2

1 2

3 1

1 1

7 5

6 5

2 1 23

(h) (i)

does not exist (dimensions don't work:

and

2 2).

det(A

100

) = det(A) S

100

= (1)

100

= 1

3. Consider the set

of matrices of the form

a b

b a V

, where

and

are real numbers.

(a) Show that

is a subspace of the vector space

of

22

matrices with real number entries.

We check the subspace conditions: i. ii.

S S

contains the zero matrix. (True). is closed under addition:

a b

b a

c d a b b a

d c =

a+c b+d (b + d) a + c ra rb rb ra

, which is of the

same form. iii.

is closed under scalar multiplication:

, which is of the same form.

(b) Find a basis for

S. a b b a =a 1 0 0 1 +b 0 1 S 1 0
, so one basis is

We can write

1 0

0 1

and

0 1

1 0

(c) Show that the (matrix) product of two elements in

is also in

S.
, so the product is also of this form.

4.

We have

a b

b a

c d

d c

ac bd ad + bc (ad + bc) ac bd

(a) Find a basis for the solution space to the system of equations

u v 2w x 2y z = w 3x y z =

0 0


In matrix form this is

1 0 0

1 0 0

2 1 0

1 3 1

x+ yz = 0 2 1 0 1 1 0 . 1 1 0

We see that the pivotal columns (columns containing a leading row term) are the rst, third, and fourth columns. Hence the nonpivotal columns are the second, fth, and sixth columns, so the

v , y , and z are the free variables. v = t1 , y = t2 , and z = t3 , then we can write all variables in terms of the arbitrary parameters t1 , t2 , and t3 : The last equation gives x = y + z so x = t2 + t3 . The middle equation gives w = 3x + y + z so w = 3(t2 + t3 ) + t2 + t3 = 2t2 + 4t3 . The rst equation gives u = v + 2w + x + 2y + z so u = t1 + 2(2t2 + 4t3 ) + (t2 + t3 ) + t2 + t3 = t1 4t2 + 10t3 . So our solution vector u, v, w, x, y, z is t1 4t2 + 10t3 , t1 , 2t2 + 4t3 , t2 + t3 , t2 , t3 . To nd the basis we split the vector apart: t1 4t2 + 10t3 , t1 , 2t2 + 4t3 , t2 + t3 , t2 , t3 = t1 1, 1, 0, 0, 0, 0 + t2 4, 0, 2, 1, 1, 0 + t3 10, 0, 4, 1, 0, 1 .
corresponding variables So if we set Then the basis is

1, 1, 0, 0, 0, 0 , 4, 0, 2, 1, 1, 0 , 10, 0, 4, 1, 0, 1
and

(b) Show that, for any

a, b,

c,

there exists a solution to

u v 2w x 2y z = w 3x y z = x+ yz =
[Hint: There is a fairly easy solution in terms of

a b c

a, b, c.]
or,

The idea is to set all of the free variables equal to zero, and then solve for the non-free variables. Doing this eventually leads us to the solution as a vector,

u, v, w, x, y, z =

v = y = z = 0, x = c, w = 3c + b, u = 6c + 2b + a, 6c + 2b + a, 0, 3c + b, c, 0, 0 .

If we were interested in the general solution, we would add this particular solution to the general solution of the homogeneous equation, which we found in the previous part.

5. (a) Are the vectors

2, 1, 0

1, 0, 1

, and

0, 1, 2

linearly independent? If so, explain why; if not, nd

an explicit dependence between them.

a 2, 1, 0 + b 1, 0, 1 + c 0, 1, 2 = 0, 0, 0 then we would need 2a b = 0, a + c = 0, and b + 2c = 0. The rst equation gives b = 2a and the second gives c = a, and then the third reduces to 0 = 0. Thus there is a nontrivial dependence: for example, 1 2, 1, 0 2 1, 0, 1 +1 0, 1, 2 = 0, 0, 0 . 2 1 0 Another way (which is really the same way) is to compute the determinant of the matrix 1 0 1 0 1 2 2 1 0 0 1 and see that it is zero. The coecients in the dependence are then a nonzero solution to 1 0 1 2 a 0 b = 0 . c 0
If we had a dependence

No , they are not linearly independent.

(b) Are the vectors

1, 0, 1, 1

0, 1, 1, 1

, and

1, 2, 0, 2

linearly independent? If so, explain why; if not,

nd an explicit dependence between them.

Yes , they are linearly independent. If we had a dependence need

a 1, 0, 1, 1 + b 0, 1, 1, 1 + c 1, 2, 0, 2 = 0, 0, 0, 0 then we would a + c = 0, b + 2c = 0, a b = 0, and a + b + 2c = 0. The rst equation gives c = a and the third gives b = a, and then the second gives a 2a = 0, so that a = b = c = 0. So the only way for a 1, 0, 1, 1 + b 0, 1, 1, 1 + c 1, 2, 0, 2 = 0, 0, 0, 0 to be true is with a = b = c = 0, which means the vectors are linearly independent. v1
and

6. Suppose that the two vectors

v2

are linearly dependent.

Show that one of the vectors is a scalar

multiple (possibly by zero) of the other.

If the vectors are dependent then we have a relation to zero. If

av1 + bv2 = 0

with at least one of

and

not equal

a b

is not zero, then we can solve the dependence for

v1 : av1 = bv2 v2 : bv2 = av1

so

v1 = v2 =

b a b a

v2 . v1 .

If

is not zero, then we can solve the dependence for

so

7. Suppose that the three vectors

v1 , v2 ,

and

v3

are a basis for a vector space and

V.

(a) Show that the three vectors

v1 , v1 + v2 ,

v1 + v2 + v3

are linearly independent.

Suppose we had a dependence relation that the only possibility is

a1 (v1 ) + a2 (v1 + v2 ) + a3 (v1 + v2 + v3 ) = 0. We want to show a1 = a2 = a3 = 0. By the distributive laws of scalar multiplication, the equation above is the same as a1 v1 +a2 v1 +a2 v2 + a3 v1 + a3 v2 + a3 v3 = 0, which we can regroup and write as (a1 + a2 + a3 )v1 + (a2 + a3 )v2 + (a3 )v3 = 0. But we know that v1 , v2 , and v3 are linearly independent (because they are a basis). So each of the coecients a1 + a2 + a3 , a2 + a3 , and a3 must be zero. Since a3 = 0 and a2 + a3 = 0, then we must have a2 = 0. And then since a1 + a2 + a3 = 0 and a2 = a3 = 0 we get a1 = 0 as well. So we conclude that in fact a1 = a2 = a3 = 0 is the only way to satisfy the equation a1 (v1 ) + a2 (v1 + v2 ) + a3 (v1 + v2 + v3 ) = 0. But this means precisely that v1 , v1 + v2 , and v1 + v2 + v3 are linearly independent. v1 , v1 + v2 ,
and

(b) Show that the three vectors Suppose that

v1 + v2 + v3

span

V.

v1 + v2 + v3 , Since v1 , v2 ,

V : we want to write w as a linear combination of v1 , v1 + v2 , and w = c1 v1 + c2 (v1 + v2 ) + c3 (v1 + v2 + v3 ). and v3 span V (because they are a basis), we know that w = a1 v1 + a2 v2 + a3 v3 for some scalars a1 , a2 , and a3 . Setting the two expressions equal gives a1 v1 + a2 v2 + a3 v3 = c1 v1 + c2 (v1 + v2 ) + c3 (v1 + v2 + v3 ). Our goal is to nd values for c1 , c2 , and c3 . The most natural thing to do is compare coecients of v1 , v2 , and v3 on both sides. Doing this (after expanding with the distributive laws as in the previous part) yields a1 = c1 + c2 + c3 , a2 = c2 + c3 , and a3 = c3 . Substituting gives c3 = a3 , c2 = a2 a3 , and c1 = a1 a2 . So w = (a1 a2 )v1 + (a2 a3 )(v1 + v2 ) + a3 (v1 + v2 + v3 ). We have written a general vector w in V as a linear combination of v1 , v1 + v2 , and v1 + v2 + v3 , which shows that v1 , v1 + v2 , and v1 + v2 + v3 span V .
is a vector in say

(c) Show that the three vectors

v1 , v1 + v2 ,

and

v1 + v2 + v3

are also a basis for

V.

By the previous two parts,

V,

which means they are a basis for

v1 , v1 + v2 , and v1 + v2 + v3 V.

are a linearly independent spanning set for

8. Consider the function

from

R4 to R3 which sends the vector w, x, y, z

to the vector

w + x, y + z, w + x + y + z

(a) Find

ker(T ):

all vectors

w, x, y, z 0, 0, 0

in

R4

which are sent to the zero vector

0, 0, 0

by

T.

The vectors sent to

are those with

see that the vectors sent to

0, 0, 0
in

are

w + x = 0, y + z = 0, and w + x + y + z = 0. We can then those w, x, y, z of the form s, s, t, t for any values of w, x, y, z


with

s
(b) Find

and

t.
all vectors

im(T ): a, b, c .
We have

a, b, c

R3

for which there exists a vector

T (w, x, y, z) =

w + x, y + z, w + x + y + z = a, b, c , which means that a, b, c must satisfy the relation c = a + b. Then it is clear that a and b can be anything, and so the image of T is the vectors of the s, t, s + t for any values of s and t. form

Quiz #2 Solutions, Math 320, Sections 352/353/354/355, February 14/16.

Find the requested solutions to the given dierential equations (implicit solutions are okay):

 The function (x) such that x + 3 = 3 and (1) = 2. Solution 1 (Linear): Putting the equation in the form
+ 3 3 = x x
(3/x) dx

shows that it is linear. The integrating factor is then I(x) = e Multiplying by it and the integrating both sides gives
(x3 + 3x2 ) dx =

= e3 ln(x) = eln(x

= x3 .

3x2 dx.

Evaluating the integrals yields

x3 = x3 + C

so that = 1 + C x3 . Plugging in the initial condition and solving yields C = 1, so the answer is (x) = 1 + x3 .
Solution 2 (Separable): Rearranging the equation as =

rating it yields

3 3 shows that it is separable. Sepax

1 = 3 3 x 1 d = 3 3 1 dx. x

and the integrating both sides gives

Evaluating both sides yields ln(3 3) = ln(x) + C , so that ln(3 3) = 3 ln(x) + C = 3 ln(x3 ) + C . Exponentiating and rearranging then shows (x) = 1 + C x3 , and plugging in the initial condition gives C = 1, so (x) = 1 + x3 .
 The function y(x) with y + y =
8e2x and y(0) = 1. y

Solution (Bernoulli): Rewrite in the form y + y = 8e2x y 1 to see the equation is of Bernoulli type, with n = 1. Set v = y 2 , so that v = 2yy . Multiplying both sides of the original equation by 2y

gives

2yy + 2y 2 = 16e2x ,

and rewriting in terms of v gives the rst-order linear equation


v + 2v = 16e2x .

The integrating factor is I(x) = e

= e2x so scaling by it and then integrating yields 2x 2x (e v + 2e v) dx = 16e4x dx


2 dx

e2x v

4e4x + C

so that v = 4e2x + Ce2x . Since y(0) = 1, v(0) = 12 = 1. Therefore, we see C = 3; then solving for y yields y(x) =
4e2x 3e2x .

 All functions P (t) satisfying et

dP = eP . dt

Solution (Separable): The equation is separable. Rearranging and then integrating gives eP dP = et dt

and evaluating gives

eP = et + C.

This is a correct (implicit) answer, although we can solve for P to obtain P = ln(et + C) .
 All functions y(x) satisfying y =
4x3 y y 2 . x4 2xy

Solution (Exact): Clearing the denominators and rewriting gives the equation as 4x3 y y 2 + x4 2xy y = 0. We then have M = 4x3 y y 2 and N = x4 2xy , so that My Nx = = 4x3 2y 4x3 2y.

These are equal, so the equation is exact. We want Fx = M so F = M dx = x4 y xy 2 + g(y) for some function g(y). Dierentiating gives N = Fy = x4 2xy + g (y) so we can take g(y) = 0. Therefore, the general solution is F (x, y) = x4 y xy 2 = C .
 The function w(x) with w =
x2 + w 2 and w(1) = 4. xw x w + , w x

Solution 1 (Homogeneous): Dividing the fraction gives w =

which is a homogeneous equation. Setting v =

w , so that w = v + xv , transforms the equation into x 1 v

v + xv = v +

so that xv = . Separating and integrating yields


v dv = 1 2 1 dx x

1 v

so that v 2 = ln(x) + C , or v = 2 ln(x) + C , so w = x 2 ln(x) + C . Finally, plugging in the initial condition shows we want C = 16 and the plus sign, so w = x 2 ln(x) + 16 . Solution 2 (Bernoulli): Dividing the fraction and rearranging gives
w 1 w = xw1 , x

which is Bernoulli with n = 1. Setting v = w2 gives v = 2ww , so multiplying the original equation by 2w yields
2ww 2 2 w = 2x, x

or, in terms of v ,
v

2 v = 2x. x

The integrating factor is I(x) = e yields Evaluating gives

= e2 ln(x) = eln(x ) = x2 , so scaling and integrating (x2 v 2x2 v) dx = 2x1 dx.


(2/x) dx
2

x2 v = 2 ln(x) + C,

so that v = x2 (2 ln(x) + C) and w = x 2 ln(x) + C . As before the initial condition yields the answer as w = x 2 ln(x) + 16 . Solution 3 (Almost-Exact): Clearing the fraction and rearranging gives
(x2 w2 ) + (xw)w = 0,

so that M = x2 w2 and N = xw. Thus we have


Mw Nx = 2w = w,

3w 3 M w Nx = = , which is a N xw x 3 function of x only. Hence we have an integrating factor of I(x) = e (3/x) dx = e3 ln(x) = eln(x ) = x3 , so scaling by it gives (x1 x3 w2 ) + (x2 w)w = 0. 1 We want Fx = x1 x3 w2 so F = x1 x3 w2 dx = ln(x) + x2 w2 + g(w) for some 2 function g(w). Dierentiating gives Fy = x2 w + g (w) so we can take g(w) = 0. Then the general 1 solution is ln(x) + x2 w2 = C . Plugging in the initial condition shows C = 8, so the solution is 2 1 2 2 given by ln(x) + x w = 8 . (Note that we can solve for w, and would recover the same answer 2

so the equation is not exact. However, we can observe that

as in the other solutions.)

 The function y(x) satisfying (2ex y + y 4 ) + (ex + 4y 3 )y = 0 which has y(0) = 1. Solution (Almost-Exact): We have M = 2ex y + y 4 and N = ex + 4y 3 , so
My Nx = = 2ex + 4y 3 ex

My Nx ex + 4y 3 = x = 1, which is a N e + 4y 3 function of x only. Hence we have an integrating factor of I(x) = e 1 dx = ex , so scaling by it gives

so the equation is not exact. However, we can observe that

(2e2x y + ex y 4 ) + (e2x + 4ex y 3 )y = 0. We want Fx = 2e2x y + ex y 4 so F = 2e2x y + ex y 4 dx = e2x y + ex y 4 + g(y) for some function g(y). Dierentiating gives Fy = e2x + 4ex y 3 + g (y) so we can take g(y) = 0. Thus the general solution is e2x y + ex y 4 = C . Now plugging in the initial condition shows C = 2, so the solution is given by e2x y + ex y 4 = 2 .

Quiz #3 Solutions, Math 320, Sections 352/353/354/355, March 20/22.

(5 2.5)

Decide whether the following statements are true or false, and circle T or F respectively.

If you

wish to receive possible partial credit in case of a wrong answer, you may explain your answer (and work out computations, if necessary) in the space below the problem.

If

A=

1 0

2 1

and

B=

1 4 7 4

1 1 3 1

, then

AB =

7 4

3 2

False . The correct product is

AB =

The inverse of the matrix

3 1

4 1 1 1

is

1 1

4 3

True . To check, multiply

3 1

4 1

4 3

to see that the product is

1 0

0 1

If

A, B, C

are

2 4, 2 3,

and

3 2 matrices respectively,

then the product

CAB

is dened.

False . The product either.

AB

is not dened (the middle dimensions do not match), so

CAB

is not dened

1 0 0 1 The matrix 0 0 0 0

1 0 0 0

0 1 0

1 0 is in row-echelon form but not reduced row-echelon form. 1 0

True . The matrix has the proper staircase shape for echelon form, but is not reduced because of the


T F The matrix

2 1 0

1 0 1

0 1 2

is invertible.

False . This matrix has determinant zero, and therefore is not invertible.

(6 2.5)

Decide whether the following statements are true or false, and circle T or F respectively.

If you

wish to receive possible partial credit in case of a wrong answer, you may explain your answer (and work out computations, if necessary) in the space below the problem.

If

and

are square matrices of the same size,

(A + B)2 = A2 + 2AB + B 2 .

False . The correct statement is

(A + B)2 = A2 + AB + BA + B 2 ; this is not equal to A2 + 2AB + B 2

because matrix multiplication is not necessarily commutative.

F False .

If

and

are invertible matrices of the same size, then

(A + B)1 = A1 + B 1 .

If

and

B = I ).

Even if

B are invertible, then A + B does not need to be invertible (example: A = I , A + B is invertible, its inverse is generally not A1 + B 1 (example: A = B = I ).

T matrices

If

is an

nn

matrix and

is an

x1

and

x2

such that

A x1 = b

and

n 1 matrix such that there A x2 = b, then det(A) = 0.

exist two dierent

n1

det(A) were nonzero, then A1 would exist. Then we could multiplyA x1 = b = A x2 on 1 1 the left by A to see x1 = A b = x2 , which can't happen because x1 and x2 are dierent. Another way to phrase the problem statement is: if a homogeneous system of n equations in n
True . If variables has two dierent solutions, then the coecient matrix has determinant zero. (Written this way, the statement might seem more familiar.)

If

A=

1 0

2 1

B=

1 1

1 1

, and

C=

3 1

2 1

, then

det(CBABA2 CA) = 100.

True . The determinant is multiplicative, so

det(CBABA2 CA) = det(C) det(B) det(A) det(B) det(A)2 det(C) det(A) = det(A)4 det(B)2 det(C)2 .
Since

det(A) = 1, det(B) = 2, and det(C) = 5, we see that det(CBABA2 CA) = (1)4 (2)2 (5)2 =

100.
The wrong way to do this problem is to actually perform the seven matrix multiplications to nd

CBABA2 CA =
waste of time.)

6 8

14 2

, and then take the determinant. (This does work, but it is a massive

If

and

are matrices such that

AB 1

is invertible, then

BA

is also invertible.

False . Here is a counter-example:

A=

B=

1 1

. Then

AB = [1]

is invertible, but

BA =

1 1

2 2

is not invertible.

The problem is true if the matrices to see that

A and B are the same size: in that case, we can take determinants det(AB) = det(A) det(B) = det(B) det(A) = det(BA).

Matrices are interesting.

True . This is self-evidently a true statement.

Quiz #4 Solutions, Math 320, Sections 352/353/354/355, March 20/22.

(2,2,2) Suppose that

v1 , , vn

are vectors in a vector space

V.

Dene, in one sentence, the span of

v1 , , vn . w
with

The span is the subspace

of all linear combinations of the vectors

More explicitly, the span is the collection of vectors

v1 , , vn . w = a1 v1 + + an vn

for some scalars

a1 , , an .
Equivalently, the span is the smallest subspace of

which contains each of the vectors

v1 , , vn .

Dene, in one sentence, what it means for The vectors are

v1 , , vn

to be linearly independent.

v1 , , vn are linearly independent when the only scalars for which a1 v1 + + an vn = 0 a1 = = an = 0.

Equivalently, the vectors are linearly independent if none of them can be written as a linear combination of the others.

Dene, in one sentence, what it means for

v1 , , vn

to be a basis of

V. v1 , , vn
must be

A basis is a linearly independent spanning set: so, to be a basis, the vectors linearly independent and span Equivalently,

V.
can be written as a unique linear combination

v1 , , vn is a basis if every vector w in V w = a1 v1 + + an vn . (V, S),


determine whether

(2,2,2) For each of the following pairs or why not.

is a subspace of

V,

and briey explain why

Note: The goal of all three parts of this problem was to check the three pieces of the subspace criterion: (i) that the zero vector is in multiple of a vector in

S,

(ii) that the sum of two vectors in

is also in

S,

and (iii) that any scalar

is also in

S.

If any of these conditions fail, then

is not a subspace.

V = R3 , S

is all vectors whose sum of coordinates is 3.

In this case,

is the vectors

x, y, z

where

x + y + z = 3.

We go through the parts of the subspace criterion: 1. The zero vector 2.

0, 0, 0

is not in

S.

3.

S is not closed under addition: for example, 3, 0, 0 and 0, 3, 0 are both in S , but their sum 3, 3, 0 is not. S is not closed under scalar multiplication: for example, 1, 1, 1 is in S , but 2 1, 1, 1 = 2, 2, 2
is not.

Therefore, this satised.

is

not a subspace of

 none of the three parts of the subspace criterion is

V = R2012 , S

is all vectors which have at least 2010 zeroes in them.

We go through the parts of the subspace criterion: 1. The zero vector 2.

0, 0, . . . , 0

is in

S,

since it has 2012 zeroes in it.

is not closed under addition: for example, if

entries equal to 1 and their sum 3.

v1 = 1, 1, 0, 0, , 0 is the vector with rst two v2 = 0, 0, , 0, 0, 1, 1 is the vector with last two entries equal to 1, then v1 + v2 = 1, 1, 0, 0, , 0, 0, 1, 1 only has 2008 zeroes, not 2010.

is closed under scalar multiplication, since scaling a vector will not reduce the number of zero

entries it has.

Therefore, this

is

not a subspace of

, because

is not closed under addition.

is all polynomials in the variable

x, S
are

is all polynomials

p(x)

such that when

p(0) = 0.
each of these polynomials

Some examples of elements in is zero.

x, x

10

3x

, and

0:

x = 0,

We go through the parts of the subspace criterion:

1. The zero vector here is the zero polynomial, and it is in 2.

3.

S is closed 0 + 0 = 0. S is closed

under addition: if

r(x) = p(x) + q(x)

and

S. p(0) = q(0) = 0,
and

then

r(0) = p(0) + q(0) = r(0) = 0 = 0.

under scalar multiplication: if

r(x) = p(x) S

p(0) = 0

then

Therefore, this

is a subspace of

, because

satises all three of the subspace conditions.

v w + 2x y z =
(5) Find a basis for the set of solutions

0 0 0
.

v, w, x, y, z

to the system

w x 2y 3z = y + 2z =

The goal is to nd all solutions to this homogeneous system, and then extract the basis from the collection of solutions.


The system in matrix form is

1 1 0 1 0 0

2 1 0

1 2 1

1 3 2

0 0 0

. This matrix is already in echelon form,

so we need only identify the free variables and then nd the general solution.

The rst, second, and fourth columns are pivotal columns (because they contain leading row terms). Thus the third and fth columns are nonpivotal columns, and so their corresponding variables (x and are the free variables.

z)

So if we set

x=s

and

z = t,

then we can substitute one at a time into the equations to write all the

variables in terms of our free parameters

and

t.

The last equation

The middle equation The rst

y + 2z = 0 gives y = 2z = 2t. w x 2y 3z = 0 gives w = x + 2y + 3z = s 4t + 3t = s t. equation v w + 2x y z = 0 gives v = w 2x + y + z = (s t) 2s 2t + t = s 2t. v, w, x, y, z = s 2t, s t, s, 2t, t


. .

Hence all the solutions are given by

To extract the basis we just split apart this solution vector:

s 2t, s t, s, 2t, t = s, s, s, 0, 0 + 1, 1, 1, 0, 0

2t, t, 0, 2t, t = s 1, 1, 1, 0, 0 + t 2, 1, 0, 2, 1
and

We have written every solution to the system as an explicit linear combination of the vectors

2, 1, 0, 2, 1

: therefore,

1, 1, 1, 0, 0 2, 0, 1
,

and

2, 1, 0, 2, 1
, and

are a basis for the solution space.

(5) Determine (with justication) whether

1, 1, 1

1, 3, 2

form a basis for

R3 . | vn |

| In general, a collection of n vectors v1 , , vn in Rn is a basis precisely when the matrix A = v1 | is invertible. (The columns of A are the coordinates of v1 , , vn .)
would need to look for scalars

The reason is: if we wanted to determine whether there was a dependence between our vectors, we

x1 , , xn

(not all zero) such that

is the same as nding a nonzero solution to the matrix equation

x1 vn + + xn vn = 0. But x1 . A x = 0, where x = . , . xn

this

and

such a solution exists precisely when

is non-invertible.

Because

det(A) = det(AT ),

one could just as well use the matrix whose rows are the vectors

v1 , , vn . 2 Thus we need to determine whether the matrix A = 0 1


determinant by expanding down the rst column:

1 3 is invertible. To do this we compute the 2 1 3 1 1 det(A) = 2 +1 = 2 (1) + 1 2 = 0. 1 2 1 3 1 1 1


not a basis because

Since the determinant is zero, they are linearly dependent.

is not invertible, meaning that these vectors are

Note: We could also solve the problem just by nding an explicit dependence:

1 2, 0, 1 3 1, 1, 1 +

1 1, 3, 2 = 0, 0, 0

(4) Suppose that the vectors

v1 , v2 ,

and

v3

are linearly independent. Show that the vectors

v1 , v1 + v2 ,

and

v1 + v2 + v3

are also linearly independent.

Suppose we had a dependence relation the only possibility is

a1 (v1 ) + a2 (v1 + v2 ) + a3 (v1 + v2 + v3 ) = 0. a1 = a2 = a3 = 0.


which we can regroup and write as

We want to show that

By the distributive laws of scalar multiplication, the equation above is the same as

a3 v1 + a3 v2 + a3 v3 = 0,
But we know that and

a1 v1 + a2 v1 + a2 v2 + (a1 + a2 + a3 )v1 + (a2 + a3 )v2 + (a3 )v3 = 0. a1 + a2 + a3 , a2 + a3 ,

v1 , v2 , and v3

are linearly independent. So each of the coecients

a3

must be zero. And then since

Since

we get

a3 = 0 and a2 +a3 = 0, then we must have a2 = 0. a1 = 0 as well. a1 = a2 = a3 = 0 v1 , v1 + v2 ,


and

a1 +a2 +a3 = 0 and a2 = a3 = 0 a1 (v1 ) + a2 (v1 +

So we conclude that in fact

is the only way to satisfy the equation

v2 ) + a3 (v1 + v2 + v3 ) = 0.
But this means precisely that

v1 + v2 + v3

are linearly independent.

Quiz #5, Math 320, Sections 352/353/354/355, May

2 1.

(2,2,2) Find the general solution to each homogeneous second-order linear dierential equation:

y + 6y + 10y = 0.
The characteristic equation is the quadratic formula. functions,

r + 6r + 10 = 0,
3x

which has roots

The general solution is

y = Ae

(3+i)x

+ Be

(3i)x

62 4 10 = 3 i 2

by

, or, to use real-valued

y = Ae

3x

cos(x) + Be

sin(x)

y + 6y + 5y = 0.
The characteristic equation is

r + 6r + 5 = 0,

which has roots .

62 4 5 = 3 2 2

 namely,

r = 5, 1.

The general solution is

y = Ae5x + Bex

y + 6y + 9y = 0.
The characteristic equation is

r2 + 6r + 9 = 0,

which has roots .

62 4 9 = 3 0 2

 namely,

r = 3, 3.

The general solution is

y = Ae3x + Bxe3x

(4) Find one solution to the non-homogeneous equation

y + 4y = x2 + e3x + sin(2x). y = C1 cos(2x) + C2 sin(2x).

The homogeneous equation is

y + 4y = 0,

which has general solution

The non-homogeneous part of the original equation is

x +e

3x

+ sin(2x).

We replace all coecients with variables; then we ll in the missing cosine term, and nally add in the missing lower-degree terms, to obtain a rst guess of

A2 x2 +A1 x+A0 +Be3x +D sin(2x)+E cos(2x).

There is an overlap (the sine and cosine terms) with the solutions of the homogeneous equation, so we scale the overlapping terms by This gives us the correct guess as

x. ypar = A2 x2 + A1 x + A0 + Be3x + Dx sin(2x) + Ex cos(2x)


.

Now we compute the second derivative

ypar = 2A2 + 9Be3x + D(4 cos(2x) 4x sin(2x)) + E(4 sin(2x) 4x cos(2x))


[note: on the actual quiz, the second derivatives of

x sin(2x)

and

x cos(2x)

were given] to obtain

ypar + 4ypar = 4A2 x2 + 4A1 x + (2A2 + 4A0 ) + 13Be3x + 4D cos(2x) 4E sin(2x).


Equating

ypar +4ypar

with

x2 +e3x +sin(2x) yields A2 = ypar =


and

1 1 1 1 , A1 = 0, A0 = , B = , D = 0, E = . 4 8 13 4
.

Thus the desired answer is

1 2 1 1 1 x + e3x x cos(2x) 4 8 13 4

(4) Show that the functions 1,

x,

ex

are linearly independent functions on the real line. [Hint:

W .]

We compute the Wronskian

W (1, x, e ) =

1 0 0

x ex 1 ex 0 ex

= ex .

(Note the matrix is upper-triangular, so

the determinant is just the product of the diagonal entries

1, 1, ex .)

The Wronskian is nonzero, hence the functions are

linearly independent .

(4) Find a homogeneous linear dierential equation which has

sin(t)1337 cos(t) and t2 +t+ 666

as solutions.

The idea is to think about what factors have to be in the characteristic polynomial of the equation, in order to get these two functions as solutions. In order to have a function characteristic polynomial

A sin(t) + B cos(t) as a solution p(r) should have a factor r2 + 1.

to an equation with constant coecients, the

In order to have a function characteristic polynomial

Ax2 + Bx + C as a solution p(r) should have a factor r3 .

to an equation with constant coecients, the

Thus, in order to get both functions as solutions, the characteristic polynomial should be divisible by both

r2 + 1

and

r3 . r3 (r2 + 1) = r5 + r3 .

The easiest polynomial with this property is just the product, The corresponding dierential equation is

+y

=0

Note: There are many other dierential equations which will work (though this one is the simplest). For example, any constant-coecient equation with characteristic polynomial divisible by work. There are also equations with non-constant coecients which work.

r5 + r3

will also

(4) Find one function equation

yp (x)

such that are

y +
and

y +

1 1 y 2y = 0 x x

y1 = x

1 1 y 2 y = x, x x 1 y2 = . x

given that two solutions to the homogeneous

This is a variation of parameters problem (although it is possible to guess a solution via undetermined coecients). We are given two independent solutions

y1 = x

and

y2 =

1 x

to the homogeneous equation.

The variation of parameters setup then says to take our particular solution as

ypar = v1 y1 + v2 y2 ,

where

v1 =

W1 (x) W (x) x 1

and

v2 =

W2 (x) , W (x)

with

W (x) =

W1 (x) =

0 x

W2 (x) =
Thus we get

x 1 v1 =

1 1 1 2 x 1 = (x) x2 (1) x = x , 2 x 1 1 x 1 = 0 (x) x = 1, and 2 x 0 = x2 . x 1 x = (2/x) 2 x2 4


and and

v2 = x4 . 8

x2 x3 = . (2/x) 2

Integrating gives

v1 =

v2 =

Thus we obtain

ypar = v1 y1 + v2 y2 =

x2 4 1 1

(x) + 5 3

x4 8

1 x

x3 8

(4) Find the eigenvalues of the matrix

A=

The eigenvalues are the zeroes of the characteristic polynomial We have

p(t) = det(tI A).

tI A =

t+1 1

5 t3

, so

p(t) = det(tI A) = (t + 1)(t 3) 5 = t2 2t 8 = (t 4)(t + 2). ( 4)( + 2) = 0


 i.e.,

Therefore the eigenvalues are the solutions to

= 2

and

Quiz #6, Math 320, Sections 352/353/354/355, April 251.

(1 9)

Find general solutions for each of the following dierential equations or systems:

P =

1 P (5 P ). 100 dP = P (5 P ) 1/5 1/5 + dP = P 5P 1 1 ln(P ) ln(5 P ) = 5 5 1 dt 100

This is a logistic equation. The equation is separable:

t +C 100 t +C 100 P = D et/20 . 5P

Rewriting gives

ln

P 5P P =

t + C, 20

so by exponentiating we have

Solving for

gives

5 1 + D et/20

. (Alternatively, one could use the formula.)

y = 2xy + x3 .
This is rst-order linear; write it in the form The integrating factor is

= ex . So we have e y 2xe y=x e . 2 Integrating both sides (substitute u = x and then integrate by parts on the right-hand side) yields 2 1 1 2 2 ex y = (x2 + 1)ex + C , so that y = (x2 + 1) + Cex . 2 2 e
(2x) dx x2 x2 3 x2

y 2xy = x3 .

y + 4xy = 8xy 2 .
This is a Bernoulli substitution, with

n = 2, so that v = y 1 and v = y 2 y . 2 2 Scaling by y gives y y 4xy 1 = 8x, or v 4xv = 8x, which is now rst-order linear. 2 4x dx The integrating factor is e = e2x . 2 2 2x2 So we have e v 4xe2x v = 8xe2x . 2 2 2x2 Integrating both sides (substitute u = x on the right-hand side) yields e v = 2e2x + C , so
that

v = 1 + Ce2x

. Therefore,

y = (2 + Ce2x )1

(6x + y) + (x + 3y 2 )y = 0. y
This is an exact equation: with

M = 6x + y and N = x + 3y 2 we have My = 1 = Nx . 2 So there exists a function f (x, y) with fx = 6x + y and fy = x + 3y . 2 Taking the anti-partial with respect to x gives f = 3x +xy+g(y) for some g(y). Then fy = x+g (y) 2 2 3 must equal x + 3y so g (y) = 3y hence we can take g(y) = y .
Therefore the solutions are

3x2 + xy + y 3 = C

4y + 4y = 0.
This is a homogeneous linear equation with constant coecients. The characteristic equation is

r4 4r3 + 4r2 = 0,

which factors as

r2 (r 2)2 = 0,

with roots

r = 0, 0, 2, 2.
So the general solution is

y = A + Bx + Ce2x + Dxe2x
.

y 6y + 9y = e + e

2x

+e

3x

This is a non-homogeneous equation with constant coecients. We can use undetermined coecients.

The homogeneous equation is with roots

y 6y + 9y = 0

which has characteristic equation

r2 6r + 9 = 0

r = 3, 3.

So the general homogeneous solution is

yhom = C1 e

Now we look for a solution to the non-homogeneous equation: but there is an overlap (namely solution by

+ C2 xe . x 2x the rst guess is A1 e + A2 e + A3 e3x

3x

3x

e3x )

with the homogeneous solutions, so we must multiply that

in order to avoid an overlap.

So our trial solution is

ypar = A1 ex + A2 e2x + A3 x2 e3x . = =

We compute

ypar ypar
so

A1 ex + 2A2 e2x + 2A3 xe3x + 3A3 x2 e3x A1 ex + 4A2 e2x + 2A3 e3x + 12A3 xe3x + 9A3 x2 e3x

ypar 6ypar + 9ypar = 4A1 ex + A2 e2x + 2A3 e3x


whence

A1 =

1 , A2 = 1, 4

and

A3 =

1 . 2 1 x 1 e + e2x + x2 e3x 4 2 + C1 e3x + C2 xe3x


.

Hence the general solution is

ygen = ypar + yhom =

Note: One can also use variation of parameters to solve this problem. It will, of course, lead to the same general solution.

y + 4y = tan(x).
This is a non-homogeneous equation with constant coecients. We cannot use undetermined coecients because of the

tan(x)

term, so we use variation of parameters.

r2 + 4 = 0, with roots r = 2i, 2i. The general homogeneous solution is thus C1 cos(2x) + C2 sin(2x). We then take y1 = cos(2x) and y2 = sin(2x), and want to construct ypar = v1 y1 + v2 y2 where W1 (x) W2 (x) v1 = and v2 = , with W (x) W (x) cos(2x) sin(2x) = 2 cos2 (2x) + 2 sin2 (2x) = 2, W (x) = 2 sin(2x) 2 cos(2x) sin(x) 0 sin(2x) = tan(x) sin(2x) = W1 (x) = 2 sin(x) cos(x) = 2 sin2 (x), tan(x) 2 cos(2x) cos(x) sin(x) cos(2x) 0 = tan(x) cos(2x) = W1 (x) = (2 cos2 (x) 1) = sin(2x) tan(x). 2 sin(2x) tan(x) cos(x) 1 cos(2x) 1 1 2 Thus v1 = sin (x) = , and v2 = sin(2x) tan(x). 2 2 2 1 1 x 1 + sin(2x) and v2 = cos(2x) + ln(cos(x)). Integrating gives v1 = 2 4 4 2 x 1 1 1 Thus we have ypar = v1 y1 + v2 y2 = + sin(2x) cos(2x) + cos(2x) + ln(cos(x)) sin(2x). 2 4 4 2 x 1 We can cancel some terms to simplify this to ypar = cos(2x) + ln(cos(x)) sin(2x). 2 2 x 1 Finally, ygen = cos(2x) + ln(cos(x)) sin(2x) + C1 cos(2x) + C2 sin(2x) . 2 2
The homogeneous equation is

y + 4y = 0

which has characteristic equation

y = 3y + 2z z = y + 2z 3 1 2 2

. The coecient matrix is

This is a rst-order linear system; we use the eigenvalue method. .

A =

We have

det(tI A) = = 1, 4.

t3 1

2 t2

= (t 3)(t 2) 2 = t2 5t + 4 = (t 1)(t 4),

so the

eigenvalues are

For

=1

we solve

3 1

2 2

a b

=1 1 1
.

a b

so that

3a + 2b a + 2b

a b

, or

b = a.

So a

basis for the

=1

eigenspace is

For

=4

we solve

3 1

2 2

a b 2 1

=4
.

a b

so that

3a + 2b a + 2b

4a 4b

, or

a = 2b.

So a

basis for the

=4

eigenspace is

We have the proper number of linearly-independent eigenvectors (namely, 2) so the general solution to the system is

y z
and

= c1 y(1) = 1,

1 1

et + c2

2 1

e4t ,

or

y = c1 et + 2c2 e4t z = c1 et + c2 e4t y(2).

(1) Given

y = sin(x + y)

nd an approximation of

The idea is to use one of the approximation methods (Euler's method, or one of the improvements to it). Here is the setup for the standard Euler's Method with a step size of We ll in all the the new

x-values,

and then the

y -values
1 1 0.909 0.0909

using the recursion 1.1 1.0909 0.814 0.0814 1.2 1.1723 0.696 0.0696

h = 0.1 and f (x, y) = sin(x + y). y , f (x, y), and h f (x, y) values one column at a time, generating yn = yn1 + h f (xn1 , yn1 ).
1.3 1.4 1.2983 0.429 0.0429 . 1.5 1.3412 0.296 0.0296 1.6 1.3708 0.170 0.0170 1.7 1.3878 0.054 0.0054 1.8 1.3932 -0.052 -0.0052 1.9 1.3880 -0.146 -0.0146 2.0 1.3734 -

x y f (x, y) h f (x, y)

1.2419 0.564 0.0564

The approximation thus gives

y(2) 1.37

Note: It is possible to solve this dierential equation explicitly for

y,

via a substitution: the result is

y = 2 cot1

2 1 x x+C

for an appropriate branch of the inverse cotangent, where

C=

1 cot(1) . 1 + cot(1)

This gives the (correct) value

y(2) 1.3375

(1,1) Find all initial conditions

solutions (in some region containing

y(a) = b for (a, b)):

which these dierential equations are guaranteed to have unique

y = (x y)1/3
The existence-uniqueness theorem for rst-order equations states that the initial value problem

y = f (x, y) with y(a) = b f derivative is continuous y


In this case,

has exactly one solution (on some interval containing on a rectangle containing

a)

if the partial

(a, b).

f (x, y) = (x y)1/3 (x, y),

so

and continuous for all

we see

1 f 2/3 = (x y)2/3 = . Since (x y) is dened y (x y)2/3 f that is dened and continuous provided the denominator is y
such that

nonzero: that is, as long as

x = y. a=b
.

Hence the pairs

(a, b) for which the IVP is guaranteed to have a unique solution are (a, b)
this equation can be solved by a substitution, yielding

Notes: If

a = b,

y=

2 x+C 3

3/2

x,

with

2 (a + b)3/2 a. If a = b the equation also has the solution y = 0 (among others). 3 1 1 y + y + 2y = 0 x x The initial value problem should have two conditions y(a) = b1 and y (a) = b2 in order to specify a unique solution. The correct answer to the problem is thus no pairs (a, b) because no initial condition y(a) = b contains enough information to specify a unique solution. C=

Partial credit was also given for the following answer:

The existence-uniqueness theorem for general linear equations states that the initial value problem

y (n) + Pn (x) y (n1) + + P2 (x) y + P1 (x) y = Q(x), with y(a) = b1 and y (a) = b2 has a unique solution (on some interval containing a) if Pn (x), , P1 (x) and Q(x) are continuous on an interval containing a. 1 1 Since the coecient functions are 1, , 2 , and 0, they are continuous everywhere except where x x x = 0. Hence the collections (a, b1 , b2 ) for which the IVP is guaranteed to have a unique solution are (a, b1 , b2 ) such that a = 0 .
Note: If obtain

a = 0, this equation is of Euler type and can be solved by the substitution v = ln |x|, to y = c1 cos(ln |x|) + c2 sin(ln |x|), for appropriate values of c1 and c2 determined by the initial conditions. If a = 0 then since the coecients are not dened when x = 0, the IVP cannot have a
solution.

(1) Find the Wronskian of

ln(x)

and

ln(x2 ),

for

x > 0.

Are the functions linearly independent?

The functions are

not independent , because

2 ln(x) = ln(x2 ).

The Wronskian is

W =

ln(x) 1 x

ln(x2 ) 2x x2

1 1 1 2 ln(x) ln(x2 ) = 2 ln(x) ln(x2 ) = 0 = 0. x x x x

Remark: After nding and that

x = e, to see that 2 ln(x) = ln(x2 ).

1 2 ln(x) ln(x2 ) , like x = 1 W , one can plug in some easy values of x to W = x W = 0 for each of those values of x; this was intended to lead to the observation 3x1 + x2 = 5 9
.

(2) Find all solutions to the system

x1 + x2 + 2x3 =

4x1 2x2 10x3 = 30


The procedure is to put this matrix in reduced row-echelon form, identify any free variables, and then write down the general solution. In matrix form we have

3 1 4

1 1 2

0 2 10

1 5 R1 R2 9 3 4 30

1 1 2

2 0 10

9 5 30

Now clear out the rst column:

1 3 4

1 1 2

2 0 10

9 1 R2 3R1 5 0 30 4

1 2 2

2 6 10

9 1 R3 4R1 22 0 30 0

1 2 6

2 6 18

9 22 . 66

Now clear out the second column, and then rescale the second row:

1 0 0

1 2 6

2 6 18

9 1 R 3R2 22 3 0 66 0

1 2 0

2 6 0

1 9 1 R2 2 22 0 0 0

1 1 0

2 3 0

9 11 0

Finally, we put it in reduced row echelon form:

1 0 0

1 1 0

2 3 0

9 1 R1 R2 11 0 0 0

0 1 0

1 3 0

2 11 . 0

So the system does have a solution (since the bottom row is not a contradiction), and we have one free variable,

z.

If we set

z=t

we obtain

y = 11 3t

and

x = 2 + t.

So the solutions of the systems are

x, y, z = 2 + t, 11 3t, t

2 (18) Let A = 0 1
power of

2 1 0

5 0 . 2

Compute the determinant, inverse, characteristic polynomial, eigenvalues,

eigenvectors (one in each eigenspace), diagonalization

D, conjugating matrix P

with

D = P 1 AP , and 2012th

A. A det(A) = 2 1 0 0 2 A + (1) 2 1 5 0 = 4 5 = 1
.

Determinant of

Expand down the rst column:

Inverse of

A [A|I]
and then row-reduce to the identity; the result will be

We write down the matrix

[I|A1 ]. 5 0 2

2 0 1

2 5 1 0 0 2 1 0 R3 2R1 0 1 0 2

So we see

1 0 2 0 0 1 1 0 2 0 0 1 0 (1)R1 ,R2 R1 R 0 1 0 0 1 0 0 1 0 0 3 0 1 0 1 0 0 0 1 2 2 5 2 2 5 1 0 2 4 0 0 1 1 0 0 1 0 2 2 0 0 1 R 2R3 R +2R2 0 1 0 1 0 1 0 0 1 0 3 0 1 0 0 1 0 1 2 1 2 2 0 0 1 1 0 2 0 0 1 1 2 4 5 1 that A = 0 1 0 . To verify this we can multiply A by this matrix, and we 1 2 2 1 0 0 0 1 0 A

indeed get the identity matrix (as we should).

Characteristic Polynomial of

The characteristic polynomial

Since

tI A = t+1 0

(t 2)
So

t2 0 1 0 t+2

p(t) is given by p(t) = det(tI A). 2 5 t+1 0 by expanding down the rst column we see that det(tI A) = 0 t+2 2 5 = (t 2)(t + 1)(t + 2) 5(t + 1) = (t + 1)(t2 + 1). +1 t+1 0
.

p(t) = (t + 1)(t2 + 1) = t3 + t2 + t + 1 p(t) = det(A tI)

Note: If one uses

instead, the resulting polynomial is

p(t) = t3 t2 t 1

Eigenvalues of

A p(t) = (t + 1)(t2 + 1).


and solving yields

The eigenvalues are the zeroes of the characteristic polynomial Setting

p() = 0 A

= 1, i, i

Eigenvectors of

2 5 a 0 0 0 b = 0 , so that a = c and b = c. Hence a basis For = 1 we want 0 1 c 0 1 for the = 1 eigenspace is 1 . 1 i2 2 5 a 0 i+1 0 b = 0 , so that b = 0 and a = (i + 2)c. Hence For = i we want 0 1 0 i+2 c 0 i 2 . 0 a basis for the = i eigenspace is 1 3 0 1

i 2 0 For = i we want 1
Hence a basis for the

5 a 0 b = 0 , so that b = 0 and a = (i + 2)c. 0 i + 2 c 0 i2 = i eigenspace is 0 . 1 2 i + 1 0

Diagonalization of

A A
are the maximal sizes, we see that

Since the eigenspaces of

is diagonalizable.

The diagonal entries of a diagonalization of

are be the eigenvalues of

A.

So, for example, we can take

1 D= 0 0

0 0 i 0 0 i

Note: Any ordering of the eigenvalues along the diagonal is acceptable.

Conjugating matrix for

A P
has columns given by independent eigenvectors of

The conjugating matrix

A.

So for the

given above, one

would be

1 P = 1 1

i 2 0 1

i2 0 1

2012th power of

A D = P 1 AP
so

We know that

Since

Then

D2012 = P 1 A2012 P , or P D2012 P 1 = A2012 . 1 (1)2012 0 0 = 0 0 i2012 0 D is diagonal we can compute D2012 = 0 0 0 (i)2012 1 0 0 A2012 = P D2012 P 1 = P P 1 = I = 0 1 0 . 0 0 1 x, y, z, w
in

0 1 0

0 0 = I. 1

(1,1) Let

be the vectors

R4

satisfying

x + y + z + w = 0.

Show that

is a subspace of

R4 . 0,
closed under addition, closed under

We check the three parts of the subspace criterion (contains scalar multiplication).

0, 0, 0, 0 satises the condition. So 0 is in S . [S2]: Suppose that x1 , y1 , z1 , w1 and x2 , y2 , z2 , w2 are in S : then x1 + y1 + z1 + w1 = 0 and x2 + y2 + z2 + w2 = 0. Then because (x1 + x2 ) + (y1 + y2 ) + (z1 + z2 ) + (w1 + w2 ) = (x1 + y1 + z1 + w1 ) + (x2 + y2 + z2 + w2 ) = 0 + 0 = 0, we see that the vector x1 , y1 , z1 , w1 + x2 , y2 , z2 , w2 is also in S . [S3]: Suppose that x1 , y1 , z1 , w1 is in S : then x1 + y1 + z1 + w1 = 0. Then because x1 + y1 + z1 + w1 = (x1 + y1 + z1 + w1 ) = 0 = 0, we see that the vector x1 , y1 , z1 , w1 is also in S . Note: Another acceptable answer is to observe that these vectors are the solutions to a homogeneous system of linear equations (in this case, the single equation x + y + z + w = 0). Thus they form a
[S1]: The zero vector subspace, because the solutions to any homogeneous system form a subspace.

Find a basis for

S. x + y + z + w = 0. 1 1 1 1
, which (rather obviously) is already in re-

We seek a basis for the set of solutions to the (single) homogeneous equation The corresponding coecient matrix is are three free variables

duced row-echelon form. The rst column is pivotal and the other three are nonpivotal: thus there

Setting

y , z , and w. y = t1 , z = t2 , and w = t3 gives x = t1 t2 t3 so the solutions t1 t2 t3 , t1 , t2 , t3 = t1 1, 1, 0, 0 + t2 1, 0, 1, 0 + t3 1, 0, 0, 1 .

are precisely

x, y, z, w =

Therefore, one basis of others.)

is

1, 1, 0, 0 , 1, 0, 1, 0 , 1, 0, 0, 1

(Note that there are many

(1,1) Decide whether the given collections of vectors are linearly independent:

1, 1, 1

1, 1, 1

, and

1, 1, 0

in

R3 . (1) 1, 1, 1 + 1 1, 1, 1 + 2 1, 1, 0 = 0, 0, 0 1 1 1 1 1 1 1 1 0 = 0.
.

Not independent : an explicit dependence is

Alternatively, one could see they are dependent by checking that

2, 1, 1, 1, 1

1, 2, 1, 1, 1

1, 1, 2, 1, 1

, and

1, 1, 1, 2, 1

in

R5 .
.

Independent : Suppose Then

a 2, 1, 1, 1, 1 + b 1, 2, 1, 1, 1 + c 1, 1, 2, 1, 1 + d 1, 1, 1, 2, 1 = 0, 0, 0, 0, 0

Subtracting the last equation from each of the other four gives

2a + b + c + d = 0, a + 2b + c + d = 0, a + b + 2c + d = 0, a + b + c + 2d = 0, and a + b + c + d = 0. a = 0, b = 0, c = 0, and d = 0.

Thus there is no nontrivial linear combination of these vectors giving the zero vector, so they are linearly independent.

Chapter 7 Notes David Seal Spring 2009 6. Eigenvalues and Eigenvectors A scalar1 C is an eigenvalue of the n n matrix A if there exists a non-zero vector v Rn such that Av = v. If we subtract v from both sides, the above equation is equivalent to (A I)v = 0 for some non-zero v. Since v = 0, we require A I to be non-invertible, otherwise the only solution to the last equation is v = 0. That is we can perform the following algorithm to nd every eigenvalue and eigenvector: (1) Set det (A I) = |A I| = 0 and solve for . This gives us every possible eigenvalue. (2) For each eigenvalue computed in step (1), solve (A I)v = 0 for v. Note that this is just a restatement of the equation Av = v. 7. Linear Systems of Dierential Equations We oftentimes run into matrices that do not have real-valued eigen-values. In such cases, we have two options: 1) give up and say this matrix has no eigenvalues, or 2) use complex numbers to factor the characteristic polynomial. The term imaginary is a bit of a misnomer because there is nothing imaginary about complex (imaginary) eigenvalues, so I will try to stick to the term complex. 7.3. (1) In this section, we are interested in solving the problem Ax = x

d where A is an n n square matrix. (Im going to use the notation dt x x because its easier to type. This is very common notation, but not used in our textbook). 7.3.1. Preliminary Theory. If {x1 , . . . , xk } are solutions to Ax = x, then so is

(2)

x(t) = c1 x1 (t) + + ck xk (t).

Our goal is to nd as many solutions xi (t) as possible, then take linear combinations to form the general solution. The function xi (t) = ei t vi solves Ax = x where i is any eigenvalue of A with eigenvector vi . You can see this if you just plug it into the equation: d Axi (t) = ei t Avi = ei t i vi = (ei t vi ). dt Taking linear combinations of the solutions xi , equation (??) becomes (3) x(t) = c1 e1 t v1 + + ck ek t vk . If we have k = n distinct eigenvalues, then this completely solves the problem. In section 7.5 we handle the case when we dont have enough eigenvectors from these eigenvalues.
1 Sometimes we take scalars from R, and sometimes we take them from C - the denitions and theory we learn is identical for either choice.

Note: that this theory doesnt care if our eigenvalues are real or complex - the above formula holds for any eigenvalue/eigenvector pair! 7.3.2. Complex Valued Solutions. Suppose we have a complex-valued solution y(t) = x1 (t) + ix2 (t). If we plus this into the dierential equation, we can see that the real and imaginary parts (Re(y) = x1 and Im(y) = x2 ) both solve the dierential equation. On one hand, Ay = A(x1 (t) + ix2 (t)) = Ax1 (t) + iAx2 (t). On the other hand, y = x1 + ix2 . Since Ay = y, the real and imaginary parts 2 must be equal, and hence x1 = Ax1 , x2 = Ax2

are two solutions of (1). So from one complex-valued solution the problem, we obtain two real-valued solutions. If we have a complex eigenvector v with associated eigenvalue = + i, then we know from equation (2) that the function y(t) = et v = et eit v = et (cos(t) + i sin(t))v is a solution to the problem. Hence we have two solutions (they will be linearly independent) from one complex eigenvalue/eigenvector pair by setting x1 (t) := Re(y) and x2 (t) := Im(y). 7.4. Second-Order Systems and Mechanical Vibrations. 7.5. Multiple Eigenvalue Solutions. Suppose we have a 3 3 matrix A with eigenvalues = 2 (mult. 2) and = 1 (mult. 1). With the eigenvalue = 1, we expect exactly 1 eigenvector v1 . With the eigenvalue = 2, we can have either one or two eigenvectors. If = 2 produces two eigenvectors v2 , v3 , then we say this eigenvalue is complete and the solution to (1) is given by equation (3): x(t) = c1 et v1 + c2 e2t v2 + c3 e2t v3 . This is the good case: A has a complete set of eigenvectors and hence A is also diagonalizable. If = 2 produces only one eigenvector (its going to produce at least one!), then we say this eigenvalue is defective. The number of missing eigenvalues is its degeneracy d = 1. The way to handle this case is to perform the following algorithm: (1) Solve (A 2I)2 u2 = 0 for u2 . If this is the zero matrix, then any vector u2 works, and so you can usually get away with choosing the easiest one: u2 = (1, 0, 0). (2) Set u1 = (A 2I)u2 . Doing this actually forces u1 to be an eigenvector since then (A 2I)u1 = (A 2I)2 u2 = 0. and hence Au1 = u1 . (3) Consider the two solutions x1 (t) = e2t u1 and x2 (t) = e2t (tu1 + u2 ). The general solution is given by equation (2) with x3 = et v1 .
2Complex numbers, just like vectors are equal if and only if each component is equal.

CHAPTER 1

First-Order Dierential Equations


1. Di Eqns and Math Models Know what it means for a function to be a solution to a dierential equation. In order to gure out if y = y(x) is a solution to the dierential equation, we plug this into the dierential equation and see if it solves it. For example, in algebra we may be faced with an equation like x2 5x + 10 = 2x. We can verify that x = 2 is a solution to this equation by plugging it in and verifying that it solves the equation. Here is my proof that x = 2 is the solution to the previous equation: LHS RHS = x2 5x + 10 = 22 5(2) + 10 = 4 10 + 10 = 4. = 2x = 2(2) = 4.

Since LHS = RHS, x = 2 is a solution. When verifying your solution, do NOT manipulate both sides of the equation. For example, the following is NOT a valid proof that x = 2 is a solution: 22 5(2) + 10 = 2(2) 4 10 + 10 = 4 4 10 = 4 10 6 = 6. For practice reviewing derivatives, you can look at 1, 4, 7, 10 and 17, 20, 23. 2. Integrals as General and Particular Solutions This section is intended to be more practice with integration. You should be able to solve 18 and 10 without thinking about them too much. Remember integration by parts is your friend: u dv = uv v du.

Try using this tool on problem 10. Another integration technique that is extremely useful is Partial Fractions. In 1 order to do the partial fractions setup for the function f (x) = x3 +3x2 we rst need to completely factor the denominator: f (x) = 1 1 = 2 . x3 + 3x2 x (x + 3)
1

Once you have all the factors, just set up the partial fractions. If any factor has degree higher than 1, you need to put a polynomial that has one less degree on the numerator: A Bx + C 1 . = + x2 (x + 3) x+3 x2 The velocity problems are covered in much more detail in section 2-3. Problems 1118 are good practice with integration and require you to know the relations between a, v and x. You should be able to do these without thinking about the problems too much. 3. Slope Fields and Solution Curves You have one theorem whose statement you need to memorize: Theorem 1. If f (x, y) and the partial derivative f (x, y) is continuous near y the point (x0 , y0 ), then the initial value problem dy = f (x, y), y(x0 ) = y0 dx has a unique solution on some (possibly small) interval containing x0 . If the hypotheses of the theorem are not satised, then anything goes. You may have one solution, innitely many solutions or no solutions whatsoever. For practice with this theorem, you may want to consider trying problems: 11, 12, 13, 14, 17, 18. As a variation on this theorem, you can drop the hypothesis on the continuity of f and obtain existence (possibly without uniqueness). See x http://en.wikipedia.org/wiki/Peano existence theorem. 4. Separable Equations and Applications The technique of separating variable allows us to solve a whole new class of problems that arent covered in the standard 222 course. Any problem from 128 will be extremely good practice for the exam. In addition, this section introduces a few new models not previously covered: (1) Population Growth: dP = kP ;. k > 0 is the growth constant. dt (2) Radioactive Decay: dN = kN ; k > 0 is the decay constant. dt (3) Newtons Law of Cooling: dT = k(A T ); k > 0 is a constant that has dt to do with how well insulated the system is. A constant is the ambient air temperature. Now that we know about phase diagrams, you should be able to sketch one of these for each of these equations. Note that the rst two equations are essentially the same, dy = const y, after youre supplied with enough initial conditions, youll be dt able to determine the correct sign for the constant. For practice, you may want to consider trying problems 3336, 43, 49, 65. 5. Linear First-Order Equations In this section we learned how to solve any linear rst order dierential equation. These equations are anything that can be written in what I call so called standard form: dy + P (x)y = Q(x). dx

In order to solve these equations, we use the INTEGRATING FACTOR METHODS. Memorize the method given on page 47. You should probably add in a step 0, which says write in standard form. The most important thing to remember is how to compute the integrating factor: = exp( P (x) dx). Practice problems 119 (every third) until you become comfortable with this method. Its VERY important to understand this method because it is one of the only two methods that we learn in the rst two chapters. The integrating factor method allows us to solve Mixing Problems. The standard setup is given by the picture on page 51. I think it helps to do the dimensional analysis to come up with the terms: dx = stu in stu out. dt salt salt Since dx = g sec , we know that stu in has units of g sec . Therefore if we take dt L [ri ] = g salt and multiply it by [ci ] = sec well have the correct units: L dx = ri ci ro co . dt The rate in, ri and ci is usually given to us. To nd co we need to compute this using what we know about the problem. The volume V (t) of the tank can usually x be explicitely computed, and hence co = V . The dierential equation becomes: dx x(t) = ri ci ro co = ri ci ro . dt V (t) For practice with this problem redo your homework problem 33, and try solving 36 and 37 on your own.

CHAPTER 2

Mathematical Models and Numerical Methods


1. Population Models Here we encounter a more sophisticated population equation thats called the logistics equation. It can be found on page 79, equation 3: dP = kP (M P ). dt One can think of this as an extension to the unbounded population growth model: dP 2 dt = kP by subtracting another term that is proportional to P . Thus, when P is 2 small, the kP term dominates, and when P is large, the P term dominates. From a phase diagram, you should be able to immediately see what the limiting population is. For review of partial fractions, you may want to consider looking at problems 18. For practice with some population models, try problems 9 and 21. 2. Equilibrium Solutions and Stability When were studying rst order dierential equations of the form dx = f (x) dt where f = f (x) is a function of x only, (note that t is the independent variable, and in this context, x is the dependent variable) we can oftentimes derive qualitative information about what happens as the solution evolves over time from a given initial condition x0 . Know how to nd a critical point. (set f (x) = 0, and solve for x). Know how to determine if your critical point is stable/unstable/semistable. For practice, try problems 1, 3, 5 and 9. Know how to analyze the stability and long term behavior for the logistics equation from section 2-1 as well as the variation that includes the harvesting parameter h 0: dP = kP (M P ) h. dt What qualitative behavior changes as h is increased? Is there a special point where h drastically changes the behavior of the solutions? 3. Acceleration-Velocity Models 3.1. Gravity. Newtons second law (abbreviated N2L) states that F = ma where F is the force applied on the object, m is the mass of the object, and 2 y a = dv = d 2 is the objects acceleration. Recall the relations between position, dt dt velocity and acceleration.
5

In general F is actually a vector, so it has velocity and magnitude. Since we have only ever worked in 1-dimension, the only options for F are positive or negative. When N2L is applied to gravity, we have |FG | = ma. One of the fundamental assumptions concerning gravity is |FG | = GMm where d is the distance between d2 the two objects, G is the gravitational constant and M , m are the masses of the two bodies. After dividing by m, we have GM |a(t)| = 2 , d so that in fact the acceleration due to the force of gravity is independent of the objects mass! When d R doesnt vary too much (i.e. for an object which stays near the surface of the earth), then we can say that GM g is a constant. Hence: d2 dv = g. dt The minus sign is to account for the direction of the force of gravity. When were using units of ft, we have that g = 32. When were measuring distance in terms of meters, g = 9.8. For practice, I suggest you review your homework problems: 25, 30. In addition I suggest looking at problems 3 and 20 for extra practice. 4. Numerical Approximation: Eulers Method Eulers method is the most basic and fundamental Numerical Technique for solving dierential equations. I should emphasize that when one studies any real world applied problem, one usually needs to resort to numerical techniques for solving a dierential equation. The best way to derive Eulers method is to start with what well call the discrete derivative. To see where this comes from recall the denition of the derivative: y(x + h) y(x) . y (x) = lim h0 h From here it makes sense that y (x) y(x+h)=y(x) when h is very small. Now if we h write yn+1 = y(xn + h); yn = y(xn ); xn+1 = xn + h, we have the approximation yn+1 yn . y (xn ) h If were trying to solve the equation dy = f (x, y) dx then we just set this discrete derivative equal to the right hand side function f : yn+1 yn = f (xn , yn ). h Solving this equation for yn+1 gives us the updating formula. Initialize y0 and x0 from the initial conditions. For n = 0, 1, 2, 3, ... do: yn+1 = yn + hf (xn , yn ), For practice try problems 3 and 6. xn+1 = xn + h.

PRACTICE PROBLEMS FOR EXAM 2

1. Matrix Operations and Inverses (1) Problems 3138 in Section 3.4 of Edwards and Penney. (2) Problems 31, 33, 34, 44 in Section 3.5 of Edwards and Penney. (3) Suppose A is a 2 1 matrix and B is a 1 2 matrix. If C = AB, show that C is not invertible. Proof. If A= Then the product AB is AB = a11 b11 a11 b12 a21 b11 a21 b12 . a11 a21 B= b11 b12 .

a21 If we perform the row-operation a11 R1 + R2 R2 we get the matrix

a11 b11 a11 b12 0 0

Since AB is row-equivalent to a matrix with a row of zeros, it is not invertible.

2. Determinants (1) Problems 47, 52, 60 in Section 3.6 of Edwards and Penney. (2) Let A be a matrix with dimensions (2k + 1) (2k + 1). This matrix is a skew-symmetric matrix, which means AT = A. Prove that det A = 0. Proof. Since det A = det AT , we have det A = det(A). Note the A means we are multiplying each row by 1, we can factor 1 out of det(A) once for each row. That is, det(A) = (1)2k+1 det(A) = det(A). Thus, det(A) = det(A) which is only possible if det A = 0.

3. Basic Vector Space Properties and Subspaces (1) Problems 24, 2831 in Section 4.2 of Edwards and Penney. (2) If V is a vector space verify that (v1 + v2 ) + (v3 + v4 ) = [v2 + (v3 + v1 )] + v4 for all vectors v1 , v2 , v3 , v4 in V . Use only the denition of a vector space.
Date: November 1, 2012.
1

Proof. This is an exercise in the use of associativity and commutativity. (v1 + v2 ) + (v3 + v4 ) = [(v1 + v2 ) + v3 ] + v4 = [(v2 + v1 ) + v3 ] + v4 = [v2 + (v1 + v3 )] + v4 = [v2 + (v3 + v1 )] + v4 . Easy as pie. (3) Let Rn denote the usual vectors of n-tuples of real numbers. However, we now dene a new type of vector addition and scalar multiplication. v1 v2 = v1 v2 c v1 = cv1

where the right hand side of the equations are dened in the usual way for vectors in Rn . Which properties of a vector space are satised by , ? Solution. We go through all the properties. (a) v1 v2 = v2 v1 for all vectors. Thus commutativity for addition fails. (b) Now we test for associativity. u (v w) = u (v w) = u (v w) = (u v) + w = (u v) (w). So the operation is not associative. (c) There is no zero element for this operation. The usual zero vector is a right-side identity, u0=u0=u but there is no left-side identity, 0 u = u = u. (d) We cannot even talk about inverses, because there is no identity element for ! (e) Let b, c be scalars. Then b (c u) = b (cu) = (b)(cu) = cbu = (c)(bu) = b (b u).

So is commutative. (f) There is an identity element for this operation, however it is 1 rather than 1: (1) (g) The rst distributive law a (u v) = a (u v) = (a)(u v) = (a)u (a)v = a ua v u = (1)u = u.

holds. (h) The second distributive law (a + b) holds.


2

u = (a + b)u = au bu = a

u [(b)

u].

(4) Let R2 denote all pairs of real numbers. We dene vector addition to be (x, y) (u, v) = (x + u, 0) and scalar multiplication as c Is R2 with , (x, y) = (cx, 0). as the operations a vector space?

Solution. We go through all the properties. (a) We rst check commutativity. We have (x1 , x2 ) (y1 , y2 ) = (x1 + y1 , 0) = (y1 + x1 , 0) = (y1 , y2 ) (x1 , x2 ) for all vectors. Thus commutativity for addition is true. (b) Now we test for associativity. u (v w) = (u1 , u2 ) (v1 + w1 , 0) = (u1 + v1 + w1 , 0) = (u1 + v1 , 0) (w1 , 0) = (u v) w. So the operation is associative. (c) There is no identity element for this operation, (u1 , u2 ) v = (u1 + v1 , 0) = u. regardless of how we choose v, provided u2 = 0. Since one of the properties has failed, these operations cannot form a vector space. If we had planned better we could have gone straight to property (c), then stopped once that property failed. (5) Let F be the vector space of real-valued functions. Is the subset {f F | f (0) = f (1)} a subspace of F ? Solution. We test for the two subspace properties. (a) Let f, g be in the subset above. In other words, f (0) = f (1) and g(0) = g(1). Then (f + g)(0) = f (0) + g(0) = f (1) + g(1) = (f + g)(1) so f + g is also in the subset. (b) Let c be a scalar and f a function in the subset. Then (cf )(0) = cf (0) = cf (1) = (cf )(1) so cf is in the subset. Both of the subspace properties are satised so the subset is a subspace. (6) It is a fact that Rnn , the set of all nn matrices, is a vector space with the usual denition of A + B and cA for any real c. (a) Is the set of all invertible matrices a subspace? (b) Let B be a given matrix in Rnn . Is the set of all matrices such that AB = BA a subspace of Rnn ? Solution. (a) No, this is not a subspace. It fails the rst subspace property. In particular, the identity matrix I and I are both invertible, but I + (I) = 0 where 0 is the matrix of all zeros is not invertible. (b) Yes. We verify both properties. (i) Let A, C be matrices in the above subspace, so AB = BA and CB = BC. Then (A + C)B = AB + CB = BA + BC = B(A + C) so A + C is in the subset.
3

(ii) Let c be a scalar. Then (cA)B = cAB = cBA = B(cA) so cA is also in the subset. Because both properties are satised, the subspace theorem tells us this subset is a subspace.

4. Span and Bases (1) Problems 2332 of Section 4.3 in Edwards and Penney. (2) Problems 24, 34, 35 of Section 4.4 in Edwards and Penney. (3) Suppose two vectors u and v are linearly dependent. Prove that one of them is a scalar multiple of the other. Proof. If one of these vectors if 0 this is trivially true. So we assume both u and v are nonzero vectors. Then if they are linearly dependent we have c1 u + c2 v = 0 c1 u c2 so the vectors are scalar multiples of each other. v= (4) Find three vectors in R3 which are linearly dependent, but any two of them are linearly independent (between just the two of them). Solution. Visually, this corresponds to 3 vectors lying in the same plane. So we can pick the vectors (1, 0, 0), (0, 1, 0), and (1, 1, 0). We must actually prove the solution is correct, so rst we verify the vectors are mutually linearly independent. Pair 1. First we show (1, 0, 0) and (0, 1, 0) are LI. If c1 (1, 0, 0) + c2 (0, 1, 0) = (0, 0, 0) (c1 , 0, 0) + (0, c2 , 0) = (0, 0, 0) (c1 , c2 , 0) = (0, 0, 0) so c1 = c2 = 0. Thus they are linearly independent. Pair 2. First we show (1, 0, 0) and (1, 1, 0) are LI. If c1 (1, 0, 0) + c2 (1, 1, 0) = (0, 0, 0) (c1 , 0, 0) + (c2 , c2 , 0) = (0, 0, 0) (c1 + c2 , c2 , 0) = (0, 0, 0) so c2 = 0 and c1 + c2 = c1 = 0. Thus they are linearly independent. Pair 3. First we show (0, 1, 0) and (1, 1, 0) are LI. If c1 (0, 1, 0) + c2 (1, 1, 0) = (0, 0, 0) (0, c1 , 0) + (c2 , c2 , 0) = (0, 0, 0) (c2 , c1 + c2 , 0) = (0, 0, 0) so c2 = 0 and c1 + c2 = c1 = 0. Thus they are linearly independent.
4

for c1 , c2 = 0. Then

Now we must show together they are linearly dependent. c1 (1, 0, 0) + c2 (0, 1, 0) + c3 (1, 1, 0) = (0, 0, 0) (c1 + c3 , c2 + c3 , 0) = (0, 0, 0) so c1 = c3 and c2 = c3 with no restrictions on c3 . Thus there are nonzero values c1 , c2 , c3 such that the vectors add to 0, so they are linearly dependent. 5. Inner Products (1) Problems 26, 30, 32 of Section 4.6 in Edwards and Penney. (2) Consider the vector space Rnn of n n matrices with real entries. Determine if (A, B) = tr (AB T ) is an inner product on Rnn , where tr A is the sum of the diagonal entries of A. Proof. We must verify (A, B) has the inner product properties. (a) The trace is unchanged by taking the transpose because the transpose leaves the diagonal unchanged. So tr A = tr AT . Thus (A, B) = tr (AB T ) = tr ((AB T )T ) = tr ((B T )T AT ) = tr (BAT ) = (B, A). So the rst property is satised. (b) The trace is also linear so tr (A + B) = tr A + tr B. Then (A, B + C) = tr (A(B + C)T ) = tr (AB T + AC T ) = tr (AB T ) + tr (AC T ) = (A, B) + (A, C) so the second property holds. (c) By the aforementioned linearity of the trace, we have (cA, B) = tr (cAB T ) = ctr (AB T ) = c(A, B). The third property is satised. (d) This one seems like a doozy. The expanded denition of matrix multiplication is
n

[AB]jj =
i=1

aij bji .

In our case B = AT so [AT ]ji = aij . Thus the sum above is


n

[AA ]jj =
i=1

a2 . ij

From this we get


n

(A, AT ) =

[AAT ]jj
j=1 n n

=
j=1 i=1

a2 0 ij

because each 0. In fact, we only have (A, AT ) = 0 if each of those entries is zero so the fourth and nal property holds. Thus it denes an inner product.

a2 ij

EXAM 2 REVIEW
DAVID SEAL

3. Linear Systems and Matrices 3.2. Matrices and Gaussian Elimination. At this point in the course, you all have had plenty of practice with Gaussian Elimination. Be able to row reduce any matrix youre given. My advice to you concerning not making mistakes is the following: (1) Avoid fractions! Use least common multiples rather than deal with them. Computers dont make mistakes adding fractions, we do. (2) Take small steps. When I do these problems, you will never see me write down the operation 2R1+3R2 . Instead I would break this up into three steps: 1.) R2=3R2 2.) R1=R1 3.) R2=R1+R2 . In fact you can kind of cheat and do both operations 1 and 2 at once. This isnt an elementary row operation, but it doesnt hurt to do that. (3) Use back substitution after you have your matrix in diagonal form. You dont need your matrix in reduced row echelon form to nd a solution, just row echelon form. What I mean by this third piece of advice is the following. Suppose youve already performed the following row operations: 4 2 0 0 A b row ops 0 6 1 1 . 0 0 2 2 This is now in row echelon form, but not reduced row echelon form (see next section) because the o diagonal entries are non zero. But as far as we care, this is enough information to solve for all the variables. Write down the equation described by the 3rd row: 2z = 2, nd z = 1. Then write down the equation described by the 2nd row: 6y + z = 1, and solve for y. For practice see problems 11, 12 and 23, 24. Practice enough of 11-18 until you dont have to think about doing these problems, it just becomes mechanical. 3.3. Reduced Row-Echelon Matrices. The most important theorem from this section is Theorem 1 (The Three Possibilities Theorem). Every linear system has either a unique solution, innitely many solutions or no solutions.
Date: Updated: March 29, 2009.
1

DAVID SEAL

Exercise: If A is an n n matrix, what are the possible outcomes of solving Ax = 0? Hint: What could As reduced row echelon form look like? Know how to put a matrix in reduced row echelon form. Usually if youre interested in solving an equation you dont do this, but this vocabulary term could come up and you should know it. For practice try problems 1-4. 3.4. Matrix Operations. Know if and when multiplication/addition is dened for matrices and how to do it. These are easy problems to do, but in case they show up on the exam you want to make sure you dont make any arithmetic mistakes! Im a bit reluctant to suggest problems from this section, but you should be able to do any of 1-16 with your eyes closed. See problems 17,18,19. Do these look familiar now? Can you produce a vector/vectors that uniquely describe the kernel of a particular matrix for each of these problems? What are the dimensions of these linear subspaces? (Hint: the dimension is the number of free variable that are necessary to describe the space). 3.5. Inverses of Matrices. The important denition. Denition. An n n matrix A (square matrix only!) is said to be invertible if there exists a matrix B such that AB = BA = Inn where Inn is the identity matrix. Fact. Inverses, when they exist, are unique. This is a very nice feature to have when you dene something! You wouldnt want to get in a ght with your friend over who found the better inverse. We denote the (unique) inverse matrix of A by A1 . You DO NOT need to be prepared to nd the inverse of a matrix using row operations! Prof. Bertrand said this type of problem is too long for an exam situation. One problem you might run into is given two matrices, are they inverses of each other? To check this you need to know how to multiply two matrices together and the denition of the inverse. The following problem is short enough to appear on an exam. Exercise: If A and B are invertible matrices, is AB an invertible matrix? If so, whats the inverse matrix? Can you prove this? 3.6. Determinants. Well rst begin with an extremely important fact. If A is a square matrix, then A is invertible if and only if det(A) = 0. You can add this to theorem 7 given on page 193. See your notes from discussion section about what I had to suggest in how to go about nding the determinant of a matrix. You can always do this through cofactor expansion, the thing to keep in mind is the checkerboard pattern that shows up when evaluating the sign of each coecient. See problems 1,2. I dont imagine youll be asked to evaluate the determinant for any large (i.e. larger than 4 4) matrix. One other trick to keep in mind is what sort of row operations can you do to a matrix when evaluating the determinant. See property 5 listed in the text. This says youre allowed to add a multiple of any row to another row and this doesnt change the determinant. You have to be very careful not to misuse this property! This doesnt mean you can arbitrarily multiply a row by a number like were used to with systems of equations. See problems 7, 9.

EXAM 2 REVIEW

Exercise: If A is an n n matrix, express det(A) in terms of and the matrix A. Hint: The answer is not det(A). 4. Vector Spaces 4.2. The Vector Space Rn and Subspaces. If we have a vector space V , and we have a subset W V , a natural question to ask is whether or not W itself forms a vector space. This means it needs to satisfy all the properties of a vector space that are listed on page 236 of your text. The bad news is this is quite a long list, but the good news is we dont have to check every property on the list, because most of them are inherited from the original vector space V . In short, in order to see if W is a vector space, we need only check if W passes the following test. Theorem 2. If V is a vector space and W V is a non-empty subset, then W itself is a vector space if and only if it satises the following two conditions: (1) Additive Closure If a W and b W , then a + b W . (2) Multiplicative Closure If R and a W , then a W . The statement of this theorem has the term non-empty as one hypothesis for the theorem to be true. In most applications of this theorem, we actually replace the statement non-empty with requiring that 0 W . I.e., the theorem from above is equivalent to the following theorem. Theorem 3. If V is a vector space and W V , then W itself is a vector space if and only if it satises the following two conditions: (1) Additive Closure If a W and b W , then a + b W . (2) Multiplicative Closure If R and a W , then a W . (3) Non Empty 0 W . Note that these are two properties that are on the long laundry list of properties we require for a set to be a vector space. Example Consider W := {a = (x, y) R2 : x = 2y}. Since W R2 , we may be interested if W itself forms a vector space. To answer this question we need only check two items: (1) Additive Closure: An arbitrary element in W can be described by (2y, y) where y R. Let (2y, y), (2z, z) W . Then (2y, y) + (2z, z) = (2y + 2z, y + z)) W since 2y + 2z = 2(y + z). (2) Multiplicative Closure: We need to check if R, and a W , then a W . Again, an arbitrary element in W can be described by (2y, y) where y R. Let R and (2y, y) W . Then (2y, y) = (2y, y) W since the rst coordinate is exactly twice the second element. Note: it is possible to write this set as the kernel of a matrix. In fact, you can check that W = ker(A), where A12 = 2 1 . We actually have a theorem that says the kernel of any matrix is indeed a linear subspace. Example Consider W := {a R3 : z 0}. In order for this to be a linear subspace of R3 , it needs to pass two tests. In fact, this set passes the additive closure test, but it doesnt pass multiplicative closure! For example, (0, 0, 5) W , but (1) (0, 0, 5) = (0, 0, 5) W . /

DAVID SEAL

Denition. If Amn is a matrix, we dene ker(A) := {x Rn : Ax = 0}. This is also called the nullspace of A. Note that ker(A) lives in Rn . Denition. If Amn is a matrix, we dene Image(A) := {y Rm : y for some x Rn }. This is also called the range of A. Note that Image(A) lives in Rm . Theorem 4. If Amn is a matrix, then ker(A) is a linear subspace of Rn and Image(A) is a linear subspace of Rm . Exercise: show this theorem is true. In lieu of section 4-4, I think likely candidates from this section will be proving a set of element is not a vector space. For example see problems 7, 9, 10, 13. Problems 15-22 serve as excellent practice for leading up to section 4-4, but you should get the gist after doing problems from that section. 4.3. Linear Combinations and Independence of Vectors. If we have a collection of vectors {v1 , v2 , . . . , vk }, we can form many vectors by taking linear combinations of these vectors. We call this space the span of a collection of vectors, and we have the following theorem: Theorem 5. If {v1 , v2 , . . . , vk } is a collection of vectors in some vector space V , then span{v1 , v2 , . . . , vk } := {w : w = c1 v1 + c2 v2 + + ck vk , for some scalars ci R} is a linear subspace of V . For a concrete example, we can take two vectors v1 = (1, 1, 0) and v2 = (1, 0, 0) which both lie in R3 . Then the set W = span{(1, 1, 0), (1, 0, 0)} describes a plane that lives in R3 . This set is a linear subspace by this previous theorem. In fact, we can be a bit more descriptive and write W = {(x, y, z) R3 : z = 0}. If we continue with this example, it is possible to write W in many other ways. In fact, we could have written W = span{(1, 1, 0), (1, 0, 0), (5, 1, 0)} = span{(10, 1, 0), (2, 1, 0)}. These examples illustrate the fact that our choice of vectors need not be unique. What is unique, is the least number of vectors that are required to describe the set. In fact this is so important we give it a name, and call it the dimension of a vector space. This is the content of section 4.4. In our example, dim(W ) = 2, but right now we dont have enough tools to show this. In order to make this statement least, precise, we need to introduce following important denition. Denition Vectors {v1 , v2 , . . . , vk } are said to be linearly independent if c1 v1 + c2 v2 + + ck vk = 0 for some scalars ci , it must follow that ci = 0 for each i. OK, so denitions are all ne and good, but how do we check if vectors are linearly independent? The nice thing about this denition is it always boils down to solving a linear system. Example As a concrete example, lets check if vectors {v1 , v2 } linearly independent where v1 = (4, 2, 6, 4) and v2 = (2, 6, 1, 4). Ax =

EXAM 2 REVIEW

We need to solve the problem c1 v1 + c2 v2 = 0. This reduces to asking what are the solutions 2 4 6 2 c1 6 + c2 1 4 4 to 0 0 = 0 . 0

We can write this problem as a matrix equation Ac = 0 where 4 2 2 6 c1 A= 6 1 , c = c2 4 4 and solve this using Gaussian elimination. 4 2 0 2 6 0 row ops 6 1 0 4 4 0 1 0 0 0 0 1 0 0 0 0 . 0 0

Thus c1 = c2 = 0 is the only solution to this problem, and so these two vectors are linearly independent. To demonstrate that a collection of vectors are not linearly independant, it sufces to nd a non-trivial combination of these vectors and show they sum to 0. For example, see example 6 in the textbook. We have another method for checking if vectors are linearly independent. This method is more complicated to apply so I encourage you become familiar with using Gaussian Elimination (row operations) for checking independence. But to be complete, Ill include this other method as well. Essentially what happened in this last problem was we were able to do row operations to a matrix, and get the identity matrix in a part of it. Being row equivalent to the identity matrix is equivalent to being invertible and this is equivalent to having a non-zero determinant. What makes this tricky is what part of what matrix are we considering because determinants are only dened for square matrices. Well do the special case rst: Theorem 6. Independance of n Vectors in Rn The n vectors {v1 , v2 , . . . , vn } are linearly independent if and only if det(A) = 0 where (as usual) | | | A = v1 v2 . . . vn , | | | A is the (square n n) matrix given by putting the vectors vi as collumns. Note: this theorem currently only applies to a collection of n vectors in Rn , but we can easily extend this to other collections of vectors. Well do this in a minute. What this theorem doesnt give us is what combination gives us a sum thats 0. I do think seeing a proof of this is instructive. It tells you why we care about even taking determinants here, as well being a good illustration for how one goes about checking for independence.

DAVID SEAL

Proof: (If det(A) = 0, then the vectors are independent). Suppose c1 v1 + +cn vn = 0 for some scalars ci . Then as usual, this leads to solving the system Ac = 0 where c1 | | | c2 A = v1 v2 . . . vn , c = . . . . | | | cn Since det(A) = 0, we know A is invertible which also means A is row equivalent to the identity matrix. This means we can do row operations to this and turn the system into: 1 0 0 0 0 1 0 0 | | | 0 0 0 1 0 v1 v2 . . . vn 0 row ops . [A | 0 ] = . .. . . 0 | | | 0 0 1 0 0 0 1 0 Evidently, c1 = c2 = = cn = 0, so the vectors are linearly independent. To summarize. Know the denition of linear independence. Know that checking for independence always results in asking what are the solutions to the equation Ac = 0. If c = 0 is the only solution, then they are independent, and if theres a non-zero c that solves this, they are dependent. Finding c gives you the coefcients c1 , c2 , . . . that demonstrate linear dependence. Gaussian elimination (row operations) is a method that will give you the coecients. Exercise Is the set of vectors {(0, 0, 1), (1, 0, 1), (0, 0, 0)} linearly independent? Exercise Is the set of vectors {(1, 1, 1), (1, 5, 1), (10, 17, 0), (0, 1, 0)} linearly independent? Hint: There is a one line solution to this, or you can write down the full linear system and solve it. Exercise 5. Bases and Dimension for Vector Spaces Well begin with the major denition of the section. Denition. A collection of vectors {v1 , v2 , . . . , vn } are a basis for a vector space V if they satisfy (1) {v1 , v2 , . . . , vn } are linearly independent. (2) V = span {v1 , v2 , . . . , vn }. Fact: The number of vectors in any basis for a nite dimensional vector space is unique. We dene dim(V ) = n where n is the number of vectors in a basis. Theorem 7. If dim(V ) = n, and S = {v1 , v2 , . . . , vn } is a collection of n linearly independent vectors, then S is a basis for V . Fact: dim(Rn ) = n. This is a deep, non-trivial result that depends on the previous theorem! Why is this true? For R3 , we have the basis S = {(1, 1, 1), (1, 1, 0), (1, 0, 0)} and there are 3 vectors here. Im guessing a problem very similar to your homework problems will show up with a high probability. All of your problems asked you to nd a basis for the kernel of a matrix. Know how to do this.

EXAM 2 REVIEW

Practice 12-20 until you know how to do this! 5.1. Good Luck!

Lecture Notes of Math 320 , Fall 2012


Bing Wang

September 5th, 2012; Dierential Equation

An equation relating an unknown function and one or more of its derivatives is called a dierential equation. Notation: use prime or dot to denote derivatives. Example 1.1. The movement of free falling body. x = g.

Example 1.2. Harmonic Oscillator. When the body is displaced from its equilibrium position, x = 0, it experiences a restoring force, F, proportional to the displacement, x: F = kx for some positive k. x + kx = 0. Example 1.3. Newtons law of cooling. The time rate of change of the temperature T (t) of a body is proportional to the dierence between T and the temperature A of the surrounding medium. dT = k(T A), dt

Example 1.4. The time rate of change of population P(t) with constant birth and death rate is described by the following equation dP = kP, dt

Terminology: order. The order of a dierential equation is the order of the highest derivative that appears in it. Then check the order of the previous example and the following equation: y + (y )5 + y = 0. 1

Answer: 3. Note that all the dierential equations we will study in this semester contain only one independent variable. Such dierential equation is called ordinary dierential equation, or simplied as ODE. If the number of independent variables is more than one, then the dierential equation is called partial dierential equation, abbreviated as PDE. Return to Newtons cooling rule. Suppose the domain is occupied by some material, which is put in an environment with temperature A. Then temperature is a function of both position and time. We denote the temperature function by T = T (x, t). It satises the following equation 2 2 2 T = k 2 + 2 + 2 T. t x1 x2 x3 Moreover, T satises boundary condition T (x, t) = A whenever x is on the boundary of . We will not discuss PDE in the future. As you know, ODE is strongly related to the linear algebra. In some sense, PDE is also related to some advanced version of linear algebra. Many equations have no solution at all. For example, (y )2 = 1 has no real solution. For the equations with solutions, how to compute it? If a solution exists, is the solution unique? In many cases, the ODE can be solved formally. The solution can be written down explicitly. The methods of nding exact solution includes, but not limited to, integrals, separation of Variables, solution of linear equations, elementary substitution methods, etc. However, for most ODEs, it is impossible to nd the precise solution. In this case, we will use numerical method to nd approximation solution. The equations that can be solved includes: 1. Integrals: use the example of free falling body. 2. Separable Equations: use the example of Newtons cooling rule. 3. Linear First Order equations: use the example of population model. 4. Substitution Methods and Exact Equations. In this class and the next, we shall discuss simple examples of the rst 3 types. In particular, the Separable Equations and Linear First Order equations. At the end, discuss the solution of the harmonic oscillator. We dont know how to solve it now. Thats the reason why do we need to study linear algebra.

September 7th, 2012; Separable Equation and First Order Linear Equation

Separable ODE Example 2.1. Solve the initial value problem. dy = 8xy, dx y(0) = 2.

It is easy to see that


2 2 dy = 8xdx, log y = 4x2 + C1 , y = e4x +C1 = C2 e4x . y

Now decide the value of C2 . Put in x = 0, y = 2, we have 2 = C2 e0 , C2 = 2. Therefore, the nal solution is y = 2e4x . Analyze the general rule, dy g(x) = H(x, y) = g(x)h(y) = . dx f (y) Then we have f (y)dy = g(x)dx.
2

Note that we obtain the relationship between y and x. However, generally this is given implicitly. Then we return to solve the equation 4 2x dy = 2 , dx 3y 5 We can rewrite the equation in the form (3y2 5)dy = (4 2x)dx y3 5y = 4x x2 + C. Now we decide the value of C. Put x = 1, y = 3 into the last equation, we obtain 12 = 3 + C C = 9. Therefore, we obtain the solution y3 5y + x2 4x 9 = 0. First order linear ODE. Example 2.2. Solve the initial value problem dy 11 x y = e 3 , dx 8 y(0) = 1. y(1) = 3.

Multiply two sides by ex . Applying the product rule in dierentiation, we see that dy dy dy dex d x y = ex ex y = ex + y= e y , dx dx dx dx dx 11 4x d 33 4x Right = e 3 = e 3 . 8 dx 32 Le f t = ex 3

Integration on both sides yields that ex y = 33 x 33 4x e 3 + C, y = e 3 + Ce x . 32 32

Now we decide the constant C. Put in x = 0, y = 1, we have 1 = Therefore, the solution is y = 33 1 + C, C = . 32 32

x 1 x e 33e 3 . 32 From this example, we can see the general method to solve the rst order linear ODE, which has the form

dy + P(x)y = Q(x). dx Let (x) = e


P(x)dx

. Then we have d ye dx
P(x)dx

= Q(x)e

P(x)dx

+ C.

Then y(x) = e
P(x)dx

Q(x)e

P(x)dx

dx + C .

Lets go back to examples to check this general method. Example 2.3. Solve the initial value problem. x2 dy + xy = sin x, dx y(1) = y0 .

First, we can write this equation as a rst order linear ODE. dy y sin x + = 2 . dx x x The integration factor should be e x Denote S i(x) =
x sint dt. 0 t
1 x dx

= elog x = x. Now the equation becomes

dy sin x d sin x +y= , (xy) = . dx x dx x

Then we see that xy = S i(x) + C.

Put x = 1, y = y0 into the formula, we have y0 = S i(1) + C. Therefore, we have y= S i(x) y0 S i(1) 1 + = (S i(x) S i(1) + y0 ) . x x x 4

September 10th, 2012; Exact Equations and Substitution Methods

Exact Dierential Equation Lets see the example xdy + ydx = 0, d(xy) = 0, xy = C. In general, if we have F(x, y(x)) = C, then taking derivatives yields dF(x, y(x)) = 0, Example 3.1. Solve the initial value problem. 2xy3 dx + 3x2 y2 dy = 0, y(1) = 2. F F y + = 0. x y x

Solution

The equation can be written as d(x2 y3 ) = 0, x2 y3 = C.

Put in x = 1, y = 2, we obtain C = 8. So the solution is x2 y3 = 8. In general, suppose we have a dierential equation in the form M(x, y)dx + N(x, y)dy = 0, can we use the previous method to solve it? To be precise, can we nd a function F = F(x, y) such that dF = Mdx + Ndy? In other words, is the dierential equation exact? Actually, there is a necessary condition. If the dierential equation is exact, then we have M= F , x N= F M N 2 F , = = . y y x xy

Now check that 8xdx + 9xydy = 0 is not exact. The amazing thing is that M = N is also a sucient condition if the domain where the y x dierential equation is solved is not too complicated. Theorem 3.1. Suppose is a rectangle domain of the plane. Then the dierential equation M(x, y)dx + N(x, y)dy = 0 is exact if and only if M N = . y x at each point of . 5

Example 3.2. Solve the dierential equation 2x sin y + 3x2 y dx + x3 + x2 cos y + y2 dy = 0.

Solution

Note that M = 2x sin y + 3x2 y,

N = x3 + x2 cos y + y2 . Check

N M = 2x cos y + 3x2 = . y x This equation is exact by Theorem 3.1. So there is a function F = F(x, y) such that dF = It follows that F = 2x sin y + 3x2 y, F = x2 sin y + x3 y + f (y); x F y3 = x3 + x2 cos y + y2 , F = x3 y + x2 sin y + + g(x). y 3 This forces that f (y) =
y3 3

F F dx + dy = 2x sin y + 3x2 y dx + x3 + x2 cos y + y2 dy = 0 x y

+ C, g(x) C. Therefore, the solution is x3 y + x2 sin y + y3 + C = 0, 3

where C is an arbitrary constant. Substitution Methods Consider the equation variable, then we have
dy dx

= (x + y)2 . It is not separable. Let u = x + y, where u is the new du dy =1+ = 1 + u2 . dx dx

This new equation is separable and can be solved: du = dx, tan1 u = 1 + u2 du = 1 + u2 dx = x + C, u = tan(x + C).

Therefore, we have the solution of the original equation y = u x = tan(x + C) x. There are numerous substitution which is based on smart observation. In this class, we focus on two basic substitutions: Homogeneous equation and Bernoulli Equations. Homogeneous Equation 6

Consider the equation have u+x

dy dx

= ( y )2 + 1. Let u = y , then y = ux x x

dy dx

= u + x du . Therefore, we dx

du du du dx = u2 + 1, x = u2 u + 1 2 . = dx dx x u u+1

So we arrived a separable equation which can be solved.


dy In general, a dierential equation in the form dx = F( y ) is called homogeneous equation. For x homogeneous equation, the standard substitution is to let u = y . Then we have x

x a separable equation.

du = F(u) u, dx

Example 3.3. Solve the initial value problem x dy =y+ dx x2 y2 , y(1) = 0.

Solution

Dividing both sides by x gives us dy y = + dx x 1 y 2 , x

which is a homogeneous equation. Let u = y , we have x x du +u=u+ dx 1 du = , sin1 u = log x + C. 1 u2 , x 1 u2

Put in x = 1, y = 0, which is the same as x = 1, u = 0, we obtain C = 0. Therefore, the solution is y x = u = sin log x, i.e., y = x sin log x. Bernoulli Equations A dierential equation in the form dy + P(x)y = Q(x)yn dx is called Bernoulli Equation, where n is a constant satisfying n 0, 1 (1)

September 12th, 2012; Slope Fields and Solution Curves

Continue the discussion of Bernoulli equation. Bernoulli equation can be written as yn Let u = y1n , then
du dx

dy + P(x)y1n = Q(x). dx

dy = (1 n)yn dx . Consequently, we have

1 du du + P(x)u = Q(x), + (1 n)P(x)u = (1 n)Q(x). n 1 dx dx

The last equation is rst order linear equation, which can be solved. Example 4.1. Solve the equation x dy + 6y = 3xy2 . dx

Solution Rewrite the equation as dy y + 6 = 3y2 , dx x which is Bernoulli equation with n = 2. Multiply both sides by y2 , we have y2 6 du 6 du 6 dy + = 3, + u = 3, u = 3, dx xy dx x dx x
6 x

where u = y1 . For this linear rst order equation, the integration factor is e ing this factor to both sides of the last equation implies

= x6 . Multiply-

d 6 3 3 (x u) = 3x6 , x6 u = x5 + C, u = x + Cx6 . dx 5 5 The solution is y= The meaning of slope eld:


dy Given the dierential equation dx = f (x, y), there is a simple geometric way to think about its solutions. At each point (x, y), the value of f (x, y) determines a slope f (x, y). Therefore, the dy solution curve of the dierential equation dx = f (x, y) is a curve in xy-plane whose tangent line at each point (x, y) has slope f (x, y). 3 5x

1 . + Cx6

This geometric viewpoint suggests a graphical method for constructing approximate solutions dy of the dierential equation dx = f (x, y). 8

dy Example 4.2. Construct a slope eld for the dierential equation dx = x y and use it to sketch an approximate solution curve that passes through the point (4, 4). Use this solution curve to estimate y(0).

It follows from the table and picture of page 21 of the text book. So y(0) = 0.9. Does solution of a dierential equation exist? Consider the example Consider the example
dy dx dy dx

= x2 . The initial value problem y(0) = 1 has no solution at all. = y 3 . There are at least two solution curves passing through (0, 0): y 0, 2 y= x 3
3 2 1

Is solution of dierential equation unique?

However, we do have a existence and uniqueness theorem for dierential equations. Theorem 4.1. Suppose f and f are continuous on some rectangle R in xy-plane that contains the y point (a, b) in its interior. Then, for some open interval I containing the point a, the initial value problem dy = f (x, y), y(a) = b dx has one and only one solution that is dened on the interval I. In this theorem, we dont know how large I is, we only know it is some interval containing a. Return to previous examples to illustrate this theorem. Mention the idea of the proof: transfer the dierential equation into integration equation. Give a sketchy proof if time permitted.

September 14th, 2012; Some Word Problems and Population Models

Radioactive decay. Let N(t) be the number of atoms of certain radioactive isotope at time t. It has been observed that N obey the following dierential equation dN = kN dt for some positive k, which depends on the particular radioactive isotope. Clearly, we have N = N0 ek . Half life is the time needed for half of the material to decay, which we denote by . Then 1 log2 N0 = N0 ek , k = log 2 = . 2 k For C 14 , the half-life is =
log 2 0.0001216

= 5700 years. 9

Example 5.1. A specimen of charcoal turns out to contain 63% as much C 14 as a sample of present-day charcoal of equal mass. What is the age of the sample? We have 0.63N0 = N0 ekt t = log 0.63 = 3800. 0.0001216

Example 5.2. A tank contains 1000 liters of a solution consisting of 100kg of salt dissolved in water. Pure water is pumped into the tank at the rate of 5L/s, and the mixturekept uniform by stirringis pumped out at the same rate. How long will it be until only 10kg of salt remains in the tank? First, we set up the dierential equation with initial conditions. Let y to be the number (in kg) of salt in the tank, we see that y = This equation can be solved easily: y = 100e0.005t . Let y = 10, we have 10 = 100e0.005t 0.005t = log 1 , t = 200 log 10 = 461(s). 10 5 y, 1000 y(0) = 100.

Note that in the simplest model, the population growth satises some equation which is the same as the radioactive decay: dP = P, dt where P = P(t) is the population of the world at time t. The solution is P(t) = P0 et . This solution is not very satisfactory since
t

lim P(t) = lim P0 et = ,


t

which is impossible since our world is limited. It is reasonable that the world population should have an upper bound. The following dierential equation describes the population which cannot tend to innity. dP = kP(M P), dt (2)

where both k and M are positive numbers. Clearly, if P > M, then the population is decreasing. Equation (2) is called Logistic equation.

10

Note that equation (2) is separable. Suppose P0 = M, then P M is a trivial solution. So we assume that P0 M. We have dP 1 1 = kdt, + dP = kMdt, P(M P) P MP log |P| log |M P| = kMt + C, P | = kMt + C, log | MP P = C1 ekMt . MP Plug in t = 0, P = P0 , we have P0 MP0 kMt MP0 ekMt P P0 ekMt P 1 + ekMt = e P= = . M P M P0 M P0 M P0 M + P0 (ekMt 1) Clearly, let t , we obtain limt P(t) = M. This M is called the carrying capacity of the environment. Quickly review Chapter 1 if time permitted.

September 17th, 2012; Population Models

Lets use the following example to recall the equation we learned in previous section. Example 6.1. Suppose that in 1885 the population of certain country was 50 million and was growing at the rate of 750, 000 people per year. Suppose also that in 1940 its population was 100 million and was then growing at the rate of 1 million per year. Assume that this population satises the logistic equation. Predict the population for the year 2000. The logistic equation is d P = kP(M P), dt with solution P(t) = M 1+
M P0

1 ekMt

(3)

The population can be calculated whenever P0 , k and M are known. The given conditions imply 0.75 = 50k(M 50), 1.00 = 100k(M 100) M = 200, k = 1 . 10000

11

We set t = 0 at year 1940 (Why dont we choose t = 0 at 1885?), then P0 = 100. Therefore, at 2000, when t = 60, we have P(60) = 200 1 + (2 1)e0.000120060 153.7.

The situation where Logistic equation applies: limited resources, competition, joint proportion(speard of disease). The joint proportion interpretation need more attention. The spreading speed of a desease is proportional to the chance of encounters among healthy people and infected people. Here is an important observation: the population growth is proportional to the encounters. Use this observation, we can deal with a dierent population model. Let P be the population of some unsophisticated animals. Suppose the death rate is a constant, the birth rate is proportional to the chance of encounters between males and females. In view of this assumption, we have the following equation dP P P = P + k0 ( ) ( ) = P + kP2 = kP P = kP (P M) , dt 2 2 k where M = . k Solve this equation 1 dP 1 = kdt, dP = kMdt, P(P M) PM P |P M| PM = ekMt+C0 , = CekMt . P P Plug in initial condition, we have 1 which in turn imply the general solution P= It is interesting to compare (4) with (3). Example 6.2. Consider an animal population P(t) that is modeled by the equation dP = 0.0004P(P 150). dt Find P(t) if (a) P(0) = 200; (b) P(0) = 100. In case (a), we have P(t) = 150 . 1 0.25e0.06t 12 M 1+
M P0

log

|P M| = kMt + C0 , P

M M kMt = 1 e , P P0

1 ekMt

(4)

At t log 4 from the left, we see that P(t) blows up. This time t = 0.06 population explosion happens. In case (b), we have P(t) = The solution exists always and lim P(t) = 0.
t

log 4 0.06

is the doomsday, when the

150 . 1 + 0.5e0.06t

Return to the general setting. According to (4), we see that if P0 < M, then the solution exists always and tends to zero as time goes to innity. If P0 > M, then the solution exists until the doomsday time: T =
log
P0 P0 M

kM

Draw the typical solution curves for both equations P = kP(M P) (Logistic equation) and P = kP(P M) (Explosion-Extinction Equation). Explain the stability of the solution P M. For the rst equation, P M is a stable solution. However, for the second equation, P M is not a stable solution. It is clear that one can predict the stability of solutions by writing down every solution explicitly. However, there are many equations whose solutions cannot be written down explicitly. However, the stability of solutions can also be analyzed in many cases.

September 19th, 2012;Equilibrium Solutions and Stability

We have seen the basic idea of stability of solutions last time. In this class, we shall dene the stability more rigorously. The dierential equation of the form dx = f (x) dt (5)

is called an autonomous rst-order dierential equation. Note that the independent variable t does not appear explicitly. We say x = c is a critical point of (5) if f (c) = 0. Clearly, x c is then a constant solution of (5). It is also called an equilibrium solution. Draw the phase diagram for each case. Example 7.1. Calculate the critical points of the following dierential equations. dx dt dx dt dx dt dx dt = x(x 1), = x(1 x), = x3 + 1, = e x 1, x = 0, x = 1. x = 0, x = 1. x = 1. x = 0.

13

Denition 7.1. A critical point x = c is said to be stable provided that, if the initial value x0 is suciently close to c, then x(t) remains close to c for all t > 0. More precisely, the critical point c is stable if, for each > 0, there exists > 0 such that |x(t) c| < for every t > 0 whenever |x0 c| < . The critical point x = c is unstable if it is not stable. Come back to the Logistic equation dx = x(1 x) and the Explosion-Extinction equation dt = x(x 1) to check the concept of stability. Draw pictures of typical solution curves.

dx dt

Phase diagram: A diagram which includes x-axis and the moving direction of x according to the position of x. Check stability on the Phase diagram: A point is stable if it is the converging point in the phase diagram. The point is unstable if it is not a converging point. Example 7.2. Use the phase diagram method to analyze the stability of the critical points of the equations in Example 7.1. Harvesting a logistic equation. Example 7.3. Solve the dierential equation dP = P(4 P) h dt whenever h = 3, 4, 5. Discuss the stability of critical points in each case. If h = 3, critical points are P = 1 and P = 3. We have dP dP 1 dP 1 = = dt, = dt, 2 P3 P1 P2 4P + 3 (P 1)(P 3) P 3 P0 3 2t P3 = 2dt, = e . log P1 P 1 P0 1 P0 3 2t P0 3 2t P 1 e e , =3 P0 1 P0 1 P= 3 1
P0 3 2t P0 1 e . P0 3 2t P0 1 e

The behavior of P depends on the initial value P0 . (a). P0 > 3. Note that 0 < and limt P(t) = 3. (b). 1 < P0 < 3. We have limt P(t) = 3.
P0 3 P0 1

< 1. Therefore, 1 < 0. Clearly 1


P0 3 2t P0 1 e

P0 3 2t P0 1 e

> 0 always. So P(t) always exists

P0 3 P0 1

P0 3 2t P0 1 e

> 0 always and P(t) always exists

(c). P0 < 1. Let T = 1 log P0 3 . Then 1 2 P0 1 whenever t < T , and limtT P(t) = .

< 0 whenever t < T . So the solution exists

14

If h = 4, critical points are P = 2. We have 1 dP dP 1 = dt, d(P 2)1 = dt, = P(4 P) 4 = (P 2)2 , = t, 2 dt P 2 P0 2 (P 2) 1 P0 2 . P=2+ =2+ 1 1 + (P0 2)t t + P 2
0

Therefore, if P0 > 2, then 1 + (P0 2)t > 1 > 0. This means the solution P(t) exists for ever and limt P(t) = 2. However, if P0 < 2, then the denomenator 1 + (P0 2)t approaches 0 as t 1 approaches T = 2P0 . Furthermore, limtT P(t) = . If h = 5, no critical points. we have dP dP = P2 + 4P 5 = (P 2)2 1, = dt, tan1 (P 2) = t + C, dt (P 2)2 + 1 P 2 = tan(C t), P = 2 + tan(C t), where C = tan1 (P0 2). Draw the solution curve for each case and mention the eect of parameters to the solutions of dierential equations. Draw the phase diagram for each case and analyze the stability of each critical point.

September 21th, 2012; Acceleration-Velocity Models

1. The simplest model,

dv = g = 9.8m/s2 . dt 2. There exists resistance. (a). Resistance is proportional to velocity.

dv = v g. dt This is a separable rst-order dierential equation. It can be solved as


g v+ dv g t g e . g = dt, log g = t, v(t) = v0 + v+ v0 + g As t , there is a limit nite speed, , which is the terminal speed. Integrate the veloclity function, we obtain the position function

y(t) = y0 +

v0 + g g (1 et ) t. 2

(b). Resistance is proportional to square of velocity. 15

dv = g v|v|. dt If v > 0, or the body is moving upward, we have dv = g 1 + v2 , dt g Using the substitution such that u2 = v2 , or u = v g
g.

Then du =

g dv.

It follows that

du g du = g(1 + u2 ), = gdt, tan1 u tan1 u0 = gt, u = tan tan1 u0 gt . 2 dt 1+u Recall that v =
g u.

Then g tan tan1 v0


g

v=

gt = g

g tan C1 gt , tan udu = sin u du = cos u

where we denote tan1 v0

by constant C1 for convenience. Note that

d cos u = log | cos u| + C. Integration of the velocity function gives us cos u y(t) = y0 + g 1 tan C1 gs ds = y0 0 cos(C1 gt) 1 = y0 + log . cos C1
s s

tan(C1
0

gs)d(C1 gs)

If v < 0, or the body is moving downward. In this case, we have dv = g + v2 , dt g v(t) = tanh(C2 gt), C2 = tanh1 cosh(C2 gt) 1 y(t) = y0 log . cosh C2 3. Escape Velocity(if time permitted). The movement of a body in the gravitational eld of a planet is described by the following simple model. We assume the direction of the velocity is parallel to the direction of the gravitational force. So dv GM = 2 . dt r 16

v0 . g

Note that dr = vdt. We have v GM 2GM dv dv2 1 1 = 2 , = 2 , v2 (r) v2 (r0 ) = 2GM( ). dr dr r r0 r r

Since v(r) > 0 always according to our requirement, we have v2 (r0 ) v2 (r) = 2GM( The number
2GM r0

1 1 2GM 1 1 ), v2 (r0 ) lim 2GM( ) = , v(r0 ) r r0 r r r0 r0

2GM . r0

is the escape velocity.

September 24th, 2012; Eulers Method

A general dierential equation cannot be solved explicitly. Therefore, numerical method are necessary for the solution of general dierential equation. How do you drive? You follow the instruction of the road sign. Actually, to draw a solution curve is similar to driving. We adjust the drawing direction according to the dierential equation. Suppose we have the equation dy = f (y, t). Starting from the point (t0 , y0 ), we have a solution dt curve y = y(t). How can we calculate the value of y(t0 + 1)? The simplest way is to set the driving direction at time t0 , then drive directly to the time t0 + 1 without adjusting direction. In this case, we approximate y(t0 + 1) by the value y0 + f (t0 , y0 ) 1 = y0 + f (t0 , y0 ). Of course, this approximation may be far away from the real precise value of y(t0 + 1). We divide the time 1 1 into N-pieces of equal length, with each length N . So t0 = t0 , t1 = t0 + 1 , , tN = t0 + 1. N

Now we will drive more carefully, at each time tk , we adjust the driving direction according to the road sign, or the direction given by the dierential equation. So we have 1 f (t0 , y0 ), N 1 y2 = y1 + f (t1 , y1 ), N 1 yN = yN1 + f (tN1 , yN1 ). N y1 = y0 + Then yN is a better approximation of y(t0 +1). The larger the N is, the smaller the error |yN y(t0 +1)| is. If we plot our driving record on the y t plane, we obtain a sequence of straight line segments, which are approximations of the real solution curves. The method described above is called Eu1 lers Method, the number N is called the step size. Generally, the step size could be any positive number h and the algorithm of Eulers method can be described as follows. 17

Given the initial value problem dy = f (t, y), dt y(t0 ) = y0 ,

Eulers method with step size h consists of applying the iterative formula yn+1 = yn + h f (tn , yn ), (n 0)

to calculate successive approximations y1 , y2 , y3 , to the true values y(t1 ), y(t2 ), of the solution y = y(x) at the points t1 , t2 , t3 , respectively. Example 9.1. Apply Eulers method to approximate the solution of the initial value problem dy = t + y, dt y(0) = 5,

(a) rst with step size h = 1 on the interval [0, 3] to approximate y(3). (b) then with step size h = 0.2 on the interval [0, 1] to approximate y(1). (a). We have t0 = 0, y0 = 5; t1 = 1, y1 = 5 + 1 (0 + 5) = 10; t2 = 2, y2 = 10 + 1 (1 + 10) = 21; t3 = 3, y3 = 21 + 1 (2 + 21) = 44. Therefore, y(3) is approximated by 44. (b). We have t0 = 0, t1 = 0.2, t2 = 0.4, t3 = 0.6, t4 = 0.8, t5 = 1.0, y0 = 5; y1 = 5 + 0.2 (0 + 5) = 6; y2 = 6 + 0.2 (0.2 + 6) = 7.24; y3 = 7.24 + 0.2 (0.4 + 7.24) = 7.24 + 1.528 = 8.768; y4 = 8.768 + 0.2 (0.6 + 8.768) = 10.6416; y5 = 10.6416 + 0.2 (0.8 + 10.6416) = 12.92992.

So y(1) is approximated by 12.92992. The error for the Eulers method: local error, accumulative error and round o error in the computation. The disadvantage of Eulers method: there exists example such that the appximation values do not converge as h, the step size, tends to zero. Example 9.2. Solve the intial value problem dy = 1 + y2 , dt What is limt y(t)? 2 18 y(0) = 0.

It is not hard to solve this equation to obtain that y(t) = tan t. Therefore, we have limt = . However, we can also use Eulers method to approximate the 2 solution curve. Let h = 2N for a very large N. One can see that y( ) can always be approximated 2 by yN = yN1 + (1 + y2 ). N1 2N No matter how large N is, the value of yN is always nite. So the dierence between yN and lim y(t) is innite.
t 2

10

September 26th, 2012; More on Euler Method

The estimate of local error. Using h as step size, we apply Eulers method to the dierential equation dy = f (t, y), y(t0 ) = y0 . dt In the rst step, we want to nd the dierence between the approximation value y1 and the precise value y(t1 ). Clearly, we have y1 = y0 + h f (t0 , y0 ), Recall the mean value theorem of calculus. It states that given an arc between two endpoints, there is at least one point at which the tangent to the arc is parallel to the secant through its endpoints. Applying mean value theorem to the solution curve, we have y(t1 ) = y0 + hy (t0 + ), So the error term is |yactual (t1 ) yapproximation (t1 )| = |y(t1 ) y1 | = h|y (t0 + ) f (t0 , y0 )| = h| f (t0 + , y(t0 + )) f (t0 , y0 )| Note that f (t0 + , y(t0 + )) f (t0 , y0 ) = f (t0 + s, y(t0 + s))| s=1 s=0 =
0 1

0 h.

=
0

d f (t0 + s, y(t0 + s))ds ds f f y + ds t y t f f + f ds t y


1

=
0

=
0

f f + f ds. t y

19

If we know both the actual solution and the aprroximated solution locates in the compact rectangular region [a, b] [c, d], we can dene C= Then we have Local error Ch Ch2 . So the error in the rst step is bounded by Ch2 . If we want to approximate y(t0 + 1) by Eulers method with step size h, we need 1 steps. The total error is then expected to by bounded by h 1 Ch2 h Ch, i.e., |y(t0 + 1) yN | Ch,
1 where N = h . Because of (6), we say the error of Eulers method is of order h.

sup
[a,b][c,d]

f f | + | || f |. t y

(6)

In practice, we need more precise method, whose error is of order higher than h. We now introuce one of them: the improved Euler Method. Given the intial value problem dy = f (t, y), dt y(t0 ) = y0 ,

the improved Euler method with step size h constists in applying the iterative formulas un+1 = yn + h f (tn , yn ), 1 yn+1 = yn + h ( f (tn , yn ) + f (tn+1 , un+1 )). 2 Draw a picture to show this. Example 10.1. Apply Improved Eulers method to approximate the solution of the initial value problem dy = t + y, dt with step size h = 1 on the interval [0, 3]. We have t0 = 0, y0 = 5; t1 = 1, u1 = y0 + 1 (t0 + y0 ) = 10; y1 = y0 + 1 1 ((0 + 5) + (1 + 10)) = 5 + 8 = 13; 2 1 t2 = 2, u2 = y1 + 1 (t1 + y1 ) = 27; y2 = 13 + 1 ((1 + 13) + (2 + 27)) = 34.5; 2 1 t3 = 3, u3 = y2 + 1 (t2 + y2 ) = 71; y3 = 34.5 + 1 ((2 + 34.5) + (3 + 71)) = 34.5 + 55.25 = 89.75. 2 20 y(0) = 5,

The exact solution can be solved as dy y = t, dt d t (e y) = et t, et y 5 = dt


t 0

es sds = tet + 1 et , y = 6et t 1.

Therefore, y(3) = 6e3 3 1 = 116.51. The appxoximation given by Eulers method is 44. Clearly, the method of improved Eulers method is more accurate. In general, the error term of the improved Eulers method is bounded by Ch2 . In other words, the error of the improved Eulers method is of order h2 .

11

September 28th, 2012; Linear Systems

Now we move to the second big topic in this course: Linear Algebra. Linear Algebra is the branch of mathematics concerning vector spaces, as well as linear mappings between such spaces. It originates from the solution linear equations in several unknowns. A linear equation of two variables is an equation in the form ax + by = c. Note that the solution set of the above equation is a straight line. This is the reason why this equation is called linear. A linear equation of three variables is an equation in the form ax + by + cz = d. Put several linear equations together, we obtain a linear system. The solution of a linear system is the intersection of the solution set of each equation in the system. Example 11.1. Solve the linear system of two variables. x + y = 1, 2x + y = 3.

Example 11.2. Solve the linear system of two variables. x + y = 1, 2x + 2y = 3.

Example 11.3. Solve the linear system of two variables. x + y = 1, 2x + 2y = 2.

21

These three systems correspond to the following three pictures: intersection of two lines, parrallel two lines, coincidence to two lines. The geomtric picture is helpful in solving system of linear equations. However, if the unknown variables number is very big, then the geometric picture becomes not very clear. So we need to study the solution in a more abstract way. Lets see how do we solve the following equation. Example 11.4. Solve the lienar system x + 2y + z = 4, 3x + 8y + 7z = 20, 2x + 7y + 9z = 23.

We repeatedly use several operations to simplify the system. These operations are called the elementary operations: 1. Multiply one equation by a nonzero constant. 2. Interchange two equations. 3. Add a constant multiple of one equation to another equation. Conceptually, the solution of the given linear system is very simple. We repeatedly apply the 3 elementary operations until the system arrives the form of x = , y = , z = . It is not surprising that the solution process of the previous example is the same as the solution process of the following system. u + 2v + w = 4, 3u + 8v + 7w = 20, 2u + 7v + 9w = 23. So the name of the variables are not important. The important things are the coecient numbers. For simplicity, we can store these numbers in a rectangle by order of the variables. This leads us to the coecient matrix 1 2 1 3 8 7 2 7 9 and augmented coecient matrix 1 2 1 4 3 8 7 20 2 7 9 23 22

of the given linear system. For augmented matrix, we obviously have 3 elementary row operations. Keep in mind that there is a one-to-one correspondence between linear system and its augemented matrix. So the 3 elementary row operations is the basis for solving a linear system. A solution can be read out directly if the augmented matrix reaches a very simple form. Return the previous example to show this.

12

October 1st, 2012; Gaussian Elimination and Reduced Row-echelon Matrices, I

In this class and the next, we focus on this problem: how to solve a linear system by the matrix method? As we know, the basic tools we can use are the three elementary row operations. Using these operations repeatedly, we can simplify a given augmented matrix to a very simply form, this very simple form is called the reduced echelon form, or reduced echelon matrix. Denition 12.1. A matrix is called a reduced row-echelon matrix, or a reduced row-echelon form, if the following properties are satised. If a row has nonzero entries, then the rst nonzero entry is 1, which is called the leading 1 of this row. If a column contains a leading 1, then all other entries in the column are 0. If a row contains a leading 1, then each row above contains a leading 1 on the further left. Lets justify if the given matrices are reduced echelon forms. 1 0 1 0 0 0 1 1 0 0 0 0 2 0 1 1 1 0 0 1 1 2 0 0 No! The second row has no leading 1.

No! On the column of the second row leading 1, there exists another nonzero term

No! The rst rows leading 1 is on the right side of the second rows leading 1.

0 0 1 1 0 2 0 1 3

Yes!

Why reduced-row echelon form is important. Because it is simplest and one can read the solution out directly from reduced-row echelon form. For example, the linear equation corresponding to the last matrix above is x = 1, y = 2, z = 3, 23

which is a linear system with an obvious solution. So our task to solve a linear system is to transform the augmented matrix to its reduced rowechelon form. Why can we do this? We are guanranteed by the following theorem. Theorem 12.1. Every matrix is row equivalent to one and only one reduced row-echelon matrix. Here, we say two matrices are row equivalent if one can be obtained from the other by nite steps of elementary row operations. Now we x notations for the elementary row operations: Multiply row p by constant c Interchange row p and row q Add c times row p to row q cR p , S WAP(R p , Rq ), cR p + Rq .

Using these notations, we study the following examples. Example 12.1. Solve the linear system 2x + 8y + 4z = 2, 2x + 5y + z = 5, 4x + 10y z = 1.

The augmented matrix is 2 8 4 2 2 5 1 5 4 10 1 1 It can be transformed into a reduced row-echelon form by the following process 2 1 2 8 4 2 1 R 1 4 2 1 4R1 +R3 1 2 1 2R1 +R2 1 4 2 5 1 5 2 5 1 5 0 3 3 3 0 4 10 1 1 0 4 10 1 1 4 10 1 1 2 1 1 1 R 1 4 2 1 1 4 3 3 6R2 +R3 1 4 2 R3 +R2 1 1 R2 3 1 1 0 1 1 1 0 1 1 1 0 0 1 0 0 3 9 0 0 6 9 3 0 0 1 3 1 4 0 5 1 0 0 11 2R3 +R1 0 1 0 4 4R2 +R1 0 1 0 4 . 0 0 1 3 0 0 1 3 From the reduced row-echelon form, we can read the solution out. x = 11, y = 4, z = 3. 24

4 2 1 3 3 3 6 9 3 4 2 1 1 0 4 0 1 3

13

October 3rd, 2012; Gaussian Elimination and Reduced Row-Echelon Matrices, II

The process to obtain the reduced row-echelon form is called the Gauss-Jordan Elimination. 1. Use the elementary row operations to transform the matrix into a matrix of shape upper triangular, and each nonzero row has a leading 1. 2. Use back substitution to obtain reduced row-echelon form. Example 13.1. Solve the linear system x + y + z = 1, 2x + 2y + 5z = 8, 4x + 6y + 8z = 14.

The augmented matrix is 1 1 1 1 2 2 5 8 4 6 8 14 One easily arrives 1 1 1 1 0 0 3 6 0 2 4 10 It seems we have trouble in the second row and second column: 0 can not be made into a leading 1. This diculty can be solved by swapping the second and third row. 1 1 1 1 1 R 1 1 1 1 1 R3 1 1 1 1 2 2 3 0 2 4 10 0 1 2 5 0 1 2 5 0 0 1 2 0 0 3 6 0 0 3 6 So we have arrived a matrix of upper triangular shape, each nonzero row start from a leading 1. Now we are able to use back-substitution to clear all the columns with leading one. This process is done from bottom to the top, from right to the top. This is the reason why do we call it back substitution. 1 1 1 1 2R3 +R2 1 1 1 1 R3 +R1 1 1 0 1 R2 +R1 1 0 0 0 0 1 2 5 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 2 0 0 1 2 0 0 1 2 0 0 1 2 See more complicated examples.

25

Example 13.2. Solve the linear system x3 x4 x5 = 4, 2x1 + 4x2 + 2x3 + 4x4 + 2x5 = 4, 2x1 + 4x2 + 3x3 + 3x4 + 3x5 = 4, 3x1 + 6x2 + 6x3 + 3x4 + 6x5 = 6.

Example 13.3. Solve the linear system x3 + x4 = 3, x1 + x2 = 1, x2 + x3 = 2, x1 + 3x2 + 3x3 + x4 = 10.

14

October 5th, 2012; Matrix Operations

A linear system is called consistent if it has at least one solutions. Otherwise, it is called inconsistent. Check the consistency of previous examples. How to justify if a linear system is consistent? Suppose A is the reduced row-echolon of some augmented matrix of a linear system. Then the linear system is consistent if and only if it does not contain a row of (0, 0, , 0, 1). If a linear system is consistent, there are two possibilities. 1. Each variable column has a leading 1. There is a unique solution. 2. Some variable column has no leading 1. Such a variable is called a free variable, which can be arbitrary number. So in this case the linear system has innitely many solutions. Then apply the above discussion to the study of Homogeneous linear system. Introduce the matrix operation. Why do we needed it? One of the important reason is for the simplicity of notation. Start from inner product. The innner product of two vectors v and wcan be written as the product of a row matrix v and a column matrix w. Then the simpest linear system can be written as a matrix product. For example, x + 2y + 3z = 1 can be written as Av = 1 where A = (1, 2, 3), x v = y . z

26

Now we increase the number of equations. If we dene correct relationship, the form Av = b will represent the whole system. For example, the linear system x + 2y + 3z = 1, 2x + y + 3z = 2, x + 3y 5z = 3, 10x + y 9z = 4. can be written as Av = b where 1 2 A= 1 10 2 3 1 3 , 3 5 1 9 x v = y z 1 2 b= 3 4

Then lets give the formal denition of matrix product. Note that the matrix product is dened for AB if and only if the column number of A equals the row number of B. We say the matrix A is a n m matrix if it has n rows and m columns. Therefore, if A is a n m matrix, B is a p q matrix, then AB is well dened if and only if m = p. Note that AB is dened does not mean BA is dened. Then show the matrix sum. This is very easy, two matrices can be summed if and only if there are of the same dimension, or the same shape. Then introduce the scalar product of a matrix and a number. There is a special type of matrices whose row number equals the column number. Such matrices are called square matrices. Give examples of A, B such that AB BA. 0, B 0. Give examples of A, B such that AB = 0, hower A

Actually, there is one example satisfying the previous two requirements. A= Then we have AB = 0 0 , 0 0 BA = 0 2 . 0 0 0 1 , 0 1 B= 1 1 0 0

27

15 16 17 18 19

October 8th, 2012; Matrix Operations, II October 10th, 2012; Midterm Exam, I October 12th, 2012; Review for Midterm Exam October 15th, 2012; Inverse matrices October 17th, 2012; Determinants

The meaning of determinant: the volume with orientation. For the matrix a b c d The rst column vector gives us a line t : (at, ct), or ay cx = 0. The distance from this line to the point (b, d) is given by |ad bc| a2 + c2 The length of the vector (a, c) is is a2 + c2 . It follows that the area spanned by (a, c) and (b, d) |ad bc| a2 + c2 = |ad bc| a2 + c2

Area = For two matrices, we dene

detA = ad bc, it has the geometric meaning of area with orientation. In dimension 3, give the cofactor expansion denition. Then explain that the expansion along the rst row is the same as the expansion along the rst column. This will imply that the detA = detA . Then explain from geometric point of view the change of determinant under the fundamental row transformations. Then use the reduced row echelon form to explain that detA = detEk detEk1 detE1 . This then implies that detAB = detAdetB. 28

A matrix is invertible if and only if the row reduced echelon form is Id, if and only if det A Calculate the determinants of the following example.

0.

2 5 4 det 5 3 1 , 1 4 5

2 1 1 1 0 1 2 0 det 2 3 2 3 0 3 3 3

20

October 19th, 2012; Determinants, II

Using fundamental row(column) transformation to transform a matrix to an uptrianglular matrix. In each transformation, record the change of the determinant.

21

October 22th, 2012; The Vector Space and subspaces

In this class, we study the explicit vector space R3 . A vector in R3 is simply an ordered triple (x, y, z) of real numbers. For simplicity of notation, a vector is denoted by v. The sum of vectors and the multiplication of a vector by a scalar is dened in an obious way. R3 is the set of all the vectors v = (x, y, z). A subset V R3 is called a subspace if the following properties are satised. (1). If v V, then kv V for arbitrary k R. (2). If u V and v V, then u + v V. By denition, if V , then 0 V.

Example 21.1. Check if the following set are linear subspace. (a). V = {(x, y, z)|z 0}. (b). V = {(x, y, z)|x + y + z = 1}. (c). V = {(x, y, z)|x + y + z = 0}. (d). V = {(x, y, z)|x + y + z = 0, x + 2y + 3z = 0}. The sets V dened in (c) and (d) are linear subspaces. However, they are dierent in geometry by the dimensions. In order to illustrate the dimension rigorously, we need the concept of linear indpendence. Two vectors u and v are called linearly independent if and only if the linear system (for variables a an b) au + bv = 0 29 (7)

has a unique solution a = 0, b = 0. Two vectors u and v are called linearly dependent if and only if they are not linearly independent,i.e., there exists a solution of (7) such that either a 0 or b 0. Suppose u and v are linearly dependent, then au + bv = 0 has a nonzero solution. Without loss of generality, we assume that a b u = v. a This means that u is generated by v. So u is redundant. Example 21.2. Check if the following vectors are linearly independent. (a). u = (1, 0, 1), (b). u = (2, 1, 3), v = (3, 0, 3). v = (1, 2, 3). 0, then we have

Three vectors u, v and w are called linearly independent if and only if the linear system (for variables a an b) au + bv + cw = 0, has a unique solution a = 0, b = 0, c = 0. Three vectors u, v and w are called linearly dependent if and only if they are not linearly independent,i.e., there exists a solution of (7) such that either a 0 or b 0, or c 0. Suppose u, v and w are linearly dependent, then au + bv + cw = 0 has a nonzero solution. Without loss of generality, we assume that a b c u = v w. a a This means that u is generated by v and w. So u is redundant. Example 21.3. Check if the following vectors are linearly independent. (a). u = (1, 1, 0), v = (1, 1, 1), w = (1, 3, 2). (b). u = (1, 2, 3), v = (1, 4, 9), w = (1, 8, 27). 0, then we have (8)

30

22

October 24th, 2012; Linear combinations and independence of vectors

We discuss the general Euclidean space Rn . One can regard Rn as the collection of all n-vectors, or n-tuples, (x1 , x2 , , xn ). Some times, we will also write vectors in column for convenience. One can easily dene linear subspace of Rn . Denition 22.1. A set V in Rn is called a linear subspace of Rn if the following properties are satised. (a). If v V, then kv V for every k R. (b). If u, v V, then u + v V. Note that the set 0 is a linear subspace and it is contained in every other non-empty linear subspace. Many interesting set V appears as the solution set of a linear system. Not every solution set is a linear subspace. For example, let V = {(x1 , x2 , x3 , x4 )|x1 x2 + x3 + 2x4 = 1}. It is not a linear subspace since 0 is not in V. However, if the linear system is homogeneous, then the solution set must be a linear subspace. Theorem 22.1. Suppose A is a m n matrix, v is a column vector in Rn . Let V = v|Av = 0 . Then V is a linear subspace of Rn . V is called the kernel space of the matrix A. The key of the proof is the following. If Av = 0, then A(kv) = 0 for arbitrary real number k. If Au = 0 and Av = 0, then A(u + v) = 0. Example 22.1. Find V = v|Av = 0 , where v R4 and 8 1 1 3 A = 1 3 10 5 1 4 11 2 It is easy to compute that 1 0 1 2 rre f (A) = 0 1 3 1 0 0 0 0 As column 3 and 4 have no leading ones, we see x3 and x4 are free variables. So let x3 = s, x4 = t, we have the solution x1 = s 2t, x2 = 3s + t, x3 = s, x4 = t. 31

Now we write every solution as a vector x1 x 2 = x 3 x4 1 2 3 +t 1 s 1 0 0 1

So every solution vector is the linear combination of v1 = (1, 3, 1, 0) , v2 = (2, 1, 0, 1) . In other words, the solution space is spanned by the vectors v1 and v2 . Denition 22.2. The vectors v1 , v2 , , vk are called linearly independent if the linear system c1 v1 + c2 vn + cn vn = 0 has a unique solution c1 = c2 = = ck = 0. They are called linearly dependent if they are not linearly independent. Vectors v1 , v2 , , vk are linearly dependent if and only if one of them can be expressed as the linear combination of the others. On one hand, if one vi can be expressed as the linear combination of the others, we can easily see that v1 , , vk are linearly dependent. For example, lets assume v1 = a2 v2 + a3 v3 + ak vk , then v1 + a2 v2 + a3 v3 + ak vk = 0. So we have a nontrivial solution c1 = 1, c2 = a2 , , ck = ak . It means that v1 , , vk are linearly independent. On the other hand, if v1 , , vk are linearly dependent, then we have a solution {ci }k i=1 such that some ci is nonzero. For simplicity, lets assume c1 0, then c1 v1 = c2 v2 + + ck vk , v1 = ck c2 v2 + vk . c1 c1

So v1 can be expressed as the linear combination of the remainded vectors.

23

October 26th, 2012; Bases of Linear Spaces

The basis of a linear subspace V. Denition 23.1. Vectors v1 , , vk is called a basis of V if the following properties are satised. (a). v1 , , vk are linearly independent. (b). V is spanned by v1 , , vk . The number of vectors in a basis is independent of the choice of bases. Theorem 23.1. Suppose V has two bases v1 , , vk and u1 , , ul . Then k = l. 32

The point of the proof is to translate this to a problem of solving linear system by the denition of linear independence. Then one can obtain contradiction if k l. Therefore, the number of vectors in a basis of V reveals some important property of the linear subspace V. It exactly means the smallest number of vectors one need to span V. This number is called the dimension of V. Clearly, the dimension of a straight line is 1, the dimension of a plane is 2, the dimension of Rn is n. For Rn , we dene e1 = (1, 0, 0, , 0), e2 = (0, 1, 0, , 0), en = (0, 0, 0 , 1). One can easily check e1 , , en form a basis of Rn . So dimension of Rn is n. Therefore, for n vectors v1 , , vn in Rn , they form a basis if and only if they are linearly independent. Theorem 23.2. The vectors u = (u1 , u2 , u3 ), independent if and only if u1 det u2 u3 v = (v1 , v2 , v3 ) and w = (w1 , w2 , w3 ) are linearly v1 w1 v2 w2 v3 w3

0.

Give a sketchy proof and then show an example. Check if the vectors u = (1, 2, 3), v = (2, 3, 4), w = (3, 4, 5)

are linearly independent. So they form a basis of R3 . In general, we have the following theorem. Theorem 23.3. Suppose v1 , , vn are n column vectors in Rn , then they form a basis of Rn if and only if det(v1 , , vn ) 0.

Note the determinant symmetry det A = det A .

24

October 29th, 2012; Second order linear equations

Study the special example of harmonic oscillator: F = kx, with the equation of displacement x being m d2 x dx + c + kx = 0. 2 dt dt 33

Note the following property of the solutions: If x1 and x2 are solutions, then for every pair of constants c1 and c2 , the linear combination c1 x1 + c2 x2 is a new solution. The second order linear equation is a dierential equation in the form y + p(t)y + q(t)y = f (t). The equation is called homogeneous if f (t) 0. Lemma 24.1. The solution set of a homogeneous second order linear equation is a linear space, i.e., if y1 and y2 are two solutions, then c1 y1 + c2 y2 is a new solution for every two constants c1 and c2 . The proof is to check by denition. If y1 and y2 are solutions, we have y1 + p(x)y1 + q(x)y1 = 0, y2 + p(x)y2 + q(x)y2 = 0. Multiply the rst equation by c1 , the second equation by c2 . Then add them together, we have (c1 y1 + c2 y2 ) + p(x)(c1 y1 + c2 y2 ) + q(x)(c1 y1 + c2 y2 ) = 0, which means c1 y1 + c2 y2 is a solution. Lets get some feeling from the following example. Example 24.1. Check the solutions of the following dierential equations. y + y = 0, y y = 0, y = c1 cos t + c2 sin t. y = c1 et + c2 et . y = c1 et + c2 tet .

y 2y + y = 0,

In order to show y + y = 0 has solutions c1 cos t + c2 sin t, we only need to show that cos t and sin t are solutions by Lemma 24.1. Actually, we see (cos t) + cos t = ( sin t) + cos t = cos t + cos t = 0, (sin t) + sin t = (cos t) + sin t = sin t + sin t = 0. It is expected that second order linear equation has more than one solutions. However, how to express all the solutions of a given equation. This is guaranteed by the following existence and uniqueness theorem. Theorem 24.1. Suppose that the functions p, q and f are continuous on the open interval I containing the point a. Then given any two numbers b0 and b1 , the equation y + p(x)y + q(x)y = f (x) has a unique solution on the entire interval I that satises the initial conditions y(a) = b0 , y (a) = b1 .

34

Note that one need two initial conditions to x the solution: the initial value and initial derivative. This is natural. Just think Newtons equation F = ma = m x. When F is known, one need initial position and velocity to predict the position at every time. If f 0, or the equation is homogeneous, for the initial conditions y(a) = 0, y (a) = 0, there is a unique solution y 0. Fix two constants c1 and c2 . For the initial condition y(a) = 1, y (a) = 0, there is a solution y1 . For the initial condition y(a) = 0, y (a) = 1, there is another solution y2 . Therefore, the solution Y = c1 y1 + c2 y2 satises the initial condition Y(a) = c1 , Y (a) = c2 . By Theorem 24.1, Y = c1 y1 + c2 y2 is the unique solution satisfying the given initial condition. The above argument can be generalized. Suppose y1 , y2 are two solutions such that v1 = y1 (a) , y1 (a) v2 = y2 (a) . y2 (a)

They form a basis of R2 , so every initial condition vector y(a) y (a) can be expressed as a linear combination of v1 and v2 . For every solution Y, we can write v= Y(a) = c1 v1 + c2 v2 . Y (a)

Clearly, the solution Z = Y c1 y1 c2 y2 satises the initial condition Z(a) = 0, Z (a) = 0. Therefore, by the existence and uniqueness theorem, we have Z 0. It follows that Y = c1 y1 +c2 y2 . Recall that two vectors form a basis of R2 if and only if they form a matrix with nonzero determinant. Dene W(y1 , y2 ) = det y1 y2 y1 y2

The above argument show that if W(y1 , y2 )(a) 0, then every solution can be expressed as the linear combination of y1 , y2 . On the other hand, if W(y1 , y2 )(a) = 0, then we have v1 , v2 linearly dependent. Therefore, we can assume that v2 = v1 . Then the solution Z = y2 y1 satises the initial condition Z(a) = 0, Z (a) = 0. Therefore, we have Z 0, which is the same as y2 = y1 . Then we see that W(y1 , y2 ) = det y1 y2 y y1 = det 1 = 0. y1 y2 y1 y1

Therefore, we have checked the following statement. Lemma 24.2. Suppose y1 , y2 are two solutions of a homogeneous equation y + p(x)y +q(x)y = 0, then either W(y1 , y2 ) 0 every where, or W(y1 , y2 ) 0. For the given examples, calculate their Wronskian. 35

25

October 31st, 2012; Second order linear equations, II

Calculate the solutions for the examples. Example 25.1. Find solutions for the equation y 2y + y = 0.

Find solutions for the equation y 2y + (1 2 )y = 0. Solve the auxiliary equation k2 2k + (1 2 ) = 0, (k 1 )(k 1 + ) = 0. So we have solutions y = c1 e(1+)t + c2 e(1)t . In particular, there is a solution y= e(1+)t e(1)t tet . 2

Therefore, it is not surprising that tet is a solution of y 2y + y = 0. In general, for y 2ay + a2 y = 0, we have common solutions y = c1 eat + c2 teat . How do we know we have obtained all the solutions? We have the following theorem. Theorem 25.1. Suppose y1 and y2 are two solutions of the homogeneous second-order linear equation y + p(x)y + q(x)y = 0 on an open interval I on which p and q are continuous. If W(y1 , y2 ) 0 at some point, then every solution can be expressed as a linear combination of y1 and y2 . In other words, y1 and y2 form a basis for the solution space. Two functions f1 , f2 are linearly independent over the interval I if the equality c1 f1 + c2 f2 0 hold for only c1 = c2 = 0. Note that the right hand side is zero function. 36

Recall that the Wronskian of two functions f1 , f2 is dened as W( f1 , f2 ) = det f1 f1 f2 . f2

If f1 , f2 are linearly dependent, then W( f1 , f2 ) 0. Actually, we can assume f2 = f1 without loss of generality, then we have W( f1 , f1 ) = 0 by direct calculation. We have f1 , f2 linearly dependent W( f1 , f2 ) 0 f1 , f2
?

linearly dependent.

If y1 , y2 are solutions of a homogeneous equation, we have more information. The prevous lemma shows the following. y1 , y2 linearly dependent W( f1 , f2 ) 0 y1 , y2 linearly dependent.

We also know that W(y1 , y2 ) is either a zero function, or a function never being zero. So we have y1 , y2 linearly dependent W( f1 , f2 )(a) = 0 y1 , y2 linearly dependent.

26

November 2nd, 2012; High order Linear Equations

Look at the example y 6y + 11y + 6y = 0.

Find solutions and show that they are all the possible solutions. Generalize the previous discussion in 2nd order to higher order. For linear dierential equation of order n with continuous initial conditions, we have existence and uniqueness theorem for the solutions. Note that for an order n dierential equation, the initial conditions contain n terms: the initial value y(a), initial derivative y (a), , untial the initial (n 1)-derivative y(n1) (a).

27

November 5th, 2012; Homogeneous Equations with constant coecients

Suppose the characteristic equation of a dierential equation is rn + pn1 rn1 + p1 r + p0 = 0. Suppose it has roots r1 of multiplicity m1 , r2 of multiplicity m2 , , rq of multiplicity mq . Suppose that m1 + m2 + mq = n. Then every solution of the dierential equation is of the form c1,1 er1 t + c1,2 ter1 t + c1,m1 tm1 1 er1 t + cq,1 er1 t + cq,2 ter2 t + cq,mq tmq 1 erq t

37

28

November 7th, 2012; Homogeneous Equations with constant coecients, II

Recall the example of the harmonic oscillator: x+ For simplicity, we assume


k m

k x = 0. m

(9)

= 1 and obtain the general solution c1 cos t + c2 cos t

by checking that cos t and sin t are linearly independent solutions of this second order homogeneous equation. Note that the characteristic equation is r2 + 1 = 0, which has solution r = i where i is the imaginary number such that i2 = 1. Therefore, formally, the solution of (9) is c1 eit + c2 eit . Actually, this is true if we allow c1 , c2 also be complex number. Recall that for any real number t, we Taylor expansions

e =
t k=0

tk k!

cos t =
k=0

(1)k t2k (2k)! (1)k t2k+1 (2k + 1)!

sin t =
k=0

Now we can dene the complex exponential by the above formal series. So we have

eit =
k=0

ik t k t2 it3 t4 = 1 + it + + + = k! 2 3! 4!

p=0

(1) p t2p (1)q t2q+1 +i = cos t + i sin t. (2p)! (2q + 1)! q=0

Let t = , we obtain the famous Eulers formula ei + 1 = 0. Clearly, cos t and sin t is the real part and imaginary part of eit . Note that the real part and the imaginary part of eit are cos t and sin t.

38

It can be checked by formal series expansion that ez1 ez2 = ez1 +z2 . Therefore, let z = a + ib, then ea+ib = ea eib = ea (cos b + i sin b) = ea cos b + iea sin b. So the real part and imaginary part of ea+ib are ea cos b, ea sin b respectively. Look at the examples y + 4y + 5y = 0. The characteristic equation has solutions r = 2 i. The real part and imaginary part of e(2+i)t are e2t cos t, The general solution is c1 e2t cos t + c2 e2t sin t. Example 28.1. Find the general solutions of y(4) + 4y(3) + 4y + 8y + 4y = 0. e2t sin t.

Characteristic equation r4 + 4r3 + 4y2 + 8y + 4 = (r2 + 2r + 2)2 = [(r + 1 + i)(r + 1 i)]2 = (r + 1 + i)2 (r + 1 i)2 . The root r = 1 + i has multiplicity 2. The real and imaginary part of e(1+i)t are et cos t and et sin t. Therefore, the general solutions are c1 et cos t + c2 tet cos t + c3 et sin t + c4 tet sin t.

29

November 9th, 2012; Mechanical Vibrations


k m.

Recall the free harmonic oscillator: m x + kx = 0. For simplicity, we write 0 = solution is x = A cos 0 t + B sin 0 t =
B Let = tan1 A , then we have cos =

The general

A2 + B2
A , A2 +B2

A A2 + B2

cos 0 t + So

B A2 + B2

sin 0 t .

sin =

B . A2 +B2

x=

A2 + B2 (cos 0 t cos + sin 0 t + sin ) =

A2 + B2 cos(0 t ) = C cos(0 t ),

39

where C = A2 + B2 is called the amplitude, 0 is called the circular frequency and is called the phase angle. Now we assume the existence of external force, the harmonical oscillator becomes the damped harmonic oscillator. x+ where p =
c 2m .

k x + 2p x = 0, m

Solutions of characteristic equation. Three cases (depending on coecients): two distinct real solutions; one real solution with multiplicity 2; two complex valued solutions. Example 29.1. The following equations describes the motion of damped harmonic oscillator. Find the precise solution for each equation. (1). (2). (3). x + 4 x + 2x = 0, x + 4 x + 4x = 0, x + 4 x + 8x = 0, x(0) = 0, x(0) = 4, x(0) = 0, x(0) = 4, x(0) = 0, x(0) = 4.

For the rst example, the characteristic equation is r2 +4r+2 = 0. Therefore, r1 = 2+ 2, r2 = 2 2. A general solution is of the form x(t) = c1 e(2+ 2)t + c2 e(2 2)t , x (t) = (2 + 2)c1 e(2+ 2)t + (2 2)c2 e(2 2)t . Plug in the intial condition, we have 0 = x(0) = c1 + c2 ; 4 = x (0) = (2 + 2)c1 + (2 2)c2 . This yields c1 = 2, c2 = 2. So the solution is x(t) = 2e(2+ 2)t 2e(2 2)t . The second example. The characteristic equation is r2 + 4r + 4 = (r + 2)2 = 0, which has root r = 2 of multiplicity 2. So the general solution is x(t) = c1 e2t + c2 te2t , x(t) = 2c1 e2t + c2 e2t 2c2 te2t . Plugging in the condition x(0) = 0, x(0) = 4, we have 0 = c1 , 4 = 2c1 + c2 . Therefore, c1 = 0, c2 = 4. The solution is x(t) = 4te2t . 40

The third example. The charactristic equation is r2 + 4r + 8 = (r + 2)2 + 4 = 0, which has roots r = 2 2i. It follows that e2t cos 2t and e2t sin 2t are linearly independent solutions. So general solution is x(t) = c1 e2t cos 2t + c2 e2t sin 2t, x(t) = 2c1 e2t (cos 2t + sin 2t) + 2c2 e2t (cos 2t sin 2t). It follows that 0 = x(0) = c1 ; 4 = x(0) = 2c1 + 2c2 . So c1 = 0 and c2 = 2, we obtain the solution x(t) = 2e2t sin 2t. Draw the picture of each solution and generalize the above example. For the damped harmonic oscillator x + 2p x + Note that p 0.
k (a). p2 m > 0, we have two nonpositive distinct roots for the characteristic equation. No oscillation behavior. This is the Overdamped case. k (b). p2 m = 0, we have one real root of multiplicity two. No oscillation behavior. This is called the Critically Damped case. k (c). p2 m < 0, we have a pair of conjugate imaginary roots for the characteristic equation. There are oscillation behavior. This is the Underdamped case.

k x = 0. m

Discuss more examples of using solutions to construct dierential equations.

30

November 12th, 2012; Nonhomogeneous equations

Suppose we have two solutions y1 and y2 for the non-homogeneous equations y(n) + pn1 y(n1) + pn2 y(n2) + p1 y + p0 = f (t). Clearly, y1 y2 satises the equation y(n) + pn1 y(n1) + pn2 y(n2) + p1 y + p0 = 0. (11) (10)

Therefore, every solution of (10) can be written as yc + y p where y p is a solution of (10), yc is the general solution of (11). In order to solve a non-homogeneous equation, we only need one more step than the homogeneous equation: to nd a particular solution.

41

Example 30.1. Check that t + 1 is the solution of y + 6y + 5y = 5t + 11. Write down the general solution of this equation. y p = t + 1, yc = c1 et + c2 e5t , so the general solution y = t + 1 + c1 et + c2 e5t . Example 30.2. Find particular solutions for the following equations (1).y 4y = 10e3t ; (2).y 4y = 4e2t . (12)

(1). Try y = Ae3t . Then calculate to obtain A = 2. (2). Note that y = e2t solves the corresponding homogeneous equation y 4y = 0. So we try y = Ate2t . Calculate to obtain A = 1. If f (x) is the linear combination of product functions of polynomial, erx and cos rx, sin rx, we can always try particular solutions. If f (x) itself is not a complementary solution, i.e., the solution of the associated homogeneous linear equation, then we try directly according to the form of f (x). If f (x) solves the homogeneous equation, then one should consider multiplying extra t terms. Example 30.3. Solve the initial value problem y + y = cos t, y(0) = 0, y (0) = 1.

Since f (t) = cos t, the rst try for particular solution is y = A cos t + B sin t. However, it is easy to see that (A cos t + B sin t) + (A cos t + B sin t) = 0, no matter how do we choose A, B. Actually, cos t and sin t are two linearly independent solutions of the homogeneous equation y + y = 0. The trick is to replace cos t by t cos t, sin t by t sin t. Let y = At cos t + Bt sin t, we have y = A cos t At sin t + B sin t + Bt cos t; y = A sin t A sin t At cos t + B cos t + B cos t Bt sin t = t(A cos t + B sin t) + (2A sin t + 2B cos t). It follows that 1 cos t = y + y = 2A sin t + 2B cos t, A = 0, B = . 2 42

1 We obtain a particular solution y p = 2 t sin t. On the other hand, the solutions for the homogeneous equation y + y = 0 is yc = c1 cos t + c2 sin t. Therefore, the general solution is

1 1 1 y = t sin t + c1 cos t + c2 sin t, y = t cos t + ( c1 ) sin t + c2 cos t. 2 2 2 Plug in the value t = 0, we have 0 = y(0) = c1 , 1 = y (0) = c2 . So the solution is 1 y = t sin t + sin t. 2

31

November 19th, 2012; Eigenvalues and Eigenvectors, I

In many cases, we will meet the problem of nding Ak v where A is an n n matrix and v is a column vector in Rn . If we do the calculation from the denition, then the calculation becomes very complicated when n and k are large. However, in some special case, the calculation is very simple. For example, if 1 1 1 A = 1 1 1 , 1 1 1 Then we have 3 Av = 3 = 3v, 3 1 v = 1 . 1

A2 v = A(3v) = 3Av = 32 v, , Ak v = A(Ak1 v) = A(3k1 v) = 3k v.

The multiplication of A to v behaves like the muliplication of a number to v. Therefore, the computation is much easier. The above example suggests us to study the following important concepts: eigenvector and eigenvalue. Denition 31.1. Suppose A is an n n square matrix. A number is called an eigenvalue of A provided there exists a nonzero vector v Rn such that Av = v. This v is called an eigenvector associated with the eigenvalue . Note that the requirement of v 0 is necessary since A0 = 0 for arbitrary .

In the above example, we see that for the given A, v is an eigenvector associated with the eigenvalue = 3. 43

More examples for eigenvalues. Let A= 2 3 . 4 6

Its easy to see that 8 is an eigenvalue. The vector v = (1, 2) is an eigenvector associated with the eigenvalue 8. What is the general method to search eigenvalues and eigenvectors? Suppose is an eigenvalue, v is an eigenvector associated with . Then Av = v, Av = Iv, (A I)v = 0. Note that v 0. So the linear system (A I)x = 0 has at least one non-zero solution v. This means that this system has at least one free variables. So the reduced row echelon form of A I has at most n 1 leading 1s. Therefore, the matrix A I is not invertible, which is equivalent to say det(A I) = 0. After expansion, we see that det(A I) = 0 is a polynomial equation of variable . This equation is called the characteristic equation. Example 31.1. Find the characteristic equation for the following matrices. A= 2 3 4 6

We see that det(A I) = det 2 3 = (2 )(6 ) 12 = 2 8. 4 6

The characteristic equation is 2 8 = 0. Clearly, the eigenvalues are the roots of the characteristic equations: = 0 and = 8. Once the eigenvalues are decided, one can calculate the eigenvectors by solving the linear system (A I)x = 0. Lets continue to calculate the eigenvectors for Example 31.1. For = 0, we solve
3 2 3 x1 0 1 2 = 4 6 x2 0 0 0

x1 0 = x2 0

3 So the associated eigenvectors are ( 2 t, t) for arbitrary nonzero t.

For = 8, we solve
1 6 3 x1 0 1 2 = , 4 2 x2 0 0 0

x1 0 = x2 0

Therefore, the associated eigenvectors are ( 1 t, t) for arbitrary nonzero t. 2 Note that for every eigenvalue, the eigenvector is NOT unique. 44

Example 31.2. Find eigenvalues and eigenvectors for the following matrices. A= 0 4 , 1 0 B= 3 0 , 0 3 C= 1 2 . 0 1

For A, = 2i, with eigenvectors being multiples of ( 2i, 1) . For B, = 3, with eigenvectors being all nonzero vectors. For C, = 1, with eigenvectors being multiples of (1, 0) . Example 31.1 and Example 31.2 illustrate all the four possibilities for eigenvalues and eigenvectors of 2 2 matrix A.

32

November 21st, 2012; Eigenvalues and Eigenvectors, II

First, lets calculate the following example Example 32.1. Find the eigenvalues and the associated eigenvectors for the matrix 1 2 1 A = 1 0 1 0 0 1

Calculate characteristic equation. 1 1 2 1 det(A I) = det 1 0 0 1 = (1 ) det 1 2 1

= (1 )(2 2) = ( 2)( 1)( + 1). So the eigenvalues are 1 = 2, 2 = 1, 3 = 1. For each eigenvalue, we can calculate the eigenvectors. For = 2, we have 1 1 2 0 1 2 A 1 I = 1 2 1 , rre f (A 1 I) = 0 0 1 0 0 0 0 0 1 So the solution space of (A 1 I)x = 0 is spanned by v1 = (2, 1, 0) . The eigenvectors associated with = 2 are tv1 for arbitrary t 0. 45

For 2 = 1, we have 0 2 1 1 0 3 2 A 2 I = 1 1 1 rre f (A 2 I) = 0 1 1 2 0 0 0 0 0 0


3 The solution space is spanned by v2 = 2 , 1 , 1 . The eigenvectors associated with = 1 are 2 all the vectors tv2 for arbitrary t 0.

For 3 = 1, we have 2 2 1 1 1 0 A 3 I = 1 1 1 rre f (A 3 I) = 0 0 1 0 0 2 0 0 0 The solution space is spanned by v3 = (0, 1, 0) . The eigenvectors associated with = 1 are all the vectors tv3 for arbitrary t 0. Then show the following general fact: every 3 3 matrix has at least one real eigenvalues. Actually, after fundamental calculation, it is easy to see that f () = det(A I) is a polynomial of degree 3. The leading term is (1)3 3 . It follows that

lim f () = ,

lim f () = .

By the continuity of f , there must exist some 0 (, ) such that f (0 ) = 0. The same argument applies when A is an odd-dimensional square matrix. So every odd dimensional square matrix has at least one real eigenvalues. We already know for a given eigenvalue, the associated eigenvectors are not unique. However, all the eigenvectors plus zero form a linear subspace, which is called the eigenspace of . Denition 32.1. Suppose is an eigenvalue of A, then the solution space of (A I)x = 0 is called the eigenspace of A associated with the eigenvalue . Example 32.2. Calculate the eigenspaces of the matrix 4 2 1 A = 2 0 1 2 2 3

Calculate characteristic equation 1 4 2 1 = = 3 72 + 16 12. det(A I) = det 2 2 2 3 We observe an integer solution = 2, then use long division to obtain the factorization. det(A I) = ( 2)2 ( 3). 46

For = 2. We obtain the eigenspace is span {v1 , v2 } where 1 v1 = 1 , 0 1 v2 = 0 2

For = 3. We obtain the eigenspace is span v3 where 1 v3 = 1 1 Then use the eigenvectors and eigenvalues to calculate A4 v in the above example, where v = (2, 3, 4) . Actually, note that v = v1 + v2 + 2v3 , Av = 2v1 + 2v2 + 2 3v3 , A2 v = 22 v1 + 22 v2 + 2 32 v3 , A4 v = 24 v1 + 24 v2 + 2 34 v3 . Therefore, 1 162 0 A4 v = 16(v1 + v2 ) + 162v3 = 16 1 + 162 1 = 178 . 194 1 2

33

November 26th, 2012; First-Order Linear Systems and Matrices

Return to the harmonic oscillator and translate it to a rst order linear system. Therefore, we can solve a rst order linear system by translating it to a second order equation. Examples of translating a high-order equation to linear system. Example 33.1. Using substitution to translate the following equation into a rst order linear system: y(3) + 10y + 3y 10y = et .

Solve the rst order linear system: x = y, y = x. 47

This system is the same as x + x = 0. The characteristic equation for this second order, homogeneous equation is r2 + 1 = 0, r = i x = c1 cos t + c2 sin t. In general, a rst order dierential equation system is y1 = p11 (t)y1 + p1n yn + f1 (t), y2 = p21 (t)y1 + p2n yn + f2 (t), yn = pn1 (t)y1 + pnn yn + fn (t). Then suggest the existence and uniqueness theorem: if each function pi j and fk are continuous on the open interval I containing the point a. Then given the n numbers b1 , b2 , , bn , the above system has a unique solution on the entire interval I satisfying the n initial conditions y1 (a) = b1 , , yn (a) = bn . We can put y1 , y2 , , yn into one vector y, the coecients pi j (t) into one matrix P(t), then we have dy = Py + f . dt

34

November 28th, 2012; First-Order Linear Systems and Matrices

Linear property. Let y1 , , yn be n solutions of the homogeneous linear equation on the open interval I. If c1 , c2 , , cn are constants, then the linear combination y(t) = c1 y1 (t) + + cn yn (t) is also a solution. The vector-valued functions y1 , , yn are linearly dependent on the interval I if there exist constants c1 , c2 , , cn , not all zero, such that c1 y1 (t) + cn yn (t) = 0 for all t in I. Suppose each yi is n-dimensional vector valued functions on the inteval I, then the Wronskian of y1 , , yn is y11 (t) y12 (t) y1n (t) y (t) y (t) y (t) 21 22 2n W(t) = det yn1 (t) yn2 (t) ynn (t) 48

Note that if y1 , , yn are linearly dependent, then the Wronskian is a zero function. Note the dierence of Wronskian here and the Wronskian of a higher order dierential equation. By the existence and uniqueness theorem, we see that if y1 , , yn are solutions of the homogeneous equation, then the Wronskian is zero fundtion if and only if the Wronskian is zero at some point. By the existence and uniqueness theorem, the solution of a homogeneous equation is totally decided by the initial value at the point a I. At the point a, the initial value can be arbitrary vector in b Rn . So the solution space has the same structure as Rn . Theorem 34.1. Suppose y1 , , yn be n linearly independent solutions of the homogeneous linear equation y = P(t)y on an open interval I where P(t) is continuous. Then every solution can be written as y(t) = c1 y1 (t) + + cn yn (t).

Example 34.1. Verify that y1 (t) = (3e2t , 2e2t ) , y2 (t) = (e5t , 3e5t ) are solutions of the rst-order system y1 = 4y1 3y2 , y2 = 6y1 7y2 .

Check the Wronskian, the linear independence, then every solution can be written as c1 y1 (t) + c2 y2 (t). Example 34.2. Verify that y1 (t) = (2et , 2et , et ) , y2 (t) are solutions of 3 2 dy = 1 3 dt 0 1 Then with initial value y(0) = (0, 2, 6) . Nonhomogeneous equation. = (2e3t , 0, e3t ) , y3 (t) = (2e5t , 2e5t , e5t ) 0 2 y 3

35

November 30th, 2012; First Linear Equations and Matrices(II)

Consider the simple Homogeneous system: d y = Ay, dt A= 0 1 . 1 0

49

Write y = (y1 , y2 ) . Then the system can be easily written as d y1 = y2 , dt d y2 = y1 . dt Clearly, both ya = The Wronskian W(t) = det cos t sin t = cos2 t + sin2 t = 1 sin t cos t 0. cos t , sin t yb = sin t cos t

So ya and yb form a basis of the solution space. Every solution can be written as y = ca ya + cb yb = where ca , cb are arbitrary constants. The method discussed above is not general. The general method is to use eigenvalues and eigenvectors. The characteristic equation det So = i. For i, calculate the corresponding eigenspace (i.e. the solution space, or kernel space of AI). A I = A iI = i 1 1 i
R1(i)+R2

ca cos t cb sin t , ca sin t + cb cos t

1 = 2 + 1 = 0. 1

i 1 R1i 1 i 0 0 0 0

It follows that the eigenspace is spanned by (i, 1) . So the following vector-valued function is a solution: eit i ieit sin t + i cos t sin t cos t = it = = +i 1 e cos t + i sin t cos t sin t

Take real and imaginary part, we obtain two independent solutions y1 = So every solution can be written as y = c1 y1 + c2 y2 = c1 sin t + c2 cos t . c1 cos t + c2 sin t sin t cos t , y2 = . cos t sin t

Note that the particular expression of solutions as linear combination of basis vectors may be dierent. However, the solution space is independent of the choice of basis. 50

Example 35.1. Solve the initial value problem: d y = Ay, dt where A= 4 3 . 3 4 y(0) = 1 1

Characteristic equation ( 4)2 + 9 = 0. Eigenvalue = 4 3i. For the eigenvalue = 4 + 3i, the eigenspace is the kernel space of A I = 3i 3 3 3i
R1 (i)+R2

3i 3 0 0

R1/(3i)

1 i , 0 0

which is spanned by (i, 1) . So we have vet = e(4+3i)t i sin 3t + i cos 3t sin 3t cos 3t = e4t = e4t + ie4t 1 cos 3t + i sin 3t cos 3t sin 3t

The real part and the imaginary part of this complex-valued solution are two independent real solutions. Therefore, every solution is y = c1 e4t Plug in the initial value y(0) = (1, 1) . So 0 1 1 = y(0) = c1 + c2 c1 = 1, c2 = 1. 1 0 1 So the solution of the initial value problem is y = e4t cos 3t sin 3t . cos 3t + sin 3t sin 3t cos 3t + c2 e4t cos 3t sin 3t

36

December 3rd, 2012; Multiple Eigenvalue solutions

Suppose A= Then what are the general solutions of


d dt y

1 1 . 0 1

= Ay?

51

The characteristic equation is ( 1)2 = 0. There is a unique eigenvalue = 1 with multiplicity 2. The eigenspace is spanned by (1, 0) . So we have a solution et However, the solution space of solution? An easy guess is
d dt y

1 . 0

= Ay is two-dimensional? Where is the other independent tet 1 . 0

However, direct calculations show that it is not a solution. However, we can change it to be a solution by adding an extra term et w. Let v = (1, 0) . Suppose y = (tv + w)et is a solution. Then d y = (tv + w)et + vet = et (1 + t)v + w , dt Ay = et (tAv + Aw) = et tv + Aw . So y is a solution if and only w satises (A I)w = v. Choose w = (0, 1) . Then we see that et tv + w = et t 1

It is easy to see this is a solution independent of et v. So the general solutions are y = c1 et v + c2 et (tv + w) = et c1 + c2 t . c2

Note that (A I)w = v, (A I)v = 0. Therefore, we can also solve w by (A I)2 w = 0. Then we start the general discussion. An eigenvalue is called of multiplicity k if it is a k-fold root of the characteristic equation. It is possible that for some eigenvalue with multiplicity k, the dimension of the eigenspace is less than k. If we denote the dimension of eigenspace as p. Then dene d = k p, which is called defect. An eigenvalue of mulitplicity k is called complete if the dimension of the corresponding eigenspace is of dimension k. So is complete if and only if d = 0. Note that if k = 1, then is always complete. An eigenvalue of multiplicity k > 1 is called defective if it is not complete, i.e. d > 0. If d > 0, we have to use similar method as we have discussed in the previous example to nd new solutions. If A is 2 2 and is defective, there is only one possibility: k = 2, p = 1, d = 1. In this case, d there is a uniform way to solve the system dt y = Ay. 1. Solve a v2 such that (A I)2 v2 = 0 and (A I)v2 2. Then solutions are v1 et , v1 t + v2 et . 52 0. Let v1 = (A I)v2 .

37

December 5th; Multiple Eigenvalue solutions(II)

A more complicated example. Example 37.1. Find the eigenvalue and eigenspace of the matrix A. A= 1 3 3 7

The characteristic equation is ( 4)2 = 0. For 3-dimensional case. Example 37.2. Find the general solutions of d y = Ay, dt 5 1 1 3 0 A=1 3 2 1

Calculate characteristic equation (3 )3 = 0. The eigenvalue = 3 has multiplicity 3. Note that (A I)3 = 0. So v1 can be chosen as arbitrary vector. Without loss of generality, let 1 v1 = 0 0 Then calculate v2 = (A I)v1 , v3 = (A I)v2 . We have 0 2 v2 = 1 , v3 = 2 3 2 Then one can write down the generalized solutions. Discuss the general situation. For an eigenvalue with multiplicity k and defect d. Only need to try from (A I)d+1 .

38

December 7th, Matrix Exponentials and Linear Systems


d dt y

Suppose y1 (t), , yn (t) form a basis of the solution space of the homogeneous system We can combine the yi s to obtain a matrix (t) = (y1 (t), , yn (t)),

= Ay.

53

which is called the fundamental matrix. Its easy to see that (t) solve the dierential equation of matrices: d (t) = A(t). dt As a corollary, the initial value problem d y = Ay, dt has the solution y(t) = (t)(0)1 y0 . Example 38.1. Solve the initial value problem in Example 35.1 by fundamental matrix method. According to the solution of Example 35.1 , we have (t) = So we have (0) = Recall that initial condition y0 = So we obtain y(t) = (t)(0)1 y0 = Since (t) is a solution of d Y(t) = AY(t), dt where Y(t) a square-matrix-valued function. We can guess that the solution of this equation is Y(t) = eAt Y(0), by formally look Y as a number. Actually, this makes sense if we dene the exponential of square matrix by formal series. For every square matrix B, dene

y(0) = y0 ,

e4t sin 3t e4t cos 3t e4t cos 3t e4t sin 3t

0 1 , 1 0

(0)1 =

0 1 1 0

1 1

e4t sin 3t e4t cos 3t 0 1 1 e4t ( sin 3t + cos 3t) 4t cos 3t 4t sin 3t 1 0 1 = e4t (cos 3t + sin 3t) e e

e =
B k=0

Bk k!

54

By this denition, we can claculate the derivative of eAt formally. d At d e = dt dt


k=0

(At)k d A2 t2 A3 t3 = I + At + + + k! dt 2! 3!

A3 t2 2! A2 t2 = A I + At + + 2! = A + A2 t + = AeAt . Therefore, we can easily check that eAt (0) is a solution of


d dt Y

= AY:

d At d At e (0) = e (0) = AeAt Y(0) = A eAt (0) . dt dt Note that eAt = I. So eAt (0) is a solution of the matrix dierential equation with initial value (0). On the other hand, (t) is another solution of the same equation with the same initial conditions. So we have (t) = (0)eAt . Calculate some examples of eAt .

39

December 10th, Nonhomogeneous Linear Systems

Consider the linear system d y = P(t)y + f (t). dt It is easy to see the solution has the decomposition y(t) = yc (t) + y p (t), where y p (t) is a particular solution of (13), yc (t) is a general solution of the associated homoged neous system dt y = Ay. How to nd y p ? There is a general method called variation of parameters. Let (t) be the fundamental matrix of the associated homogeneous system. Then for every constant vector c, we see that (t)c is a solution of the homogeneous dierential equation. If c is not a constant vector, but a vector function, then the derivative of c will come into eect. By choose good c(t), it is possible to let the derivative of (t)c(t) be exactly f (t): d d d d (t)c(t) = (t) c(t) + (t) c(t) = P(t)(t)c(t) + (t) c(t) = f (t). dt dt dt dt (13)

55

Thereofore, the choice of c(t) is given by the restriction f (t) = (t) d d c(t) c(t) = (t)1 f (t). dt dt

So c(t) is any antiderivative of (t)1 f (t). For example, we can choose c(t) =
a t

(s)1 f (s)ds.

So we obtain a particular solution y p = (t)


a d Theorem 39.1. If (t) is a fundamental matrix for the homogeneous system dt y = P(t)y on some interval where P(t) and f (t) are continuous, then a particular solution of the nonhomogeneously system t

(s)1 f (s)ds.

d y = P(t)y + f (t) dt is given by y p (t) = (t)


a t

1 (s) f (s)ds.

Suppose y(a) is given, then the solution is y = yc + y p = (t)(a)1 y(a) + (t)


a t

(s)1 f (s)ds.

In particular, if P(t) = A, then we can choose (t) = eAt . Then (t)1 = eAt . So y = eA(ta) y(a) + eAt
a t

eAs f (s)ds.

Example 39.1. Solve the initial value problem d y = Ay + f , dt where A= 4 3 1 ,f = 3 4 1 y(0) = 0 0

56

The solution is y = eAt y(0) + eAt


0 t

eAs f ds = eAt
0

eAs f ds.

Note that eAt = e4t cos 3t sin 3t , sin 3t cos 3t cos 3s sin 3s , sin 3s cos 3s cos 3s sin 3s 1 cos 3s + sin 3s = e4s sin 3s cos 3s 1 sin 3s + cos 3s

eAs = e4s eAs f = e4s Note that


t 0

e4s cos 3sds =

1 t 4s 1 1 e d sin 3s = e4s sin 3s|t 0 3 0 3 3 t 1 4t e sin 3t + 4 = e4s sin 3sds 3 0 1 4t cos 3s 4s t 4 = e sin 3t + 4 e |0 3 3 3 1 16 4 = e4t sin 3t + 1 e4t cos 3t 3 9 9

sin 3s (4)e4s ds
0

e4s cos 3sds


0 t

e4s cos 3sds


0

It follows that 25 9 4 1 e4s cos 3sds = e4t sin 3t + 1 e4t cos 3t , 3 9 0 t 3 4 e4s cos 3sds = e4t sin 3t + 1 e4t cos 3t . 25 25 0
t

Similarly, we can calculate


t 0

e4s sin 3sds =

1 e4s cos 3sds e4t sin 3t 4 0 4 4t 3 = e sin 3t + 1 e4t cos 3t . 25 25

3 4

So we have
t 0

eAs f ds = = =

t 0 4t e

e4s cos 3s + e4s sin 3s ds e4s sin 3s + e4s cos 3s

sin 3t + 7(e4t cos 3t) 25 7 sin 3t + (e4t cos 3t)

e4t (7 cos 3t sin 3t) + 7e4t 25 ( cos 3t + 7 sin 3t) + e4t 57

It follows that y(t) = eAt


0 t

eAs f ds cos 3t sin 3t e4t (7 cos 3t sin 3t) + 7e4t sin 3t cos 3t 25 ( cos 3t + 7 sin 3t) + e4t

= e4t =

1 7 + e4t (7 cos 3t sin 3t) . 25 1 + e4t (cos 3t + 7 sin 3t)

This is the nal solution. It is not hard to verify that it satises the equation.

40

December 12th, Stability and Phase Plane

Learn how to draw the phase diagram. A rst order system of the form d x = F(x, y), dt d y = G(x, y). dt is called autonomous system. In an autonomous system, the variable t does not appear in the right hand side. Starting from time t0 , the solution x(t), y(t) corresponds to a curve in the xy-plane, which is called the phase plane. This curve is called a trajectory. A critical point is a point (x , y ) such that F(x , y ) = G(x , y ) = 0. Find the critical points of the following autonomous system. dx = 14x 2x2 xy, dt dy = 16y 2y2 xy. dt The critical points are (0, 0), (0, 8), (7, 0), (4, 6). A phase portrait is a picture on the phase plane (xy-plane). It consists of the critical points and typical solution curves. Example 40.1. Draw the phase diagram for the following linear system. d z = Az, dt where z = (x, y) . A is 1 0 2 0 1 0 0 1 0 1 , , , , 0 1 0 3 0 2 1 0 0 0

58

41

December 14th, Review for the nal

The nal will cover the following sections. Chapter 4.1, 4.2, 4.3, 4.4. Chapter 5.1, 5.2, 5.3, 5.4, 5.5. Chapter 6.1. Chapter 7.1, 7.2, 7.3, 7.5. Chapter 8.1, 8.2. You can go to the Math library to request previous exams for practice. Some limited exams are online: http://math.library.wisc.edu/reserve/320.html However, you need to make sure wheter the problems in the previous exams are covered by the above list. The nal will take place in Social Science 5206, 7:25-9:25pm, 12/18/2012.

59

MATH 320: HOMEWORK 1

EP1.1.4 Verify by substitution that each given function is a solution of the dierential equation: y = 9y; y1 = e3x , y2 = e3x . Solution. This is a second order ODE, so we need second derivatives: y1 = 3e3x , y2 = 3e3x y1 = 9e3x , y2 = 9e3x . From this we get y1 = 9y1 , y2 = 9y2 . Thus both functions are solutions. EP1.1.22 Consider the IVP ey y = 1; y(0) = 0. First verify y(x) = ln(x + C) satises the ODE. Then determine a value of the constant C so that y(x) satises the initial condition. Sketch several typical solutions of the ODE, then highlight the one that satises the initial condition. Solution. We only need the rst derivative of the given function, 1 . y = x+C Substituting this into the ODE, we get 1 1 eln(x+C) = (x + C) = 1. x+C x+C Thus y(x) is a solution to the ODE. For the initial condition we solve some basic algebra. y(0) = 0 ln(0 + C) = 0 C = e0 = 1. So with C = 1 the initial condition is satised. I have not included the necessary sketches here. EP1.1.32 Write a dierential equation that models the following: The time rate of change of a population P is proportional to the square root of P . Solution. dP = k P dt EP1.1.33 Write a dierential equation that models the following: The time rate of change of the velocity v of a coasting motorboat is proportional to the square of v. Solution.
dv dt

= kv 2
1

MATH 320: HOMEWORK 1

EP1.1.34 Write a dierential equation that models the following: The acceleration v of a Lamborghini is proportional to the dierence between 250 km/h and the velocity of the car. Solution. v = k(250 v). EP1.2.5 Find a function y = f (x) satisfying the given ODE and initial condition 1 dy ; y(2) = 1. = dx x+2 Solution. We integrate directly:
x

f (x) = 1 +
2

1 dt = 1 + [log(t + 2)]x 2 t+2

= 1 + (log(x + 2) log 4) and we are done. EP1.2.20 A particle starts at the origin and travels along the x-axis with the velocity function v(t) whose graph is shown. Sketch the graph of the resulting position function x(t) for 0 t 10. Solution. This is just like one of those calculus problems, but they give you a starting point. Your sketch should look something like the included gure. EP1.2.43 Arthur Clarkes The Wind from the Sun describes Diana, a spacecraft propelled by the solar wind. Its aluminized sail provides it with a constant acceleration of .0001 g = .0098 m/s2 . Suppose this spacecraft starts from rest at time t = 0 and simultaneously res a projectile (straight ahead in the same direction) that travels at one-tenth of the speed c = 3.0108 m/s of light. How long will it take the spacecraft to catch up with the projectile, and how far will it have traveled by then? Solution. Let xp (t) denote the position of the projectile and xD (t) denote the position of the spacecraft Diana at time t. The frame of reference that is our best choice places Diana at position zero at time zero, i.e., xD (0) = 0, and we say the projectile was red at time

MATH 320: HOMEWORK 1

zero, xp (0) = 0. Since the projectile travels at a constant velocity .1c, its position is given by xp (t) = (.1c)t. We must nd an expression for the position of Diana and nd when the two meet. We have only Dianas acceleration, .0001 g, and that its initial velocity is zero (it starts from rest). From this we can conclude that the velocity at time t is vD (t) = (.0001 g)t. Velocity is the derivative of position, so we have the IVP dxD = .0001gt, xD (0) = 0 dt which is solved by integration:
t

xD (t) = 0 +
0

.0001gs2 .0001gs ds = 2

= .0005gt2 .
0

The two paths cross when xD (t) = xp (t), so .0005gt2 = .1ct t = Thats a really long time. EP1.3.26 Suppose the deer population P (t) in a small forest satises the logistic equation dP = .0225P .0003P 2 . dt Construct a slope eld and appropriate solution curve to answer the following questions: If there are 25 deer at time t = 0 and t is measured in months, how long will it take the number of deer to double? What will be the limiting deer population? Solution. You can use DFIELD to get an idea of the slope eld. By the slope eld I have, it should take about 62 months for the the deer population to double. It also appears that the limiting population should be 75 deer. EP1.4.26 Find a particular solution for the IVP dy = 2xy 2 + 3x2 y 2 , y(1) = 1. dx .1c 193878.625 years. .0005g

MATH 320: HOMEWORK 1

Solution. I like breaking problems into separate pieces. Step 1: Find the general solution. We separate the variables like so: 1 dy = (2x + 3x2 ) dx. y2 This integrates easily: 1 . + x3 + C Step 2: Apply the initial conditions. This leads to an algebraic computation. 1 C = 3. 1 = 2 1 + 13 + C y 1 = x2 + x3 + C y(x) = x2 Conclusion. The solution to the given initial value problem is y(x) =
1 . x2 +x3 3

EP1.4.45 The intensity I of light at a depth of x meters below the surface of a lake staises the ODE dI = 1.4I. dx (a) At what depth is the intensity half the intensity I0 at the surface (where x = 0)? (b) What is the intensity at a depth of 10 m (as a fraction of I0 )? (c) At what depth will the intensity by 1% of that at the surface? Solution. Before moving to any of the individual parts of the problem note that we will need to nd I(x) by solving the initial value problem. So lets do that rst. Step 1: Find the general solution. We can separate the variables: 1 dI = 1.4 dx log I = 1.4x + C I(x) = D exp(1.4x) I where D is the parameter determined by the initial condition. Step 2: Apply the initial conditions. In this case we are only given that I(0) = I0 . Substituting this into our expression for I(x) we get I0 = D exp(1.4 0) = D exp(0) = D. Conclusion. The function which solves the initial value problem if I(x) = I0 exp(1.4x). (a) This question is really asking for what value of x do we have I(x) = I0 /2. This is just solving the equation I0 exp(1.4x) = I0 /2 1.4x = log 1 2 x= log 2 .495. 1.4

(b) This is asking for I(10) = I0 exp(1.4 10) 8.3 107 I0 . (c) Very similar to (a). I(x) = .01I0 leads to log .01 1.43. 1.4 So the light intensity is 1% of its surface intensity at only 1.43 meters deep. exp(1.4x) = .01 x = That about does it.

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

(1) Verify the function x(t) = 5e2t is a solution to the dierential equation x 3x + 2x = 0. If we add the initial condition x(0) = 4, is x(t) a solution to the IVP? (2) Integration Review. Compute the following integrals: (a) (b) sin2 (x) dx ex cos x dx Hint. Use a trig identity Hint. Use integration by parts twice

Solution. (a) sin2 (x) dx = 1 1 (1 cos 2x) dx = 2 2 x 1 sin 2x + C 2 ex cos x

(b) 2

ex cos x dx = ex sin x

ex sin x dx = ex sin x ex cos x ex cos x dx =

ex cos x dx = ex (sin x + cos x) + C

1 x e (sin x + cos x) + C 2

(3) How do we change a rst order linear dierential equation to begin solving it? (This is just one line.) Solution. We multiply the ODE by a generic function, (t). That is, (t)y (t) + (t)a(t)y(t) = (t)b(t).

(4) Determine the order of the following ODEs, and classify them as linear or nonlinear. (a) y + 3y 7y = 0, (b) x + 5tx = tan t, (c)
d4 y dx4

+ sin y =

dy dx .

(5) Verify the given function(s) is a solution to the ODE or IVP. (a) y y = 0; y(x) = cosh x Solution. y = sinh x, y = cosh x. Therefore, y y = cosh x cosh x = 0.

(b) tx x = t2 , x(0) = 0; x(t) = 3t + t2


1

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

Solution. x = 3 + 2t. Thus t(3 + 2t) (3t + t2 ) = t2 , so this is a solution to the ODE. Since x(0) = 3 0 + 02 = 0, this is a solution to the IVP. (c) t2 y + 5ty + 4y = 0, y(e) = e2 ; y1 (t) = t2 , y2 (t) = t2 log t. Solution. y1 = 2t3 , y1 = 6t4 . Therefore, t2 (6t4 ) + 5t(2t3 ) + 4(t2 ) = (6 10 + 4)t2 = 0 so y1 is a solution to the ODE, and y1 (e) = e2 so it solves the IVP. y2 = 2t3 log t + t2 t1 , y2 = 6t4 log t 2t3 t1 3t4 = 6t4 log t 5t4 . So t2 (6t4 log t 5t4 ) + 5t(2t3 log t + t2 t1 ) + 4(t2 log t) = (6 10 + 4)t2 log t + (5 + 5)t2 = 0. Therefore, y2 is a solution to the ODE, and y2 (e) = e2 log e = e2 so it solves the IVP. (6) Sketch a direction eld and a few integral curves for the following ODEs. (a) y = y(4 y) (b) y = y(y 4) (c) y = y(y 3)2 (d) y = sin y (7) Research: Find two ODEs (of any order) from your major and explain the system the describe. Do not solve these equations or explain how they were derived, simply tell me what system they are used to model. (8) Solve the following initial value problems: (a) t2 (1 + x2 ) + 2x dx = 0; dt x(0) = 1, Solution. We do a little algebra to rewrite: 2x dx 2x = t2 dx = t2 dt. 1 + x2 dt 1 + x2 Since this is separable, we directly integrate. 2x dx = 1 + x2 t2 dt log(1 + x2 ) = t3 /3 + C
3 /3

1 + x2 = Det

x(t) =

Det3 /3 1.

Awesome, now we determine the coecient D using the initial conditions. x(0) = 1 = De03 /3 1 1 = D 1 D = 2. 2et3 /3 1.

Thus, the solution is x(t) =

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

(b)

dy dt

+ ty = t;

y(0) = 2.

Solution. This equation is linear, so we employ those techniques. You multiply by an integrating factor: dy + ty = t. dt We want to take advantage of the product rule, hoping that d [y] = y + y = t. dt This requires y + y = y + ty = t. We solve this by separation, and nd that a solution is (t) = exp(t2 /2). With this integrating factor, we know that d t2 /2 2 2 2 e y = et /2 t et /2 y = et /2 + C dt y(t) = 1 + Cet
2 /2

Now, the initial condition requires y(0) = 2 = 1 + Ce0 , so C = 1. (9) What is the equation of the logistic growth model? How should we solve it? Solution. There are a number of dierent forms, but I prefer dP = rP dt 1 P K .

This is a separable ODE, which we know how to solve. (10) What are the ideas from basic calculus we take advantage of to solve separable equations and linear equations? (Each answer is only two words long). Solution. The solution of separable equations is based on the chain rule. The solution of separable equations is based on the product rule. (11) In this course, I presented a theorem on the existence and uniqueness of solutions to the IVP dy = f (t, y), y(0) = y0 dt assuming some smoothness conditions for f . The theorem is quite general, and it certainly includes linear rst order ODEs. However, there is a dierent theorem specic to linear rst order ODEs. What is the advantage of the theorem for linear rst order ODEs compared to the more general theorem? Solution. Both existence and uniqueness theorems tells us that our solution to the IVP is only valid only a limited interval. The theorem for linear ODEs has the advantage that we know exactly how big the interval is. In particular, it states that the solution is valid in the interval in which a(x) and b(x) are both continuous. The more general merely says that the interval exists. It may be incredibly tiny, or quite large. The general theorem says nothing one way or the other. BD2.2.3 Solve the ODE y + y 2 sin x = 0. Solution. We do a little algebra to rearrange the equation as y 2 dy = sin x dx.

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

Then we integrate: y 1 = cos x + C. Now do more algebra to arrive at y(x) = BD2.2.4 Solve the ODE y = (3x2 1)/(3 + 2y). Solution. We do a little algebra to rearrange the equation as (3 + 2y) dy = (3x2 1) dx. Then we integrate: 3y + y 2 = x3 x + C. This is actually a quadratic equation in y if we rewrite as y 2 + 3y + (x x3 + D) = 0, so we can solve for y to get y(x) = 3 9 4x + x3 + E . 2 1 . cos x + C

BD2.2.5 Solve the ODE y = (cos2 x)(cos2 2y). Solution. We do a little algebra to rearrange the equation as 1 dy = cos2 x dx. cos2 2y Now use a trig identity to get sec2 2y dy = Then we integrate: 1 x 1 tan 2y = sin 2x + C. 2 2 4 Now do more algebra to arrive at y(x) = 1 1 arctan x sin 2x + D . 2 2 1 (1 + cos 2x) dx. 2

BD2.2.6 Solve the ODE xy = (1 y 2 )1/2 . Solution. We do a little algebra to rearrange the equation as 1 1 y2 Then we integrate: arcsin y = log x + C. Now do more algebra to arrive at y(x) = sin(log x + C). dy = 1 dx. x

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

BD2.2.21 Solve the IVP y = (1 + 3x2 )/(3y 2 6y), y(0) = 1 and determine the interval in which the solution is valid. Solution. We do a little algebra to rearrange the equation as 3y 2 6y dy = 1 + 3x2 dx. Then we integrate: y 3 3y 2 = x + x3 + C. Applying the initial condition, we get 13 3 12 = 0 + 0 + C C 2. So our solution is y 3 3y 2 = x + x3 2. We will consider this implicit solution good enough. Now we know this solution is only valid in the interval between two vertical tangents. These vertical tangents occur when dy = 3y 2 6y = 0 y = 0 or y = 2. dx So the question is, where on our specic solution does y = 0 and y = 2? We substitute these values in to nd that 03 3 02 = x + x3 2 x = 1 and 23 3 22 = x + x3 2 x = 1. So the interval in which the solution is valid is 1 < x < 1. BD2.5.3 Sketch f (y) versus y, determine the critical (equilibrium) points, and classify each as stable or unstable. Draw the phase line, and sketch several graphs of solutions in the ty-plane. dy = y(y 1)(y 2), y0 0 dt Solution. First, the graph of f .

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

The critical points occur when f (y) = 0, and that is y = 0, y = 1, and y = 2. The points are unstable, stable, and unstable. Now, the phase line and some solutions.

BD2.5.4 Sketch f (y) versus y, determine the critical (equilibrium) points, and classify each one asymptotically stable or unstable. Draw the phase line, and sketch several graphs of solutions in the ty-plane. dy = ey 1, < y0 < dt Solution. First, the graph of f .

The critical points occur when f (y) = 0, and that is y = 0. The point is unstable. Now, the phase line and some solutions.

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

BD2.5.22 Suppose that a given population can be divided into two parts: those who have a given disease and can infect others, and those who do not have it but are susceptible. Let x be the proportion of susceptible individuals and y the proportion of infectious individuals; then x+y = 1. Assume that the disease spreads by contact between sick and well members of the population and that the rate of spread
dy dt

is proportional to the number of such contacts.

Further, assume that members of both groups move about freely among each other, so the number of contacts is proportional to the product of x and y. Since x = 1 y, we obtain the IVP dy = y(1 y), y(0) = y0 , dt where is a positive proportionality factor, and y0 is the initial proportion of infectious individuals. (a) Find the equilibrium points for the dierential equation and determine whether each is asymptotically stable, semistable, or unstable. (b) Solve the IVP and verify that the conclusions you reached in part (a) are correct. Show that y(t) 1 as t , which means that ultimately the disease spreads through the entire population. Solution. Here we go. (a) The equilibrium points occur when y(y 1) = 0, so for y = 0 and y = 1. Doing a phase line we can determine y = 1 is stable and y = 0 is unstable. (b) This is a separable equation: dy = dt. y(1 y)

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

Now we integrate, using partial fractions. 1 1 dy = t + C log y y1 y= y y1 = t + C

Det Det + 1 Now applying the initial conditions, we nd that y(0) = y0 = D y0 D= . D1 y0 1

Thus, we can do a little algebra to nd the specic solution to the IVP y(t) = y0 et y0 . = t y + 1 y0 e y0 (y0 1)et 0

We said in (a) that the only stable equilibrium point is 1, so all solutions should tend to 1 (that is for positive y0 , and this is a population problem). To verify this we note that the limit as t is
t

lim y(t) =

1 : y0 > 0 0 : y0 = 0

BD2.5.28 A second order chemical reaction involves the interaction (collision) of one molecule of a substance P with one molecule of a substance Q to produce one molecule of a new substance X; this is denoted P + Q X. Suppose that p and q, where p = q, are the initial concentrations of P and Q, respectively, and let x(t) be the concentration of X at time t. Then p x(t) and q x(t) are the concentrations of P and Q at time t, and the rate at which the reaction occurs is given by the equation dx = (p x)(q x), dt where is a positive constant. (a) If x(0) = 0, determine the limiting value of x(t) as t without solving the dierential equation. Then solve the IVP and x(t) for any t. (b) If the substances P and Q are the same, the p = q and the above ODE is replaced by dx = (p x)2 . dt If x(0) = 0, determine the limiting value of x(t) as t without solving the dierential equation. Then solve the IVP and x(t) for any t. Solution. I recommend making a phase line. (a) The equilibrium points are x = p and x = q. Without loss of generality we may assume p < q. In this case the solution to the IVP has the limit
t

lim x(t) = p.

MATH 320: EXAM 1 ADDITIONAL PROBLEMS

We solve using separation of variables and partial fractions. dx 1 (log(p x) log(q x)) = t + C (p x)(q x) = dt qp qDe(qp)t p De(qp)t 1 Now we apply the initial condition. x(t) = x(0) = 0 = So our nal formula is x(t) = pq(e(qp)t 1) pe(qp)t p = . qe(qp)t /q 1 qe(qp)t p qD p p D= D1 q

(b) The equilibrium point is x = p. This point is semistable, particularly for an initial condition starting below x = p, such as our initial condition x(0) = 0. In other words,
t

lim x(t) = p.

Now we solve our ODE: dx 1 = dt = t + C (p x)2 px Applying our IC right now, we nd that 1 =C p and go back to solving our ODE. Making this substitution, we solve for x and nd x(t) = p2 t + p pt + 1

MATH 320: PRACTICE EXAM 1

This is merely a practice exam. It is not meant to be your sole source of study. You should aim to complete this in 50 minutes without consulting any references. Just to reiterate: THIS CANNOT BE YOUR ONLY RESOURCE FOR STUDYING. Consult your class notes and textbook for additional examples and practice problems. (1) Find the general/specic solutions to the following ODE/IVPs. (a) y 7y = sin 2x (b)
dx dt

= 2t(x2 + 9)

(c) 2xe2t + (1 + e2t ) dx = 0, x(1) = 2 dt Solution. The tricky part is choosing the correct method. (a) We may use our formulas for linear equations directly in this case: (x) = exp and y(x) = e7x sin 2x dx 1 = (7 sin(2x) + 2 cos(2x)) + Ce7x . (x) 53 7 dx = e7x

(b) Separate the variables: x2 dx 1 x = 2t dt arctan = t2 + C. +9 3 3

Now do the algebra for an explicit solution: x(t) = 3 tan(3t2 + C.) (c) This equation is separable dx 2e2t = dt x 1 + e2t log x = log(1 + e2t ) + C x(t) = Delog(1+e
2t )1

D . 1 + e2t

Applying the initial condition results in 2 = Thus the solution is x(t) = for any t.
1

D D = 2(1 + e2 ). 1 + e2 2(1 + e2 ) 1 + e2t

MATH 320: PRACTICE EXAM 1

(2) What formula from basic calculus is being invoked each time separation of variables is used to solve a separable ODE? Which formula is invoked for the use of an integrating factor for 1st order linear ODEs? Solution. Separation of variables uses the chain rule, and the product rule is used for 1st order linear ODEs. (3) Consider the ODE dy = y(y 1)(y 3). dt Find the equilibrium points of this autonomous ODE and create a phase line to classify their stability. Sketch a few solution curves for various initial conditions. Then solve the ODE with initial condition y(0) = 3. Solution. The equilibrium points are y = 0, y = 1, and y = 3. Using f (y) = y(y 1)(y 3) as a guide for our phase line, we nd have the following phase line:

We can use this phase line as a guide to sketch a few solution curves:

For the explicit solution we use separation of variables: dy = dt y(y 1)(y 3) 1 1 1 + 3y 2(y 1) 6(y 3) dy = dt

MATH 320: PRACTICE EXAM 1

Note that we used partial fractions here. Now integrate to nd 1 1 1 log y log(y 1) + log(y 3) = t + C 3 2 6 2 log y 3 log(y 1) + log(y 3) = 6t + C. A bit of algebra leads to y 2 (y 3) = De6t . (y 1)3 Applying the initial condition gives 4(2 3) 3 = De0 D = 4. (2 1) So the (implicit) specic solution is y 2 (y 3) = 4e6t . (y 1)3 An important fact to observe here is that determining stability of equilibrium points using this solution would be extremely dicult. In this instance it seems not only easier, but necessary to use the phase line to examine such issues. (4) Suppose we have a plate of enchiladas at 300 in a room that is 75 . After a minute out of the oven, the temperature has cooled to 250 . (a) Assume the rate at which the temperature changes is directly proportional to the dierence between the room temperature and enchilada temperature. Write down the dierential equation describing the time evolution of the temperature of the enchiladas. Dene any variables you introduce. (b) Find the formula that gives the temperature of the enchiladas for any time, t. (c) How long until the enchiladas cool to 125 ? Based on this, would you say we have an accurate temperature model? Solution. This follows my previous lectures. (a)
dTe dt

= k(Te 75). Here Te is the temperature of the enchiladas and k is simply a

constant of proportionality. Note that the equation comes from Newtons cooling law. (b) This means solving the above equation using separation of variables. dTe = k dt log(Te 75) = kt + C Te 75 Te (t) = Dekt + 75. Applying the initial condition, we nd that Te (0) = 300 = De0 + 75 D = 225. To determine k, we apply what we know from a minute later: Te (1) = 250 = 225ek + 75 k = log 7 9 .

MATH 320: PRACTICE EXAM 1

Putting these pieces together, we see that Te (t) = 225e


t log(7/9)

+ 75 = 225

7 9

+ 75.

(c) This is now just a little algebra: 225et log(7/9) + 75 = 125 t log(7/9) = log(2/9) t = log(2/9) 5.98. log(7/9)

That seems like a reasonable amount of time, so I would say the model is okay.

(5) Consider the linear system x1 + 2x2 + x3 = 0 2x1 3x2 + x3 = 0 3x1 + 5x2 = 0. Rewrite this system as a matrix, and use Gaussian elimination to nd the solutions to this system. Solution. The matrix form of this system and the row operations are 1 3 2 5 1 0
2R1 +R2 R2

1 2 1 0 3 5 0 0 1 0
3R1 +R3 R3

2 3 1 0 0 1 3 0 0 0

2 1

1 3

0
R2 +R3 R3

1 2 1 0 0 0 0 0

0 1 3 0 0

0 1 3 0

The system must have innitely many solutions, so we start by parameterizing x3 = t. Using row 2 of the matrix we have x2 = 3x3 = 3t. Going back up to row 1 gives x1 = 2x2 x3 = 6t + t = 7t. So x1 = 7t x2 = 3t x3 = t

Math 320 Spring 2009 Part III Linear Systems of Di EQ


JWR May 14, 2009

Monday March 30

The Existence and Uniqueness Theorem for Ordinary Dierential Equations which we studied in the rst part of the course has a vector version which is sill valid. Here is the version (for a single equation) from the rst part of the course, followed by the vector version. Theorem 1 (Existence and Uniqueness Theorem). Suppose that f (t, y) is a continuous function of two variables dened in a region R in (t, y) plane and that the partial f /y exists and is continuous everyhere in R. Let (t0 , y0 ) be a point in R. Then there is a solution y = y(t) to the initial value problem dy = f (t, y), dt y(t0 ) = y0

dened on some interval I about t0 . The solution is unique in the sense that any two such solutions of the initial value problem are equal where both are dened. Theorem 2 (Existence and Uniqueness Theorem for Systems). Assume that f (t, x) is a (possibly time dependent) vector eld on Rn , i.e. a function which assigns to each time t and each vector x = (x1 , . . . , xn ) in Rn a vector f (t, x) in Rn . Assume that f (t, x) is continuous in (t, x) and that the partial derivatives in the variables xi are continuous. Then for each initial time t0 and each point x0 in Rn the initial value problem dx = f (t, x), dt 1 x(t0 ) = x0

dened on some interval I about t0 . The solution is unique in the sense that any two such solutions of the initial value problem are equal where both are dened. Proof. See Theorem 1 on page 682 in Appendix A.3 and Theorem 1 on page 683 in Appendix A.4 of the text. 3. For the rest of this course we will study the special case of linear system where the vector eld f (t, x) has the following form In that case the vector eld f (t, x) has the form f (t, x) = A(t)x + b(t) where A(t) is a continous n n matrix valued function of t and b(t) is a continuous vector valued function of t with values in Rn . A system of this form is called a linear system of dierential equations. We shall move the term A(t)x to the other side so the system takes the form dx + A(t)x = b(t) dt (1)

In case the the right hand side b is identically zero we say that the system is homogeneous otherwise it is called inhomogeneous or non homogeneous. If a11 a12 . . . a1n x1 b1 a21 a22 . . . a2n x2 b2 A= . x = . , b = . . . .. . , . . . . . . . . . . . xn an1 an2 . . . ann bn Then f = (f1 , . . . , fn ) where fi = ai1 (t)x1 ai2 (t)x2 ain (t)xn + bi (t) and the system (1) can be written as n equations dxi + ai1 (t)x1 + ai2 (t)x2 + + ain (t)xn = bi (t), dt i = 1, 2, . . . n (2)

in n unknowns x1 , . . . , xn where the aij and bi (t) are given functions of t. For the most part we shall study the case where the coecients aij are constant. Using matrix notation makes the theory of equation (1) look very much like the theory of linear rst order dierential equations from the rst part of this course. 2

4. In the case of linear systems of form (2) the partial derivatives are automatically continuous. This is because fi /xj = aij (t) and the matrix A(t) is assumed to be a continuous function of t. Hence the Existence and Uniqueness Theorem 2 apples. But something even better happens. In the general case of a nonlinear system solutions can become innite in nite time. For example (with n = 1) the solution to the nonlinear equation dx/dt = x2 is x = x0 /(1 x0 t) which becomes innite when t = 1/x0 . But the following theorem says that in the linear case this doesnt happen. Theorem 5 (Existence and Uniqueness for Linear Systems). Let t0 be a real number and x0 be a point in Rn then the dierential equation (1) has a unique solution x dened for all t satisfying the initial condition x(t0 ) = x0 . Proof. See Theorem 1 on page 399 of the text and Theorem 1 page 681 of Appendix A.2. Denition 6. A nth order linear dierential equation is of form dn y dn1 y dy + p1 (t) n1 + + pn1 (t) + pn (t)y = f (t) n dt dt dt (3)

where the functions p1 (t), . . . , pn (t), and f (t) are given and the function y is the unknown. If the function f (t) vanishes identically, the equation is called homogeneous otherwise it is called inhomogeneous. For the most part we shall study the case where the coecients p1 , . . . , pn are constant. 7. The text treats nth order equations in chapter 5 and systems in chapter 7, but really the former is a special case of the latter. This is because after introducing the variables di1 y xi = i1 dt the equation (3) becomes the system dxi xi+1 = 0, dt and i = 1, . . . , n 1

dxn + p1 (t)xn1 + + pn2 (t)x2 + pn (t)x1 = f (t). dt For example the 2nd order equation d2 y dy + p1 (t) + p2 (t)y = f (t) 2 dt dt 3

becomes the linear system dx1 x2 =0 dt dx 2 + p1 (t)x2 + p2 (t)x1 = f (t) dt in the new variables x1 = y, x2 = dy/dt. In matrix notation this linear system is d x1 0 1 x1 0 + = . x2 p2 (t) p1 (t) f (t) dt x2 For this reason the terminology and theory in chapter 5 is essentially the same as that in chapter 7. Theorem 8 (Existence and Uniqueness for Higher Order ODEs). If the functions p1 (t), . . . , pn1 (t), pn (t), f (t) are continuous then for any given numbers t0 , y0 , . . . , yn1 the nth order system (3) has a unique solution dened for all t satisfying the initial condition y(t0 ) = y0 , y (t0 ) = y1 , . . . , y (n1) (t0 ) = yn1 Proof. This is a corollary of Theorem 5. See Theorem 2 page 285, Theorem 2 page 297. Theorem 9 (Superposition). Suppose A(t) is a continuous n n matrix valued function of t. Then the solutions of the homogeneous linear system dx + A(t)x = 0 dt (4)

of dierential equations form a vector space In particular, the solutions of a higher order homogeneous linear dierential equation form a vector space. Proof. The show that the set of solutions is a vector space we must check three things: 1. The constant function 0 is a solution. 2. The sum x1 + x2 of two solutions x1 and x2 is a solution. 3. The product cx of a constant c and a solution x is a solution. 4

(See Theorem 1 page 283, Theorem 1 page 296, and Theorem 1 page 406 of the text.) Theorem 10 (Principle of the Particular Solution). Let xp is a solution is a particular solution of the non homogeneous linear system dx + A(t)x = b(t). dt Then if x is a solution of the corresponding homogeneous linear system dx + A(t)x = 0 dt then x + xp solves the nonhomogeneous system and conversely every solution of the nonhomogeneous system has this form. Proof. The proof is the same as the proof of the superposition principle. The text (see page 490) says it a bit dierently: The general solution of the nonhomogeneous system is a particilar solution of the nonhomogeneous system plus the general solution of the corresponding homogeneous system. Corollary 11. Let yp be a solution is a particular solution of the non homogeneous higher order dierential equation dn1 y dy dn y + p1 (t) n1 + + pn1 (t) + pn (t)y = f (t). n dt dt dt Then if y is a solution of the corresponding homogeneous higher order dierential equation dn y dn1 y dy + p1 (t) n1 + + pn1 (t) + pn (t)y = 0 n dt dt dt then y + yp solves the nonhomogeneous dierential equation and conversely every solution of the nonhomogeneous dierential equation has this form. Proof. Theorem 4 page 411. Theorem 12. The vector space of solutions of (4) has dimension n. In particular, the vector space of solutions of (3) has dimension n. 5

Proof. Let e1 , e2 , . . . , en be the standard basis of Rn . By Theorem 2 there is a unique solution xi of (4) satisfying xi (0) = ei . We show that these solutions form a basis for the vector space of all solutions. The solutions x1 , x2 , . . . , xn span the space of solutions. If x is any solution, the vector x(0) is a vector in Rn and is therefore a linear combination x(0) = c1 e1 + c2 e2 + + cn en of e1 , e2 , . . . , en . Then c1 x1 (t) + c2 x2 (t) + + cn xn (t) is a solution (by the Superposition Principle) and agrees with x at t = 0 (by construction) so it must equal x(t) for all t (by the Existence and Uniqueness Theorem). The solutions x1 , x2 , . . . , xn are independent. If c1 x1 (t) + c2 x2 (t) + + cn xn (t) = 0 then evaluating at t = 0 gives c1 e1 + c2 e2 + + cn en = 0 so c1 = c2 = = cn = 0 as e1 , e2 , . . . , en are independent.

Wednesday April 1, Friday April 3

13. To proceed we need to understand the complex exponential function. Euler noticed that when you substitute z = i into the power series

e =
n=0

zn n!

(5)

you get e = =
i

n=0 2k 2k k=0

in n n!

i i2k+1 2k+1 + (2k)! (2k + 1)! k=0 (1)k 2k (1)k 2k+1 +i (2k)! (2k + 1)! k=0

=
k=0

= cos + i sin . This provides a handy way of remembering the trigonometric addition formulas: ei(+) = cos( + ) + i sin( + ) and ei ei = cos + i sin cos + + i sin = (cos cos sin sin ) + i(sin cos + cos sin ) 6

so equating the real and imaginary parts we get cos( + ) = cos cos sin sin , sin( + ) = sin cos + cos sin .

Because cos() = cos and sin() = sin we have cos = Note the similarity to cosh t = et + et , 2 sinh t = et et . 2 ei + ei , 2 sin = ei ei . 2i

It is not dicult to prove that the series (5) converges for complex numbers z and that ez+w = ez ew . In particular, if z = x + iy where x and y are real then ez = ex eiy = ex (cos y + i sin y) = ex cos y + iex sin y so the real and imaginary parts of ez are given ez = ex cos y =
ez + ez , 2

ez = ex sin y =

ez ez . 2i

14. We will use the complex exponential to nd solutions to the nth order linear homgeneous constant coecient dierential equation dn1 y dy dn y + p1 n1 + + pn1 + pn y = 0 n dt dt dt (6)

As a warmup lets nd all the solutions of the linear homogeneous 2nd order equation dy dy a 2 + b + cy = 0 dt dt where a, b, c are constants and a = 0. (Dividing by a puts the equation in the form (3) with n = 1 and p1 = b/a and p2 = c/a constant.) As an ansatz we seek a solution of form y = ert . Substituting gives (ar2 + br + c)ert = 0 so y = ert is a solution i ar2 + br + c = 0, 7 (7)

i.e. if r = r1 or r = r2 where b + b2 4ac , r1 = 2a

r2 =

b2 4ac . 2a

The equation (7) is called the charactistic equation (or sometimes the auxilliary equation). There are three cases. 1. b2 4ac > 0. In this case the solutions r1 and r2 are distinct and real and for any real numbers c1 and c2 the function y = c1 er1 t + c2 er2 t satises the dierential equation. 2. b2 4ac < 0. In this case the roots are complex conjugates: b 4ac b2 4ac b2 b +i , r2 = r = i . r1 = r = 2a 2a 2a 2a
The functions ert and ert are still solutions of the equation because calculus and algebra works the same way for real numbers as for complex numbers. The equation is linear so linear combinations of solutions are solutions so the real and imaginary parts ert + ert , 2 ert ert 2i

y1 =

y2 =

of solutions are solutions and for any real numbers c1 and c2 the function y = c1 y1 + c2 y2 is a real solution. 3. b2 4ac = 0. In this case the characteristic equation (7) has a double root so aD2 + bD + c = a(D r)2 where r = b/(2a). (For the moment interpret D as an indeterminate; well give another interpretation later.) It is easy to check that both ert and tert are solutions so y = c1 ert + c2 tert is a solution for any constants c1 and c2 . Example 15. For any real numbers c1 and c2 the function y = c1 e2t + c2 e3t is a solution the dierential equation d2 y dy 5 + 6y = 0. dt2 dt 8

Example 16. For any real numbers c1 and c2 the function y = c1 et cos t + c2 et sin t is a solution the dierential equation dy d2 y 2 + 2y = 0. 2 dt dt Example 17. For any real numbers c1 and c2 the function y = c1 et + c2 tet is a solution the dierential equation d2 y dy 2 + y = 0. 2 dt dt

Friday April 3 Wednesday April 8


p(r) = rn + p1 rn1 + + pn1 r + pn

18. It is handy to introduce operator notation. If

and y is a function of t, then p(D)y := dn y dn1 y dy + p1 n1 + + pn1 + pn y n dt dt dt

denotes the left hand side of equation (6). This gives the dierential equation the handy form p(D)y = 0. The characteristic equation of this dierential equation is the algebraic equation p(r) = 0, i.e. y = ert = p(D)y = p(D)ert = p(r)ert = p(r)y so y = ert is a solution when p(r) = 0. When p(x) = xk we have that p(D)y = Dk y, i.e. k dk y d Dk y = k = y dt dt is the result of dierentiating k times. Here is what makes the whole theory work: 9

Theorem 19. Let pq denote the product of the polynomial p and the polynomial q, i.e. (pq)(r) = p(r)q(r). Then for any function y of t we have (pq)(D)y = p(D)q(D)y where D = d/dt. Proof. This is because D(cy) = cDy if c is a constant. Corollary 20. p(D)q(D)y = q(D)p(D)y. Proof. pq = qp. 21. If q(D) = 0 then certainly pq(D)y = 0 by the theorem and if p(D)y = 0 then pq(D)y = 0 by the theorem and the corollary. This means that we can solve a homogeneous linear constant coecient equation by factoring the characteristic polynomial. Theorem 22 (Fundamental Theorem of Algebra). Every polynomial has a complex root. Corollary 23. Every real polynomial p(r) factors as a product of (possibly repeated) linear factors r c where c is real and quadratic factors ar2 + br + c where b2 4ac < 0. 24. A basis for the solution space of (D r)k y = 0 is ert , tert , . . . , tk1 ert . A basis for the solution space of (aD2 + bD + c)k y = 0 is ept cos(qt), ept sin(qt), . . . , tk1 ept cos(qt), tk1 ept sin(qt) where r = p qi are the roots of ar2 + br + c, i.e. p =
b 2a

and q =

b2 4ac . 2a

Example 25. A basis for the solutions of the equation (D2 + 2D + 2)2 (D 1)3 y = 0 is et cos t, et sin t, tet cos t, tet sin t, et , tet , t2 et . 10

26. This reasoning can also be used to solve inhomogeneous constant coecient linear dierential equation p(D)y = f (t) where the homogeneous term f (t) itself solves solves a homogeneous constant coecient linear dierential equation q(D)f = 0. This is because any solution y of p(D)y = f (t) will then solve the homogeneous equation (qp)(D)y = 0 and we can compute which solutions of (qp)(D)y = 0 also satisfy p(D)y = f (t) by the method of undetermined coecients. Heres an Example 27. Solve the initial value problem dy + 2y = e3t , dt y(0) = 7.

Since e3t solves the problem (D 3)e3t = 0 we can look for the solutions to (D + 2)y = e3t among the solutions of (D 3)(D + 2)y = 0. These all have the form y = c1 e2t + c2 e3t but not every solution homogeneous second order equation solves the original rst order equation. To see which do, we plug in (D + 2)(c1 e2t + c2 e3t ) = (D + 2)c2 e3t = (3c2 + 2c2 )e3t = e3t if and only if c2 = 1/5. Thus yp = e3t /5 is a particular solution of the inhomogeneous equation. By the Principle of the Particular Solution above the general solution to the inhomogeneous equation is the particular solution plus the general solution to the homogeneous problem, i.e. y = c1 e2t + e3t 5

To satisfy the initial condition y(0) = 7 we must have 7 = c1 +1/5 or c1 = 6.8. the solution we want is y = 6.8e2t + 0.2e3t .

11

Friday April 10, Monday April 13

28. The displacement from equilibrium y of a object suspended from the ceiling by a spring is governed by a second order dierential equation d2 y dy + c + ky = F (t) (8) dt2 dt where m is the mass of the object, c is a constant called the damping constant, k is a constant of proportionality called the spring constant, and F (t) is a (generally time dependent) external force. The constants m and k are positive and c is nonnegative. This is of course Newtons law m Ftot = ma, d2 y dt2 = FS + FR + FE of three terms, viz. a := (9) (10)

where the total force Ftot is the sum Ftot the spring force FS = ky, the resistive force FR = c and the external force

dy , dt

FE = F (t). The spring force FS is the combined force of gravity and the force the spring exerts to restore the mass to equilibrium. Equation (9) says that the sign of the spring force FS is opposite to the sign of the displacement y and hence FS pushes the object towards equilibrium. The resistive FR is the resistive force proportional to the velocity of the object. Equation (10) says that the sign of the resistive force FR is opposite to the sign of the velocity dy/dt and hence FR slows the object.. In the text, the resistive force FR is often ascribed to a dashpot (e.g. the device which prevents a screen door from slamming) but in problems it might be ascribed to friction or to the viscosity of the surrounding media. (For example, if the body is submersed in oil, the resistive force FR might be signicant.) The external force will generally arise from some oscillation, e.g. F (t) = F0 cos(t) (11)

and might be caused by the oscillation of the ceiling or of a ywheel. (See gure 5.6.1 in the text.) 12

29. Equation 9 is called Hookes law. It can be understood as follows. The force FS depends solely of the position y of the spring. For y near y0 the Fundamental Idea of Calculus says that the force FS (y) is well approximated by the linear function whose graph is the tangent line to the graph of FS at y0 , i.e. FS (y) = FS (y0 ) + FS (y0 )(y y0 ) + o(y y0 ) (12) where the error term o(y y0 ) is small in the sense that
yy0

lim

o(y y0 ) = 0. y y0

The fact that equilibrium occurs at y = y0 means that FS (y0 ) = 0, i.e. (assuming FE is zero) i.e. if the object is at rest at position y = y0 then Newtons law F = ma holds. The assertion that y is displacement from equilibrium means that y0 = 0, i.e. we are measuring the position of the object as its signed distance from its equilibrium position as opposed to say its height above the oor or distance from the ceiling. From this point of view Hookes law is the approximation that arises when we ignore the error term o(y y0 ) in (12). This is the same reasoning that leads to the equation g d2 + =0 t dt L as an approximation to the equation of motion g d2 + sin = 0 t dt L for the simple pendulum. (See page 322 of the text.) On the other hand, some of the assigned problems begin with a sentence like A weight of 5 pounds stretches a spring 2 feet. In this problem there are apparently two equilibria, one where the weight is not present and another where it is. In this situation the student is supposed to assume that the force FS is a linear (not just approximately linear) function of the position, so that the spring constant k is 5/2 (i.e. the slope of the line) and that the equilibrium position occurs where the weight is suspended at rest. 30. The energy is dened as the sum E := mv 2 ky 2 + , 2 2 13 v := dy dt

of the kinetic energy mv v 2/2 and the potential energy ky 2 /2. If the external force FE is zero, the energy satises dv dy d2 y dE = mv + ky = v m 2 + ky dt dt dt de = c dy dt
2

When c (and FE ) are zero, dE/dt = 0 so E is constant (energy is conserved) while if c is positive, dE/dt < 0 so E is decreasing (energy is dissipated). When we solve the equation (with FE = 0) we will see that the motion is periodic if c = 0 and tends to equilibrium as t becomes innite if c > 0. 31. It is important to keep track of the units in doing problems of this sort if for no other reason than that it helps avoid errors in computation. We never add two terms if they have dierent units and whenever a quantity appears in a nonlinear function like the sine or the exponential function, that quantity must be unitless. In the metric system the various terms have units as follows: y has the units of length: meters (m). dy has the units of velocity: meters/sec. dt d2 y has the units of acceleration: meters/sec2 . dt2

m has the units of mass: kilograms (kg). F has the units of force: newtons = kgm/sec2 . Using the principle that quantities can be added or equated only if they have the same units and that the units of a product (or ratio) of two quantities is the product (or ratio) of the units of the quantities we see that c has the units of kgm/sec and that k has the units of kg/sec2 . The quantity 0 := k m

thus has the units of frequency: sec1 which is good news: When c = 0 and FE = 0 equation (8) becomes the harmonic oscillator equation d2 y 2 + 0 y = 0 dt2 14 (13)

(we divided by m) and the solutions are y = A cos(0 t) + B sin(0 t). (The input 0 t to the trigonometric functions is unitless.) Remark 32. When using English units (lb, ft, etc.) you need to be a bit careful with equations involving mass. Pounds (lb) is a unit of force, not mass. Using mg = F and g=32 ft/sec2 we see that an object at the surface of the earth which weighs 32 lbs (i.e. the force on it is 32 lbs) will have a mass of 1 slug1 So one slug weighs 32 lbs at the surface of the earth (or lb = (1/32)slugft/sec2 ). When using metric units, kilogram is a unit of mass not force or weight. A 1 kilogram mass will weigh 9.8 newtons on the surface of the earth. (g= 9.8 m/sec2 and newton = kgm/sec2 ). Saying that a mass weighs 1 kilogram is technically incorrect usage, but it is often used. What one really means is that it has 1 kilogram of mass and therefore weighs 9.8 newtons. 33. Consider the case where the external force FE is not present. In this case equation (8) reduces to the homogeneous problem m d2 y dy + c + ky = 0 dt2 dt (15) (14)

The roots of the characteristic equation are c + c2 4mk c c2 4mk r1 = , r1 = . 2m 2m We distinguish four cases. 1. Undamped: c = 0. The general solution is y = A cos(0 t) + B sin(0 t), 0 := k m

2. Under damped: c2 4mk < 0. The general solution is y = ept (A cos(1 t) + B sin(0 t)) ,
1

p :=

c , 2m

1 :=

2 0 p2

The unit of mass in the English units is called the slug really!

15

3. Critically damped: c2 4mk = 0. The general solution is y = ept (c1 + c2 t) 4. Over damped: c2 4mk. The general solution is y = c1 er1 t + c2 er2 t . In the undamped case (where c = 0) the motion is oscillatory and the limit of y(t) as t becomes innite does not exists (except of course when A = B = 0). In the three remaining cases (where c > 0) we have limt y = 0. In case 3 this is because lim tept = 0
t

(in a struggle between an exponential and a polynomial the exponential wins) while case 4 we have r2 < r1 < 0 because c2 4mk < c. In cases 1 and 2 we can dene and C by C := A2 + B 2 , cos = A , C sin = B C

and the solution takes the form y = Cept cos(1 t ). In the undamped case p = 0 and the y is a (shifted) sine wave with amplitude C, while in the under damped case the graph bounces back and forth between the two graphs y = ept . 34. Now consider the case of forced oscillation where the external force is given by (11). The right hand side F (t) = F0 cos(t) of (8) solves the ODE (D2 + 2 )F = 0 so we can solve using the method of undetermined coecients. We write equation (8) in the form (mD2 + cD + k)y = F0 cos t (16)

and observe that every solution of this inhomogeneous equation satises the homogeneous equation (D2 + 2 )(mD2 + cD + k)y = 0. The solutions of this last equation all can be written as y + yp where (mD2 + cD + k)y = 0 and (D2 + 2 )yp . We know all the solutions of the former 16

equation by the previous paragraph and the most general solution of the latter is yp = b1 cos t + b2 sin t. We consider three cases. (i) c = 0 and = 0 . The function yp satises (16) i (k m 2 )b1 cos t + (k m 2 )b2 sin t = F0 cos t which (since the functions cos t and sin t are linearly independent) can only be true if b2 = 0 and b1 = F0 /(k m 2 ). Our particular solution is thus yp = F0 cos t F0 cos t = 2 2 k m m(0 2 )

(ii) c = 0 and = 0 . In this case we can look for a solution ihe form yp = b1 t cos 0 t + b2 t sin 0 t but it is easier to let tend to 0 in the solution we found in part (i). Then by lHpitals rule (dierentiate top and bottom o with respect to ) we get yp = lim F0 cos t F0 t sin 0 t = 2 2) m(0 2m0

The solution yp bounces back and forth between the two lines y = F0 t/(2m). The general solution in this case is (by the principle of the particular solution) the general solution of the homogeneous system plus the solution yp and the former remains bounded. Thus every solution oscillates wildly as t becomes innite. This is the phenomenon of resonance and is the cause of many engineering disasters. (See the text page 352.) (iii) c > 0. The high school algebra in this case is the most complicated but at least we know that there are no repeated roots since the roots of r2 + 2 = 0 are pure imaginary and the roots of mr2 + cr + k = 0 are not. The function yp satises (16) i (k m 2 )b1 + cb2 cos t + cb1 + (k m 2 )b2 sin t = F0 cos t so (k m 2 )b1 + cb2 = F0 and 17 (cb1 + (k m 2 )b2 = 0.

In matrix form this becomes k m 2 c k m 2 and the solution is b1 b2 k m 2 c F0 = 2 c k m 0 1 k m 2 c = 2 )2 + (c)2 c k m 2 (k m F0 k m 2 = c (k m 2 )2 + (c)2


1

b1 b2

F0 0

F0 0

so our particular solution is yp = F0 (k m 2 ) cos t + c sin t . (k m 2 )2 + (c)2

As above this can writen as yp = where cos = k m 2 (k m 2 )2 + (c)2 , sin = c (k m 2 )2 + (c)2 . F0 cos(t ) (k m 2 )2 + (c)2

The general solution in this case is (by the principle of the particular solution) the general solution of the homogeneous system plus the solution yp . The former decays to zero as t becomes innite, and the latter has the same frequency as the external force F0 cos t but has a smaller amplitude and a phase shift . (See the text page 355.)

Wednesday April 15 - Friday April 17


dx1 = 3x1 , dt dx2 = 5x2 , dt 18

35. Its easy to solve the initial value problem x1 (0) = 4, x2 (0) = 7.

The answer is x1 = 4e3t , x2 = 7e5t . The reason this is easy is that this really isnt a system of two equations in two unknowns, it is two equations each in one unknown. When we write this system in matrix notation we get dx = Dx, dt x= x1 x2 , D= 3 0 0 5 .

The problem is easy because the matrix D is diagonal. 36. Here is a system which isnt so easy. dy1 = y1 + 4y2 , dt dy2 = 2y1 + 7y2 , dt y1 (0) = 15, y2 (0) = 11.

To solve it we make the change of variables y1 = 2x1 + x2 , Then dx1 = dt dy1 dy2 = dt dt (y1 + 4y2 ) (2y1 + 7y2 ) = 3y1 3y2 = 3x1 , y2 = x1 + x2 ; x1 = y1 y2 , x2 = y1 + 2y2 .

dx2 dy1 dy2 = +2 = (y1 + 4y2 ) + 2(2y1 + 7y2 ) = 5(y1 + 2y2 ) = 5x2 , dt dt dt x1 (0) = y1 (0) y2 (0) = 4, x2 (0) = y1 (0) + 2y2 (0) = 7.

The not so easy problem 36 has been transformed to the easy problem 35 and the solution is y1 = 2x1 + x2 = 8e3t + 7e5t , Its magic! 37. To see how to nd the magic change of variables rewrite problem 36 in matrix form dy = Ay, dt A= 1 4 2 7 19 , y= y1 y2 y2 = x1 + x2 = 4e3t + 7e5t .

and the change of variables as y = Px, In the new variables we have dx dy = P1 = P1 Ay = P1 APx. dt dt so we want to nd a matrix P so that the matrix D := P1 AP = 1 0 0 2 (17) x = P1 y.

is diagonal. Once we have done this the not so easy initial value problem dy = Ay, dt y(0) = y0 := 15 11

is transformed into the easy intial value problem dx = Dx, dt x(0) = x0 := P1 y0

as follows. Let x be the solution to the easy problem and dene y := Px. Since the matrix P is constant we can dierentiate the relation y = Px to get dy dx =P = P1 Dx = P1 DPy = Ay. dt dt Also since the equation y := Px holds for all t it holds in particular for t = 0, i.e. y(0) = Px(0) = Px0 = PP1 y0 = bf y0 . This shows that y := Px solves the not so easy problem when x solves the easy problem. 38. So how do we nd a matrices P and D satisfying (17)? Multiplying the equation A = PDP1 on the right by P gives AP = DP. Let v and w be the columns of P so that P = v w . Then as matrix multiplication distributes over concatentation we have AP = Av Aw 20 .

But PD = v1 w1 v2 w2 1 0 0 2 = 1 v1 2 w1 1 v2 2 w2 = 1 v 2 w .

Thus the equation AP = DP can be written as Av Aw = 1 v 2 w ,

i.e. the single matrix equation AP = DP becomes to two vector equations Av = 1 v and Aw = 2 w. Denition 39. Let A be an n n matrix. We say that is an eigenvalue for A i there is a nonzero vector v such that Av = v. (18)

Any nonzero solution v of equation (18) is called an eigenvector of A corresponding to the eigenvector . The set of all solutions to (18) (including v = 0) is called the eigenspace corresponding to the eigenvalue . Equation (18) can be rewritten is a homgeneous system (I A)v = 0 which has a nonzero solution v if and only if det(I A) = 0. The polynomial det(I A) is called the characteristic polynomial of A and so The eigenvalues of a matrix are the roots of its characteristic polynomial. 40. We now nd the eigenvalues of the matrix A from section 37. The characteristic equation is det(I A) = det 1 4 2 7 = ( 1)( 7) + 8 = 2 8 + 15

21

so the eigenvalues are 1 = 3 and 2 = 5. The eigenvectors v corresponding to the eigenvalue 1 = 3 are the solutions of the homogeneous system 0 0 = 3 1 4 2 37 v1 v2 = 2 4 2 4 v1 v2 ,

i.e. the multiples of the column vector v = (2, 1). The eigenvectors w corresponding to the eigenvalue 2 = 5 are the solutions of the homogeneous system 0 0 = 5 1 4 2 57 v1 v2 = 4 4 4 4 v1 v2 ,

i.e. the multiples of the column vector v = (1, 1). A solution to (17) is given by 2 1 3 0 P= , D= 1 1 0 5 and the change of variables y = Px used in problem 36 is y1 y2 = 2 1 1 1 x1 x2 = 2x1 + x2 x1 + x2 .

Remark 41. The solution of the diagonalization problem (17) is never unique because we can always nd another solutions be replacing each eigenvector (i.e. column of P) by a nonzero multiple of itself. For example if c1 = 0 and c2 = 0, Av = 1 v, Aw = 2 w, then also A(c1 v) = 1 (c1 v), A(c2 w) = 2 (c2 w) so the matrix c1 v c2 w should work as well as the matrix P = v w used above. Indeed c1 v c2 w = PC = where C = c1 0 0 c2

and CD = DC (since both are diagonal) so (PC)D(PC)1 = PCDC1 P1 = PDCC1 P1 = PDP1 which shows that if A = PDP1 then also A = (PC)D(PC)1 .

22

Monday April 20 - Wednesday April 22

Denition 42. The matrix A is similar to the matrix B i there is an invertible matrix P such that A = PBP1 . A square matrix D is said to be diagonal if its o diagonal entries are zero, i.e. i it has the form 1 0 0 0 2 0 D= . ... 0 0 n A square matrix is said to be diagonzlizable i it is similar to a diagonal matrix. Remark 43. (i) Every matrix is similar to itself and hence a diagonal matrix is diagonalizable. (ii) If A is similar to B, then B is similar to A (because A = PBP1 = B = P1 AP). (iii) If A is similar to B and B is similar to C, then A is similar to C (because A = PBP1 and B = QCQ1 = A = (PQ)C(PQ)1 ). Theorem 44. Similar matrices have the same characteristic polynomial. Proof. If A = PBP1 then I A = I PBP1 = PIP1 PBP1 = P(I B)P1 so det(IA) = det P(IB)P1 = det(P) det(IB) det(P)1 = det(IB) as the determinant of the product is the producr of the determinants and the determinant of the inverse is the inverse of the determinant. Remark 45. It follows that similar matrices have the same eigenvalues as the eigenvalues are the roots of the characteristic polynomial. Of course they dont necessarily have the same eigenvectors, but if A = PBP1 and w is an eigenvector for B then Pw is an eigenvector for A: A(Pw) = (PBP1 )(Pw) = P(BW) = P(w) = (PW). Theorem 46. An n n matrix A is diagonalizable i there is a linearly independent sequence (v1 , v2 , . . . , vn ) of eigenvectors of A 23

Proof. The matrix P = v1 , v2 , . . . , vn is invertible if and only if its columns v1 , v2 , . . . , vn are linearly independent and the matrix equation AP = PD holds with D is diagonal if and only if the columns of P are eigenvectors of A. (See Theorem 2 on page 376 of the text.) Theorem 47. Assume that v1 , v2 , . . . , vk are nonzero eigenvectors of A corresponding to distinct eigenvalues 1 , 2 , . . . , k , i.e. Avi = i , vi = 0, and i = j for i = j. Then the vector v1 , v2 , . . . , vk are linearly independent. Proof. By induction on k. The one element sequence v1 is independent because we are are assuming v1 is non zero. Assume as the hypothesis of induction that v2 , . . . , vk are independent. We must show that v1, v2 , . . . , vk are independent. For this assume that c1 v1 + c2 v2 + + ck vk = 0. (We must show that c1 = c2 = = ck = 0.) Multiply by 1 I A: c1 (1 I A)v1 + c2 )1 I A)v2 + + ck (1 I A)vk = 0. Since Avi = i vi this becomes c1 (1 1 )v1 + c2 (1 2 )v2 + + ck (1 k )vk = 0. Since 1 1 = 0 this becomes c2 (1 2 )v2 + + ck (1 k )vk = 0. As v1 , . . . , vk are independent (by the induction hypothesis) we get c2 ( 2 ) = = ck (1 k ) = 0 as the eigenvalues are distinct we have 1 i = 0 for i > 1 so c2 = = ck = 0. But now c1 v1 = 0 so c1 = 0 as well (since v1 = 0) so the ci are all zero as required. (See Theorem 2 on page 376 of the text.) Corollary 48. If an n n matrix A has n distinct real eigenvalues, then it is diagonalizable. Proof. A square matrix is invertible if and only if its columns are independent. (See Theorem 3 on page 377 of the text.) 24

Remark 49. The analog of the corollary remains true if we allow complex eigenvalues and assume that all matrices are complex. Example 50. The characteristic polynomial of the matrix A= is det(I A) = det 3 4 4 3 = ( 3)2 + 16 = 2 6x + 25 3 4 4 3

and the roots are 3 4i. The eigenvalues arent real so the eigenvectors cant be real either. However, the matrix A can be diagonalized if we use complex numbers: For = 3 + 4i the solutions of (I A)v = 4i 4 4 4i v1 v2 =4 iv1 v2 v1 + iv2 =0

arei (v1 , v2 ) = c(1, i) while for = 3 4i the solutions of (I A)v = 4i 4 4 4i v1 v2 =4 iv1 v2 v1 iv2 =0

are (v1 , v2 ) = c(1, i). Hence we have A = PDP1 where D= 3 + 4i 0 0 3 4i , P= 1 1 i i , P1 = 1 2i i 1 i 1 .

Example 51. Not every matrix is diagonalizable even if we use complex numbers. For example, the matrix N= 0 1 0 0

is not diagonalizable. This is because N2 = 0 but N = 0. If N = PDP1 then 0 = N2 = (PDP1 )(PDP1 ) = PD2 P1 so D2 = 0. But if D is diagonal, then D2 = 1 0 0 2
2

2 0 1 0 2 2

is also diagonal so 2 = 2 = 0 so 1 = 2 = 0 so D = 0 so N = 0 1 2 contradicting N = 0. 25

Friday April 24

52. The Wronskian of an n functions x1 (t), x2 (t), . . . , xn (t) taking values in Rn is the determinant W (t) = det(x1 , x2 , . . . , xn ) of the n n matrix whose ith column is xi . Of course for each value of t it is the case that W (t) = 0 if and only if the vectors x1 (t), x2 (t), . . . , xn (t) are linearly independent and certainly it can happen that W (t) is zero for some values of t and non-zero for other values of t. But if the functions xi are solution of a matrix dierential equation, this is not so: Theorem 53. If x1 (t), x2 (t), . . . , xn (t) are solutions of the homogeneous lindx ear system = A(t)x then either W (t) = 0 for all t or W (t) = 0 for all dt t. Proof. This is an immediate consequence of the Existence and Uniqueness Theorem: If W (t0 ) = 0 then there are constants c1 , c2 , . . . , cn such that c1 x1 (t0 ) + c2 x2 (t0 ) + + cn , xn (t0 ). Now x(t) := c1 x1 (t) + c2 x2 (t) + + cn , xn (t) satises the equation and x(t0 ) = 0 so (by uniqueness) x(t) = 0 for all t. Remark 54. As explained in paragraph 7 this specializes to higher order dierential equations. The Wronskian of n functions x1 (t), x2 (t), . . . , xn (t) is Wronskian of of the coresponding sequence xi = xi , xi , xi , . . . , xi
(n1)

of vectors. For example in n = 2 the Wronskian of x1 (t), x2 (t) is W (x1 , x2 ) = det where x := dx/dt. Denition 55. The trace Tr(A) of a square matrix A is the sum of its diagonal entries. Theorem 56. Let x1 (t), x2 (t), . . . , xn (t) be solutions of the homogeneous lindx ear system = A(t)x. Then the Wronskian W (t) satises the dierential dt equation dW = Tr A(t) W (t). dt 26 x1 x1 x1 x2 = x1 x2 x2 x1

Proof. We do the 2 2 case A= a11 a12 a21 a22 , x1 = x11 x21 , x2 = x12 x22 .

Let x = dx/dt. Writing out the equations x1 = Ax1 and x2 = Ax2 gives x11 = a11 x11 + a12 x21 , x21 = a21 x11 + a22 x21 , Since W = x11 x22 x12 x21 we get W = = x11 x22 + x11 x22 x12 x21 x12 x21 (a11 x11 + a12 x21 )x22 + x11 (a21 x12 + a22 x22 ) (a11 x12 + a12 x22 )x21 x12 (a21 x11 + a22 x21 ) = (a11 + a22 )(x11 x22 x12 x21 ) x12 = a11 x12 + a12 x22 , x22 = a21 x12 + a22 x22 .

= Tr A W. as claimed.

Monday April 29

57. Consider a system of tanks containing brine (salt and water) connected by pipes through which brine ows from one tank to another.

Wednesday April 29

58. Consider a collection of masses on a track each connected to the next by a spring with the rst and last connected to opposite walls.

k1

m1

k2

m2

k3

m3

k4

27

There are n masses lying on the x axis. The left wall is at x = a, the right wall at x = b, and the ith mass is at Xi so a < X1 < X2 < Xn < b. The spring constant of the ith spring is ki and the ith mass is mi . The rst spring connects the rst mass to the left wall, the last ((n + 1)st) spring connects the last mass to the right wall, and the (i + 1)st spring (i = 1, 2, . . . , n 1) connects the ith mass to the (i + 1)st mass. We assume that there is an equilibrium conguration a < X0,1 < X0,2 < X0,n < b where the masses are at rest and dene the displacement from equilibrium x1 , x2 , . . . , xn by xi := Xi X0,i . note that the distance Xi+1 Xi between two adjacent masses is related to the dierence xi+1 xi of the displacements by the formula Xi+1 Xi = (X0,i+1 X0,i ) + (xi+1 xi ). (19)

With n = 3 (as in the diagram above) the equations of motion for this system are d2 x1 m1 2 = (k1 + k2 )x1 + k2 x2 dt d2 x2 m2 2 = k2 x1 (k2 + k3 )x2 + k3 x3 dt d2 x3 m3 2 = k3 x2 (k3 + k4 )x3 dt or in matrix notation d2 x M 2 = Kx (20) dt with m1 0 0 x1 x = x2 M = 0 m2 0 , x3 0 0 m3 (k1 + k2 ) k2 0 . k2 (k2 + k3 ) k3 K= 0 k3 (k3 + k4 ) In the general case (arbitrary n) the matrix M is diagonal with diagonal entries m1 , m2 , . . . , mn and the matrix K is symmetric tridiagonal with entries (k1 + k2 ) . . . , (kn + kn+1 ) on the diagonal and entries k2 , . . . , kn on the sub and super diagonal. 28

59. To derive the system of equations from rst principles2 let Ti (i = 1, 2, . . . , n + 1) denote the tension in ith spring. This means that the force exerted by the ith spring on the mass attached to its left end if Ti and the force exerted by the ith spring on the mass attached to its right end is Ti . The tension depends on the length Xi Xi1 of the ith spring so by linear approximation and (19) Ti = Ti,0 + ki (xi+1 xi ) + o(xi+1 xi ) where Ti,0 denotes the tension in the ith spring when the system is in equilibrium and ki is the derivative of the tension at equilibrium. (The tension is assumed positive meaning that each spring is trying to contract, so the mass on its left is pulled to the right, and the mass on the right is pulled to the left.) We ignore the small error term o(xi+1 xi ). The net force on the ith mass is Ti+1 Ti = T0,i+1 + ki+1 (xi+1 xi ) T0,i ki (xi xi1 ). At equilibrium the net force on each mass is zero: Ti+1,0 Ti,0 = 0 so Ti+1 Ti simplies to Ti+1 Ti = ki1 xi1 (ki+1 + ki )xi + ki+1 xi+1 . Remark 60. If the masses are hung in a line from the ceiling with the lowest mass attached only to the mass directly above, the system of equations is essentially the same: one takes kn+1 = 0. 61. Assume for the moment that all the masses are equal to one. Then the system takes the form d2 x = Kx. (21) dt2 The eigenvalues of K are negative so we may write them as the negatives of squares of real numbers. If Kv = 2 v then for any constants A and B the function x = (A cos t + B sin t)v is a solution of (21). The following theorem says that K is diagonalizable so this gives 2n independent solutions of (21).
2

This is not in the text: I worked it out to fulll an inner need of my own.

29

Theorem 62. Assume A is a symmetric real matrix, i.e. AT = A. Then A is diagonalizable. If in addition, Av, v > 0 for all v = 0 then the eigenvalues are positive. Remark 63. This theorem is sometimes called the Spectral Theorem. It is often proved in Math 340 but I havent found it in our text. It is also true that for a symmetric matrix, eigenvectors belonging to distinct eigenvalues are orthogonal. In fact there is an orthonormal basis of eigenvectors i.e. a basis v1 , . . . , vn so that |vi | = 1 and vi , vj = 0 for i = j. (We can always make a non zero eigenvector into a unit vector by dividing it by its length.) Remark 64. We can always make a change of variables to convert (20) to (21) as follows. Let M1/2 denote the diagonal matrix whose entries are mi . Then multiplying (20) by the inverse M1/2 of M1/2 gives M1/2 d2 x = M1/2 Kx. dt2

Now make the change of variables y = M1/2 x to get d2 x d2 y = M1/2 2 = M1/2 Kx = M1/2 KM1/2 y. dt2 dt It is easy to see that M1/2 KM1/2 (the new K) is again symmetric.

10

Friday May 1

65. You can plug square matrix into a polynomial (or more generally a power series) just as if it is a number. For example, if f (x) = 3x2 + 5 then f (A) = 3A2 + 5I. Since you add or multiply diagonal matrices by adding or multiplying corresponding diagonal entries we have f 0 0 = f () 0 0 f () . (22)

Finally, since P(A + B)P1 = PAP1 + PBP1 and (PAP1 )k = PAk P1 we have f (PAP1 ) = Pf (A)P1 . (23)

30

66. This even works for power series. For numbers the exponential function has the power series expansion

ex =
k=0

xk k!

so for square matrices we make the denition

exp(A) :=
k=0

1 k A . k!

Replacing A by tA gives

exp(tA) =
k=0

tk k A . k!

(Some people write etA .) Dierentiating term by term gives d exp(tA) = dt =


k=1

k
k=0

tk1 k A k!

tk1 Ak (k 1)! tj j+1 A j!

=
j=0

=A
j=0

tj j A j!

= A exp(tA) Since exp(tA) = I when t = 0 this means that the solution to the initial value problem dx = Ax, x(0) = x0 dt is x = exp(tA)x0 .

31

67. The moral of the story is that matrix algebra is just like ordinary algebra and matrix calculus is just like ordinary calculus except that the commutative law doesnt always hold. However the commutative law does hold for powers of a single matrix: Ap Aq = Ap+q = Aq+p = Aq Ap . 68. You can compute the exponential of a matrix using equations (22) and (23). If 0 A = PDP1 , D= , 0 then exp(tA) = P exp(tD)P1 , exp(tD) = et 0 0 et .

Example 69. In paragraph 40 we saw how to diagonalize the matrix A= We found that A = PDP1 where P= Now exp(tD) = 2 1 1 1 , P1 = 1 1 1 2 e3t 0 0 e5t , D= 3 0 0 5 . 1 4 2 7 .

so exp(tA) = P exp(tD)P1 is computed by matrix mutiplication exp(tA) = = = 2 1 1 1 2 1 1 1 e3t 0 0 e5t e3t e3t e5t 2e5t . 1 1 1 2

2e3t e5t 2e3t + 2e5t e3t e5t e3t + 2e5t

Example 70. If N2 = 0 then

exp(tN) =
k=0

tk k N = k! 32

k=0

tk k N = I + tN. k!

In particular, exp 0 t 0 0 = 1 t 0 1 .

Theorem 71. If AB = BA then exp(A + B) = exp(A) exp(B). Proof. The formula ea+b = aa eb holds for numbers. Here is a proof using power series. By the Binomial Theorem (a + b)k = so k! p q ab p!q! p+q=k

k=0

(a + b)k = k!

ap bq = p!q! k=0 p+q=k

p=0

ap p!

q=0

bq q!

If AB = BA the same proof works to prove the theorem. Example 72. (Problem 4 page 487 of the text) We compute exp(tA) where A= The characteristic polynomial is det(I A) = det 3 1 1 1 = ( 3)( 1) + 1 = ( 2)2 3 1 1 1 .

and has a double root. But the null space of (2I A) is one dimensional so we cant diagonalize the matrix. However (A 2I) =
2

3 2 1 1 12

1 1 1 1

=0

so exp t(A 2I) = I + t(A 2I) as in Example 70. But the matrices 2tI and t(A 2I) commute (the identity matrix commutes with every matrix) and tA = 2tI + t(A 2I) so exp(tA) = exp(tI) exp t(A 2I) = e2t I + t(A 2I) i.e. exp(tA) = e2t 1 + t t t 1t 33

11

Monday May 4
dx = F(x) dt

73. A solution of a two dimensional system

is a parametric curve in the plane R2 . The collection of all solutions is called the phase portrait of the system. When we draw the phase portrait we only draw a few representative solutions. We put arrows on the solutions to indicate the direction of the parameterization just like we did when we drew the phase line in the frst part of this course. We shall only draw phase portraits for linear systems F(x) = Ax where A is a (constant) 2 2 matrix.

12

Wednesday May 6
dx = A(t)x + f (y) dt

74. Consider the inhomogeneous system (24)

where A(t) is a continuous n n matrix valued function, f (t) is a continuous function with values in Rn and the unknown x also takes values in Rn . We shall call the system dv = A(t)v (25) dt the homogeneous system corresponding to the inhomogenoous system (24). 75. Let v1 , v2 , . . . , vn be n linearly independent solutions to the homogeneous system (25) and form the matrix (t) := v1 (t) v2 (t) vn (t) .

The matrix is called a fundamental matrix for the system (25); this means that the columns are solutions of (25) and they form a basis for Rn for some (and hence3 every) value of t.
3

by Theorem 53

34

Theorem 76. A fundamental matrix satises the matrix dierential equation d = A dt Proof. d dt v1 v2 vn = = dvn dv1 dv2 dt dt dt Av1 Av2 Avn .

= A v1 v2 vn

Theorem 77. If is a fundamental matrix for the system (25) then the function v(t) = (t)(t0 )1 v0 is the solution of the initial value problem dv = A(t)v, dt v(t0 ) = v0 .

Proof. The columns of (t)(t0 )1 are linear combinations of the columns of (t) (see the Remark 79 below) and hence are solutions of the homogeneous system. The initial condition holds because (t0 )(t0 )1 = I Theorem 78. If the matrix A is constant the matrix (t) = exp(tA) is a fundamental matrix for the system dv = Av. dt

Proof. det((0)) = det(0A) = det(I) = 1 = 0. Remark 79. The proof of Theorem 77 asserted that The columns of PC are linear combnations of the columns of P. Because P c1 c2 . . . ck = Pc1 Pc2 . . . Pck

it is enough to see that this is true when C is a single column, i.e. when C is n 1. In this case it is the denition of matrix multiplication. 35

80. Now we show how to solve the inhomogeneous system (24) once we have a fundamental matrix for the corresponding homogeneous system (25). By the Superposition Principle (more precisely the Principle of the Particular Solution) it is enough to nd a particular solution xp of (24) for then the general solution of (24) is the particular solution plus the general solution of (25). The method we use is called variation of parameters. We already saw a one dimensional example of this in the rst part of the course. We use the Ansatz xp (t) = (t)u(t) dxp d du du du = u+ = Au + = Axp + dt dt dt dt dt which solves (24) if du =f dt so we can solve by integration
t t

Then

u(t) = u(0) +
0

u ( ) d = u(0) +
0

( )1 f ( ) d.

(Since we only want one solution not all the solutions we can take u(0) to be anything.) The solution of the initial value problem is x = (t)u(t) + (t) (0)1 x0 u(0) . Example 81. We solve the inhomogeneous system dx = Ax + f dt where A= 1 4 2 7 , f= 1 0 . (26) du = f, dt x(0) = x0

In Example 69 we found the fundamental matrix (t) = exp(tA) = 2e3t e5t 2e3t + 2e5t e3t e5t e3t + 2e5t 36

for the corresponding homogeneous system. We take du = (t)1 f = exp(tA)1 f = exp(tA)f = dt Integrating gives 2e3t e5t 3t 5t + 10e + 3e 3 5 = 1 u= e3t e5t 15 5e3t + 3e5t + 3 5 so a particular solution is xp = exp(tA)u = 1 15 1 = 15 1 = 15 = 1 10e3t + 3e5t 15 5e3t + 3e5t (2e3t e5t )(10e3t + 3e5t ) + (2e3t + 2e5t )(5e3t + 3e5t ) (e3t e5t )(10e3t + 3e5t ) + (e3t + 2e5t )(5e3t + 3e5t ) 2e3t e5t 2e3t + 2e5t e3t e5t e3t + 2e5t 2e3t e5t e3t e5t

(23 + 10e2t + 6e3t ) + (16 10e2t 6e2t ) (13 + 10e2t + 6e2t ) + (11 10e2t 3e2t ) 7 2 .

Whew! This means that if we havent made a mistake, the functions x1 = should satisfy dx1 = x1 + 4x2 + 1, dt Lets check: dx1 =0= dt 7 15 +4 7 15 2 15 +7 + 1 = x1 + 4x2 + 1 2 15 = 2x1 + 7x2 dx2 = 2x1 + 7x2 . dt 7 , 15 x2 = 2 15

dx2 = 0 = 2 dt

37

Using equation (26) we see that the solution of the initial value problem dx1 = x1 + 4x2 + 1, dt dx2 = 2x1 + 7x2 , dt i.e. x1 (0) = 17, x2 (0) = 29

is x = exp(tA)u(t) + exp(tA) x0 u(0) x1 = x2 =


7 15 2 15

7 2 + (2e3t e5t )(17 + 15 ) + (2e3t + 2e4t )(29 + 15 ), 7 2 + (e3t e5t )(17 + 15 ) + (e3t + 2e5t )(29 + 15 )

Remark 82. This particular problem can be more easily solved by dierentiating the equation x = Ax + f . Since f is constant we get x = Ax which is a homogeneoous equation. Final Exam 07:45 A.M. THU. MAY 14

38

VECTOR SPACE NOTES: CHAPTER 4


DAVID SEAL

4.2. Vector Space Rn and Subspaces. Vector Spaces are (in the abstract sense) sets of elements, called vectors, that are endowed with a certain group of properties. You can add two vectors and multiply vectors by scalars. They satisfy the usual commutative, associative and distributive laws you would expect. These properties are listed on page 236 of your textbook. Many problems we encounter can actually be viewed as vector spaces. In fact, this may come as a surprise to you, but functions are vectors. In fact, solutions to linear dierential equations can also be thought of as a linear subspace of a certain class of functions. This is the motivation for studying vectors in the abstract sense. If we have a vector space V , and we have a subset W V , a natural question to ask is whether or not W itself forms a vector space. This means it needs to satisfy all the properties of a vector space from above! The bad news is this is quite a long list, however the good news is we dont have to check every property on the list, because most of them are inherited from the original vector space V . In short, in order to see if W is a vector space, we need only check if W passes the following test. Theorem 1. If V is a vector space and W V is a non-empty subset, then W itself is a vector space provided it satises the following two conditions: a. Additive Closure If a W and b W , then a + b W . b. Multiplicative Closure If R and a W , then a W . Note that these are two properties that are on the long laundry list of properties we require for a set to be a vector space. Example Consider W := {a = (x, y) R2 : x = 2y}. Since W R2 , we may be interested if W itself forms a vector space. To answer this question we need only check two items: a. Additive Closure: An arbitrary element in W can be described by (2y, y) where y R. Let (2y, y), (2z, z) W . Then (2y, y) + (2z, z) = (2y + 2z, y + z)) W since 2y + 2z = 2(y + z). b. Multiplicative Closure: We need to check if R, and a W , then a W . Again, an arbitrary element in W can be described by (2y, y) where y R. Let R and (2y, y) W . Then (2y, y) = (2y, y) W since the rst coordinate is exactly twice the second element. Note: it is possible to write this set as the kernel of a matrix. In fact, you can check that W = ker(A), where A12 = 2 1 . We actually have a theorem that says the kernel of any matrix is indeed a linear subspace.
Date: Updated: March 19, 2009.
1

DAVID SEAL

Example Consider W := {a R3 : z 0}. In order for this to be a linear subspace of R3 , it needs to pass two tests. In fact, this set passes the additive closure test, but it doesnt pass multiplicative closure! For example, (0, 0, 5) W , but (1) (0, 0, 5) = (0, 0, 5) W . / Denition. If Amn is a matrix, we dene ker(A) := {x Rn : Ax = 0}. This is also called the nullspace of A. Note that ker(A) lives in Rn . Denition. If Amn is a matrix, we dene Image(A) := {y Rm : Ax = y for some x Rn }. This is also called the range of A. Note that Image(A) lives in Rm . Theorem 2. If Amn is a matrix, then ker(A) is a linear subspace of Rn and Image(A) is a linear subspace of Rm . 4.3. Linear Combinations and Independence of Vectors. If we have a collection of vectors {v1 , v2 , . . . , vk }, we can form many vectors by taking linear combinations of these vectors. We call this space the span of a collection of vectors, and we have the following theorem: Theorem 3. If {v1 , v2 , . . . , vk } is a collection of vectors in some vector space V , then W := span{v1 , v2 , . . . , vk } := {w : w = c1 v1 + c2 v2 + + ck vk , for some scalars ci } is a linear subspace of V . For a concrete example, we can take two vectors v1 = (1, 1, 0) and v2 = (1, 0, 0) which both lie in R3 . Then the set W = span{(1, 1, 0), (1, 0, 0)} describes a plane that lives in R3 . This set is a linear subspace by this previous theorem. In fact, we can be a bit more descriptive and write W = {(x, y, z) R3 : z = 0}. If we continue with this example, it is possible to write W in many other ways. In fact, we could have written W = span{(1, 1, 0), (1, 0, 0), (5, 1, 0)} = span{(10, 1, 0), (2, 1, 0)}. These examples illustrate the fact that our choice of vectors need not be unique. What is unique, is the least number of vectors that are required to describe the set. In fact this is so important we give it a name, and call it the dimension of a vector space. This is the content of section 4.4. In our example, dim(W ) = 2, but right now we dont have enough tools to show this. In order to make this statement least, precise, we need to introduce denition. Denition. Vectors {v1 , v2 , . . . , vk } are said to be linearly independent if whenever c 1 v1 + c 2 v2 + + c k vk = 0 for some scalars ci , it must follow that ci = 0 for each i. OK, so denitions are all ne and good, but how do we check if vectors are linearly independent? The nice thing about this denition is it always boils down to solving a linear system. Example As a concrete example, lets check if vectors {v1 , v2 } linearly independent where v1 = (4, 2, 6, 4) and v2 = (2, 6, 1, 4).

VECTOR SPACE NOTES: CHAPTER 4

We need to solve the problem c1 v1 + c2 v2 = 0. This reduces to asking what are the solutions to 4 2 6 2 c1 6 + c2 1 4 4 0 0 = . 0 0

We can write this problem as a matrix equation Ac = 0 where 4 2 2 6 c1 A= 6 1 , c = c2 4 4 and solve this using Gaussian elimination. 4 2 0 2 6 0 row ops 6 1 0 4 4 0 1 0 0 0 0 1 0 0 0 0 . 0 0

Thus c1 = c2 = 0 is the only solution to this problem, and so these two vectors are linearly independent. To demonstrate that a collection of vectors are not linearly independant, it suces to nd a non-trivial combination of these vectors and show they sum to 0. For example, see example 6 in the textbook.

Math 320 Spring 2009 Part II Linear Algebra


JWR April 24, 2009

Monday February 16
a unique solution x = b/a if a = 0, no solution if a = 0 and b = 0, innitely many solutions (namely any x) if a = b = 0.

1. The equation ax = b has

2. The graph of the equation ax + by = c is a line (assuming that a and b are not both zero). Two lines intersect in a unique point if they have dierent slopes, do not intersect at all if they have the same slope but are not the same line (i.e. if they are parallel), and intersect in innitely many points if they are the same line. In other words, the linear system a11 x + a12 y = b1 , a21 x + a22 y = b2

has a unique solution if a11 a22 = a12 a21 , no solution if a11 a22 = a12 a21 but neither equation is a multiple of the other, innitely many solutions if one equation is a multiple of the other. 3. The graph of the equation ax + by + cz = d is a plane (assuming that a, b, c, are not all zero). Two planes intersect in a line unless they are parallel or 1

identical and a line and a plane intersect in a point unless the line is parallel to the plane or lies in the plane. Hence the linear system a11 x + a12 y + a13 z = b2 a21 x + a22 y + a23 z = b2 a31 x + a32 y + a33 z = b3 has either a unique solution, no solution, or innitely many solutions. 4. A linear system of m equations in n unknowns x1 , x2 , . . . , xn has the form a11 x1 + a12 x2 + + a1n xn = b1 , a21 x1 + a22 x2 + + a2n xn = b2 , () am1 x1 + am2 x2 + + amn xn = bm . The system is called inconsistent i has no solution and consistent i it is not inconsistent. Some authors call the system underdetermined i it has innitely many solutions i.e. if the equations do not contain enough information to determines a unique solution. The system is called homogeneous i b1 = b2 = bm = 0. A homogeneous system is always consistent because x1 = x2 = = xn = 0 is a solution. 5. The following operations leave the set of solutions unchanged as they can be undone by another operation of the same kind. Swap. Interchange two of the equations. Scale. Multiply an equation by a nonzero number. Shear. Add a multiple of one equation to a dierent equation. It ie easy to see that the elementary row operations do not change the set of solutions of the system (): each operation can be undone by another operation of the same type. Swapping two equations twice, returns to the original system, scaling a row by c and then scaling it again by c1 returns to the original system, and nally adding a multiple on one row to another and then subtracting the multiple returns to the original system.

Wednesday February 18
a12 a22 . . . .. . a1n a2n . . . amn

6. A matrix is an m n array a11 a21 A= . . . am1

am2

of numbers. One says that A has size m n or shape m n or that A has m rows and n columns. The augmented matrix a11 a12 a1n b1 a21 a22 a2n b2 M := [A b] = . () . . .. . . . . . . . . . . am1 am2 amn bm represents the system of linear equations () in section 4. The elementary row operations described above to transform a system into an equivalent system may be represented as Swap. Scale. Shear. M([p,q],:)=M([q,p],:) M(p,:)=c*M(p,:) M(p,:)=M(p,:)+c*M(q,:) Interchange the pth and qth rows. Multiply the pth row by c. Add c times qth row to pth row.

The notations used here are those of the computer language Matlab.1 The equal sign denotes assignment, not equality, i.e. after the command X=Y is executed the old value of X is replaced by the value of Y. For example, if the value of a variable x is 7, the eect of the command x=x+2 is to change the value of x to 9.
An implementation of Matlab called Octave is available free on the internet. (Do a Google search on Octave Matlab.) I believe it was written here at UW. A more primitive version of Matlab called MiniMat (short for Minimal Matlab) which I wrote in 1989 is available on my website. There is a link to it on the Moodle main page for this course. It is adequate for everything in this course. Its advantage is that it is a Java Applet and doesnt need to be downloaded to and installed on your computer. (It does require the Java Plugin for your web browser.)
1

Denition 7. A matrix is said to be in echelon form i (i) all zero rows (if any) occur at the bottom, and (ii) the leading entry (i.e. the rst nonzero entry) in any row occurs to the right of the leading entry in any row above. It is said to be in reduced echelon form i it is in echelon form and in addition (iii) each leading entry is one, and (iv) any other entry in the same column as a leading entry is zero. When the matrix represents a system of linear equations, the variables corresponding to the leading entries are called leading variables and the other variable are called free variables. Remark 8. In military lingo an echelon formation is a formation of troops, ships, aircraft, or vehicles in parallel rows with the end of each row projecting farther than the one in front. Some books use the term row echelon form for echelon form and reduced row echelon form or Gauss Jordan normal form for reduced echelon form. Denition 9. Two matrices are said to be row equivalent i one can be transformed to the other by elementary row operations. Theorem 10. If the augmented coecient matrices () of two linear systems () are row equivalent, then the two systems have the same solution set. Proof. See Paragraph 5 above. Remark 11. It is not hard to prove that the converse of Theorem 10 is true if the linear systems are consistent, in particular if the linear systems are homogeneous. Any two inconsistent systems have the same solution set (namely the empty set) but need not have row equivalent augmented coecient matrices. For example, the augmented coecient matrices 1 0 0 0 0 1 x1 = 0, 0x2 = 1 are not row equivalent. 4 and 0 0 1 0 0 0 0x1 = 1, 0x2 = 0

corresponding to the two inconsistent systems and

Theorem 12. Every matrix is row equivalent to exactly one reduced echelon matrix. Proof. The Gauss Jordan Elimination Algorithm described in the text2 on page 165 proves at least one. Figure 2 shows an implementation of this algorithm in the Matlab programming language. At most one is tricky. A proof appears in my book3 on page 182 (see also page 105). Remark 13. Theorem 12 says that it doesnt matter which elementary row operations you apply to a matrix to transform it to reduced echelon form; you always get the same reduced echelon form. 14. Once we nd an equivalent system whose augmented coecient matrix is in reduced echelon form it is easy to say what all the solutions to the system are: the free variables can take any values and then the other variables are uniquely determined. If the last non zero row is [0 0 0 1] (corresponding to an equation 0x1 + 0x2 + + 0xn = 1) then the system is inconsistent. For example, the system corresponding to the reduced echelon form 0 1 5 0 6 7 0 0 0 1 8 5 is x2 +5x3 +6x5 = 7 x4 +8x5 = 5.

The free variables are x1 , x3 , x5 and the general solution is x2 = 7 5x3 6x5 , where x1 , x3 , x5 are arbitrary. x4 = 5 8x5

Friday February 20

Now we dene the operations of matrix algebra. This algebra is very useful for (among other things) manipulating linear systems. The crucial point is that all the usual laws of arithmetic hold except for the commutative law.
2 3

Edwards & Penny: Dierential Equations & Linear Algebra, 2nd ed. Robbin: Matrix Algebra Using MINImal MATlab

Figure 1: Reduced Echelon (Gauss Jordan) Form function [R, lead, free] = gj(A) [m n] = size(A); R=A; lead=zeros(1,0); free=zeros(1,0); r = 0; % rank of first k columns for k=1:n if r==m, free=[free, k:n]; return; end [y,h] = max(abs(R(r+1:m, k))); h=r+h; % (*) if (y < 1.0E-9) % (i.e if y == 0) free = [free, k]; else lead = [lead, k]; r=r+1; R([r h],:) = R([h r],:); % swap R(r,:) = R(r,:)/R(r,k); % scale for i = [1:r-1,r+1:m] % shear R(i,:) = R(i,:) - R(i,k)*R(r,:); end end % if end % for (The eect of the line marked (*) in the program is to test that the column being considered contains a leading entry. The swap means that the subsequent rescaling is by the largest possible entry; this minimizes the relative roundo error in the calculation.)

Denition 15. Two matrices are equal i they have the same size (SIZE MATTERS!) and corresponding entries are equal, i.e. A = B i A and B are both m n and entryij (A) = entryij (B) for i = 1, 2, . . . , m and j = 1, 2, . . . , n. The text sometimes writes A = [aij ] to indicate that aij = entryij (A). Denition 16. Matrix Addition. Two matrices may be added only if they are the same size; addition is performed elementwise, i.e if A and B are m n matrices then entryij (A + B) := entryij (A) + entryij (B) for i = 1, 2, . . . , m and j = 1, 2, . . . , n. The zero matrix (of whatever size) is the matrix whose entries are all zero and is denoted by 0. Subtraction is dened by A B := A + (B), B := (1)B. Denition 17. Scalar Multiplication. A matrix can be multiplied by a number (scalar); every entry is multiplied by that number, i.e. if A is an m n matrix and c is a number, then entryij (cA) := c entryij (A) for i = 1, 2, . . . , m and j = 1, 2, . . . , n. 18. The operations of matrix addition and scalar multiplication satisfy the following laws: (A + B) + C = A + (B + C). A + B = B + A. A + 0 = A. A + (A) = 0. c(A + B) = cA + cB, (b + c)A = bA + cA. (bc)A = b cA . 1A = A. 0A = 0, c0 = 0 (Additive Associative Law) (Additive Commutative Law) (Additive Identity) (Additive Inverse) (Distributive Laws) (Scalar Asscociative Law) (Scalar Unit) (Multiplication by Zero).

In terms of lingo we will meet later in the semester these laws say that the set of all m n matrices form a vector space. 7

Denition 19. The product of the matrix A and the matrix B is dened only if the number of columns in A is the same as the number or rows in B, and it that case the product AB is dened by
n

entryik (AB) =
j=1

entryij (A)entryjk (B)

for i = 1, 2, . . . , m and k = 1, 2, . . . , p where A is m n and B is n p. Note that the ith row of A is a 1 n matrix, the kth column of B is a n 1 matrix, and entryik (AB) = rowi (A)columnk (B). 20. With the notations a11 a12 a1n a21 a22 a2n A= . . .. . . . . . . . . am1 am2 amn , x= , b= ,

x1 x2 . . . xn

b1 b2 . . . bn

the linear system () of section 4 may be succinctly written Ax = b. 21. A square matrix is one with the same number of rows as columns, i.e of size n n. A diagonal matrix is a square matrix D of form d1 0 0 0 d2 0 D= . . ... . . . . . . . 0 0 dn i.e. all the nonzero entries are on the diagonal. The identity matrix is the square matrix whose diagonal entries are all 1. We denote the identity matrix (of any size) by I. The jth column of the identity matrix is denoted by ej and is called the jth basic unit vector. Thus I= e1 e2 en .

22. The matrix operations satisfy the following laws: (AB)C) = A(BC), (aB)C) = a(BC). C(A + B) = CA + CB. (A + B)C = AC + BC. IA = A, AI = A. 0A = 0, A0 = 0. (Asscociative Laws) (Left Distributive Law) (Right Distributive Law) (Multiplicative Identity) (Multiplication by Zero)

23. The commutative law for matrix multiplication is in general false. Two matrices A and B are said to commute if AB = BA. This can only happen when both A and B are square and when they have the same size, but even then it can be false. For example, 0 1 0 0 0 0 1 0 = 1 0 0 0 but 0 0 1 0 0 1 0 0 = 0 0 0 1 .

Monday February 23

Denition 24. An elementary matrix is a matrix which results from the identity matrix by performing a single elementary row operation. Theorem 25 (Elementary Matrices and Row Operations). Let A be an an m n matrix and E be an m m elementary matrix. Then the product EA is equal to the matrix which results from applying to A the same elementary row operation as was used to produce E from I. 26. Suppose a matrix A is transformed to a matrix R by elementary row operations, i.e. R = Ek E2 E1 A where each Ej is elementary. Thus R = MA where M = Ek E2 E1 . Because of the general rule E A B = EA EB

(we might say that matrix multiplication distributes over concatenation) we can nd M via the formula M A I = MA MI = R M .

The Matlab program shown in Figure 4 implements this algorithm. 9

Figure 2: Reduced Echelon Form and Multiplier function [M, R] = gjm(A) [m,n] = size(A); RaM = gj([A eye(m)]); R = RaM(:,1:n); M = RaM(:,n+1:n+m);

Denition 27. A matrix B is called a right inverse to the matrix A i AB = I. A matrix C is called a left inverse to the matrix A i CA = I. The matrix A is called invertible i it has a left inverse and a right inverse. Theorem 28 (Uniqueness of the Inverse). If a matrix has both a left inverse and a right inverse then they are equal. Hence if A is invertible there is exactly one matrix denoted A1 such that AA1 = A1 A = I. Proof. C = CI = A(AB) = (CA)B = IB = B. Denition 29. The matrix A1 is called the (not an) inverse of A. Remark 30. The example 1 0 0 0 1 0 1 0 0 1 = 0 0

1 0 0 1

shows that a nonsquare matrix can have a one-sided inverse. Since 1 0 1 0 c13 1 0 0 1 = 0 1 c23 0 1 0 0 we see that left inverses are not unique. Since 1 0 1 0 0 0 1 = 0 1 0 b31 b32

1 0 0 1

we see that right inverses are not unique. Theorem 28 says two sided inverses are unique. Below we will prove that an invertible matrix must be square. 10

Wednesday February 25

31. Here is what usually (but not always) happens when we transform an mn matrix A to a matrix R in reduced echelon form. (The phrase usually but not always means that this is what happens if the matrix is chosen at random using (say) the Matlab command A=rand(m,n).) Case 1: (More columns than rows). If m < n then (usually but not always) matrix R has no zero rows on the bottom and has the form R= I F

where I denotes the m m identity matrix. In this case the homogeoneous system Ax = 0 has nontrivial (i.e. nonzero) solutions, the inhomogeneous system Ax = b is consistent for every b, and both the matrix A and the matrix R have innitely many right inverses. In this case the last n m variables are the free variables. The homogeneous system Ax = 0 takes the form x1 xm+1 x2 xm+2 x 0 , x = . , x = . Rx = I F = x 0 . . . . xm xn and the general solution of the homogeneous system is given by x = Fx . (The free variables x determines the other variables x .) Case 2: (More rows than columns). If n < m then (usually but not always) the matrix R has m n zero rows on the bottom and has the form R= I 0

where I denotes the n n identity matrix. In this case the homogeoneous system Ax = 0 has no nontrivial (i.e. nonzero) solutions, the inhomogeneous system Ax = b is inconsistent for innitely b, and both the matrix A and

11

the matrix R have innitely many left inverses. The system Ax = b may be written as Rx = MAx = Mb or bn+1 b1 b2 bn+2 I x b x= = Mb = , b = . , b = . . . 0 0 b . . bm bn which is inconsistent unless b = 0. Case 3: (Square matrix). If n = m then (usually but not always) the matrix R is the identity and the matrix A is invertible. In this case the inverse matrix A1 is the multiplier M found by the algorithm in paragraph 26 and gure 4. In this case the homogeneous system Ax = 0 has only the trivial solution x = 0 and the inhomogeneous system Ax = b has the unique solution x = A1 b. Theorem 32 (Invertible Matrices and Elementary Matrices). Elementary matrices are invertible. A matrix is invertible if and only if it is row equivalent to the identity, i.e. if and only if if it is a product of elementary matrices. Proof. See Case(3) of paragraph 31 and paragraph 35 below. Theorem 33 (Algebra of Inverse Matrices). The invertible matrices satisfy the following three properties: 1. The identity matrix I is invertible and I1 = I. 2. The inverse of an invertible matrix A is invertible and A1
1

= A.

3. The product AB of two invertible matrices is invertible, and AB


1

= B1 A1 .
1

=A Proof. That I1 = I follows from IC = CI = I if C = I. That A1 1 1 1 1 follows from CA = AC = I if C = A . To prove AB = B A let 1 1 1 1 1 C = B A . Then C(AB) = B A AB = B IB = B1 B = I and (AB)C = ABB1 A1 = AIA1 = AA1 = I. 12

Remark 34. Note the structure of the last proof. We are proving If P, then Q where if P is the statement if A and B are invertible and then Q is the statement then AB is invertible. The rst step is Assume that A and B are invertible. The second step is Let C = B1 A1 . Then there is some calculation. The penultimate step is Therefore C(AB) = (AB)C = I and the last step is Therefore AB is invertible. Each step is either a hypothesis (like the rst step) or introduces notation (like the second step) or follows from earlier steps. The last step follows from the penultimate step by the denition of what it means for a matrix to be invertible. 35. The algorithm in paragraph 26 can be used to compute the inverse A1 of an invertible matrix A as follows. We form the n 2n matrix A I . Performing elementary row operations produces a sequence A I = A0 B0 , A1 B1 , A m Bm = I M ,

where each matrix in the sequence is obtained from the previous one by multiplication by an elementary matrix Ak+1 Bk+1 = Ek Ak Bk = Ek Ak EBk .

Hence there is an invariant relation A1 Bk+1 = (Ek Ak )1 (EBk ) = A1 E1 EBk = A1 Bk , k+1 k k i.e. the matrix A1 Bk doesnt change during the algorithm. Hence k A1 = A1 I = A1 B0 = A1 Bm = I1 M = M. 0 m This proves that the algorithm computes the inverse A1 when A is invertible. See Case 3 of paragraph 31. 36. If n is an integer and A is a square matrix, we dene An := AA A
n

for n 0 with The power laws Am+n = Am An ,

An := A1 . A0 = I

follow from these denitions and the associative law. 13

Friday February 27 and Monday March 2

37. In this section A denotes an n n matrix and aj denotes the jth column of A. We indicate this by writing A= a1 a2 an .

We also use the notation ej for the jth column of the identity matrix I: I= e1 e2 en .

Theorem 38. There is a unique function called the determinant4 which assigns a number det(A) to each square matrix A and has the following properties. (1) The determinant of the identity matrix is one: det(I) = 1. (2) The determinant is additive in each column: det( aj + aj ) = det( aj )+det( aj )

(3) Rescaling a column multiplies the determinant by the same factor: det( a1 caj an ) = c det( a1 aj an ).

(4) The determinant is skew symmetric in the columns: Interchanging two columns reverses the sign: det( ai aj ) = det( aj ai ).

Lemma 39. The following properties of the determinant function follow from properties (2-4) (5) Adding a multiple of one column to a dierent column leaves the determinant unchanged: det(
4

ai + caj ) = det(

ai )

(i = j)

The text uses the notation |A| where I have written det(A).

14

(6) If a matrix has two identical columns its determinant is zero: i = j, ai = aj = det( ai aj ) = 0.

Proof. Item (6) is easy: interchanging the two columns leaves the matrix unchanged (because the columns are identical) and reverses the sign by item (4). To prove (5) det( ai + caj ) = det( = det( = det( by (2), (3), and (6) respectively. Remark 40. The theorem denes the determinant implicitly by saying that there is only one function satisfying the properties (1-4). The text gives an inductive denition of the determinant on page 201. Inductive means that the determinant of an n n matrix is dened in terms of other determinants (called minors) of certain (n 1) (n 1) matrices. Other denitions are given in other textbooks. We wont prove Theorem 38 but will instead show how it gives an algorithm for computing the determinant. (This essentially proves the uniqueness part of Theorem 38.) Example 41. The determinant of a 2 2 matrix is given by det The determinant a11 a21 det a31 a11 a12 a21 a22 = a11 a22 a12 a21 . ai ) + det( ai ) + c det( ai ) caj ) aj )

of a 3 3 matrix is given by a12 a31 a22 a23 = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 a32 a33 a12 a21 a33 a11 a23 a32 a13 a22 a31 .

The student should check that (with these denitions) the properties of the determinant listed in Theorem 38 hold. Theorem 42 (Elementary Matrices and Column Operations). Let A be an an m n matrix and E be an n n elementary matrix. Then the product 15

AE is equal to the matrix which results from applying to A the same elementary column operation as was used to produce E from I. (The elementary column operations are swapping two columns, rescaling a column by a nonzero factor, and adding a multiple of one column to another.) Proof. This is just like Theorem 25. (The student should write out the proof for 2 2 matrices.) Theorem 43. If E is an elementary matrix, and A is a square matrix of the same size, then the determinant of the product AE is given by (Swap) det(AE) = det(A) if (right multiplication by) E swaps two columns; (Scale) det(AE) = c det(A) if E rescales a column by c; (Shear) det(AE) = det(A) if E adds a multiple of one column to another. Proof. These are properties (2-4) in Theorem 38. Theorem 44. The determinant of a product is the product of the determinants: det(AB) = det(A) det(B). Hence a matrix is invertible if and only if its determinant is nonzero and det(A1 ) = det(A)1 . Proof. An invertible matrix is a product of elementary matrices, so this follows from Theorem 43 if A and B are invertible. Just as a noninvertible square matrix can be transformed to a matrix with a row of zeros by elementary row operations so also a noninvertible square matrix can be transformed to a matrix with a column of zeros by elementary column operations. A matrix with a column of zeros has determinant zero because of part (3) of Theorem 38: multiplying the zero column by 2 leaves the matrix unchanged and mutiplies the determinanat by 2 so the determinant must be zero. Hence the determinant is zero if either A or B (and hence also AB) in not invertible. The formula det(A1 ) = det(A)1 follows as det(A1 ) det(A) = det(A1 A) = det(I) = 1. The fact that a matrix is invertible if and only if its determinant is nonzero follows form the facts that an invertible matrix is a product of elementary matrices (Theorem 32), the determinant of an elementary matrix is not zero (by Theorem 43 with A = I), and the determinant of a matrix with a zero column is zero. 16

Remark 45. The text contains a formula for the inverse of a matrix in terms of determinants (the transposed matrix of cofactors) and a related formula (Cramers Rule) for the solution of the inhomogeneous system Ax = b where A is invertible. We will skip this, except that the student should memorize the formula A1 = 1 det(A) a22 a12 a21 a11

for the inverse of the 2 2 matrix A= a11 a12 a21 a22 .

Corollary 46. If E is an elementary matrix, and A is a square matrix of the same size, then the determinant of the product EA is given by (Swap) det(EA) = det(A) if (left multiplication by) E swaps two rows; (Scale) det(EA) = c det(A) if E rescales a row by c; (Shear) det(EA) = det(A) if E adds a multiple of one row to another. 47. Figure 6 shows a Matlab program which uses this corollary to compute the determinant at the same time as it computes the reduced echelon form. The algorithm can be understood as follows. If E is an elementary matrix then det(EA) = c det(A) where c is the scale factor if multiplication by E rescales a row, c = 1 if multiplication by E swaps two rows, and c = 1 if multiplication by E subtracts a multiple of one row from another. We initialize a variable d to 1 and as we transform A we update d so that the relation d det(A) = k always holds with k constant, i.e. k = det(A). (This is called an invariant relation in computer science lingo.) Thus when we rescale a row by c1 we replace d by dc, when we swap two rows we replace d by d, and when when we subtract one row from another we leave d unchanged. (The matrix A changes but k does not.) At the end we have replaced A by I so d det(I) = k so d = k = det(A).

17

Figure 3: Computing the Determinant and Row Operations function d = det(A) % invariant relation d*det(A) = constant [m n] = size(A); d=1; for k=1:n [y,h] = max(abs(A(k:m, k))); h=k-1+h; if y < 1.0E-9 % (i.e if y == 0) d=0; return else if (k~=h) A([k h],:) = A([h k],:); % swap d=-d; end c = A(k,k); A(k,:) = A(k,:)/c; % scale d=c*d; for i = k+1:m % shear A(i,:) = A(i,:) - A(i,k)*A(k,:); end end % if end % for

18

Monday March 2

Denition 48. The transpose of an m n matrix A is the n m matrix AT dened by entryij (AT ) = entryji (A) for i = 1, . . . , n, j = 1, . . . , m. 49. The following properties of the transpose operation (see page 206 of the text) are easy to prove: (i) (AT )T = A; (ii) (A + B)T = AT + BT ; (iii) (cAT ) = cAT ; (iv) (AB)T = BT AT . For example, to prove (iv) entryij ((AB)T ) = entryji (AB) =
k

entryjk (A)entryki (B) entryik (BT )entrykj (AT )


k

=
k

entryki (B)entryjk (A) =


T T

= entryij (B A ). Note also that the transpose of an elementary matrix is again an elementary matrix. For example. 1 c 0 1
T

1 0 c 1

c 0 0 1

c 0 0 1

0 1 1 0

0 1 1 0

Finally a matrix A is invertible if and only if its transpose AT is invertible (because BA = AB = I = AT BT = BT AT = IT = I) and the inverse of the transpose is the transpose of the inverse: AT
1

= A1

19

Remark 50. The text (see page 235) does not distinguish Rn and Rn1 and sometimes uses parentheses in place of square brackets for typographical reasons. It also uses the transpose notation for the same purpose so x1 x2 T x = (x1 , x2 , . . . , xn ) = x1 x2 xn = . . . . xn Theorem 51. A matrix and its transpose have the same determinant: det(AT ) = det(A). Proof. The theorem is true for elementary matrices and every invertible matrix is a product of elementary matrices. Hence it holds for invertible matrices by Theorem 44. If A is not invertible then det(AT ) = 0 and det(A) = 0 so again det(AT ) = det(A).

Friday March 6

Denition 52. A vector space is a set V whose elements are called vectors and equipped with (i) an element 0 called the zero vector, (ii) a binary operation called vector addition which assigns to each pair (u, v) of vectors another vector u + v, and (iii) an operation called scalar multiplication which assigns to each number c and each vector v another vector cv, such that the following properties hold: (u + v) + w = u + (v + w). u + v = v + u. u + 0 = u. u + (1)u = 0. c(u + v) = cu + cv, (b + c)u = bu + cu. (bc)u = b cu . 1u = u. 0u = 0, c0 = 0 20 (Additive Associative Law) (Additive Commutative Law) (Additive Identity) (Additive Inverse) (Distributive Laws) (Scalar Asscociative Law) (Scalar Unit) (Multiplication by Zero).

Example 53. As noted above in paragraph 18 the set of all m n matrices with the operations dened there is a vector space. This vector space is denoted by Mmn in the text (see page 272); other textbooks denote it by Rmn . (R denotes the set of real numbers.) Note that the text (see page 235 and Remark 50 above) uses Rn as a synonym for Rn1 and has three notations for the elements of Rn : x1 x2 T x = (x1 , x2 , . . . , xn ) = x1 x2 xn = . . . . xn Examples 54. Here are some examples of vector spaces. (i) The set F of all real valued functions of a real variable. (ii) The set P of all polynomials with real coecients. (iii) The set Pn of all polynomials of degree n. (iv) The set of all solutions of the homogeneous linear dierential equation d2 x + x = 0. dt2 (v) The set of all solutions of any homogeneous linear dierential equation. The zero vector in F is the constant function whose value is zero and the operations of addition and scalar multiplication are dened pointwise, i.e. by (f + g)(x) := f (x) + g(x), (cf )(x) = cf (x).

The set P is a subspace of F (a polynomial is a function); in fact, all these vector spaces are subspaces of F. The zero polynomial has zero coecients, adding two polynomials of degree n is the same as adding the coecients: (a0 + a1 x + + an xn ) + (b0 + b1 x + + bn xn ) = = (a0 + b0 ) + (a1 + b1 )x + + (an + bn )xn , and multiplying a polynomial by number c is the same as multiplying each coecient by c: c(a0 + a1 x + an xn ) = ca0 + ca1 x + can xn . 21

Denition 55. A subset W V of a vector space V is called a subspace i it is closed under the vector space operations, i.e. i (i) 0 W , (ii) u, v W = u + w W , and (iii) c R, u W = cu W . Remark 56. The denition of subspace on page 237 of the text appears not to require the condition (i) that 0 W . However that denition does specify that W is non empty; this implies that 0 W as follows. There is an element u W since W is nonempty. Hence (1)u W by (iii) and 0 = u + (1)u W by (ii). Conversely, if 0 W , then the set W is nonempty as it contains the element 0. The student is cautioned not to confuse the vector 0 with the empty set. The latter is usually denoted by . The empty set is characterized by the fact that it has no elements, i.e. the statement x is always false. In particular, 0 . The student should also take care to distinguish the words / subset and subspace. A subspace of a vector space V is a subset of V with certain properties, and not every subset of V is a subspace. 57. A subspace of a vector space is itself a vector space. To decide if a subset W of a vector space V is a subspace you must check that the three properties in Denition 55 hold. Example 58. The set Pn of ploynomials of degree n is a subset of the vector space P (its elements are polynomials) and the set Pn is a subspace of Pm if n m (if n m then a p polynomial of degree n has degree m). These are also subspace because they are closed under the vector space operations.

Monday March 9 Wednesday March 11

Denition 59. Let v1 , v2 , . . . , vk be vectors in a vector space V . The vector w in V is said to be linear combination of the vectors v1 , v2 , . . . , vk i there exist numbers x1 , x2 , . . . , xk such that w = x1 v2 + x2 v2 + + xk vk . 22

The set of all linear combinations of v1 , v2 , . . . , vk is called the span of v1 , v2 , . . . , vk . The vectors v1 , v2 , . . . , vk are said to span V i V is the span of v1 , v2 , . . . , vk , i.e. i every vector in V is a linear combination of v1 , v2 , . . . , vk . Theorem 60. Let v1 , v2 , . . . , vk be vectors in a vector space V . Then the span v1 , v2 , . . . , vk is a subspace of V . Proof. (See Theorem 1 page 243 of the text.) The theorem says that a linear combination of linear combinations is a linear combination. Here are the details of the proof. (i) 0 is in the span since 0 = 0v1 + 0v2 + + 0vk . (ii) If v and w are in the span, there are numbers a1 , . . . , bk such that v = a1 v1 + a2 v2 + + ak vk and w = b1 v1 + b2 v2 + + bk vk so v + w = (a1 + b1 )v1 + (a2 + b2 )v2 + + (ak + bk )vk so v + w is in the span. (iii) If c is a number and v is in the span then there are numbers a1 , . . . , ak such that v = a1 v1 +a2 v2 + +ak vk so cv = ca1 v1 +ca2 v2 + +cak vk so cv is in the span. Thus we have proved that the span satises the three conditions in the denition of subspace so the span is a subspace. Example 61. If A = a1 a2 an is an m n matrix and b Rm , then b is a linear combination of a1 , a2 , . . . , an if and only if the linear system b = Ax is consistent, i.e. has a solution x. This is because of the formula Ax = x1 a2 + x2 a2 + + xn an for x = (x1 , x2 , . . . , xn ). The span of the columns a1 , a2 , , an is called the column space of A. Denition 62. Let v1 , v2 , . . . , vk be vectors in a vector space V . The vectors v1 , v2 , . . . , vk are said to be independent5 i the only solution of the equation x1 v2 + x2 v2 + + xk vk = 0 ()
The more precise term linearly independent is usually used. We will use the shorter term since this is the only kind of independence we will study in this course.
5

23

is the trivial solution x1 = x2 = = xk = 0. The vectors v1 , v2 , . . . , vk are said to be dependent i they are not independent, i.e. i there are numbers x1 , x2 , . . . , xk not all zero which satisfy (). Theorem 63. The vectors v1 , v2 , . . . , vk are dependent if and only if one of them is in the span of the others. Proof. Assume that v1 , v2 , . . . , vk are dependent. Then there are numbers x1 , x2 . . . , xk not all zero such that x1 v1 + x2 v2 + + xk vk = 0. Since the numbers x1 , x2 , . . . , xk are not all zero, one of them, say xi is not zero so xi1 xi+1 xk x1 vi1 vi+1 vk , vi = v1 xi xi xi xi i.e. vi is a linear combination of v1 , . . . .vi1 , vi+1 . . . , vk . Suppose conversely that vi is a linear combination of v1 , . . . .vi1 , vi+1 . . . , vk . Then there are numbers c1 , . . . , ci1 , ci+1 , . . . , ck such that vi = c1 v1 + + ci1 vi1 + ci+1 vi+1 + + ck vk . Then x1 v1 + x2 v2 + + xk vk = 0 where xj = cj for j = i and xi = 1. Since 1 = 0 the numbers x1 , x2 , . . . , xk are not all zero and so the vectors v1 , v2 , . . . , vk are dependent. Remark 64. (A pedantic quibble.) The text says things like the set of vectors v1 , v2 , . . . , vk is independent but it is better to use the word sequence instead of set. The sets {v, v} and {v} are the same (both consist of the single element v) but if v = 0 the sequence whose one and only element is v is independent (since cv = 0 only if c = 0) whereas the two element sequence v, v (same vector repeated) is always dependent since c1 v +c2 v = 0 if c1 = 1 and c2 = 1. Denition 65. A basis for a vector space V is a sequence v1 , v2 , . . . , vn of vectors in V which both spans V and is independent. Theorem 66. If v1 , v2 , . . . , vn is a basis for V and w1 , w2 , . . . , wm is a basis for V , then m = n. This is Theorem 2 on page 251 of the text. We will prove it next time. It justies the following Denition 67. The dimension of a vector space is the number of elements in some (and hence every) basis. Remark 68. It can happen that there are arbitrarily long independent sequences in V . For example, this is the case if V = P, the space of all polynomials: for every n the vectors 1, x, x2 , . . . , xn are independent. In this case we say that V is innite dimensional. 24

10

Friday March 13

Proof of Theorem 66. Let w1 , w2 , . . . , wm and v1 , v2 , . . . , vn be two sequences of vectors in a vector space V . It is enough to prove () If w1 , w2 , . . . , wm span V , and v1 , v2 , . . . , vn are independent, then n m.

To deduce Theorem 66 from this we argue as follows: If both sequences w1 , w2 , . . . , wm and v1 , v2 , . . . , vn are bases then the former spans and the latter is independent so n m. Reversing the roles gives m n. If n m and m n, then m = n. To prove the assertion () is enough to prove the contrapositive: If w1 , w2 , . . . , wm span V and n > m, then v1 , v2 , . . . , vn are dependent. To prove the contrapositive note that because w1 , w2 , . . . , wm span there are (for each j = 1, . . . , n) constants a1j , . . . , amj such that
m

vj =
i=1

aij wi .

This implies that for any numbers x1 , x2 , . . . , xn we have


n n m m n

xj v j =
j=1 j=1

xj
i=1

aij wi

=
i=1 j=1

aij xj

wi .

(#)

Since n > m the homogeneous linear system


n

aij xj = 0,
j=1

i = 1, 2, . . . , m

()

has more unknowns than equations so there is a nontrivial solution x = (x1 , x2 , . . . , xn ). The left hand side of ( ) is the coecient of wi in (#) so ( ) implies that n xj vj = 0, i.e. that v1 , v2 , . . . , vn are dependent. j=1 Denition 69. For an m n matrix A (i) The row space is the span of the rows of A. 25

(ii) The column space is the span of the columns of A. (iii) The null space is the set of all solutions x Rn of the homogeneous system Ax = 0. The dimension of the row space of A is called the rank of A. The text calls the dimension of the row space the row rank and the dimension of the columns space the column rank but Theorem 72 below says that these are equal. Theorem 70 (Equivalent matrices). Suppose that A and B are equivalent m n matrices. Then (i) A and B have the same null space. (ii) A and B have the same row space. Proof. Assume that A and B are equivalent. Then B = MA where M = E1 E2 Ek is a product of elementary matrices. If Ax = 0 then Bx = MAx = M0 = 0. Similarly if Bx = 0 then Ax = M1 Bx = M1 0 = 0. Hence Ax = 0 Bx = 0 which shows that A and B have the same null space. Another way to look at it is that performing an elementary row operation doesnt change the space of solutions of the corresponding homogeneous linear system. This proves (i) Similarly performing an elementary row operation doesnt change the row space. This is because if E is an elementary matrix then each row of EA is either a row of A or is a linear combination of two rows of A so a linear combination of rows of EA is also a linear combination of rows of A (and vice versa since E1 is also an elementary matrix). This proves (ii). Theorem 71. The rank of a matrix A is the number r of non zero rows in the reduced echelon form of A. Proof. By part (ii) of Theorem 70 it is enough to prove this for a matrix which is in reduced echelon form. The non zero rows clearly span the row space (by the denition of the row space) and they are independent since the identity matrix appears as an r r submatrix. Theorem 72. The null space of an m n matrix has dimension n r where r is the rank of the matrix. 26

Proof. The algorithm on page 254 of the text nds a basis for the null space. You put the matrix in reduced echelon form. The number of leading variables is r so there are n r free variables. A basis consists of the solutions of the system obtained by setting one of the free variables to one and the others to zero. Spring recess. Mar 14-22 (S-N)

11

Monday March 23

73. Theorem 70 says that equivalent matrices have the same row space, but they need not have the same column space. The matrices A= 1 0 1 0 and B= 1 0 0 0

are equivalent and the row space of each is the set of multiples of the row 1 0 , but the column spaces are dierent: the column space of A consists of all multiples of the column (1, 1) while the column space of B consists of all multiples of column (1, 0). However Theorem 74. The row rank equals the column rank, i.e. the column space and row space of an m n matrix A have the same dimension. Proof. Theorem 63 says that if a sequence v1 , . . . , vn of vectors is dependent then one of them is a linear combination of the others. This vector can be deleted without changing the span. In particular, if the columns of a matrix are dependent we can delete one of them without changing the column space. This process can be repeated until the vectors that remain are independent. The remaining vectors then form a basis. Thus a basis for the column space of A can be selected from the columns of A. The algorithm in the text on page 259 tells us that these can be the pivot columns of A: these are the columns corresponding to the leading variables in the reduced echelon form. Let ak be the kth column of A and rk be the kth column of the redced echelon form R of A. Then A= a1 a2 an , R= r1 r2 rn ,

27

and MA = R where M is the invertible matrix which is the product of the elementary matrices used to transform A to its reduced echelon form R. Now matrix multiplication distributes over concatenation: MA = so Mak = rk , and rk = M1 ak for k = 1, 2, . . . , n. After rearranging the columns of R and rearranging the columns of A the same way we may assume that the rst r columns of R are the rst r columns e1 , e2 , . . . , er of the identity matrix and the last n r rows of R are zero. Then each of the last n r columns of R is a linear combination of the rst r columns so (multiplying by M) each of the last n r columns of A is a linear combination of the rst r columns (with the same coecients). Hence the rst columns of A span the column space of A. If some linear combination of the rst r columns of A is zero, then (multiplying by M1 ) the same linear combination of the rst r columns is zero. But the rst r columns of R are the rst r columns of the identity matrix so the coecients must be zero. Hence the rst r columns of A are independent. Example 75. The following matrices were computer generated. 1 3 19 23 1 0 4 5 1 2 14 17 0 1 5 6 , A= R= 2 6 38 46 0 0 0 0 , 2 7 43 52 0 0 0 0 2 1 3 2 1 3 10 29 6 1 2 7 0 2 1 21 , . M= M1 = 5 2 6 19 3 2 3 55 1 1 1 1 2 7 22 64 The matrix R is the reduced echelon form of A and MA = R. The pivot columns are the rst two columns. The third column of R is 4e1 + 5e2 and the third column of A is a3 = 4a1 + 5a2 . The fourth column of R is 5e1 + 6e2 and the fourth column of A is a4 = 5a1 + 6a2 . The rst two columns of A are the same as the rst two columns of M1 . Ma1 Ma2 Man =R= r1 r2 rn ,

28

11.1

Wednesday March 22

The following material is treated in Section 4.6 of the text. We may not have time to cover it in class so you should learn it on your own, Denition 76. The inner product of two vectors u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) in Rn is denoted u, v and dened by u, v = u1 v2 + u2 v2 + + un vn . It was called the dot product in Math 222. It can also be expressed in terms of the transpose operation as u, v = uT v. The length |u| of the vector u is dened as |u| := u, u .

Two vectors are called orthogonal i their inner product is zero. 77. The inner product satises the following. (i) u, v = v, u . (ii) u, v + w = u, v + u, w . (iii) cu, v = c u, v . (iv) u, u 0 and u, u = 0 u = 0. (v) | u, v | |u| |v|. (vi) |u + v| |u| + |v|. The inequality (v) is called the Cauchy Schwartz Inequality. It justies dening the angle between two nonzero vectors u and v by the formula u, v = |u| |v| cos Thus two vectors are orthogonal i the angle between them is /2. The inequality (vi) is called the triangle inequality. 29

Theorem 78. Suppose that the vectors v1 , v2 , . . . , vk are non zero and pairwise orthogonal, i.e. vi , vj = 0 for i = j. Then the sequence v1 , v2 , . . . , vk is independent. Denition 79. Let V be a subspace of Rn . The orthogonal complement is the set V of vectors which are orthogonal to all the vectors in V , in other words w V v, w = 0 for all v V. Theorem 80. The column space of AT is the orthogonal complement to the null space of A. Exam II. Friday Mar 27

30

Math 320 Spring 2009 Part I Dierential Equations


JWR March 9, 2009
The text is Dierential Equations & Linear Algebra (Second Edition) by Edwards & Penney.

Wednesday Jan 21
dy = 2y + 3, dt

1. In rst year calculus you learned to solve a linear dierential equation like y(0) = 5 (1)

This semester you will learn to solve a system of linear dierential equations like: dx = 3x + y + 7, dt dy = x + 5y 2, dt (x(0), y(0)) = (4, 8). (2)

Note that if you can solve systems of equations like (2) you can also solve higher order equations like d2 y dy = 3 + y + 7, 2 dt dt You can change (3) into a system: dy = v, dt dv = 3v + y + 7, dt y(0) = 4, v(0) = 8 (4) y(0) = 4, dy dt = 8.
t=0

(3)

2. An ODE (ordinary dierential equation) of order n looks like F t, y, y , y , . . . , y (n) = 0 The unknown is a function y = y(t) of the independent variable t and dy y := , dt When the equation looks like y (n) = G t, y, y , y , . . . , y (n1) (6) d2 y y := 2 , dt ..., y
(n)

(5)

dn y := n . dt

we say it is in normal form. It may be impossible to rewrite equation (5) as equation (6). A system of dierential equations is the same thing with the single unknown y replaced by the vector y := (y1 , y2 , . . . , ym ). Remark 3. As our rst examples will show, the independent variable often has the interpretation of time which is why the letter t is used. Un this case the ODE represents the time evolution of a dynamical system. For example the 2nd order system m = r GM m r3

describes the motion of a planet of mass m moving about a sun of mass M . The sun is at the origin, r is the posiition vector of the planet, and r = |r| is the length of r, i.e. the distance from the planet to the sun. Sometimes the ODE has a geometric interpretation in which case the letter x is often used for the independent variable. Example 4. Swimmer crossing a river (Text page 15.) Let the banks of a river be the vertical lines x = a in the (x, y) plane and suppose that the river ows up so that the velocity vR of the river at the point (x, y) is vR = v0 1 x2 a2 .

The formula says that vR = 0 on the banks where x = a and vR = v0 in the center of the river where x = 0 (the y-axis). The swimmer swims with

constant velocity vS towards the closest point on the opposite shore. The system x2 dx dy = vS , = vR = v0 1 2 dt dt a is a dynamical system describing the position of the swimmer. Dividing the dy dy/dt two equations and using = gives the geometric equation dx dy/dx dy v0 = dx vS 1 x2 a2

which describes the trajectory of the swimmer. Example 5. Newtons law of cooling. This says that rate of change of the temperature T of a body (e.g. a cup of coee) is proportional to dierence A T bewteen the ambient temperature (i.e. room temperature) and the temperature of the body. The ODE is dT = k(A T ). dt In a tiny time from t to t + h of duration t = (t + h) t = h the change in the temperature is T = T (t + h) = T (t) so the rate of change id T /t. By Newtons law of cooling we have (approximately) T k(A T ). t It doesnt matter much if we use T = T (t) or T = T (t + h) on the right hand side because T is continuous and h is small. By the denition of the derivative dT T T (t + h) = T (t) = lim = lim t0 t h0 dt h so we get the exact form of Newtons law of cooling as the limit as h 0 in the approximate form.

Friday Jan 23

Theorem 6 (Existence and Uniqueness Theorem). Suppose that f (t, y) is a continuous function of two variables dened in a region R in (t, y) plane and 3

that the partial f /y exists and is continuous everyhere in R. Let (t0 , y0 ) be a point in R. Then there is a solution y = y(t) to the initial value problem dy = f (t, y), dt y(t0 ) = y0

dened on some interval I about t0 . The solution is unique in the sense that any two such solutions of the initial value problem are equal where both are dened. Remark 7. The theorem is stated on page 23 of the text and proved in an appendix. The same theorem holds for systems and hence higher order equations. We usually solve an ODE by doing an integration. Then an arbitrary constant C arises and we choose it to satisfy the initial condition y(t0 ) = y0 . The Existence and Uniqueness Theorem tells us that this is the only answer. 8. The rst order ODE dx = f (t, x) dt has an important special case, where the function f (t, x) factors as a product f (t, x) = g(x)h(t) of a functiong(x) of x and a function h(t) of t. Then we can write the ODE dx/dt = g(x)h(t) as dx/g(x) = h(t) dt, integrate to get dx = g(x) h(t) dt,

and solve for x in terms of t. When g(x) is identically one, the equation is dx/dt = h(t) so the answer is x = h(t) dt. When h(t) is is identically one, the system is autonomous, i.e. dx/dt = g(x). In this case can nd out a lot about the solutions from the phase diagram. 1 Example 9. Braking a car. A car going at speed v0 skids to a stop at a constant deceleration k in time T leaving skid marks of length L. We nd each of the four quantities in terms of the other three. Let the brakes be applied at time t = 0, so the car stops at time t = T , and let v = v(t) denote
1

Well study this later in Section 2.2 of the text. See Figure 2.2.7 on page 93.

the velocity at time t, and x = x(t) denote the distance travelled over the time interval [0, t]. Then the statement of the problem translates into the equations dv = k, dt v= dx , dt v(0) = v0 , v(T ) = 0, x(0) = 0, x(T ) = L.

Integrating the rst dierential equation gives dv dt = dt k dt = kt + C,

so C = v(0) = v0 and v(t) = v0 kt so 0 = v(T ) = v0 kT so v0 = kT , k = v0 /T , and T = v0 /k. Integrating the second dierential equation gives
T

L = x(T ) x(0) =
0

dx dt = dt

v(t) dt
0 0

1 (v0 kt) dt = v0 T 2 kT 2 .

1 2 2 2 From T = v0 /k we get L = v0 /k 2 v0 /k = 1 v0 /k. (See problems 30-32 2 page 17 of the text.)

Remark 10. Mathematically this is the same problem as the problem of a falling body on earth: If y is the height of the body, v = dy/dt is its speed, a = dv/dt = d2 y/dt2 is the acceleration, then Newtons 3rd law is F = ma = mg where g = 32ft/sec2 =9.8m/sec2 is the acceleration due to gravity so dy gt2 v= = gt + v0 , y= + v0 t + y0 . dt 2 Example 11. Population equation (exponential growth and decay). The dierential equation dP = kP dt says that the rate of growth (or decay if k < 0) of a quantity P is proportional to its size. We solve by separation of variables: dP/P = dt so ln P = so P = P0 ekt . dP = P k dt = kt + C = kt + ln P0

12. A single linear homogeneous equation. The more general equation dy = R(t)y dt is solved the same way: dy/y = R(t) dt so ln y = dy = y
R

R(t) dt

and exponentiating this equation gives y=e


R(t) dt

Note that the additive constant in R(t) dt becomes a multiplicative constant after exponentiating. For example, integrating the equation dy = ty dt gives ln y = so exponentiating gives y = exp
1 2 t 2

dy = y

t dt = 1 t2 + C 2

+ C = exp

1 2 t 2

exp(C) = y0 exp

1 2 t 2

where y0 = eC . (For typographical reasons the exponential function is often denoted as exp(x) := ex .) Example 13. Consider the function f (y) = |y|p . On the region where y = 0 the the derivative f (y) is continuous so the Existence and Uniqueness Theorem applies to solutions which stay in this region. We solve by separation of variables where y > 0 y 1p = 1 p) so (as long as t c > 0) y = [(1 p)(t c)]1/(1p) . 6 dy = t c. yp

When y < 0 we have |y| = y and y = [(1 p)(|t c|]1/(1p) . Funny things happen when y = 0. If p > 1 the derivative f (y) is continuous so by the Existence an Uniqueness Theorem the only solution with y(t0 ) = 0 is y 0. (This is reected in the fact that the above formula for y becomes innite when t = c.) If 0 < p < 1 however a solution can remain at zero for a nite amount of time and follow one of the above solutions to the left 1 and right. For example for p = 2 and any choice of c1 < 0 and c2 > 0 the function 1 (c1 t)2 for t < c1 4 y(t) = 0
1 (t 4

for c1 t c2 c2 )2 for c2 < t

solves the ODE and the initial condition y(0) = 0, so the solution is not unique. This is essentially the example of Remark 2 on page 23 of the text.

Monday Jan 26

14. Slope elds and phase diagrams. To draw the slope eld of an ODE dy/dx = f (x, y) raw a little line segment of slope f (x, y) and many points (x, y) in the (x, y)-plane. The curves tangent to these little line segments are the graphs of the solution curves. This is a lot of work unless you have a computer and it is often not very helpful. In the case of an autonomous ODE dy/dt = f (y) the phase diagram (see e.g. Figures 2.2.8n and 2.2.9 on page 94 of the text.) is more helpful. This is a line representing the y-axis with the zeros of f indicated and the intervals in between the zeros marked with an arrow indicating the sign of f (y) for y in that interval. Example 15. Swimmer crossing river. Recall from last Wednesday the dynamical system dx = vS , dt dy x2 = vR = v0 1 2 dt a

describing the position of the swimmer. Dividing the two equations and using

dy dy/dt = gives the geometric equation dx dy/dx dy x2 =k 1 2 dx a , k := v0 vS

which describes the trajectory of the swimmer. Take k = 1, a = 1. The solution curves are x3 y =x + C. 3 They are vertical translates of one another. The solution with C = 0 starts 2 at the point (x, y) = 1, 2 and ends at (x, y) = 1, 3 . 3 Example 16. The slope eld for dy/dx = x y The slope is horizontal on the line y = x, negative to the left and positive to the right. The picture in the text (page 20) suggests that the solutions are asymptotic as x . Well check this in the next lecture. 17. The phase diagram for dy/dt = (y a)(y b). Assume that a < b so dy/dt > 0 for y < a and for b < y while dy/dt < 0 for a < y < b. The phase diagram is
-

a
s

s -

From the diagram we can see that


t

lim y(t) = a

if y(0) < a if y(0) = a if a < y(0) < b if y(0) = b if y(0) < b if b < y(0) if y(0) < a

y(t) = a
t

lim y(t) = a

y(t) = b
t

lim y(t) = a lim y(t) = lim y(t) = 8

tT1

tT2 +

The diagram does not tell us whether T1 and T2 are nite. For this we will solve the equation by separation of variables and partial fractions. 1 1 1 1 = (y a)(y b) ba yb ya dy = dt (y a)(y b) |y b| ln = (b a)t + c. |y a| |y b| = Ce(ba)t , C := ec . |y a| What to do about the absolute values? Well certainly yb = Ce(ba)t , ya y = y0 when t = 0, and the exponential is positive so we must have y0 b , y0 := y(0). C = y0 a Now we can solve for y. We introduce the abbreviation u := Ce(ba)t to save writing: yb b au = u = y = (y a)u + b = y 1 u = b au = y = . ya 1u Now plug back in the values of u and C and multiply top and bottom of the resulting fraction by y0 a to simplify: (y0 a)b a(y0 b)e(ba)t . (y0 a) (y0 b)e(ba)t As a check we plug in t = 0. We get (y0 a)b a(y0 b) (b a)y0 y= = = y0 . (y0 a) (y0 b) ba as expected. Now (y0 a)b a(y0 b) = a, lim y(t) = =b lim y(t) = t t (y0 b) (y0 a) y= but if y0 < a there is a negative value of t (namely t = T2 above) where the denominator vanishes and similarly if y0 > b there is a positive value of t (namely t = T1 above) where the denominator vanishes. 9

Wednesday Jan 28

18. Three ways to solve dy/dt + 2y = 3. A linear rst order ODE is one of form dy + P (t)y = Q(t). (1) dt If P and Q are constants we can solve by separation of variables. For example to solve dy/dt + 2y = 3 we write ln(2y 3) = 2 dy = 2y 3 dt = t + c

so 2y 3 = e2t C (where C = ec ) and hence y = (3 + e2t C)/2. This doesnt work if either P or Q is not a constant. In the method of integrating factors we multiply the ODE (1) by a function to get (t) and then choose so that dy + (t)P (t)y = (t)Q(t) dt (2)

d = (t)P (t). dt The ODE (1) then takes the form

d y = Q (3) dt which can be solved by integration. In the method of variation of parameters we look for a solution of the form y = (t)u(t) so the ODE (1) takes the form d du u + + P u = Q. dt dt Then once we solve the ODE (1) simplies to d + P = 0 dt (4)

du = 1 Q. (5) dt In either method we rst reduce to a homogeneous linear ODE (either (2) or (4)) and then do an integration problem (either (3) or (5)). 10

Remark 19. Because equation (3) is the homogeneous linear ODE corresponding to the inhomogeneous linear ODE (1), the general solution of (3) is of form (t)C where C is an arbitrary constant. Having solved this problem by separating variables we solve (1) by trying to nd a solution where the constant C is replaced by a variable u. For this reason the method of variation of parameters is also called the method of variation of constants. The text uses the method of integrating factors for a single ODE in section 1.5 page 50 and the method of variation of constants for systems on section 8.2 page 493. Example 20. To solve dy/dx = x y rewrite it as dy/dx + y = x. Multiply by (x) = ex to get dy x e + yex = xex . dx Then dy x d yex = e + yex = xex dx dx so. integrating by parts, yex = so y = x 1 + Cex . Note that the general solution is asymptotic to the particular (C = 0) solution y = x 1. 21. The Superposition Principle. Important! If dy1 + P (t)y1 = Q1 (t) dt and dy2 + P (t)y2 = Q2 (t) dt xex dx = xex ex + C

and if y = y1 + y2 and Q = Q1 + Q2 , then dy + P (t)y = Q(t). dt In particular (take Q2 = 0 and Q = Q1 ) this shows that the general solution dy of an inhomogeneous linear equation +P (t)y = Q(t) is the general solution dt du of the corresponding homogeneous equation + P (t)u = 0 plus a particular dt solution of the inhomogeneous linear equation. 11

Example 22. When we discussed the slope eld of dy/dx = x y (text gure 1.3.6 page 20) we observed that it looks like all the solutions are asymptotic. Indeed if dy1 /dx = x y1 and dy2 /dx = x y2 then d (y1 y2 ) = (y1 y2 ) dx so y1 y2 = Cex so limx (y1 y2 ) = 0. This proves that all the solutions are asymptotic without solving the equation. The argument works more generally if x is replaced by Q(x), i.e. for the equation dy/dx = Q(x) y.

Friday January 30

23. Mixture problems. Let x denote the amount of solute in volume of size V and c denote its concentration. Then c = x/V. In a mixture problem, any of these may vary in time. Thus if a uid with concentration cin (units = mass/volume) ows into a tank at a rate of rin (units = volume/time) the amount of solute added in time dt is cin rin dt. Similarly if a uid with concentration cout (units = mass/volume) ows out of the tank at a rate of rout (units = volume/time) the amount of solute removed in time dt is cin rin dt. (The book uses the subscript i as an abbreviation for in and the subscript o as an abbreviation for out.) Hence the dierential equation dx = cin rin cout rout . dt In such problems one generally assumes that cin , rin , and rout are constant but x, cin , and possibly also the volume V of the tank vary. Example 24. A tank contains V liters of pure water. A solution that contains cin kg of sugar per liter enters a tank at the rate rin Liters/min. The solution is mixed and drains from the tank at the same rate. (a) How much sugar is in the tank initially? (b) Find the amount of sugar x in the tank after t minutes. (c) Find the concentration of sugar in the solution in the tank after 78 minutes. 12

In this problem rin = rout so the volume V of the tank is constant. In a time interval dt, cin rin dt kg of sugar enters the tank and x(t)/V dt kg of sugar leaves the tank so we have an inhomogeneous linear ODE dx x = cin rin rout dt V with initial value x(0) = 0. To save writing we abbreviate c := cin , r := rin = rout so the ODE is dx x = c r. dt V Solve by separation of variables V ln(V c x) = V dx = Vcx r dt = rt + K.

Since the tank initially holds pure water we have x = 0 when t = 0, hence K = V ln(V c) so K/V = ln(V c). Solving for x gives ln(V c x) = rt rt + ln(V c) = x = V c 1 exp V V

Remark 25. When x is small, the term x/V is even smaller so the equation is roughly dx/dt = cin rout and the answer small values of t is roughly x = (cin rin )t. For small values of t the amount of sugar x is also small and the approximation x = (cin rin )t is very accurate so accurate that it may fool WeBWorK but it is obviously wrong for large values of t. The reason is that limt (cin rin )t = whereas limt x = cin V so that the limiting concentration of the sugar in the tank is the same as the concentration of solution owing in. Remark 26. One student was assigned this problem in WeBWorK with values of V = 2780, c = 0.06 and r = 3 and complained to me that WeBWorK rejected the answer. I typed 2780*0.06[1-exp(-3t/2780)] and WeBWorK accepted the answer. The student had typed the value (-2780/3)(exp((-3(t+1589))/2780)-.18) and WeBWorK rejected that answer. The two answers would agree if 13

exp(-3*1589/2780)=0.18 but this isnt exactly true. I typed exp(-3*1589/2780) into the answer box for the part 1 of the question to see what WeBWorK thinks is the value and WeBWork said the value is 0.180009041024602. (The answer to part 1 is 0, but when I hit the Preview Button WeBWorK did the computation.) I replaced 0.18 by this value in the students answer and WeBWorK accepted it.

Monday February 2

Here are some tricks for solving special equations. The real trick is to nd a trick for remembering the trick. 27. Linear substitutions. To solve dy = (ax + by + c)p dx try v = ax + by + c so dv dy =a+b = a + bv p dx dx 28. Homogeneous equations. A linear equation is called homogeneous if a scalar multiple of a solution is again a solution. A function h(x, y) is called homogeneous of degree n if h(x, y) = n h(x, y). In particular, f is homogeneous of degree 0 i f (x, y) = f (x, y). Then f (x, y) = F To solve Try the substitution v = y/x. y , x F (u) := f (1, u)

dy y =F dx x

14

29. Bernoulli equations. This is an equation of form dy + P (x)y = Q(x)y n . dx Try y = v p and solve for a value of p which makes the equation simpler. 30. Exact equations. The equation M (x, y) dx + N (x, y) dy = 0 is exact if there is a function F (x, y) such that F = M, x then F =N y ()

M N = y x 2F 2F = . yx xy

()

because

In Math 234 you learn that converse is true (if M and N are dened for all (x, y). Exactness implies that the solutions to the ODE M (x, y) + N (x, y) are the curves F (x, y) = c for various values of c. To nd F (x, y) satisfying () choose (x0 , y0 ) and integrate from (x0 , y0 ) along any path joining (x0 , y0 ) to (x, y). Condition () guarantees that the integral is independent of the choice of the path. Example 31. Write the ODE 3x2 y 5 + 5x3 y 4 as 3x2 y 5 dx + 5x3 y 4 dy = 0. 15 dy =0 dx dy =0 dx

The exactness condition () holds as 3 4 2 5 3x y = 15x3 y 4 = 5x y . y x Let (x0 , y0 ) = (0, 0) and compute F (x, y) by integrating rst along the y-axis (where dx = 0) from (0, 0) to (0, y) and the along the horizontal line from (0, y) to (x, y) (where dy = 0). We get
t=y t=x

F (x, y) =
t=0 t=y

N (0, t) dt + 5(03 )t4 dt +


3 5 3 5 t=0 t=x t=0

M (t, y) dt 3t2 y 5 dt

=
t=0

=0+x y =x y . so the solutions of the ODE are the curves x3 y 5 = C. Because the exactness condition holds it doesnt matter which path we use to compute F (x, y) so long as it goes from (0, 0) to (x, y). For example, integrating rst along the x-axis (where dy = 0) from (0, 0) to (x, 0) and the along the vertical line from (x, 0) to (x, y) (where dx = 0) gives
t=y t=x

F (x, y) =
t=0 t=y

N (x, t) dt + 5(x3 )t4 dt +


t=0 t=x t=0

M (t, 0) dt 3t2 (05 ) dt

=
t=0

= x3 y 5 + 0 = x3 y 5 . Along the diagonal line from (0, 0) to (x, y) we have dx = x dt and dy = y dt with t running from 0 to 1 so
t=1 t=1

F (x, y) =
t=0 t=1

N (tx, ty)y dt + 5t7 x3 y 4 y dt + +


3 3 5 xy 8 t=0 t=1

M (tx, ty)x dt 3t7 x2 (y 5 )x dt

= =

t=0 5 3 5 xy 8

=x y .

t=0 3 5

Wednesday February 4

32. Reducible second order equations. A second order ODE where either the unknown x or or its derivative dx/dt is missing can be reduced the equation 16

to a rst order equation. If x is missing the equation is already rst order in dx/dt. The case where both t and dx/dt are missing is like a conservative force eld in physics, i.e. a force eld which is the negative gradient of a potential energy function U so Newtons third law takes the form m In this case the energy m|v|2 E := + U, 2 v := dx dt d2 x = U dt2

is conserved (constant along solutions). When the number of dimensions is one (but not in higher dimensions) every force eld is a gradient and we can use this fact to reduce the order. To solve d2 x = f (x) dt2 take U = f (x) dx and v = dx/dt so the equation becomes
1 2

dx dt

+ U (x) = E

which can be solve by separation of variables. Example 33. Consider the equation m d2 x = kx. dt2

Dene the velocity v, and the total energy E by v := dx , dt E := mv 2 kx2 + . 2 2

(The total energy is the sum of the kinetic energy mv 2 /2 and the potential energy U (x) := kx2 /2.) Now dE dv dx = mv + kx = dt dt dt 17 m dv + kx v = 0, dt

so the total energy E is constant along solutions. Then dx =v= dt 2E kx2 . m

We solve the initial value problem v(0) = 0 and x(0) = x0 . Then 2E = kx2 0 so dx k := = x2 x2 , , 0 dt m so dx = dt x2 x2 0 so x dx = cos1 = dt = t + C. x0 x2 x2 0 When t = 0, we have x = x0 so x.x0 = 1 so (since cos(0) = 1) C=0 and hence so x = x0 cos(t). Remark 34. On page 70 of the text, the problem is treated a little dierently. The unknown x is viewed as the independent variable and the substitution v= dx , dt d2 x dv dv dx dv = = = v 2 dt dt dx dt dx

is used to transform the equation m into the equation d2 x + kx = 0 dt2

dv v + kx = 0. dx Solving this by separation of variables gives m m v dv + k x dx = 0

which is the conservation law 1 mv 2 + 1 kx2 = E from before. The book uses 2 2 the letters x, y, p where I have used t, x, v. (I deviated from the books notation to emphasize the connection with physics.) 18

Monday February 9

35. Peak Oil. In 1957 a geologist named M. K. Hubbert plotted the annual percentage rate of increase in US oil production against total cumulative US oil production and discovered that the data points fell (more or less) on a straight line. Specically dQ/dt Q + =1 aQ b where Q = Q(t) us the total amount of oil (in billions of barrels) produced by year t, a = 0.055, b = 220 with the initial condition Q(1958) = 60.2 The ODE for Q can be written as dQ = aQ kQ2 , dt a k= . b

This equation is called the Logistic Equation. (We solved a similar equation dy/dt = (ya)(yb) above.) By solving this equation Hubbert predicted that annual US oil production would peak (i.e. dQ/dt would become negative) in the year 1975. The peak actually occurred in 1970 but this went unnoticed because by this time the US had begun to import much of its oil. A similar calculation for world oil production produced a prediction of a peak in the year 2005. 36. First order autonomous quadratic equations. Consider the equation dx = Ax2 + Bx + C. dt The right hand side will have either two zeros, one (double) zero, or no (real) zeros depending on whether B 2 4AC is positive, zero, or negative. If there are two zeros, say B + B 2 4AC B B 2 4AC p := , q := , 2A 2A then the equations may be written as dx = A(x p)(x q) dt
I got these gures from page 155 (see also page 201) of the very entertaining book: Kenneth S. Deeyes, Hubberts Peak, Princeton University Press, 2001. I estimated the initial condition from the graph, so it may not be exactly right.
2

19

and the limiting behavior can be determined from the phase diagram as we did last week. If there are no zeros, all solutions reach is nite time. After completing the square and rescaling x and t the equation has one of the folloeing three forms: Example 37. Example with no zeros. We solve the ODE dx = 1 + x2 , dt We separate variables and integrate: tan1 (x) = dx = 1 + x2 dt = t + c, c := tan1 (x0 ), x(0) = x0 .

so for /2 < t < /2 we have x = tan(t + c) = tan t + tan c tan t + x0 = . 1 tan t tan c 1 x0 tan t

The solution becomes innite when t = tan1 (1/x0 ). Example 38. Example with two zeros. We solve the ODE dx = 1 x2 , dt We separate variables and integrate: tanh1 (x) = dx = 1 x2 dt = t + c, c := tanh1 (x0 ), x(0) = x0 .

so for /2 < t < /2 we have x = tanh(t + c) = tanh t + tanh c tanh t + x0 = . 1 + tanh t tanh c 1 + x0 tanh t

For 1 < x0 < 1 we have limt x = 1 and limt x = 1. Example 39. Example with a double zero. The solution of the ODE dx = x2 , dt x(0) = x0

is x = x0 /(x0 t). If x0 = 0 it becomes innite when t = x0 . 20

40. After a change of variables, every quadratic ODE dy = Ay 2 + By + C ds takes one of these three forms. Divide by A and complete the square 1 dy = A ds B y+ 2A
2

B 2 4AC 4A2

Let u = y + (B/2A) and k 2 = |(B 2 4AC)/(4A2 )|: 1 du = u2 k 2 . A ds Finally (if k = 0) divide by k 2 and let x = u/k and t = Ak 2 s to arrive at dx/dt = 1 x2 . (If k = 0, take x = u and t = As to arrive at dx/dt = x2 .) 41. Trig functions and hyperbolic functions. eit eit 2i eit + eit cos(t) = 2 it e eit i tan(t) = it e + eit 2 cos (t) + sin2 (t) = 1 d sin(t) = cos(t) dt d cos(t) = sin(t) dt tan(t) + tan(s) tan(t + s) = 1 tan(t) tan(s) sin(t) = et et 2 et + et cosh(t) = 2 t e et tanh(t) = t e + et 2 cosh (t) sinh2 (t) = 1 d sinh(t) = cosh(t) dt d cosh(t) = sinh(t) dt tanh(t) + tanh(s) tanh(t + s) = 1 + tanh(t) tanh(s) sinh(t) =

42. Bifurcation and dependence on parameters. The dierential equation dx = x(4 x) h dt models a logistic population equation with harvesting rate h. The equilibrium points are H = 2 4 h, N =2+ 4h if h < 4, There is a bifurcation at h = 4. This means that the qualitative behavior of the system changes as h increases past 4. When h = 4 there is a double root (H = N ) and for h > 4 there is no real root and all solutions reach in nite time. 21

43. Air resistance proportional to v. The equation of motion is F = ma where the force is F = FG + FR with a= dv , dt v= dy , dt FG = g, FR = kv.

This can be solved by separation of variables and there is a terminal velocity v := lim v = mg/k
t

whicsh is independent of the initial velocity. 44. Air resistance proportional to v 2 . The equation of motion is F = ma where the force is F = FG + FR with a= Thus m dv , dt v= dv = dt dy , dt FG = g, FR = kv|v|.

g kv 2 g + kv 2

when v > 0 when v < 0.

After rescaling (i.e. a change of variable) we can suppose m = g = k and we use the above. To nd the height y we need to choose the constants of integration correctly. 45. Escape velocity. A spaceship of mass m is attracted to a planet of mass M by a gravitational force of magnitude GM m/r2 so that (after cancelling m) the equation of motion (if gravity is the only force acting on the spaceship) is dv d2 r GM = 2 = 2 dt dt r where r is the distance of the spaceship to the center of the planet and v = dr/dt is the velovity of the spaceship. As above, the energy E := mv 2 GM m 2 r

is a constant of the motion so if r(0) = r0 and v(0) = v0 we have (after dividing by m/2) 2GM 2GM 2 = v0 v2 r r0 22

from which follows


2 v 2 > v0

2GM . r0

The quantity 2GM/r0 is called the escape velocity. If v0 is greater than the esacape velocity then
t

r(t) =
0

dr dt = dt

v dt >
0 0

2 v0

2GM dt = t r0

2 v0

2GM r0

so r becomes innite in nite time.

Wednesday February 11

46. Monthly Investing. Mary starts a savings account. She plans to invest 100 + t dollars t months after opening the account. The account pays 6% annual interest. How much is in the account after t months? Denote by S(t) the amount in the account after t months. Then S(0) = 100 and S(t + 1) = S(t)+interest+deposit, i.e. S(t + 1) = S(t) + 0.06 S(t) + (100 + t) 12

This equation can be written in the form S(t + h) = S(t) + f (t, S(t))h where h = 1 and f (t, S) = 0.06 S(t) + (100 + t). It can also be written 12 S = f (t, S(t))t where S = S(t + h) S(t) and t = h = 1. 47. Daily Investing. Donald starts a savings account. He plans to invest daily at a rate of 100 + t dollars per month after opening the account. The account pays 6% annual interest. How much is in the account after t months? Denote by S(t) the amount in the account after t months. This is n = 30t days. One day is h months where h = 1/30. Then S(0) = 100 and S(t + h) = S(t)+one days interest+one days deposit, i.e. S(t + h) = S(t) + 0.06 S(t)h + (100 + t)h, 12 23 h= 1 , 30 t = nh

This equation can be written in the form S(t + h) = S(t) + f (t, S(t))h where h = 1/30 and f (t, S) = 0.06 S(t) + (100 + t). It can also be written 12 S = f (t, S(t))t
1 . 30

where S = S(t + h) S(t) and t = h =

48. Hourly Investing. Harold starts a savings account. He plans to invest hourly at a rate of 100 + t dollars per month after opening the account. The account pays 6% annual interest. How much is in the account after t months? Denote by S(t) the amount in the account after t months. This is n = 720t hours. One hour is h months where h = 1/720. Then S(0) = 100 and S(t + h) = S(t)+one hourss interest+one hourss deposit, i.e. 0.06 S(t)h + (100 + t)h, 12 This equation can be written in the form S(t + h) = S(t) + h= 1 , 720 t = nh

S(t + h) = S(t) + f (t, S(t))h where h = 1/720 and f (t, S) = 0.06 S(t) + (100 + t). It can also be written 12 S = f (t, S(t))t
1 . 720

where S = S(t + h) S(t) and t = h =

49. Continuous Investing. Cynthia starts a savings account. She plans to invest continuously at a rate of 100 + t dollars per month after opening the account. The account pays 6% annual interest. How much is in the account after t months? Denote by S(t) the amount in the account after t months. Then S(0) = 100 and the change dS in the account in an innitessimal time interval of size dt at time t is 0.06 dS = S(t) dt + (100 + t) dt 12 This equation can be written in the form dS = f (t, S(t)) dt where f (t, S) = 0.06 S(t) + (100 + t). 12 24

Remark 50. Mary is getting an annual interest rate of 6% compounded monthly Donald is getting an annual interest rate of 6% compounded daily and is investing a little more each month than is Mary. Harold is getting an annual interest rate of 6% compounded hourly and is investing a little more each month than is Donald. Cynthia is getting an annual interest rate of 6% compounded continuously and is investing a little more each month than is Harold. The point is that all the answers are about the same. Heres why: Theorem 51 (The Error in Eulers Method). Assume that f (t, y) is continuously dientiable. Let y = y(t) be the solution to the initial value problem dy = f (t, y), dt y(0) = y0

and yn be the solution to the dierence equation yn+1 = yn + f (nh, yn )h. Then there is a constant C = C(f, T ) (dependent on T and f but independent of h) such that |y(t) yn | Ch, for t = nh and 0 t T . Remark 52. This theorem is stated on page 122 of the text. When I get a chance, I will put a formula for C in these notes and provide a proof. (Only motivated students should try to learn the proof.)

25

Anda mungkin juga menyukai