Ordinary Di Fferential Equations: (Lecture Notes)

Ordinary Dierential Equations
with
Linear Algebra
(Lecture Notes)
Todd Kapitula
Contents
Contents i
List of Figures iii
1 Essentials of Linear Algebra 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Linear equations and systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Notation and terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2.2 Solutions of linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Solving by Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Linear combinations of vectors and matrix/vector multiplication . . . . . . . . . . . . . . 7
1.3.1 Linear combinations of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2 Matrix/vector multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Span of vectors and the null space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.1 Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 Homogeneous linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Linear systems equivalence results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.6 Linear independence of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.7 Subspaces and basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.1 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.2 Basis of a subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.8 Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.9 The inverse of a square matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.10 The determinant of a square matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.11 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.1 Example: Markov processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.2 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.12 Application: ordinary dierential equations . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2 Scalar rst-order linear dierential equations 36
2.1 Motivating problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2 General theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2.1 Existence and uniqueness theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2.2 Numerical solutions: Eulers method . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Analytical solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.1 The homogeneous solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.2 The particular solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.2.1 Variation of parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.2.2 Undetermined coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
i
ii CONTENTS
2.4 Example: one-tank mixing problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Systems of rst-order linear dierential equations 50
3.1 Motivating problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.1 Two-tank mixing problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.1.2 Mass-spring system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.3 LRC circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 Existence and uniqueness theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Analytical solutions: constant coecient matrices . . . . . . . . . . . . . . . . . . . . . . . 54
3.3.2.1 Variation of parameters: examples . . . . . . . . . . . . . . . . . . . . . . 67
3.3.2.2 Undetermined coecients . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.4.1 Mass-spring system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.4.2 Coupled oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.4.3 Two-tank mixing problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4 Scalar higher-order linear dierential equations 80
4.1 Connection with rst-order systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 Example: the forced-damped oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5 Discontinuous forcing and the Laplace transform 91
5.1 Discontinuous forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1.1 The Heaviside unit step function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1.2 The Dirac delta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.2 The particular solution: variation of parameters . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3 The particular solution: the Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3.1 Denition and properties of the Laplace transform . . . . . . . . . . . . . . . . . . 96
5.3.2 Application: second-order scalar ODEs . . . . . . . . . . . . . . . . . . . . . . . . 99
5.3.3 Example: the forced oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.3.4 Example: one-tank mixing problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.5 The transfer function and convolutions . . . . . . . . . . . . . . . . . . . . . . . . . 105
List of Figures
1.1 (color online) A graphical depiction of the three possibilities for linear systems of two equa-
tions in two unknowns. The left gure shows the case when the corresponding lines are not
parallel, and the other two gures show the cases when the lines are parallel. . . . . . . . . . 4
1.2 (color online) The output from the applet Matrix Calculator for the matrix A in (1.11.7). . . 32
2.1 (color online) A cartoon which depicts the mixing of a brine solution in a reservoir. . . . . . 36
2.2 (color online) The direction eld (slope eld) associated with the ODE x
= x
2
/4 + cos(3t).
It is calculated using the Java applet DFIELD developed by J. Polking. A solution curve is
drawn for three dierent initial conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.3 (color online) A cartoon which illustrates the existence/uniqueness theory of Theorem 2.2.1.
The thick (blue) curve denotes the solution curve, and the dashed (green) box is the domain
on which the functions g(t, x) and
x
g(t, x) are both continuous. . . . . . . . . . . . . . . . . . 39
2.4 (color online) A cartoon which depicts the solution curve x = (t), as well as a piecewise
linear approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 (color online) The solution curve for the ODE x
= x
2
/4 + cos(3t) with the initial condition
x(0) = 0. The purple curve denotes the true solution, while the red curve is the Euler approx-
imation with a step size of h = 0.2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
= x
2
x(0) = 0. The purple curve denotes the true solution, while the red curve is the Euler approx-
imation with a step size of h = 0.05. Note that the approximation is much better than that
given in Figure 2.5, even though it is still not very good. . . . . . . . . . . . . . . . . . . . . . 42
= x
2
x(0) = 0. The purple curve denotes the true solution, while the green curve is the Runge-
Kutta approximation with a step size of h = 0.2. Note that the approximation is much better
than that given in Figure 2.6, even though the step size is four times larger. . . . . . . . . . . 43
3.1 (color online) A cartoon which depicts the two-tank mixing problem. . . . . . . . . . . . . . 50
3.2 (color online) A cartoon which depicts the mass-spring problem. . . . . . . . . . . . . . . . . 51
3.3 (color online) The phase plane associated with the system (3.3.4). It is calculated using the
Java applet PPLANE developed by J. Polking. Solution curves are drawn for several dierent
initial conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 (color online) The phase plane associated with the system (3.3.5). Solution curves are drawn
for several dierent initial conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.7 (color online) The phase plane associated with the system (3.3.8) when a = 1. . . . . . . . . . 61
iii
iv LIST OF FIGURES
3.8 (color online) The phase plane associated with the system (3.3.8) when a = 1. . . . . . . . . 62
3.9 (color online) The phase plane associated with the system (3.3.8) when a = 0. . . . . . . . . . 62
3.10 (color online) A cartoon of the solution to the sinusoidally forced mass-spring for the ini-
tial condition y(0) = y
(0) = 0. The thick (red) curve is the amplitude of the solution (see
Figure 3.11), and the thin (blue) curve is the actual solution curve. . . . . . . . . . . . . . . . 73
3.11 (color online) A cartoon of the plot of the amplitude A
() is given by the thick (blue) curve

in the left gure, and a cartoon of the plot of the period of the amplitude function T
() is
given in the right gure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.12 (color online) A cartoon which depicts the coupled mass-spring problem. . . . . . . . . . . . 75
3.13 (color online) A cartoon which depicts the amplitudes associated with the motion for each
mass. The amplitude for the top mass is denoted by the (red) solid curve, and that for the
bottom mass is given by the (blue) dashed curve. . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.14 (color online) A cartoon which depicts a two-tank mixing problem in which the incoming
concentration c(t) is not necessarily constant. . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.15 (color online) A cartoon which as t +depicts the concentration in tank A, c
A
(t), and tank
B, c
B
(t), when the incoming concentration is given by c(t) = c
0
(1 +0.5sin(t)). The curves are
not to scale. The horizontal dashed line associated with each curve is a plot of its mean. . . . 79
4.1 (color online) The type of the zeros of the characteristic polynomial for scalar second-order
ODEs in the a
0
a
1
-plane. The thick (red) curve denotes the transition between real-valued
zeros and complex-valued zeros. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 (color online) A cartoon of the dynamics in the x
1
x
2
-plane, where x
1
= y, x
2
= y
, for the
second-order ODE in each portion of the a
0
a
1
-plane. . . . . . . . . . . . . . . . . . . . . . . . 84
4.3 (color online) A cartoon of the dynamics in the x
1
x
2
-plane for the homogeneous problem
associated with (4.2.1) (compare with Figure 4.2). . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.4 (color online) A cartoon of the plot of Re(
+
), which is dened in (4.2.3). . . . . . . . . . . . 88
4.5 (color online) A cartoon of the plots of the amplitude A
() (left gure) and the phase
()
(right gure). Note that when is small the phase is roughly zero, so that the mass roughly
oscillates in-phase with the forcing. On the the other hand, if is large, then the phase is
roughly , so that the mass is oscillating out-of-phase with the forcing. . . . . . . . . . . . . 89
5.1 (color online) The plot of the step function u(t a) is given in the left gure, and the plot of
the function
h
(t a) is given in the right gure. . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 (color online) The plot of the solution to the ODE y
+4y
+3y = f (t) with the initial condition

y(0) = y
(0) = 0. The thick dashed (blue) curve is the solution for f (t) = 5u(t 2), while the
thick solid (red) curve is the solution for f (t) = 5(t 2). . . . . . . . . . . . . . . . . . . . . . 96
5.3 (color online) Two solution plots of the forced oscillator when the forcing function is modeled
by a sum of delta functions (see (5.3.5)). Here we choose f
0
= 1 and
1
= 3.7. The solid (red)
curve is the solution for f
1
= 0.9, and the dashed (blue) curve is the solution for f
1
= 0.9.
The two solutions coincide for 0 t 3.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4 (color online) Cartoon plots of the solution curve for f
1
= f
0
(left gure) and f
1
= f
0
(right
gure). The time
1
is chosen in each gure so that there is no motion for t >
1
. In the left
gure
1
= 1+4, while in the right gure
1
= 1+3. The (red) arrows depict the direction
of the forcing at t = 1 and t =
1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.5 (color online) Plots of the solution curve of (5.3.9) under the condition that f
0
= a = 1. The
thick solid (blue) curve is the solution when = 0.5, and the thick dashed (red) curve is the
solution when = 2.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.6 (color online) The input F(s) is the Laplace transform of the forcing function f (t), and the
output Y(s) is the Laplace transform of the solution y(t). The physical system being modeled
is described by the transfer function H(s). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
LIST OF FIGURES v
5.7 (color online) The poles of the transfer function are denoted by (blue) circles. Recall that if
s = a + ib, then Re(s) = a and Im(s) = b. In the left gure all of the poles have negative real
part, so that the homogeneous solution is a transient solution. In the right gure some of the
poles have positive real part, so that the zero solution is unstable. . . . . . . . . . . . . . . . . 106
1
Essentials of Linear Algebra
1.1 Introduction
The average student knows how to solve two equations in two unknowns in a most elementary way: the
method of substitution. For example, consider the system of equations
2x +y = 6, 2x +4y = 5.
Solving the rst equation for y, i.e., y = 62x, and substituting this expression into the second equation
gives
2x +4(6 2x) = 5 x =
19
6
.
Substitution into either of the equations gives the value of y; namely, y = 1/3.
For systems of three or more equations this algorithm is algebraically unwieldy. Furthermore, it is
inecient, as it is often the case not very clear as to which variable(s) should be substituted into which
equation(s). Thus, at the very least, we should develop an ecient algorithm for solving large systems
of equations. Perhaps more troubling (at least to the mathematician!) is the fact that the method of sub-
stitution does not yield any insight into the structure of the solution set. An analysis and understanding
of this structure is the topic of linear algebra. As we will see, not only will we gain a much better under-
standing of how to solve linear algebraic systems, but by considering the problem more abstractly we
will better understand how to solve linear systems of Ordinary Dierential Equations (ODEs).
This chapter is organized in the following manner. We begin by developing an ecient algorithm
for solving linear systems of equations: Gaussian elimination. We then consider the problem using ma-
trices and vectors, and spend considerable time and energy trying to understand the solution structure
via these objects. We conclude the chapter by looking at special vectors associated with square matrices:
the eigenvectors. These vectors have the special algebraic property that the matrix multiplied by the
eigenvector is simply a scalar multiple of the eigenvector (this scalar is known as the associated eigen-
value). As we will see later in the course, the eigenvalues and eigenvectors are the key objects associated
with a matrix that allow us to easily and explicitly write down the solution to an ODE.
1.2 Linear equations and systems
1.2.1 Notation and terminology
A linear equation in n variables is an algebraic equation of the form
a
1
x
1
+a
2
x
2
+ +a
n
x
n
= b. (1.2.1)
The (possibly complex-valued) numbers a
1
, a
2
, . . . , a
n
are the coecients, and the unknowns to be solved
for are the variables x
1
, . . . , x
n
. The variables are also sometimes called unknowns. An example in two
variables is
2x
1
5x
2
= 7,
1
2 CHAPTER 1. ESSENTIALS OF LINEAR ALGEBRA
and an example in three variables is
x
1
3x
2
+9x
3
= 2.
A system of linear equations is a collection of m linear equations (1.2.1), and can be written as
a
11
x
1
+a
12
x
2
+ +a
1n
x
n
= b
1
a
21
x
1
+a
22
x
2
+ +a
2n
x
n
= b
2
.
.
. =
.
.
.
a
m1
x
1
+a
m2
x
2
+ +a
mn
x
n
= b
m
.
(1.2.2)
Note that the coecient a
jk
is associated with the variable x
k
in the j
th
equation. An example of two
equations in three variables is given by
x
1
4x
2
= 6
3x
1
+2x
2
5x
3
= 2.
(1.2.3)
When there is a large number of equations and/or variables, it is awkward to write down a linear
system in the form of (1.2.2). It is more convenient instead to use a matrix formulation. A matrix is a
rectangular array of numbers with m rows and n columns, and such a matrix is said to be an mn (read
m by n) matrix. If m = n, the matrix is said to be a square matrix. A matrix will be denoted as A. If
all of the numbers are real-valued, then we say that A R
mn
; otherwise, we say that A C
mn
(recall
that z C means that z = x +iy, where x, y R and i
2
= 1). The coecient matrix for the linear system
(1.2.2) is given by
A =
_
_
a
11
a
12
a
1n
a
21
a
22
a
2n
.
.
.
.
.
.
.
.
.
a
m1
a
m2
a
mn
_
_
, (1.2.4)
and the coecient a
jk
is now in the j
th
row and k
th
column. For example, the coecient matrix for the
system (1.2.3) is given by
A =
_
1 4 0
3 2 5
_
R
23
,
and
a
11
= 1, a
12
= 4, a
13
= 0, a
21
= 3, a
22
= 2, a
23
= 5.
A vector, say v, is an m 1 matrix, i.e., a matrix with only one column. A vector is sometimes called
a column vector or m-vector, and we write v R
m1
R
m
. The variables in the system (1.2.2) will be
written as the vector
x =
_
_
x
1
x
2
.
.
.
x
m
_
_
,
and the variables on the right-hand side will be written as the vector
b =
_
_
b
1
b
2
.
.
.
b
m
_
_
.
1.2. LINEAR EQUATIONS AND SYSTEMS 3
In conclusion, for the system (1.2.2) there are three matrix quantities: A, x, b. Furthermore, we will
represent the linear system using these matrices as
Ax = b. (1.2.5)
We will later see what it means to multiply a matrix and a vector. The linear system is said to be
homogeneous if b = 0, i.e., b
j
= 0 for j = 1, . . . , m; otherwise, the system is said to be nonhomogeneous.
1.2.2 Solutions of linear systems
A solution to the linear system (1.2.2) is a vector x which satises all m equations simultaneously. For
example, consider the linear system of three equations in three unknowns for which
A =
_
_
1 0 1
3 1 0
1 1 1
_
_
, b =
_
_
0
1
4
_
_
, (1.2.6)
i.e.,
x
1
x
3
= 0, 3x
1
+x
2
= 1, x
1
x
2
x
3
= 4.
It is not dicult to check that a solution is given by
x =
_
_
1
4
1
_
_
x
1
= 1, x
2
= 4, x
3
= 1.
A system of linear equations with at least one solution is said to be consistent; otherwise, it is inconsis-
tent.
How many solutions does a linear system have? Consider the system given by
2x
1
x
2
= 2, x
1
+3x
3
= 11.
The rst equation represents a line in the x
1
x
2
-plane with slope 2, and the second equation represents
a line with slope 1/3. Since lines with dierent slopes intersect at a unique point, there is a unique
solution to this system, and it is consistent. It is not dicult to check that the solution is given by
(x
1
, x
2
) = (1, 4). Next consider the system given by
2x
1
x
2
= 2, 4x
1
+2x
3
= 8.
Each equation represents a line with slope 2, so that the lines are parallel. Consequently, the lines are
either identically the same, so that there are an innite number of solutions, or they intersect at no point,
so that the system is inconsistent. It is not dicult to check that this system is consistent; however, the
system
2x
1
x
2
= 2, 4x
1
+2x
3
= 7
is inconsistent.
So far, we see that a linear system with two equations and two unknowns is either consistent with
one or an innite number of solutions, or is inconsistent. It is not dicult to showthat this fact holds for
linear systems with three unknowns. Each linear equation in the system represents a plane in x
1
x
2
x
3
-
space. Given any two planes, we know that they are either parallel, or intersect along a line. Thus, if
the system has two equations, then it will either be consistent with an innite number of solutions, or
inconsistent. Suppose that the system with two equations is consistent, and add a third linear equation.
Suppose that the original two planes intersect along a line. This new plane is either parallel to the line,
or intersects it at precisely one point. If the original two planes are the same, then the new plane is
either parallel to both, or intersects it along a line. In conclusion, for a system of equations with three
variables there is either a unique solution, an innite number of solutions, or no solution.
This argument can be generalized to show the following result:
x
1
x
2
x
1
x
2
x
1
x
2
Inconsistent 1 solution solutions
8
Figure 1.1: (color online) A graphical depiction of the three possibilities for linear systems of two equa-
tions in two unknowns. The left gure shows the case when the corresponding lines are not parallel,
and the other two gures show the cases when the lines are parallel.
Theorem 1.2.1. If the linear system (1.2.2) is consistent, then there is either a unique solution, or an
innite number of solutions.
Remark 1.2.2. Theorem 1.2.1 does not hold for nonlinear systems. For example, the nonlinear system
x
2
1
+x
2
2
= 2, x
1
+x
2
= 0
is consistent, and has the two solutions (1, 1), (1, 1).
It is often the case that if a linear system is consistent, then more cannot be said about the number of
solutions without directly solving the system. However, in the argument leading up to Theorem 1.2.1
we did see that for a system of two equations in three unknowns that if the system was consistent, then
there were necessarily an innite number of solutions. This result holds in general:
Corollary 1.2.3. Suppose that the linear system is such that m < n, i.e., there are fewer equations than
unknowns. If the system is consistent, then there are an innite number of solutions.
1.2.3 Solving by Gaussian elimination
We now need to understand how to systematically solve a given linear system. While the method of
substitution works ne for two equations in two unknowns, it quickly breaks down as a practical meth-
ods when there are three or more variables involved in the system. We need to come up with something
else.
The simplest linear system for n equations in n unknowns to solve is of the form
x
1
= b
1
, x
2
= b
2
, x
3
= b
3
, . . . , x
n
= b
n
.
The coecient matrix associated with this system is
A =
_
_
1 0 0 0 0 0 0
0 1 0 0 0 0 0
0 0 1 0 0 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 0 0 1 0
0 0 0 0 0 0 1
_
_
,
and the solution is given by x = b. The above matrix will henceforth be called the identity matrix. The
identity matrix has ones of the diagonal, and is zero everywhere else. A square identity matrix with n
1.2. LINEAR EQUATIONS AND SYSTEMS 5
columns will be denoted I
n
; for example,
I
2
=
_
1 0
0 1
_
, I
3
=
_
_
1 0 0
0 1 0
0 0 1
_
_
.
If the number of equations is not equal to the number of unknowns, then a particularly simple
system to solve is, for example, given by
x
1
3x
3
+4x
4
= 2, x
2
+x
3
6x
4
= 5. (1.2.7)
The coecient matrix for this system is given by
A =
_
1 0 3 4
0 1 1 6
_
R
24
;
however, note that another equally valid coecient matrix is given by
A =
_
_
1 0 3 4
0 1 1 6
0 0 0 0
_
_
R
34
.
This second coecient matrix corresponds to adding the (seemingly useless) equation
0x
1
+0x
2
+0x
3
+0x
4
= 0
to (1.2.7). In either case, the solution to this system is given by
x
1
= 2 +3s t, x
2
= 5 s +6t, x
3
= s, x
4
= t x =
_
_
2 +3s 4t
5 s +6t
s
t
_
_
, s, t R :
note that there are an innite number of solutions.
These coecient matrices share a common feature, which is detailed below:
Denition 1.2.4. A matrix is said to be in reduced row echelon form (RREF) if
(a) all nonzero rows are above any zero row
(b) the rst nonzero entry in a row (the leading entry) is a one
(c) every other entry in a column with a leading one is zero.
Those columns with a leading entry are known as pivot columns, and the leading entries are called
pivot positions.
If a coecient matrix is in RREF, then the linear system is particulary easy to solve. Thus, our goal is
to take a given linear system with its attendant coecient matrix, and then perform allowable algebraic
operations so that the new system has a coecient matrix which is in RREF. What are the allowable
operations?
(a) multiply any equation by a constant
(b) add/subtract equations
(c) switch the ordering of equations.
In order to do these operations most eciently, it is best to work with the augmented matrix associated
with the linear system Ax = b; namely, the matrix (Ab). The augmented matrix is formed by adding a
column to the coecient matrix, and in that column inserting the vector b. For example, for the linear
system associated with (1.2.6) the augmented matrix is given by
(Ab) =
_
_
1 0 1 0
3 1 0 1
1 1 1 4
_
_
. (1.2.8)
The allowable operations on the individual equations in the linear system correspond to operations on
the rows of the augmented matrix. In particular, when doing Gaussian elimination on an augmented
matrix in order to put it into RREF, we are allowed to:
(a) multiply any row by a constant
(b) add/subtract rows
(c) switch the ordering of the rows.
Once we have performed Gaussian elimination on an augmented matrix in order to put it into RREF, we
can easily solve the resultant system.
Remark 1.2.5. As a rule-of-thumb, when putting an augmented matrix into RREF, the idea is to place
1s on the diagonal (as much as possible), and 0s everywhere else (again, as much as possible).
Example. Consider the linear system associated with the augmented matrix in (1.2.8). We will hence-
forth let
j
denote the j
th
row of a matrix. The operation a
j
+b
k
will be taken to mean multiply the
j
th
row by a, multiply the k
th
row by b, add the two resultant rows together, and nally replace the k
th
row with this sum. With this notation in mind, performing Gaussian elimination yields
(Ab)
3
1
+
2
_
1 0 1 0
0 1 3 1
1 1 1 4
_
1
+
3
_
1 0 1 0
0 1 3 1
0 1 0 4
_
2
+
3
_
1 0 1 0
0 1 3 1
0 0 3 3
_
_
(1/3)
3
_
1 0 1 0
0 1 3 1
0 0 1 1
_
_
3
3
+
2
_
1 0 1 0
0 1 0 4
0 0 1 1
_
3
+
1
_
1 0 0 1
0 1 0 4
0 0 1 1
_
_
.
The new linear system is
x
1
= 1, x
2
= 4, x
3
= 1 x =
_
_
1
4
1
_
_
,
which is also immediately seen to be the solution.
Example. Consider the linear system
x
1
2x
2
x
3
= 0
3x
1
+x
2
+4x
3
= 7
2x
1
+3x
2
+5x
3
= 7.
Performing Gaussian elimination on the augmented matrix yields
_
_
1 2 1 0
3 1 4 7
2 3 5 7
_
_
3
1
+
2
_
1 2 1 0
0 7 7 7
2 3 5 7
_
_
2
1
+
3
_
1 2 1 0
0 7 7 7
0 7 7 7
_
2
+
3
_
1 2 1 0
0 7 7 7
0 0 0 0
_
_
(1/7)
2
_
1 2 1 0
0 1 1 1
0 0 0 0
_
_
2
2
+
1
_
1 0 1 2
0 1 1 1
0 0 0 0
_
_
.
1.3. LINEAR COMBINATIONS OF VECTORS AND MATRIX/VECTOR MULTIPLICATION 7
The new linear system to be solved is given by
x
1
+x
3
= 2, x
2
+x
3
= 1, 0x
1
+0x
2
+0x
3
= 0.
Ignoring the last equation, this is a system of two equations with three unknowns; consequently, if it
is consistent (and it is!), then it must be the case that there are an innite number of solutions. The
variables x
1
and x
2
are associated with leading entries in the RREF form of the augmented matrix. We
will say that any variable of a RREF matrix which is not associated with a leading entry is a free variable:
free variables are associated with those columns in the RREF matrix which are not pivot columns. In
this case, x
3
is a free variable. Upon setting x
3
= t, where t R, the other variables are given by
x
1
= 2 t, x
2
= 1 t.
The solution is then given by
x =
_
_
2 t
1 t
t
_
_
, t R.
Example. Consider a linear system which is a variant of the one given above; namely,
x
1
2x
2
x
3
= 0
3x
1
+x
2
+4x
3
= 7
2x
1
+3x
2
+5x
3
= 8.
Upon doing Gaussian elimination of the augmented matrix we see that
_
_
1 2 1 0
3 1 4 7
2 3 5 8
_
_
1 0 1 2
0 1 1 1
0 0 0 1
_
_
.
The new linear system to be solved is
x
1
+x
3
= 2, x
2
+x
3
= 1, 0x
1
+0x
2
+0x
3
= 1.
Since the last equation clearly does not have a solution, the system is inconsistent.
1.3 Linear combinations of vectors and matrix/vector multiplication
1.3.1 Linear combinations of vectors
The addition/subtraction of two n-vectors is exactly what is expected, as is the multiplication of a vector
by a scalar; namely,
x |y =
_
_
x
1
|y
1
x
2
|y
2
.
.
.
x
n
|y
n
_
_
, cx =
_
_
cx
1
cx
2
.
.
.
cx
n
_
_
.
Denition 1.3.1. A linear combination of the n-vectors a
1
, . . . , a
k
is given by the vector b, where
b = x
1
a
1
+x
2
a
2
+ +x
k
a
k
=
k
j=1
x
j
a
j
.
The scalars x
1
, . . . , x
k
are known as weights.
With this notion of linear combinations of vectors, we can rewrite linear systems of equations in
vector notation. For example, consider the linear system
x
1
x
2
+x
3
= 1
3x
1
+2x
2
+8x
3
= 7
2x
1
4x
2
8x
3
= 10.
(1.3.1)
Upon using the fact that two vectors are equal if and only if all of their coecients are equal, we can
write (1.3.1) in vector form as
_
_
x
1
x
2
+x
3
3x
1
+2x
2
+8x
3
2x
1
4x
2
8x
3
_
_
=
_
_
1
7
10
_
_
.
After setting
a
1
=
_
_
1
3
2
_
_
, a
2
=
_
_
1
2
4
_
_
, a
3
=
_
_
1
8
8
_
_
, b =
_
_
1
7
10
_
_
,
the linear system can then be rewritten as the linear combination of vectors
x
1
a
1
+x
2
a
2
+x
3
a
3
= b. (1.3.2)
In other words, asking for solutions to the linear system (1.3.1) can instead be thought of as asking
if the vector b is a linear combination of the vectors a
1
, . . . , a
3
. It can be checked that after Gaussian
elimination
_
_
1 1 1 1
3 2 8 7
2 4 8 10
_
_
1 0 2 1
0 1 1 2
0 0 0 0
_
_
,
so that the solution to the linear system (1.3.1) is given by
x
1
= 1 2t, x
2
= 2 t, x
3
= t; t R. (1.3.3)
In other words, the vector b is a linear combination of the vectors a
1
, . . . , a
3
, and the weights are given
in (1.3.3).
1.3.2 Matrix/vector multiplication
With this observation in mind, we can now dene the multiplication of a matrix and a vector so that the
resultant corresponds to a linear system. For the linear system of (1.3.1) again let A be the coecient
matrix, i.e.,
A = (a
1
a
2
a
3
) R
33
.
Here each column of A is thought of as a vector. If for
x =
_
_
x
1
x
2
x
3
_
_
we dene
Ax x
1
a
1
+x
2
a
2
+x
3
a
3
,
then by using (1.3.2) we have that the linear system is given by
Ax = b (1.3.4)
(compare with (1.2.5)). In other words, by writing the linear system in the form of (1.3.4) we really
mean (1.3.2), which in turn really means (1.3.1).
1.3. LINEAR COMBINATIONS OF VECTORS AND MATRIX/VECTOR MULTIPLICATION 9
Denition 1.3.2. Suppose that A = (a
1
a
2
a
n
), where each vector a
j
R
m
is an m-vector. Then for
x R
n
we have that
Ax = x
1
a
1
+x
2
a
2
+ +x
n
a
n
=
n
j=1
x
j
a
j
.
Remark 1.3.3. Note that A R
mn
and x R
n
= R
n1
, so by denition
A
.
R
mn
x
.
R
n1
= b
.
R
m1
.
Example. We have that
_
1 2
3 4
__
3
5
_
= 3
_
1
3
_
+5
_
2
4
_
=
_
7
11
_
,
and
_
1 2 5
3 4 6
_
_
_
2
1
3
_
_
= 2
_
1
3
_
_
2
4
_
+3
_
5
6
_
=
_
15
20
_
.
Note that in the rst example a 2 2 matrix multiplied a 2 1 matrix in order to get a 2 1 matrix,
whereas in the second example a 2 3 matrix multiplied a 3 1 matrix in order to get a 2 1 matrix.
The multiplication of a matrix and a vector is a linear operation:
Lemma 1.3.4. Let A R
mn
, and x, y R
n
. It is then true that
A(cx +dy) = cAx +dAy.
Proof. Writing A = (a
1
a
2
a
n
), and using the fact that
cx +dy =
_
_
cx
1
+dy
1
cx
2
+dy
2
.
.
.
cx
n
+dy
n
_
_
,
we have that
A(cx +dy) = (cx
1
+dy
1
)a
1
+(cx
2
+dy
2
)a
2
+ +(cx
n
+dy
n
)a
n
= [cx
1
a
1
+cx
2
a
2
+ +cx
n
va
n
] +[dy
1
a
1
+dy
2
a
2
+ dy
n
a
n
]
= c [x
1
a
1
+x
2
a
2
+ +x
n
a
n
] +d [y
1
a
1
+y
2
a
2
+ +y
n
a
n
]
= cAx +dAy.
Remark 1.3.5. We are already familiar with linear operators, which are simply operators which satisfy
the linearity property of Lemma 1.3.4, in other contexts. For example, if D represents dierentiation,
i.e., D[f (t)] = f

(t), then we know from Calculus I that
D[af (t) +bg(t)] = af

(t) +bg
(t) = aD[f (t)] +bD[g(t)].

Similarly, if I represents indenite integration, i.e., I[f (t)] =
_
f (t) dt, then we again knowfromCalculus
I that
I[af (t) +bg(t)] = a
_
f (t) dt +b
_
g(t) dt = aI[f (t)] +bI[g(t)].
While we will not explore this issue too deeply in this class, the implication of this fact is that much of
what we study about the actions of matrices on vectors also applies to operations such as dierentiation
and integration on functions.
1.4 Span of vectors and the null space
1.4.1 Span
Consider a collection of n-vectors a
1
, a
2
, . . . , a
k
. A particular linear combination of these vectors is given
by x
1
a
1
+ + x
k
a
k
. The collection of all possible linear combinations of these vectors is known as the
span of the vectors.
Denition 1.4.1. Let S = a
1
, a
2
, . . . , a
k
be a set of n-vectors. The span of S, i.e.,
span(S) = spana
1
, a
2
, . . . , a
k
,
is the collection of all linear combinations. In other words, b span(S) if and only if for some x C
k
,
b = x
1
a
1
+x
2
a
2
+ +x
k
a
k
.
Remark 1.4.2. Upon setting A = (a
1
a
2
a
k
), we have that b span(S) if and only if the linear system
Ax = b is consistent.
The span of a collection of vectors does have geometric meaning. First suppose that a
1
R
3
. Recall-
ing that lines in R
3
are dened parametrically by
r(t) = r
0
+ta,
where a is a vector parallel to the line and r
0
corresponds to a point on the line, we have that spana
1
is the line through the origin which is parallel to a

1
. This follows directly from
spana
1
= ta
1
: t R.
Now suppose that a
1
, a
2
R
3
are not parallel, i.e., a
2
ca
1
for some c R. Set r = a
1
a
2
, i.e., r is a
3-vector which is perpendicular to both a
1
and a
2
. The linearity of the dot product yields that
r (sa
1
+ta
2
) = sr a
1
+tr a
2
= 0,
so that
spana
1
, a
2
= sa
1
+ta
2
: s, t R
is the collection of all vectors which are perpendicular to r. In other words, spana
1
, a
2
is the plane
through the origin which is perpendicular to r. There are higher dimensional analogues, but unfortu-
nately they are dicult to visualize.
Example. Letting
a
1
=
_
1
2
_
, a
2
=
_
1
1
_
, b =
_
1
2
_
,
let us determine if b spana
1
, a
2
. As we have seen in Remark 1.4.2, this question is equivalent to
determining if the linear system Ax = b is consistent. Since after Gaussian elimination
(Ab)
_
1 0 3
0 1 4
_
,
the linear system Ax = b is equivalent to
x
1
= 3, x
2
= 4,
which is easily solved. Thus, not only is b spana
1
, a
2
, but it is the case that b = 3a
1
4a
2
.
1.4. SPAN OF VECTORS AND THE NULL SPACE 11
1.4.2 Homogeneous linear systems
An interesting class of linear systems which are important to solve arises when b = 0:
Denition 1.4.3. A homogeneous linear system is given by Ax = 0. The null space of A, denoted by
Nul(A), is the set of all solutions to a homogeneous linear system, i.e.,
Nul(A) x : Ax = 0.
Remark 1.4.4. Note that Nul(A) is a nonempty set, as A 0 = 0 implies that 0 Nul(A).
Homogeneous linear systems have the important property that linear combinations of solutions are
solutions; namely:
Lemma 1.4.5. Suppose that x
1
, x
2
Nul(A), i.e., they are two solutions to the homogeneous linear sys-
tem Ax = 0. It is then true that x = c
1
x
1
+c
2
x
2
Nul(A) for any c
1
, c
2
R; in other words, spanx
1
, x
2

Nul(A).
Proof. The result follows immediately fromthe linearity of matrix/vector multiplication (see Lemma 1.3.4).
In particular, we have that
A(c
1
x
1
+c
2
x
2
) = c
1
Ax
1
+c
2
Ax
2
= c
1
0 +c
2
0 = 0.
Example. Let us nd Nul(A) for the matrix
A =
_
_
1 1 1 0
2 1 5 1
3 3 3 0
_
_
.
Forming the augmented matrix for the linear system Ax = 0 and using Gaussian elimination yields
(A0)
_
_
1 0 4 1 0
0 1 3 1 0
0 0 0 0 0
_
_
,
which yields the linear system
x
1
+4x
3
+x
4
= 0, x
2
+3x
3
+x
4
= 0.
Setting x
3
= s, x
4
= t yields that the solution vector is given by
x =
_
_
4s t
3s t
s
t
_
_
= s
_
_
4
3
1
0
_
_
+t
_
_
1
1
0
1
_
_
.
We can conclude that
Nul(A) = span
_
_
4
3
1
0
_
_
,
_
_
1
1
0
1
_
_
.
Example. Let us nd Nul(A) for the matrix
A =
_
_
1 2 3 3
2 1 3 0
1 1 0 3
3 2 1 7
_
_
.
Forming the augmented matrix for the linear system Ax = 0 and using Gaussian elimination yields
(A0)
_
_
1 0 1 1 0
0 1 1 2 0
0 0 0 0 0
0 0 0 0 0
_
_
,
which yields the linear system
x
1
+x
3
+x
4
= 0, x
2
+x
3
2x
4
= 0.
Again setting x
3
= s, x
4
= t yields that the solution vector is given by
x =
_
_
s t
s +2t
s
t
_
_
= s
_
_
1
1
1
0
_
_
+t
_
_
1
2
0
1
_
_
.
We can conclude that
Nul(A) = span
_
_
1
1
1
0
_
_
,
_
_
1
2
0
1
_
_
.
Remark 1.4.6. When solving the homogeneous system Ax = 0 by Gaussian elimination, it is enough to
row reduce the matrix A. In this case the augmented matrix (A0) yields no additional information upon
appending 0 to A.
1.5 Linear systems equivalence results
The purpose of this section is to formulate criteria which guarantee that a linear system is consistent.
In dening matrix/vector multiplication in such a way that the linear system makes sense as Ax = b, we
showed that the linear system is consistent if and only if
b = x
1
a
1
+x
2
a
2
+ +x
n
a
n
, A = (a
1
a
2
a
n
).
Using Denition 1.4.1 for the span of a collection of vectors it is then the case that the system is consis-
tent if and only if
b spana
1
, a
2
, . . . , a
n
.
On the other hand, we solve the system by using Gaussian elimination to put the augmented matrix
(Ab) into RREF. We know that the system is inconsistent if the RREF form of the augmented matrix
has a row of the form (000 01); otherwise, it is consistent. These observations lead to the following
equivalence result:
Theorem 1.5.1. Regarding the linear system Ax = b, where A = (a
1
a
2
a
n
), the following are equiva-
lent statements:
(a) the system is consistent
(b) b is a linear combination of the columns of A
(c) b spana
1
, a
2
, . . . , a
n
(d) the RREF of the augmented matrix (Ab) has no rows of the form (000 01).
1.6. LINEAR INDEPENDENCE OF VECTORS 13
We now wish to rene Theorem 1.5.1 in order to determine criteria which guarantee that the linear
system is consistent for any vector b. First, points (b)-(c) must be rened to say that for any b R
m
, b
spana
1
, . . . , a
n
; in other words, spana
1
, . . . , a
n
= R
m
. Additionally, it must be the case that for a given
b no row of the RREF of the augmented matrix (Ab) has row(s) of the form (000 00). If this is the
case, then another vector b can be found such that the system will be inconsistent. Consequently, the
system will always be consistent if and only the the RREF of A has no zero rows.
Corollary 1.5.2. Regarding the linear system Ax = b, where A = (a
1
a
2
a
n
) and b R
m
, the following
are equivalent statements:
(a) the system is consistent for any b
(b) R
m
= spana
1
, a
2
, . . . , a
n
(c) the RREF of A has no zero rows.

Example. Suppose that the coecient matrix is given by
A =
_
1 4
3 6
_
.
Since Gaussian elimination yields that the RREF of A is I
2
, by Corollary 1.5.2 the linear system Ax = b
is consistent for any b R
2
.
Example. Suppose that the coecient matrix is given by
A =
_
1 2
3 6
_
.
Gaussian elimination yields that the RREF of A is
A
_
1 2
0 0
_
.
Since the RREF of A has a zero row, the system Ax = b is not consistent for all b. Since
Ax = x
1
_
1
3
_
+x
2
_
2
6
_
= (x
1
+2x
2
)
_
1
3
_
,
we have that
span
_
1
3
_
,
_
2
6
_
= span
_
1
3
_
;
thus, Ax = b is by Theorem 1.5.1 consistent if and only if
b span
_
1
3
_
b = c
_
1
3
_
, c R.
1.6 Linear independence of vectors
We start this section by breaking up the (supposed) solution(s) to the linear system
Ax = b (1.6.1)
into two distinct pieces. First consider the homogeneous equation Ax = 0. We saw in Denition 1.4.3
that all solutions, designated by say x
h
(the homogeneous solution), reside in the null space of A. Let a
solution to the nonhomogeneous problem be designated as x
p
(the particular solution). We then have
that
A(x
h
+x
p
) = Ax
h
+Ax
p
= 0 +b = b,
so that x
h
+ x
p
is a solution to the linear system (1.6.1). Indeed, any solution can be written in such a
manner, simply by writing a solution x as x = (x x
h
) +x
h
and designating x
p
= x x
h
.
Theorem 1.6.1. All solutions to the linear system (1.6.1) are of the form
x = x
h
+x
p
,
where the homogeneous solution x
h
Nul(A), and the particular solution x
p
solves the system.
The result of Theorem 1.6.1 will be the foundation of solving not only linear systems, but also linear
ordinary dierential equations. It should be noted that there is a bit of ambiguity associated with the
homogeneous solution. As we saw in Lemma 1.4.5, if x
1
, x
2
Nul(A), then it will be the case that
there is a family of homogeneous solutions given by the linear combination of these solutions, i.e.,
x
h
= c
1
x
1
+c
2
x
2
for any constants c
1
, c
2
R. On the other hand, there really is no such ambiguity for the
particular solution. Indeed, since
A(cx
p
) = cAx
p
= cb,
we have that cx
p
is a particular solution if and only if c = 1.
Example. Consider a linear system for which
A =
_
_
1 3 4 1
1 4 3 6
2 6 4 10
0 5 5 5
_
_
, b =
_
_
2
5
8
5
_
_
.
Upon performing Gaussian elimination the RREF of the augmented matrix is given by
(Ab)
_
_
1 0 1 2 1
0 1 1 1 1
0 0 0 0 0
0 0 0 0 0
_
_
.
The original linear system is then equivalent to the system
x
1
+x
3
+2x
4
= 1, x
2
+x
3
x
4
= 1. (1.6.2)
The free variables for this system are x
3
, x
4
, so by setting x
3
= s and x
4
= t we get the solution to be
x =
_
_
s 2t +1
s +t 1
s
t
_
_
= s
_
_
1
1
1
0
_
_
+t
_
_
2
1
0
1
_
_
+
_
_
1
1
0
0
_
_
.
The claim is that for the solution written in this form,
x
h
= s
_
_
1
1
1
0
_
_
+t
_
_
2
1
0
1
_
_
, x
p
=
_
_
1
1
0
0
_
_
.
It is easy to check that x
p
is a particular solution: verify that Ax
p
= b. Similarly, in order to see that x
h
is
a homogeneous solution, simply check that
Ax
h
= A(s
_
_
1
1
1
0
_
_
+t
_
_
2
1
0
1
_
_
) = sA
_
_
1
1
1
0
_
_
+tA
_
_
2
1
0
1
_
_
= 0.
1.6. LINEAR INDEPENDENCE OF VECTORS 15
By Lemma 1.4.5 we know that Nul(A) = spanx
1
, . . . , x
k
for some collection of vectors x
1
, . . . , x
k
, each
of which is a solution to the homogenous problem. It may be the case, however, that the original set
of vectors used to describe Nul(A) is too large, in the sense that one or more of the given vectors gives
extraneous information. For example, suppose that
Nul(A) = spanx
1
, x
2
, x
3
, x
1
=
_
_
1
1
0
_
_
, x
2
=
_
_
1
0
1
_
_
, x
3
=
_
_
5
2
3
_
_
.
This implies that the homogeneous solution is given by
x
h
= c
1
x
1
+c
2
x
2
+c
3
x
3
= (c
1
+2c
3
)x
1
+(c
2
+3c
3
)x
2
.
In the second equation we used the fact that
x
3
= 2x
1
+3x
2
x
1
+3x
2
x
3
= 0.
Thus, it is the case that the addition of x
3
in the denition of the null space is superuous, so that we
can write
Nul(A) = spanx
1
, x
2
.
Since x
2
cx
1
for some c R, we cannot reduce the collection of vectors comprising the spanning set
any further.
Denition 1.6.2. The set of n-vectors S = a
1
, a
2
, . . . , a
k
is linearly dependent if there is a nontrivial
vector x 0 C
k
such that
x
1
a
1
+x
2
a
2
+ +x
k
a
k
= 0. (1.6.3)
Otherwise, the set of vectors is linearly independent.
Remark 1.6.3. In our previous example the set of vectors x
1
, x
2
, x
3
is a linearly dependent set, whereas
the set x
1
, x
2
is a linearly independent set.
A careful examination of (1.6.3) reveals that we determine if a set of vectors is linearly dependent or
independent by solving the homogeneous linear system
Ax = 0, A = (a
1
a
2
a
k
).
If there is a nontrivial solution, i.e., if Nul(A) contains something more than the zero vector, then the
vectors will be linearly dependent; otherwise, they will be independent.
Example. Let
a
1
=
_
_
1
0
1
_
_
, a
2
=
_
_
3
1
4
_
_
, a
3
=
_
_
1
1
2
_
_
, a
4
=
_
_
3
3
2
_
_
,
and consider the sets
S
1
= a
1
, a
2
, S
2
= a
1
, a
2
, a
3
, S
3
= a
1
, a
2
, a
3
, a
4
.
Upon performing Gaussian elimination we have that the RREF of each given matrix is
A
1
= (a
1
a
2
)
_
_
1 0
0 1
0 0
_
_
, A
2
= (a
1
a
2
a
3
)
_
_
1 0 2
0 1 1
0 0 0
_
_
, A
3
= (a
1
a
2
a
3
a
4
)
_
_
1 0 2 0
0 1 1 0
0 0 0 1
_
_
.
We have that
Nul(A
1
) = 0, Nul(A
2
) = span
_
_
2
1
1
_
_
, Nul(A
3
) = span
_
_
2
1
1
0
_
_
.
Thus, the only linearly independent set is S
1
, and for the sets S
2
and S
3
we have the respective relations
2a
1
+a
2
+a
3
= 0, 2a
1
+a
2
+a
3
+0a
4
= 0.
Thus, we have
spana
1
, a
2
, a
3
= spana
1
, a
2
, spana
1
, a
2
, a
3
, a
4
= spana
1
, a
2
, a
4
.
Furthermore, upon using Remark 1.4.2 and the result of Corollary 1.5.2 we see that the only spanning
set, i.e., the only set for which R
3
= span(S), is the set S
3
. The spanning set for each of S
1
and S
2
is a
plane.
Example. Suppose that S = a
1
, a
2
, . . . , a
5
, where each a
j
R
4
. Further suppose that the RREF of A is
A = (a
1
a
2
a
3
a
4
a
5
)
_
_
1 0 2 0 3
0 1 1 1 2
0 0 0 0 0
0 0 0 0 0
_
_
.
We then have that
Nul(A) = span
_
_
2
1
1
0
0
_
_
,
_
_
0
1
0
1
0
_
_
,
_
_
3
2
0
0
1
_
_
,
so that
2a
1
+a
2
+a
3
= 0, a
2
+a
4
= 0, 3a
1
2a
2
+a
5
= 0.
In this case we have that
spana
1
, a
2
, . . . , a
5
= spana
1
, a
2
.
As we saw in the above examples, the set of linearly independent vectors which form a spanning set
can be found by removing from the original set those vectors which correspond to free variables in the
RREF of A. For example, in the last example the free variables are x
3
, x
4
, x
5
, which means that each of
the vectors a
3
, a
4
, a
5
is a linear combination of the rst two vectors, and consequently provides no new
information about the spanning set.
From Theorem 1.6.1 we know that all solutions are of the form x
h
+ x
p
, where x
h
Nul(A). Since
cx
h
Nul(A) for any c R (see Lemma 1.4.5), we know that if x
h
0, then the linear system has an
innite number of solutions. Thus, a solution can be unique if and only if the RREF of A has no free
variables, i.e., if every column is a pivot column. We can summarize our discussion with the following
result:
Theorem 1.6.4. The following statements about a matrix A R
mn
are equivalent:
(a) there is at most one solution to the linear system Ax = b
(b) the RREF of A has no free variables
(c) the RREF of A has a pivot position in every column
(d) the columns of A are linearly independent
(e) Nul(A) = 0, i.e., the only solution to the homogeneous system Ax = 0 is x = 0.
If the matrix is square, then Theorem 1.6.4 can be rened:
Corollary 1.6.5. The following statements about a square matrix A R
nn
are equivalent:
1.7. SUBSPACES AND BASIS 17
(a) there is only one solution to the linear system Ax = b for any b
(b) the RREF of A is I
n
(c) the columns of A are linearly independent
(d) the columns of A form a spanning set for R
n
.
Example. Suppose that A R
35
. Since A has more columns than rows, it is impossible for the RREF
of A to have a pivot position in every column. Hence, by Theorem 1.6.4 the columns of A cannot be
linearly independent. We cannot say that the columns form a spanning set for R
3
without knowing
something more about the RREF of A. If we are told that the RREF of A has two pivot positions, then
it must be the case that the RREF of A has one zero row; hence, by Corollary 1.5.2 the columns cannot
form a spanning set. However, if we are told that the RREF of A has three pivot positions (the maximum
number possible), then it must be the case that the RREF of A has no zero rows, which by Corollary 1.5.2
means that the columns do indeed form a spanning set.
Example. Suppose that A R
64
has 4 pivot columns. Since that is the maximal number of pivot
columns, by Theorem 1.6.4 the columns are a linearly independent set. However, they cannot form a
spanning set for R
6
, since the RREF of A will have two zero rows.
1.7 Subspaces and basis
1.7.1 Subspaces
The null space of A, Nul(A), is an important example of a more general set:
Denition 1.7.1. A set S R
n
is a subspace if
x
1
, x
2
S c
1
x
1
+c
2
x
2
S, c
1
, c
2
R.
In particular, 0 S, and if x
1
S, then c
1
x
1
S for all x
1
R.
More generally, it is true that the span of a collection of vectors is a subspace:
Lemma 1.7.2. The set S = spana
1
, a
2
, . . . , a
k
is a subspace.
Proof. Suppose that b
1
, b
2
S. Using Remark 1.4.2 it will then be the case that there exist vectors x
1
, x
2
such that for A = (a
1
a
2
a
k
),
Ax
1
= b
1
, Ax
2
= b
2
.
We must now show that for the vector b = c
1
b
1
+ c
2
b
2
there is a vector x such that Ax = b, as it will
then be true that b S. However, if we choose x = c
1
x
1
+ c
2
x
2
, then by the linearity of matrix/vector
multiplication we have that
Ax = A(c
1
x
1
+c
2
x
2
) = c
1
Ax
1
+c
2
Ax
2
= c
1
b
1
+c
2
b
2
= b.
Example. Suppose that
S =
_
_
x
1
+2x
2
3x
2
4x
1
+x
2
_
_
: x
1
, x
2
R.
Since
_
_
x
1
+2x
2
3x
2
4x
1
+x
2
_
_
= x
1
_
_
1
0
4
_
_
+x
2
_
_
2
3
1
_
_
= span
_
_
1
0
4
_
_
,
_
_
2
3
1
_
_
,
by Lemma 1.7.2 the set is a subspace.
S =
_
_
x
1
+2x
2
3x
2
+1
4x
1
+x
2
_
_
: x
1
, x
2
R.
We have that b S if and only if
b = x
1
_
_
1
0
4
_
_
+x
2
_
_
2
3
1
_
_
+
_
_
0
1
0
_
_
.
Since
2b = 2x
1
_
_
1
0
4
_
_
+2x
2
_
_
2
3
1
_
_
+
_
_
0
2
0
_
_
,
it is clear that 2b S. Consequently, S cannot be a subspace. Alternatively, it is easy to show that 0 S.
Another important example of a subspace which is directly associated with a matrix is the column
space:
Denition 1.7.3. The column space of a matrix A, i.e., Col(A), is the set of all linear combinations of
the columns of A.
Lemma 1.7.4. Col(A) is a subspace.
Proof. Setting A = (a
1
a
1
a
k
), by Denition 1.7.3 we have that
Col(A) = x
1
a
1
+x
1
a
2
+ +x
k
a
k
: x
1
, x
2
, . . . , x
k
R = spana
1
, a
2
, . . . , a
k
.
We can now invoke Lemma 1.7.2 in order to conclude that the column space is indeed a subspace.
With these notions in mind, we can revisit the statement of Theorem 1.5.1 in order to make an
equivalent statement. In particular Theorem 1.5.1(c) states that a linear system is consistent if and only
if the vector b is in the span of the column vectors of the matrix A. The denition of the column space
then yields:
Lemma 1.7.5. The linear system Ax = b is consistent if and only if b Col(A).
1.7.2 Basis of a subspace
The next question to consider is the size of a subspace. In this context size will be used to denote
the number of linearly independent vectors (see Denition 1.6.2) in a spanning set. We rst label these
vectors:
Denition 1.7.6. A set B = a
1
, a
2
, . . . , a
k
is a basis for a subspace S if
(a) the vectors a
1
, a
2
, . . . , a
k
are linearly independent
(b) S = spana
1
, a
2
, . . . , a
k
.
Regarding bases, we have the following intuitive result:
Lemma 1.7.7. If A = a
1
, . . . , a
k
and B = b
1
, . . . , b
m
are two bases for a subspace S, then it must be the
case that k = m. In other words, all bases for a subspace have the same number of vectors.
Proof. For the interested reader.
Because the number of vectors in a basis of a subspace is xed, this quantity gives a good way to
describe the size of a subspace.
1.7. SUBSPACES AND BASIS 19
Denition 1.7.8. If B = a
1
, a
2
, . . . , a
k
is a basis for a subspace S, then the dimension of S, i.e., dim[S],
is the number of basis vectors:
dim(S) = k.
Example. Let e
j
for j = 1, . . . , n denote the j
th
column vector in the identity matrix I
n
. Since I
n
is in
RREF, by Corollary 1.6.5 the set e
1
, e
2
, . . . , e
n
is linearly independent and forms a spanning set for R
n
;
in other words, it is a basis for R
n
. By Denition 1.7.8 we then have the familiar result that dim[R
n
] = n.
The question to now consider is the determination of a basis for Nul(A) and Col(A). The former we
have already determined in many calculations: we simply nd a spanning set for Nul(A) by putting A
into RREF. The manner in which we choose the spanning set guarantees that the set will also be a basis.
Now consider Col(A). This is best considered via an explicit example. Suppose that
A =
_
_
1 3 2
1 1 2
3 4 1
_
_
1 0 1
0 1 1
0 0 0
_
_
.
The matrix on the right, of course, is the RREF of A. The reduced linear system associated with Ax = 0
is then given by
x
1
+x
3
= 0, x
2
x
3
= 0.
We rst have that
Nul(A) = span
_
_
1
1
1
_
_
,
so that dim[Nul(A)] = 1. Furthermore,
A
_
_
1
1
1
_
_
= 0 a
1
+a
2
+a
3
= 0 a
3
= a
1
a
2
.
Thus, a
3
is a linear combination of a
1
and a
2
, so that in general
x
1
a
1
+x
2
a
2
+x
3
a
3
= (x
1
+x
3
)a
1
+(x
2
x
3
)a
2
;
in other words,
spana
1
, a
2
, a
3
= spana
1
, a
2
.
It is clear that a
1
, a
2
is a linearly independent set. Thus, we have that
(a) Col(A) = spana
1
, a
2
(b) a
1
, a
2
is a linearly independent set;
in other words, a
1
, a
2
is a basis for Col(A), and dim[Col(A)] = 2.
From this example we can extract some general truths. The pivot columns of the RREF of A will
form a basis for Col(A), i.e.,
dim[Col(A)] = # of pivot columns,
and the number of free variables will be the dimension of Nul(A), i.e.,
dim[Nul(A)] = # of free variables.
Since a column of the RREF of A is either a pivot column, or has a free variable, upon using the fact that
sum of the number of pivot columns and the number of free variables is the total number of columns,
we get the following:
Lemma 1.7.9. For the matrix A C
mn
, a basis for Col(A) is given by the pivot columns. Furthermore,
we have that
dim[Col(A)] +dim[Nul(A)] = n.
The dimension of the column space gives us one more bit of information. Suppose that A R
mn
, so
that Col(A) R
m
. Upon invoking a paraphrase of Corollary 1.5.2, we know that Ax = b is consistent for
any b R
m
if and only if the RREF of A has precisely m pivot columns. In other words, the system is
consistent for any b if and only if
dim[Col(A)] = dim[R
m
] = m.
If dim[Col(A)] m1, then it will necessarily be the case that Ax = b will not be consistent for all b. For
example, if A R
33
and dim[Col(A)] = 2, then it will be the case that the subspace Col(A) is a plane,
and the linear system Ax = b will be consistent if and only if the vector b lies on that plane. We nish
by restating Theorem 1.6.4:
Theorem 1.7.10. The following statements about a matrix R
mn
are equivalent:
(a) there is at most one solution to the linear system Ax = b
(b) dim[Nul(A)] = 0
(c) dim[Col(A)] = n.
Alternatively, the following are also equivalent:
(a) the linear system Ax = b is consistent for any b
(b) dim[Col(A)] = m.
Nowsuppose that we wish the linear systemto be such that not only is the solution consistent for any
b, but the solution is unique. The uniqueness condition requires that dim[Col(A)] = n, or equivalently
dim[Nul(A)] = 0, while the consistency condition requires that dim[Col(A)] = m. Thus, it must be the
case that m = n, i.e., the matrix is a square matrix. Furthermore, the fact that the RREF of A has no free
variables means that the RREF is I
n
.
Corollary 1.7.11. Consider the linear system Ax = b. In order for there to be a unique solution for any
b, it must be the case that A R
nn
is a square matrix, and the RREF of A is I
n
.
A =
_
_
1 3 2 1
1 1 2 1
3 4 1 1
_
_
.
The RREF of A is given by
_
_
1 0 1 0
0 1 1 0
0 0 0 1
_
_
.
The reduced linear system corresponding to Ax = 0 is then given by
x
1
+x
3
= 0, x
2
x
3
= 0, x
4
= 0,
so that
Nul(A) = span
_
_
1
1
1
0
_
_
.
1.8. MATRIX ALGEBRA 21
Since the pivot columns are the rst, second, and fourth columns of the RREF of A, a basis for Col(A) is
given by the set
_
1
1
3
_
_
,
_
_
3
1
4
_
_
,
_
_
1
1
1
_
_
.
Since dim[Col(A)] = 3 = dim[R
3
], the linear system Ax = b is consistent for all x R
3
.
1.8 Matrix algebra
Our goal is to now consider the algebra of matrices; in particular, addition/subtraction and multiplica-
tion. Addition/subtraction and scalar multiplication are straightforward. If we denote two matrices as
A = (a
jk
) R
mn
and B = (b
jk
) R
mn
, then it is the case that
A|B = (a
jk
|b
jk
), cA = (ca
jk
).
In other words, we add/subtract two matrices of the same size component-by-component, and if we
multiply a matrix by a scalar, then we multiply each component by that scalar. This is exactly what we
do in the addition/subtraction of vectors, and the multiplication of a vector by a scalar. For example, if
A =
_
1 2
1 3
_
, B =
_
2 1
4 3
_
,
then
A+B =
_
3 3
3 0
_
, 3A =
_
3 6
3 9
_
.
Regarding the multiplication of two matrices, we simply generalize the matrix/vector multiplication.
For a given A R
mn
, recall that for b R
n
,
Ab = b
1
a
1
+b
2
a
2
+ +b
n
a
n
, A = (a
1
a
2
a
n
).
If B = (b
1
b
2
b
) R
n
(note that each column b
j
R
n
), we then dene the multiplication of A and B
by
A
.
R
mn
B
.
R
n
= (Ab
1
Ab
2
Ab
)
.
R
m
.
Note that the number of columns of A must match the number of rows of B in order for the operation
to make sense, and further note that the number of rows of the product is the number of rows of A, and
the number of columns of the product is the number of columns of B. For example, if
A =
_
1 2 3
1 3 2
_
, B =
_
_
2 1
4 3
6 4
_
_
,
then
AB =
_
28 19
2 2
_
R
22
, BA =
_
_
1 1 8
1 1 18
2 0 26
_
_
R
33
.
As the above example illustrates, it may not necessarily be the case that AB = BA. In this example
changing the order of multiplication leads to a resultant matrices of dierent sizes. However, even if
the resultant matrices are the same size, they need not be the same. Suppose that
A =
_
1 2
1 3
_
, B =
_
2 1
4 3
_
,
so that
AB =
_
10 7
14 10
_
R
22
, BA =
_
1 1
1 1
_
R
22
.
Finally, there is a special matrix which plays the role of the scalar one in matrix multiplication: the
identity matrix I
n
. If A R
mn
, then it is straightforward to check that
AI
n
= A, I
m
A = A.
In particular, if x R
n
, then it is true that I
n
x = x.
Remark 1.8.1. Although we will rarely, if at all, use the notation in these notes, there is another op-
eration that we can perform on a matrix A: taking its transpose, which is denoted by A
T
. Writing
A = (a
ij
) R
mn
for i = 1, . . . , m and j = 1, . . . , n, we have A
T
= (a
ji
) R
nm
. In other words, each column
of A is a row of A
T
. For example,
_
_
1
3
2
_
_
T
= (132) (132)
T
=
_
_
1
3
2
_
_
,
and
_
1 3 2
2 5 8
_
T
=
_
_
1 2
3 5
2 8
_
_

_
_
1 2
3 5
2 8
_
_
T
=
_
1 3 2
2 5 8
_
.
1.9 The inverse of a square matrix
We now wish to consider the inverse of a square matrix A R
nn
. If it exists, it will be denoted by A
1
,
and it will have the property that
A
1
A = AA
1
= I
n
. (1.9.1)
Assuming that the inverse exists, it allows us to solve the linear system Ax = b via a matrix/vector
multiplication. Namely, we have
Ax = b A
1
Ax = A
1
b I
n
x = A
1
b x = A
1
b.
Lemma 1.9.1. Consider the linear system Ax = b, where A R
nn
is invertible, i.e., A
1
exists. The
solution to the linear system is given by
x = A
1
b.
How do we compute the inverse? Denote A
1
= (a
1
1
a
1
2
a
1
n
), and let e
j
denote the j
th
column of
I
n
. Using (1.9.1) it is the case that Aa
1
j
= e
j
for j = 1, . . . , n, i.e., the j
th
column of A
1
is the solution to
Ax = e
j
. From Corollary 1.7.11 it must be the case that if A
1
exists, then the RREF of A is I
n
. This yields
(Ae
j
) (I
n
a
1
j
), j = 1, . . . , n. (1.9.2)
Now, (1.9.2) is equivalent to solving n linear systems, each of which has the same coecient matrix. This
can be done more eciently by putting each right-hand side into a column of the augmented matrix,
and then row-reducing this larger augmented matrix. This is equivalent to putting (AI
n
) into RREF,
and we get
(AI
n
) (I
n
A
1
).
Remark 1.9.2. We often think of a linear system of equations as Ax = b, where x, b are vectors. It can
be the case in applications, however, that the linear system should be written as AX = B, where X, B are
matrices. In this context, nding A
1
is equivalent to solving AX = I
n
.
1.10. THE DETERMINANT OF A SQUARE MATRIX 23
Lemma 1.9.3. The square matrix A R
nn
is invertible if and only if the RREF of Ais I
n
, i.e., dim[Nul(A)] =
0. The inverse is computed via
(AI
n
) (I
n
A
1
).
A =
_
1 2
3 5
_
, b =
_
2
6
_
.
We have
(AI
2
)
_
1 0 5 2
0 1 3 1
_
A
1
=
_
5 2
3 1
_
.
Consequently, the solution to the linear system Ax = b is given by
x = A
1
b =
_
22
12
_
.
A =
_
1 2
3 6
_
.
We have
(AI
2
)
_
1 2 1 0
0 0 3 1
_
;
consequently, since the left-hand side of the augmented matrix cannot be row-reduced to I
2
, A
1
does
not exist. Since the RREF of A is
A
_
1 2
0 0
_
,
we have that the rst column of A is the only pivot column; hence, by Lemma 1.7.5 and the fact that the
pivot columns form a basis for Col(A) (see Lemma 1.7.9) the linear system Ax = b is consistent if and
only if
b Col(A) = span
_
1
3
_
.
1.10 The determinant of a square matrix
We wish to derive a scalar which tells us whether or not a square matrix is invertible. The hope is that
the scalar will be easier and quicker to compute that trying to put the augmented matrix (AI
n
) into
RREF. First suppose that A R
22
is given by
A =
_
a b
c d
_
.
If we try to compute A
1
, we get
(AI
2
)
c
1
+a
2
_
a b 1 0
0 ad bc c a
_
.
If ad bc 0, then we can continue with the row reduction, and eventually compute A
1
; otherwise, A
1
does not exist. This fact implies that this quantity has special signicance for 2 2 matrices.
Denition 1.10.1. Let A R
22
be given by
A =
_
a b
c d
_
.
The determinant of A, i.e., det(A) is given by
det(A) = ad bc.
From the discussion leading to Denition 1.10.1 we know that:
Lemma 1.10.2. Suppose that A R
22
is given by
A =
_
a b
c d
_
.
The matrix is invertible if and only if det(A) 0. Furthermore, if det(A) 0, then the inverse is given by
A
1
=
1
det(A)
_
d b
c a
_
.
Proof. A simple calculation shows that AA
1
= I
2
if det(A) 0.
A =
_
4 7
3 2
_
.
Since det(A) = 29, the inverse of A exists, and it is given by
A
1
=
1
29
_
2 7
3 4
_
.
The unique solution to the linear system Ax = b is given by x = A
1
b.
A =
_
4 1
8 2
_
.
Since det(A) = 0, the inverse of A does not exist. If there is a solution to Ax = b, it must be found by
putting the augmented matrix (Ab) into RREF, and then solving the resultant system.
We now wish to dene the determinant for A R
nn
for n 3. In theory we could derive it in a
manner similar to that for the case n = 2: start with a matrix of a given size, and then attempt to row-
reduce it to the identity. At some point a scalar arises which must be nonzero in order to ensure that the
RREF of the matrix is the identity. This scalar will then be denoted as the determinant. Instead of going
through this derivation, we instead settle on the nal result. For A R
nn
, let A
ij
R
(n1)(n1)
denote
the submatrix gotten from A after deleting the i
th
row and j
th
column. For example,
A =
_
_
1 4 7
2 5 8
3 6 9
_
_
= A
12
=
_
2 8
3 9
_
, A
31
=
_
4 7
5 8
_
.
With this notion of submatrix in mind, we note that for 22 matrices the determinant can be written as
det(A) = a
11
det(A
11
) a
12
det(A
12
),
where here the determinant of a scalar is simply the scalar.
Denition 1.10.3. If A R
nn
, then the determinant of A is given by
det(A) = a
11
det(A
11
) a
12
det(A
12
) +a
13
det(A
13
) + +(1)
n+1
a
1n
det(A
1n
).
Example. If
A =
_
_
1 4 7
2 5 8
3 6 9
_
_
,
1.10. THE DETERMINANT OF A SQUARE MATRIX 25
then we have
A
11
=
_
5 8
6 9
_
, A
12
=
_
2 8
3 9
_
, A
13
=
_
2 5
3 6
_
,
so that
det(A) = 1 det(A
11
) 4 det(A
12
) +7 det(A
13
) = 3 +24 21 = 0.
Thus, we know that A
1
does not exist; indeed, the RREF of A is
A
_
_
1 0 1
0 1 2
0 0 0
_
_
,
so that the columns of A are linearly dependent with a
1
2a
2
+a
3
= 0.
We now restate Theorem 1.7.10 (and generalize Lemma 1.10.2) in terms of determinants:
Theorem 1.10.4. Consider the linear system Ax = b, where A R
nn
. The system has a unique solution
for any b given by x = A
1
b if and only if det(A) 0. On the other hand, if det(A) = 0, then dim[Nul(A)]
1, so that if the system is consistent, the solution will not be unique.
The determinant has many properties, which are too many to detail in full here. The rst, and
perhaps most important, is that the expression of Denition 1.10.1 is not the only way to calculate
the determinant. In general, the determinant can be calculated by going across any row, or down any
column; in particular, we have
det(A) =
n
j=1
(1)
i+j
a
ij
det(A
ij
)
.
i
th
row
=
n
i=1
(1)
i+j
a
ij
det(A
ij
)
.
j
th
column
. (1.10.1)
For example,
det
_
_
4 3 6
2 0 0
1 7 5
_
_
= (6) det
_
2 0
1 7
_
(0) det
_
4 3
1 7
_
+(5) det
_
4 3
2 0
_
= (2) det
_
3 6
7 5
_
+(0) det
_
4 6
1 5
_
(0) det
_
4 3
1 7
_
.
The rst line is down the third column, and the second line is across the second row. As the above exam-
ple shows, a judicious choice for the expansion of the determinant can greatly simplify the calculation.
In particular, it is generally best to calculate the determinant using the row or column which has the
most zeros. Note that if a matrix has a zero row or column, then by using the more general denition
(1.10.1) and expanding across that zero row or column we get that det(A) = 0.
A couple of other properties which may sometimes be useful are as follows. If a matrix B is formed
fromAby multiplying a rowor column by a constant c, e.g., A = (a
1
a
2
a
n
) and B = (ca
1
a
2
a
n
), then
det(B) = c det(A). In particular, after multiplying each column by the same constant, i.e., multiplying
the entire matrix by a constant, it is then true that det(cA) = c
n
det(A). Another useful property is that
det(AB) = det(A) det(B).
Since I
n
A = A, we get from this property that
det(A) = det(I
n
) det(A) = det(I
n
) det(A) det(I
n
) = 1
(this could also be shown by a direct computation). Since AA
1
= I
n
, this also allows us to state that
1 = det(I
n
) = det(AA
1
) = det(A) det(A
1
) det(A
1
) =
1
det(A)
.
1.11 Eigenvalues and eigenvectors
We now wish to factor a matrix in a manner that will be most conducive to understanding how to solve
linear systems of Ordinary Dierential Equations. We begin by means of an example.
1.11.1 Example: Markov processes
Consider the following table, which represents the manner in which voter registration changes from
one year to the next:
D R I
D .90 .03 .10
R .02 .85 .20
I .08 .12 .70
Here R represents the number of Republicans, D the number of Democrats, and I the number of In-
dependents in a given population. We assume that the total number of voters is constant, so that
D + R + I = N, where N is the total number of voters. The table gives us the following information
regarding the manner in which voters change their political aliation from one year to the next. Let
D
j
, R
j
, I
j
represent the number of voters in year j. The table tells us that from one year to the next,
D
n+1
= .90D
n
+.03R
n
+.10I
n
R
n+1
= .02D
n
+.85R
n
+.20I
n
I
n+1
= .08D
n
+.12R
n
+.70I
n
.
Upon setting
x
n
=
_
_
D
n
R
n
I
n
_
_
, M =
_
_
.90 .03 .10
.02 .85 .20
.08 .12 .70
_
_
,
we can rewrite this as the discrete (vs. continuous) dynamical system
x
n+1
= Mx
n
, x
0
given. (1.11.1)
The dynamical system (1.11.1) is known as a Markov process, and it is distinguished by the fact that the
sum of each column of the transition (stochastic, Markov) matrix M is one.
For a given initial distribution of voters x
0
, we wish to determine the distribution of voters after
many years, i.e., we wish to compute lim
n+
x
n
. First, we need to solve for x
n
. Since
x
1
= Mx
0
, x
2
= Mx
1
= M(Mx
0
),
by dening in general M
k
= MM M, i.e., M
k
is the matrix M multiplied by itself k times, we have that
x
2
= M
2
x
0
.
Continuing in this fashion gives
x
3
= Mx
2
= M(M
2
x
0
) = M
3
x
0
, x
4
= Mx
3
= M(M
3
x
0
) = M
4
x
0
:
we get by an induction argument that the solution to the dynamical system is
x
n
= M
n
x
0
. (1.11.2)
Thus, our question is answered by determining lim
n+
M
n
, which is not a trivial calculation.
Now, suppose that there are numbers
1
, . . . ,
3
and linearly independent vectors v
1
, . . . , v
3
such that
for j = 1, . . . , 3,
Mv
j
=
j
v
j
(M
j
I
3
)v
j
= 0. (1.11.3)
1.11. EIGENVALUES AND EIGENVECTORS 27
Since the vectors are linearly independent, we can write
x
0
= c
1
v
1
+c
2
v
2
+c
3
v
3
;
indeed, by setting P = (v
1
v
2
v
3
) we have
x
0
= Pc c = P
1
x
0
.
With this in mind we then get
x
1
= Mx
0
= c
1
Mv
1
+c
2
Mv
2
+c
3
Mv
3
= c
1
1
v
1
+c
2
2
v
2
+c
3
3
v
3
. (1.11.4)
For n = 2 we get
x
2
= Mx
1
= c
1
1
Mv
1
+c
2
2
Mv
2
+c
3
3
Mv
3
= c
1
2
1
v
1
+c
2
2
2
v
2
+c
3
2
3
v
3
,
while for n = 3,
x
3
= Mx
2
= c
1
2
1
Mv
1
+c
2
2
2
Mv
2
+c
3
2
3
Mv
3
= c
1
3
1
v
1
+c
2
3
2
v
2
+c
3
3
3
v
3
Using an induction argument eventually gives another form of the solution (1.11.2) to be
x
n
= c
1
n
1
v
1
+c
2
n
2
v
2
+c
3
n
3
v
3
, c = P
1
x
0
. (1.11.5)
Alternatively, upon noting that (1.11.3) is equivalent to
MP = PD, D =
_
1
0 0
0
2
0
0 0
3
_
_
,
where D is a diagonal matrix, we have
M = PDP
1
.
In other words, the Markov matrix can be written as a product of three matrices, two of which are
intimately related to each other. Going back to the solution formula (1.11.2) we note that when n = 2,
M
2
= PDP
1
PDP
1
= PDI
3
DP
1
= PD
2
P
1
,
and when n = 3,
M
3
= M
2
M = PD
2
P
1
PDP
1
= PD
3
P
1
.
By an induction argument we then get that
M
n
= PD
n
P
1
,
so that the solution (1.11.2) can be rewritten as
x
n
= PD
n
P
1
x
0
, M = PDP
1
. (1.11.6)
Of course, this formulation is equivalent to the formulation of (1.11.5). Since
D
n
=
_
n
1
0 0
0
n
2
0
0 0
n
3
_
_
,
the behavior of x
n
for large n is again reduced to understanding the behavior of
n
j
for large n.
For the solution formulas (1.11.5) and (1.11.6) we have the same problem: what are D and P, i.e.,
what are the numbers
j
and vectors v
j
in (1.11.3)? Now, for the problem at hand it turns out to be the
case that
1
= 1,
2
= 0.59,
3
= 0.86.
Thus, we have
lim
n+
n
1
= 1, lim
n+
n
2
= lim
n+
n
3
= 0,
so that for the solution formula (1.11.5) we have
lim
n+
x
n
= c
1
v
1
.
It further turns out to be the case that the limiting vector can be written as
c
1
v
1
=
N
287
_
_
105
110
72
_
_
N
_
_
.37
.38
.25
_
_
.
Thus, in the long run 37%of the voters are Democrats, 38%are Republicans, and 25%are Independents.
Note that this is independent of the initial distribution of voters.
1.11.2 Eigenvalues and eigenvectors
Using the previous subsection as a motivation, we wish to consider the following problem (compare
with (1.11.3)). For a given matrix A R
nn
nd scalar(s) , called eigenvalues, such that
dim[Nul(AI
n
)] 1.
If is an eigenvalue, then we will call Nul(AI
n
) the eigenspace, and we will call any basis vector of
the eigenspace an eigenvector. Now, if we are given an eigenvalue, then it is straightforward to compute
the eigenvectors. Thus, the problem really is in nding an eigenvalue. In this endeavor we can rely
upon the result of Theorem 1.10.4, in which it is stated that a square matrix has a nontrivial null space
if and only if its determinant is zero. If we set
p
A
() = det(AI
n
),
then the eigenvalues will correspond to the zeros of the characteristic polynomial p
A
(). While we will
not do it here, it is not dicult to show that the characteristic polynomial is a polynomial of degree n,
the size of the square matrix. We summarize this discussion with the following result:
Theorem1.11.1. Let A R
nn
. The zeros of the n
th
-order characteristic polynomial p
A
() = det(AI
n
)
are the eigenvalues of the matrix A. The eigenvectors associated with an eigenvalue are a basis for
Nul(AI
n
).
The eigenvalues also tell us something about the invertibility of the matrix. Suppose that = 0 is an
eigenvalues, i.e.,
p
A
(0) = det(A) = 0.
By Theorem 1.10.4 it is then true that dim[Nul(A)] 1, so that A is not invertible. On the other hand, if
= 0 is not an eigenvalue, then p
A
(0) = det(A) 0, so that the matrix is invertible.
Corollary 1.11.2. The matrix A R
nn
is invertible if and only if = 0 is not an eigenvalue.
Remark 1.11.3. Since it can be the case that the eigenvalues will be complex-valued, it will also neces-
sarily be the case for A R
nn
that the associated eigenvector(s) will also be complex-valued. Thus, in
order to put the matrix AI
n
into RREF for C, it will be helpful to recall the algebra of complex-
valued numbers. We say that z C if z = a + ib, where a, b R, and i
2
= 1. The number a is the real
part of the complex number, and is sometimes denoted by Re(z), i.e., Re(z) = a. The number b is the
imaginary part of the complex number, and is sometimes denoted by Im(z), i.e., Im(z) = b. The addi-
tion/subtraction of two complex numbers is exactly the same as for vectors: add/subtract the real parts
and imaginary parts. For example,
(2 i3) +(3 +i2) = (2 +3) +i(3 +2) = 5 i.
As for multiplication, we simply use the fact that i
2
= 1; for example,
(2 i3)(3 +i2) = (2)(3) +(i3)(3) +(2)(i2) +(i3)(i2) = 6 i9 +i4 i
2
6 = 12 i5.
Finally, for division we rst multiply the complex number by the number one represented as the
complex-conjugate of the denominator divided by the complex-conjugate of the denominator. The
complex-conjugate of a complex number z which is denoted by z, and is given by taking the negative of
the imaginary part, i.e.,
z = a +ib z = a ib.
It can be checked that zz = a
2
+b
2
, and using this fact we say that the magnitude of a complex number
is given by
z =
zz =
a
2
+b
2
.
Thus, we have
z
1
z
2
=
z
1
z
2
z
2
z
2
=
1
z
2
z
2
z
1
z
2
,
which means that the division has been replaced by the appropriate multiplication. For example,
2 i3
3 +i2
=
_
2 i3
3 +i2
__
3 i2
3 i2
_
=
1
13
[(2 i3)(3 i2)] =
1
13
(i13) = i.
Example. Let us nd the eigenvalues and associated eigenvectors for
A =
_
3 2
2 3
_
.
We have
AI
2
=
_
3 2
2 3
_
,
so that the characteristic polynomial is
p
A
() = (3 )
2
4.
The zeros of the characteristic polynomial are then = 1, 5. As for the associated eigenvectors, we must
compute a basis for Nul(AI
2
) for each eigenvalue. For the eigenvalue
1
= 1 we have
AI
2

_
1 1
0 0
_
,
which corresponds to the linear equation v
1
+v
2
= 0. The eigenvector is then given by
1
= 1; v
1
=
_
1
1
_
.
For the eigenvalue
2
= 5 we have
A5I
2

_
1 1
0 0
_
,
1
v
2
2
= 5; v
2
=
_
1
1
_
.
A =
_
3 2
2 3
_
.
We have
AI
2
=
_
3 2
2 3
_
,
p
A
() = (3 )
2
+4.
The zeros of the characteristic polynomial are then = 3 | i2, where i
2
= 1. As for the associated
eigenvectors, we have for the eigenvalue
1
= 3 +i2
A(3 +i2)I
2

_
1 i
0 0
_
,
1
iv
2
1
= 3 +i2; v
1
=
_
i
1
_
=
_
0
1
_
+i
_
1
0
_
.
For the eigenvalue
2
= 3 i2 we have
A(3 i2)I
2

_
1 i
0 0
_
,
1
+iv
2
2
= 3 i2; v
2
=
_
i
1
_
=
_
0
1
_
i
_
1
0
_
.
The above is actually an example of a more general phenomena. Suppose that A R
nn
. The coe-
cients of the characteristic polynomial are all real-valued; hence, it is the case that if is a zero, then so
is the complex-conjugate . Similarly, if v is an eigenvector corresponding to the eigenvalue , then the
complex-conjugate v will be an eigenvector corresponding to the eigenvalue , with
v = p+iq v = piq.
In the above example we have
p =
_
0
1
_
, q =
_
1
0
_
.
A =
_
_
5 0 0
0 3 2
0 2 3
_
_
.
We have
AI
3
=
_
_
5 0 0
0 3 2
0 2 3
_
_
,
p
A
() = (5 )[(3 )
2
4].
The zeros of the characteristic polynomial are then = 1, 5, where = 5 is a double root. As for the
associated eigenvectors, we have for the eigenvalue
1
= 1
A1I
3

_
_
1 0 0
0 1 1
0 0 0
_
_
,
which corresponds to the linear system v
1
= 0, v
2
+v
3
1
= 1; v
1
=
_
_
0
1
1
_
_
.
For the eigenvalue
2
=
3
= 5 we have
A5I
3

_
_
0 1 1
0 0 0
0 0 0
_
_
,
2
v
3
= 0. There are two free variables, v
1
and v
3
, so that
there are two eigenvectors, which are given by
2
=
3
= 5; v
2
=
_
_
1
0
0
_
_
, v
3
=
_
_
0
1
1
_
_
.
A =
_
3 2
0 3
_
.
We have
AI
2
=
_
3 2
0 3
_
,
p
A
() = (3 )
2
.
There is only one zero of the characteristic polynomial, i.e., = 3, but it is a double root. As for the
associated eigenvector(s), we have
A3I
2

_
0 1
0 0
_
,
2
= 0. The only linearly independent eigenvector is then
given by
1
=
2
= 3; v
1
=
_
1
0
_
.
Thus, in this example there are not as many eigenvectors as there are eigenvalues.
Example. We nally consider an example for which the eigenvalues and eigenvectors must be com-
puted numerically. In this problem A R
44
, which means that p
A
() is a fourth-order polynomial.
Unless the problem is very special, it is generally the case that it is not possible to (easily) nd the four
roots. The calculation will be done by the Java applet Matrix Calculator. Using this applet, for
A =
_
_
1 2 3 4
1 2 3 4
1 2 3 4
5 6 7 8
_
_
(1.11.7)
we have
1
1.73, v
1

_
_
0.48
0.66
0.11
0.57
_
_
;
2
1.03, v
2

_
_
0.75
1.61
0.15
0.69
_
_
;
3
7.35 +i0.22, v
3

_
_
3.75
19.86
20.89
11.62
_
_
+i
_
_
1.26
3.76
4.07
3.40
_
_
.
We know that the fourth eigenvalue is the complex-conjugate of the third eigenvalue, i.e.,
4
=
3
, and
the associated eigenvector is the complex-conjugate of v
3
, i.e., v
4
= piq. The actual output of the applet
is given in Figure 1.2:
Figure 1.2: (color online) The output from the applet Matrix Calculator for the matrix A in (1.11.7).
We conclude with the following fact, which will be especially useful in applications:
Lemma 1.11.4. If all of the eigenvalues of a matrix A R
nn
are distinct, i.e., all of the roots of the
characteristic polynomial are simple, then the set of eigenvectors forms a basis for R
n
.
1.12 Application: ordinary differential equations
The goal here is to show that many of the ideas used in solving linear systems of the form Ax = b can
be used to understand the solutions to linear ordinary dierential equations (ODEs). In general, an
n
th
-order dierential equation is of the form
G(t, x,
dx
dt
,
dx
2
dt
2
, . . . ,
dx
n
dt
n
) = 0,
where
G(t, y
1
, y
2
, . . . , y
n+1
) : RR
n
R
n
R
n
R
n
is a smooth function. Here we are making the obvious denition that the derivative of a vector is simply
the derivative of each of its components, i.e.,
dx
dt
=
_
_
dx
1
/dt
dx
2
/dt
.
.
.
dx
n
/dt
_
_
.
1.12. APPLICATION: ORDINARY DIFFERENTIAL EQUATIONS 33
In order for there to be a unique solution (assuming that one exists at all), we will specify the initial
condition
x(t
0
) = x
0
,
dx
dt
(t
0
) = x
1
,
dx
2
dt
2
(t
0
) = x
2
, . . . ,
dx
n1
dt
n1
(t
0
) = x
n1
.
The ODE combined with the initial condition is known as the initial value problem (IVP). Setting
x
=
dx
dt
, x
=
dx
2
dt
2
, . . . ,
it turns out to be the case that if
G(t, y
1
, y
2
, . . . , y
n
) = G
0
(t, y
1
, y
2
, . . . , y
n
) +y
n+1
,
i.e., if the ODE can be written as
x
(n)
= G
0
(t, x, x
, . . . , x
(n1)
),
then the IVP will have a unique solution for t t
0
< and some > 0. In other words, there will be a
smooth vector-valued function (t) : R R
n
such that for t t
0
< ,
(n)
= G
0
(t, ,
, . . . ,
(n1)
),
and also at t = t
0
,
(t
0
) = x
0
, ,
(t
0
) = x
1
,
(t
0
) = x
2
, . . . ,
(n1)
(t
0
) = x
n1
.
A linear system of ODEs, which are the only type of ODEs which will be studied in any detail in this
class, is of the form
x
= A(t)x +f (t), x(t

0
) = x
0
. (1.12.1)
Here A(t) R
nn
is a continuous matrix, i.e., each entry is a continuous function, the forcing vector
f (t) R
n
is continuous, and the state vector x(t) R
n
. In words, (1.12.1) is saying that the rate of change
of the state vector at a given time is proportional to the state vector at that time (this corresponds to the
homogeneous problem x
= A(t)x), and that furthermore the state vector is subject to an external forcing
which is independent of the state vector. The initial condition of the state vector at t = t
0
is given by x
0
.
In order to better make the connection between the linear ODE (1.12.1) and linear systems of equa-
tions, we will rewrite the ODE as
T x = f (t), T
d
dt
A(t). (1.12.2)
Just as matrix/vector multiplication is a linear operation (see Lemma 1.3.4), the fact that dierentiation
is also a linear operation implies that T is also a linear operator, i.e.,
T (c
1
x
1
+c
2
x
2
) = c
1
T x
1
+c
2
T x
2
.
Thus, arguing in a manner similar to that leading to Theorem1.6.1, in which it is stated that all solutions
to the linear system Ax = b are the sum of a homogeneous solution and a particular solution, we can say
that solutions to the ODE (1.12.2) are of the form x = x
h
+x
p
, where x
h
is a solution to the homogeneous
problem, i.e.,
T x
h
= 0 x
h
= A(t)x
h
, (1.12.3)
and x
p
is a particular solution to the nonhomogeneous problem, i.e.,
T x
p
= f (t) x
p
= A(t)x
p
+f (t).
Finally, since T is a linear operator, by arguing in a manner similar to that leading to Lemma 1.4.5, in
which it is shown that Nul(A) is a subspace, we can say that a linear combination of solutions to the
homogeneous problem (1.12.3) is also a solution to the homogeneous problem.
Theorem 1.12.1. Consider the linear nonhomogeneous ODE
x
= A(t)x +f (t),
and its homogeneous counterpart
x
= A(t)x.
If x
h
is a solution to the homogeneous problem, and x
p
is a particular solution, i.e., a solution to the
nonhomogeneous problem, then
x = x
h
+x
p
is a solution to the nonhomogeneous problem. Furthermore, if x
1
, x
2
are solutions to the homogeneous
problem, then so is the linear combination c
1
x
1
+c
2
x
2
.
How do we solve the IVP (1.12.1)? Suppose that we have n solutions to the homogeneous system,
x
1
, x
2
, . . . , x
n
. Dene the matrix-valued solution (t) by
(t) = (x
1
(t) x
2
(t) x
n
(t)),
and note that the moniker is well-deserved because
= (x
1
x
2
x
n
) = (A(t)x
1
A(t)x
2
A(t)x
n
) = A(t)(x
1
x
2
x
n
) = A(t).
Since
(t)c = c
1
x
1
(t) +c
2
x
2
(t) + +c
n
x
n
(t),
we have by Theorem 1.12.1 that (t)c is a solution to the homogeneous problem for any vector c R
n
.
Thus, we can write homogeneous solutions as
x
h
(t) = (t)c,
so by Theorem 1.12.1 a solution, hereafter called the general solution, to the nonhomogeneous problem
is given by
x(t) = (t)c +x
p
(t)
= c
1
x
1
(t) +c
2
x
2
(t) + +c
n
x
n
(t) +x
p
(t).
(1.12.4)
What about the initial condition? Evaluating the general solution at t = t
0
gives
x
0
= x(t
0
) = (t
0
)c +x
p
(t
0
) (t
0
)c = x
0
x
p
(t
0
).
If we set
B = (t
0
), b = x
0
x
p
(t
0
),
then this is simply to the linear system Bc = b. Since the vector b R
n
is arbitrary, in order to guarantee
that this linear system has a solution we know by Corollary 1.7.11 that the RREF of B must be I
n
, which
by Lemma 1.7.9 means that the columns of B, which is the set x
1
(t
0
), x
2
(t
0
), . . . , x
n
(t
0
), must be linearly
independent. Thus, if the n solutions to the homogeneous problem are chosen so that they are linearly
independent at t = t
0
, then the solution given by (1.12.4) solves the ODE for any initial condition.
Theorem 1.12.2. The general solution to the nonhomogeneous problem
x
= A(t)x +f (t)
is given by
x(t) = c
1
x
1
(t) +c
2
x
2
(t) + +c
n
x
n
(t)
.
(t)c
+x
p
(t).
Here x
p
(t) is a particular solution, and x
1
, x
2
, . . . , x
n
are solutions to the homogeneous problem which are
such that the set of vectors x
1
(t
0
), x
2
(t
0
), . . . , x
n
(t
0
) is linearly independent (the matrix (t
0
) is invert-
ible).
1.12. APPLICATION: ORDINARY DIFFERENTIAL EQUATIONS 35
Example. Consider the ODE
x
=
_
0 1
2 3
_
.
A
x +
_
0
10cos(t)
_
.
f (t)
, x(0) =
_
2
4
_
.
x
0
.
Since the matrix A(t) is constant coecient, we simply write A(t) A above. It can be checked that the
homogeneous solution is given by
x
h
(t) = c
1
e
t
_
1
1
_
+c
2
e
2t
_
1
2
_
=
_
e
t
e
2t
e
t
2e
2t
_
.
(t)
c,
and that a particular solution is given by
x
p
(t) = cos(t)
_
1
3
_
+sin(t)
_
3
1
_
.
Thus, the general solution is
x(t) = x
h
(t) +x
p
(t) = (t)c +cos(t)
_
1
3
_
+sin(t)
_
3
1
_
.
Regarding the initial condition, we have
_
2
4
_
= x(0) = (0)c +
_
1
3
_
,
which is equivalent to
_
1 1
1 2
_
c =
_
2
4
_
_
1
3
_
=
_
1
1
_
.
Solving this linear system yields
c =
_
1 1
1 2
_
1
_
0
1
_
=
_
3
2
_
which in conclusion means that the solution to the IVP is
x(t) = 3e
t
_
1
1
_
2e
2t
_
1
2
_
+cos(t)
_
1
3
_
+sin(t)
_
3
1
_
.
It is interesting to note that when t >0 for the general solution,
x(t) cos(t)
_
1
3
_
+sin(t)
_
3
1
_
;
in other words, for large t the solution behavior is governed solely by the eect of the forcing, and is
independent of the initial condition.
As this example illustrates, while the theory tells us how to solve the IVP once the homogeneous and
particular solutions have been found, it tells us nothing about how to actually nd these solutions. This
will be the primary topic of the remainder of the course.
2
Scalar rst-order linear dierential equations
2.1 Motivating problems
Consider a reservoir that is lled with V
0
m
3
of a brine solution, and suppose that it initially contains
C
0
g of salt; in other words, the initial concentration of salt in the reservoir is C
0
/V
0
g/m
3
. Suppose
that the brine solution is drawn from the reservoir at a rate of 0.05V
0
m
3
/day, and further suppose
that a new brine solution with a concentration of 0.03g/m
3
ows into the reservoir at the same rate
as the outow. The situation is depicted in Figure 2.1. Consequently, while the total volume of the
brine solution remains unchanged, the concentration changes as a function of time. It will be assumed
that the reservoir is well-stirred, which implies that the concentration is independent of the location
in the reservoir. If this assumption is removed, then it will be the case that our resulting equation will
necessarily be a partial dierential equation, which is a topic of study for another course.
0.05V
0
m
3
/day
0.05V
0
m
3
/day
0.03 g/m
3
V
0
m
3
Figure 2.1: (color online) A cartoon which depicts the mixing of a brine solution in a reservoir.
Our initial goal is to write down a mathematical model which describes this physical situation. Let
x(t) represent the number of grams in the reservoir at time t, so that the concentration is given by
c(t) = x(t)/V
0
. The instantaneous rate of change of salt in the reservoir, x
(t), satises the equation

x
(t) = [inow rate] [outow rate].

Note that the units for x/(t) are g/day. Now,
[inow rate] =
_
0.03
g
m
3
_
_
0.05V
0
m
3
day
_
= 0.0015V
0
g
day
,
and
[outow rate] =
_
x(t) g
V
0
m
3
__
0.05V
0
m
3
day
_
= 0.05x(t)
g
day
.
36
2.2. GENERAL THEORY 37
Thus, x(t) satises the initial value problem (IVP)
x
= 0.0015V
0
0.05x, x(0) = C
0
.
Note that the ODE can be written in the more standard form
x
= 0.05x +0.0015V
0
, (2.1.1)
in which it is (perhaps) more easily recognizable as a linear rst-order ODE (compare with (1.12.1)).
Suppose that the rate for which the brine solution leaves the reservoir is lessened to 0.02V
0
m
3
/day.
In this case the volume in the reservoir is given by V
0
(1 +0.03t), so that
[outow rate] =
_
x(t) g
V
0
(1 +0.03t) m
3
__
.02V
0
m
3
day
_
=
0.02
1 +0.03t
x(t)
g
day
.
The governing ODE in this scenario becomes
x
=
0.02
1 +0.03t
x +0.0015V
0
. (2.1.2)
For a last scenario, suppose now that the concentration of the incoming brine solution is given by
0.03(1 +0.8sin(2t))V
0
g/m
3
. In this case
[inow rate] =
_
0.03(1 +0.8sin(2t))V
0
g
m
3
_
_
0.05V
0
m
3
day
_
= 0.0015(1 +0.8sin(2t))V
0
g
day
,
and
[outow rate] =
_
x(t) g
V
0
(1 +0.03t) m
3
__
0.02V
0
m
3
day
_
=
0.02
1 +0.03t
x(t)
g
day
,
so that the governing ODE is
x
=
0.02
1 +0.03t
x +0.0015(1 +0.8sin(2t))V
0
. (2.1.3)
Remark 2.1.1. Derive the equation for which the situation leading to the model (2.1.3) is changed so
that the solution leaves the reservoir at the constant rate 0.05V
0
m
3
/day.
Remark 2.1.2. Of course, there are many other interesting physical scenarios for which we can derive
scalar linear ODEs of the form
x
= a(t)x +f (t). (2.1.4)

Some examples include mathematical nance, population growth, radioactive decay of isotopes, and
Newtons law of cooling.
2.2 General theory
2.2.1 Existence and uniqueness theory
Before we can study the behavior of solutions to the scalar rst-order IVP
x
= g(t, x), x(t

0
) = x
0
, (2.2.1)
we rst need to know if there is something to study. In other words, we need to know if there is
a solution, and furthermore if there is only one solution. The rigorous proof of the following exis-
tence/uniqueness theorem is outside the scope of this class; however, it is not dicult to make a plau-
sibility argument. The ODE, while it does not tell us what is the solution curve, does tell us what is the
derivative of that curve. Thus, if we were to plot at every point in the tx-plane a little line whose slope
is given by g(t, x), then at every point we would know the slope of a given curve. Such a plot is given in
Figure 2.2 for g(t, x) = x
2
/4 +cos(3t), and a solution for the initial condition x(0) = 0 is also given. It is
reasonable to expect that as long as the vector eld g(t, x) is continuous, then we can nd a solution to
the ODE. If we wish the solution to be unique, then we need an added condition.
38 CHAPTER 2. SCALAR FIRST-ORDER LINEAR DIFFERENTIAL EQUATIONS
x
x=-x/4+cos(3*t)
-0.4
-0.2
0
0.2
0.4
0.6
t
-1 0 1 2 3 4 5 6 7 8 9 10
Figure 2.2: (color online) The direction eld (slope eld) associated with the ODE x
= x
2
/4 +cos(3t).
It is calculated using the Java applet DFIELD developed by J. Polking. A solution curve is drawn for
three dierent initial conditions.
Theorem 2.2.1. Consider the IVP
x
= g(t, x), x(t

0
) = x
0
.
Suppose that there is a constant C > 0 such that in the box in the tx-plane given by x x
0
, t t
0
< C
the functions g(t, x) and
x
g(t, x) are continuous. There is then a unique solution, i.e., there is a unique
function (t) such that
= g(t, ), (t
0
) = x
0
.
Furthermore, the solution exists as long as the curve is contained within the box, i.e., as long as both
t t
0
< C and (t) x
0
< C are true (see Figure 2.3).
Remark 2.2.2. There is a simple example which shows that the continuity assumptions on g(t, x) are
necessary in order to guarantee that the solution is unique. Consider the IVP
x
=
_
x, x(0) = 0.
It is clear that g(t, x)

x is continuous for all values of (t, x), and
x
g(t, x) is continuous except at
x = 0. It can be checked that for any a 0 that the function
a
(t) =
_
_
0, 0 t a
(t a)
2
/4, a < t,
is a solution. Thus, in this case there is an innite number of solutions to the IVP. On the other hand,
if the initial condition were x(0) = x
0
0, so that the existence/uniqueness theorem applies, then there
would be a unique solution: in fact, it is given by (t) = (t 2
_
x
0
)
2
/4, where it is understood that
(
_
x
0
)
2
= x
0
.
If the IVP (2.2.1) is linear, i.e., if it is of the form given in (2.1.4), then by using the fact that
g(t, x) = a(t)x +f (t)
x
g(t, x) = a(t),
2.2. GENERAL THEORY 39
t
x
x=(t)
t
0
x
0
x
0
+C
x
0
-C
t
0
+C t
0
-C
Figure 2.3: (color online) A cartoon which illustrates the existence/uniqueness theory of Theorem 2.2.1.
The thick (blue) curve denotes the solution curve, and the dashed (green) box is the domain on which
the functions g(t, x) and
x
g(t, x) are both continuous.
we can redene the box in Theorem 2.2.1 to be the strip t t
0
< C, where the constant C is chosen so
that a(t) is continuous on the interval t t
0
< C. For linear ODEs we then have the more robust result:
Corollary 2.2.3. Consider the IVP
x
= a(t)x +f (t), x(t

0
) = x
0
.
Suppose that a(t), f (t) are continuous for t t
0
< C. There is then a unique solution for t t
0
< C.
Remark 2.2.4. It will henceforth be assumed that whenever we talk about an IVP, there will be a unique
solution.
2.2.2 Numerical solutions: Eulers method
Now that we know a unique solution exists to the IVP (2.2.1), we would like to know what it is. It
turns out to be the case that if g(t, x) is not linear, i.e., if it is not of the form in (2.1.4), then we cannot
explicitly write down the solution curve. However, we can numerically approximate the solution curve.
This topic of the numerical solution of IVPs is a topic worthy of its own class: we will only briey touch
on it here.
Instead of trying to nd the solution curve for every single t, we will instead approximate it for
discrete values of t, and then, if we so desire, linearly extrapolate in order to approximate the solution
for other values of t. For example, if t
1
< t
2
are two time values for which we have the approximations x
1
and x
2
, i.e., x
1
(t
1
) and x
2
(t
2
) for the solution (t), then for t
1
< t < t
2
the approximate solution
is given by the line in the tx-plane connecting these two points, i.e.,
x(t)
x
2
x
1
t
2
t
1
(t t
1
) +x
1
(see Figure 2.4 for a pictorial description).
In order to nd the approximate values to the true solution, we will turn the ODE into a dierence
equation, and then solve that equation. Recall that via a Taylor polynomial approximation we can write
t
x
x=(t)
t
1
t
4
t
3
t
2
x
1
x
4
x
3
x
2
Figure 2.4: (color online) A cartoon which depicts the solution curve x = (t), as well as a piecewise
linear approximation.
a function as
x(t +h) x(t) +x
(t)h +
1
2
x
(t)h
2
+ x
(t)
x(t +h) x(t)
h
.
Supposing that we wish to approximate the solution on the interval a t b, for a given N 1 set
h = (b a)/N (the stepsize) and dene a sequence of t values by
t
0
= a, t
1
= a +h, t
2
= a +2h, . . . , t
j
= a +jh, . . . , t
N1
= a +(N 1)h, t
N
= b. (2.2.2)
The dierence equation which approximates the ODE (2.2.1) is given by
x(t
j+1
) x(t
j
)
h
= g(t
j
, x(t
j
)), j = 0, . . . , N 1,
which upon setting x
j
= x(t
j
) can be rewritten as
x
j+1
= x
j
+hg(t
j
, x
j
). (2.2.3)
Combining (2.2.2) with (2.2.3) gives us Eulers method:
Denition 2.2.5 (Eulers method). Consider the IVP
x
= g(t, x), x(a) = x

0
.
Let b > a be given, and let N 1 be given. Set h = (b a)/N, and dene a sequence of t values by
t
j
= a +jh, j = 0, . . . , N (a = t
0
< t
1
< t
2
< < t
N1
< t
N
= b).
The approximate values x
j
(t
j
), where (t) is the solution to the IVP, are given by
x
j+1
= x
j
+jg(t
j
, x
j
), j = 0, . . . , N 1.
While we will not prove it here, it can be shown that Eulers method approximates the true solution
up to t(h), i.e., (t
j
) x
j
Ch for some C > 0. Thus, as we let h 0
+
, i.e., as we let N +, it will be
the case that the approximate solution will converge exactly to the true solution. Consider the nonlinear
IVP
x
=
1
4
x
2
+cos(3t), x(0) = 0. (2.2.4)
2.3. ANALYTICAL SOLUTIONS 41
x
x=-x/4+cos(3*t)
-0.4
-0.2
0
0.2
0.4
0.6
t
-1 0 1 2 3 4 5 6 7 8 9 10
Figure 2.5: (color online) The solution curve for the ODE x
= x
2
/4 +cos(3t) with the initial condition
x(0) = 0. The purple curve denotes the true solution, while the red curve is the Euler approximation
with a step size of h = 0.2.
In Figure 2.5 the problem is solved via Eulers method with h = 0.2, while in Figure 2.6 it is solved with
a step size of h = 0.05. Note that as the step size decreases the approximate solution is converging to the
true solution; however, it is not until the step size satises h 0.01 does the Euler approximation give a
good approximation of the true solution.
From a practical perspective, Eulers method is not very good, and it is rarely used in practice.
Instead, one typically uses a Runge-Kutte type method, which is t(h
4
). In Figure 2.7 we see a numerical
solution to (2.2.4) using a Runge-Kutta method with a step size of h = 0.2. This approximation is clearly
much better than the Euler approximation; indeed, it is as good as an Euler approximation with a step
size of h = 0.01.
2.3 Analytical solutions
As we saw in Theorem 1.12.1, the general solution to the linear ODE
x
= a(t)x +f (t) (2.3.1)

is given by x(t) = x
h
(t) +x
p
(t), where x
h
(t) is a solution to the homogeneous problem
x
= a(t)x, (2.3.2)
and x
p
(t) is a particular solution to the nonhomogenous problem (2.3.1).
2.3.1 The homogeneous solution
Let us rst focus on the solution to the homogeneous problem (2.3.2). The form of the equation suggests
that we guess a solution of the form
x(t) = e
A(t)
,
where A(t) is an unknown function. Since
x
(t) = A
(t)e
A(t)
= A
(t)x(t),
x
x=-x/4+cos(3*t)
-0.4
-0.2
0
0.2
0.4
0.6
t
-1 0 1 2 3 4 5 6 7 8 9 10
= x
2
/4 +cos(3t) with the initial condition
x(0) = 0. The purple curve denotes the true solution, while the red curve is the Euler approximation
with a step size of h = 0.05. Note that the approximation is much better than that given in Figure 2.5,
even though it is still not very good.
we have that
A
(t)x(t) = a(t)x(t) A
(t) = a(t).
In other words, the desired function A(t) is the anti-derivative of the function a(t).
Lemma 2.3.1. The general solution to the homogeneous problem (2.3.2) is given by
x
h
(t) = (t)c, (t) = e
_
t
a(s) ds
.
Here (t) is the (matrix-valued) solution for the linear ODE (see Theorem 1.12.2).
Example. For the constant coecient problem
x
= x
we have
(t) = e
_
t
ds
= e
t
,
so that the general solution is
x
h
(t) = ce
t
.
Example. For the variable coecient problem
x
=
1
a +bt
x
we have
(t) = e
_
t
(a+bs)
1
ds
= e
b
1
ln(a+bt)
= (a +bt)
1/b
,
x
h
(t) = c(a +bt)
1/b
.
x
x=-x/4+cos(3*t)
-0.4
-0.2
0
0.2
0.4
0.6
t
-1 0 1 2 3 4 5 6 7 8 9 10
= x
2
/4 + cos(3t) with the initial condi-
tion x(0) = 0. The purple curve denotes the true solution, while the green curve is the Runge-Kutta
approximation with a step size of h = 0.2. Note that the approximation is much better than that given
in Figure 2.6, even though the step size is four times larger.
2.3.2 The particular solution
2.3.2.1 Variation of parameters
As we saw, the solution to the homogeneous problem is quite easy to nd (at least theoretically). As for
the particular solution, it follows from knowing the (matrix-valued) solution (t). Indeed, if we set
x
p
(t) = (t)
_
t
(s)
1
f (s) ds,
then upon using the fact that
= a(t) and the Fundamental Theorem of Calculus we have that

x
p
(t) =
(t)
_
t
(s)
1
f (s) ds +(t)
_
(t)
1
f (t)
_
= a(t)
_
(t)
_
t
(s)
1
f (s) ds
_
+f (t)
= a(t)x
p
(t) +f (t).
Lemma 2.3.2. If (t) is the (matrix-valued) general solution to the homogeneous problem (2.3.2), then
a particular solution to (2.3.1) is given by
x
p
(t) = (t)
_
t
(s)
1
f (s) ds.
As a consequence of Theorem 1.12.1 and the above Lemma 2.3.1 and Lemma 2.3.2 we now know the
general solution to the linear ODE (2.3.1):
Theorem 2.3.3. The general solution to the linear ODE
x
= a(t)x +f (t)
is given by
x(t) = (t)c
.
x
h
(t)
+(t)
_
t
(s)
1
f (s) ds,
.
x
p
(t)
where (t) is the (matrix-valued) solution to the homogenous linear ODE, i.e.,
(t) = e
_
t
a(s) ds
.
Example (cont.). Consider the linear ODE
x
= x +f (t). (2.3.3)
We have already seen that (t) = e
t
; hence, from Theorem 2.3.3 the general solution is given by
x(t) = ce
t
+e
t
_
t
e
s
f (s) ds.
In particular, if f (t) = 3 +43t, then upon using the fact that
_
t
e
s
[3 +43s] ds = e
t
(3 4t),
we get that the general solution is
x(t) = ce
t
+3 4t.
If we further assume that x(0) = 5, so that
5 = x(t) = c +3,
then the solution to the IVP
x
= x +3 +43t, x(0) = 5,
is given by
x(t) = 2e
t
+3 4t.
Example (cont.). Consider the variable coecient problem
x
=
1
a +bt
x +f (t).
We have already seen that (t) = (a +bt)
1/b
; hence, the general solution is
x(t) = c(a +bt)
1/b
+(a +bt)
1/b
_
t
(a +bs)
1/b
f (s) ds.
In particular, if we set a = 2, b = 1, and f (t) F
0
, then upon simplifying the general solution becomes
x(t) = c(2 +t) +F
0
(2 +t) ln(2 +t).
If we further assume that x(3) = 7, so that
7 = x(3) = c,
then the solution to the IVP
x
=
1
2 +t
x +F
0
, x(0) = 7,
is given by
x(t) = 7(2 +t) +F
0
(2 +t) ln(2 +t).
2.3.2.2 Undetermined coecients
The variation of parameters formulation of Lemma 2.3.2 always gives the particular solution; however,
it can technically arduous to carry out the integrations. We now consider a dierent method for nding
the particular solution - the method of undetermined coecients. When this method is applicable, it
is often easier to use. Instead of computing several anti-derivatives, we instead cleverly guess the form
of the particular solution, and from this guess convert the linear ODE to a linear system of algebraic
equations. As we saw in the last chapter, linear systems are easy to solve via Gaussian elimination.
When is the method applicable? The system must minimally be of the form of (2.3.3), i.e., the
homogeneous problem must be constant coecient. Furthermore, the forcing term must be a member
of a class of functions for which the derivative is also a member of the class of functions. Some careful
thought reveals that this class of functions are linear combinations of functions of the formp(t)e
at
cos(bt)
and p(t)e
at
sin(bt), where p(t) is a polynomial.
The method will be illustrated via a sequence of examples. In all of these examples the homogeneous
problem will be
x
= x (t) = e
t
.
First consider
x
= x +t 4t
3
.
Since f (t) = t 4t
3
, which is a third-order polynomial, we will guess the particular solution to be
x
p
(t) = a
0
+a
1
t +a
2
t
2
+a
3
t
3
.
Plugging this guess into the ODE yields
a
1
+2a
2
t +3a
3
t
2
= (a
0
+a
1
t +a
2
t
2
+a
3
t
3
) +t 4t
3
,
which can be rewritten as
a
0
+a
1
+(a
1
+2a
2
1)t +(a
2
+3a
3
)t
2
+(a
3
+4)t
3
= 0.
Since a polynomial can be identically zero if and only all of its coecients are zero, we then get the
linear system of equations
a
0
+a
1
= 0, a
1
+2a
2
1 = 0, a
2
+3a
3
= 0, a
3
+4 = 0,
which can be rewritten in matrix form as
_
_
1 1 0 0
0 1 2 0
0 0 1 3
0 0 0 1
_
_
a =
_
_
0
1
0
4
_
_
, a =
_
_
a
0
a
1
a
2
a
3
_
_
.
Upon putting the augmented matrix into RREF, i.e.,
_
_
1 1 0 0 0
0 1 2 0 1
0 0 1 3 0
0 0 0 1 4
_
_
1 0 0 0 23
0 1 0 0 23
0 0 1 0 12
0 0 0 1 4
_
_
,
we see that the solution to the linear system is
a =
_
_
23
23
12
4
_
_
.
Consequently, we can conclude that
x
p
(t) = 23 23t +12t
2
4t
3
,
x(t) = ce
t
+23 23t +12t
2
4t
3
.
For our next example consider
x
= x +8e
3t
.
Since f (t) = 6e
3t
, which is an exponential function, we will guess the particular solution to be
x
p
(t) = a
0
e
3t
.
3a
0
e
3t
= a
0
e
3t
+8e
3t
,
(4a
0
+8)e
3t
= 0.
Since an exponential function can be identically zero if and only its coecient is zero, we see that
4a
0
+8 = 0 a
0
= 2.
x
p
(t) = 2e
3t
,
x(t) = ce
t
2e
3t
.
For our third example we will suppose that f (t) is a summation of the previous two forcing terms,
i.e.,
x
= x +t 4t
3
+8e
3t
.
If we break up the problem into two simpler problems, i.e.,
x
1
= x
1
+t 4t
3
, x
2
= x
2
+8e
3t
,
and nd the particular solution for each problem, i.e.,
x
1,p
(t) = 23 23t +12t
2
4t
3
, x
2,p
(t) = 2e
3t
,
then upon using linearity it is not dicult to check that the particular solution for the full problem is
given by
x
p
(t) = x
1,p
(t) +x
2,p
(t) = 23 23t +12t
2
4t
3
2e
3t
.
We also see the linearity property coming into play if we instead use the variation of parameters formula
in order to nd the particular solution. In this case we would use the fact that integration is a linear
operation. This example illustrates an important principle regarding the method of undetermined coef-
cients: we can break up complicated forcing functions into smaller, more analytically tractable pieces,
and then form the full solution by summing the smaller pieces.
For our fourth example let us consider
x
= x +5cos(4t).
Since f (t) = 5cos(4t), which is a trigonometric function, we will guess the particular solution to be
x
p
(t) = a
0
cos(4t) +a
1
sin(4t).
4a
0
sin(4t) +4a
1
cos(4t) = (a
0
cos(4t) +a
1
sin(4t)) +5cos(4t),
(a
0
+4a
1
5) cos(4t) +(4a
0
a
1
) sin(4t) = 0.
In order for this equation to be hold for all t, it must again be true that each of the coecients are zero,
i.e.,
a
0
+4a
1
5 = 0, 4a
0
a
1
= 0.
which can be rewritten in matrix form as
_
1 4
4 1
_
a =
_
5
0
_
, a =
_
a
0
a
1
_
.
Since
_
1 4
4 1
_
1
=
1
15
_
1 4
4 1
_
,
the solution to the linear system is given by
a =
1
15
_
1 4
4 1
__
5
0
_
=
1
3
_
1
4
_
.
x
p
(t) =
1
3
cos(4t) +
4
3
sin(4t),
x(t) = ce
t
1
3
cos(4t) +
4
3
sin(4t).
For our nal example in which we will complete the computation, consider the ODE
x
= x +3e
t
.
Using the second example as a guide, it appears to be the case that we should guess the particular
solution to be x
p
(t) = a
0
e
t
. Plugging this guess into the ODE gives
a
0
e
t
= a
0
e
t
+3e
t
3e
t
= 0.
Since this clearly is never true, our guess was incorrect. What went wrong? The problem is that the
guess is precisely the homogeneous solution, (t) = e
t
. In the case that the forcing function f (t) is
also a homogeneous solution, we must modify our guess. It turns out to be the case that the correct
modication is to multiply our original guess by t; in other words, we will guess that
x
p
(t) = a
0
te
t
.
Plugging this guess into the ODE gives
a
0
e
t
a
0
te
t
= a
0
te
t
+3e
t
,
(a
0
3)e
t
= 0 a
0
= 3.
Thus, we see that the particular solution is
x
p
(t) = 3te
t
,
x(t) = ce
t
+3te
t
.
Now let us consider a sequence of problems, and simply guess the form of the particular solution
without going through the work of actually computing the undetermined coecients (the interested
student can complete the calculation). We start with the problem
x
= 5x +f (t),
which has the homogeneous solution x
h
(t) = ce
5t
. We have that for various functions f (t):
(a) f (t) = t
2
e
3t
x
p
(t) = (a
0
+a
1
t +a
2
t
2
)e
3t
(b) f (t) = t sin(2t) +t
2
x
p
(t) = (a
0
+a
1
t) cos(2t) +(a
2
+a
3
t) sin(2t)
.
f
1
(t) = t sin(2t)
+a
4
+a
5
t +a
6
t
2
.
f
2
(t) = t
2
(c) f (t) = te
t
cos(5t) x
p
(t) = (a
0
+a
1
t)e
t
cos(5t) +(a
2
+a
3
t)e
t
sin(5t)
(d) f (t) = t
3
e
5t
x
p
(t) = t(a
0
+a
1
t +a
2
t
2
+a
3
t
3
)e
5t
.
In each guess we used the principle that polynomials in the forcing imply polynomials for the guess,
exponentials in the forcing imply exponentials in the guess, and sine and cosine terms in the forcing
imply the same for the guess. We also used the rule-or-thumb that a a forcing term having a product
of terms implies that the guess should be a product of guesses. For example, in (a) the forcing term is a
quadratic polynomial multiplied by an exponential. The quadratic termimplies the guess of a quadratic
polynomial, and the exponential term implies the guess of an exponential. Consequently, the guess for
the product is the quadratic polynomial multiplied by the exponential. For the guess associated with
(c) we used the principle that we can break up the forcing function into smaller pieces, and then guess
for each piece. Thus, in that example we see one guess associated with t sin(2t), and another guess
associated with the polynomial t
2
. For the guess associated with (d) we needed to modify the expected
guess in order to take into account the fact that part of our guess was also a homogeneous solution. We
initially guessed a cubic multiplied by the exponential, and then afterwards multiplied by the smallest
integer power of t which guarantees that no part of the guess is a homogeneous solution.
2.4 Example: one-tank mixing problem
Let us again consider the mixing problem discussed in Section 2.1. Let us now assume that the ow
rates are given by aV
0
m
3
/day for some a > 0, and let us further assume that the incoming concentration
is given by c(t) g/m
3
for some positive function c(t). The governing equation for the amount of salt in
the tank, x(t), is then given by
x
= ax +aV
0
c(t), x(0) = x
0
.
We will now solve this equation under the assumption that the incoming concentration is sinusoidal,
i.e.,
c(t) = c
0
(1 cos(t)), > 0.
Our goal here is twofold: (a) to understand the manner in which the concentration of brine varies in
the tank as t +, and (b) to understand the manner in which the concentration depends upon the
frequency for t >0.
2.4. EXAMPLE: ONE-TANK MIXING PROBLEM 49
The homogeneous problem is
x
= ax (t) = e
at
;
hence, since a > 0 the solution for t >0 will depend primarily upon the particular solution. Note that
this implies that the initial condition is unimportant when trying to understand the solution behavior
for large t. Using the method of undetermined coecients we know that the particular solution is of
the form
x
p
(t) = a
0
+a
1
cos(t) +a
2
sin(t).
Solving as in the previous section yields that the coecients are given by
a
0
= c
0
V
0
, a
1
= c
0
V
0
a
2
a
2
+
2
, a
2
= c
0
V
0
a
a
2
+
2
,
i.e.,
x
p
(t) = c
0
V
0
_
1
a
2
a
2
+
2
cos(t)
a
a
2
+
2
sin(t)
_
.
A standard trigonometric identity yields that
a
2
a
2
+
2
cos(t) +
a
a
2
+
2
sin(t) =
a
a
2
+
2
cos(t ), tan =

a
;
hence, the particular solution can be rewritten in the form
x
p
(t) = c
0
V
0
_
1
a
a
2
+
2
cos(t )
_
, tan =

a
.
In conclusion, the concentration in the tank for t >0 is approximately given by
c
tank
(t) =
x
p
(t)
V
0
= c
0
_
1
a
a
2
+
2
cos(t )
_
, tan =

a
.
Recall that the mean, i.e., the average, of a periodic function f (t) with period T is given by
f =
1
T
_
T
0
f (s) ds.
Since the cosine has zero mean, the mean concentration in the tank is given by c
0
, which is the same as
the mean concentration of the incoming brine. On the other hand, the variation of the concentration in
the tank about the mean, which is given by the term
c
0
a
a
2
+
2
cos(t ),
depends upon the frequency. The amplitude, say
A
()
a
a
2
+
2
,
decreases monotonically as the frequency as the frequency increases. Furthermore, the frequency intro-
duces a phase-shift, say
() tan
1
(/a),
which depends not only upon the frequency, but upon the fractional constant a. Need a gure here
similar to that of Figure 3.15 and Figure 4.5.
3
Systems of rst-order linear dierential equations
3.1 Motivating problems
3.1.1 Two-tank mixing problem
3 lb/gal
100 gal
5 lb/gal
200 gal
4 gal/hr
6 gal/hr 5 gal/hr
3 gal/hr
1 gal/hr
7 gal/hr
tank A tank B
Figure 3.1: (color online) A cartoon which depicts the two-tank mixing problem.
Consider the two-tank mixing problem illustrated in Figure 3.1. Letting x
1
represent the pounds of
salt in tank A, and x
2
the pounds of salt in tank B, we wish to derive an ODE which models the physical
situation. First note that the amount of brine solution in each tank is constant: for tank A 7 gallons
enter and leave per hour, and for tank B 8 gallons enter and leave per hour. Arguing as in Section 2.1,
we see that
x
1
=
_
3
lb
gal
__
4
gal
hr
_
+
_
x
2
lb
200 gal
__
3
gal
hr
_
_
x
1
lb
100 gal
__
7
gal
hr
_
x
2
=
_
5
lb
gal
__
7
gal
hr
_
+
_
x
1
lb
100 gal
__
1
gal
hr
_
_
x
2
lb
200 gal
__
8
gal
hr
_
.
Simplifying yields
x
1
=
7
100
x
1
+
3
200
x
2
+12
x
2
=
1
100
x
1
8
200
x
2
+35,
which in matrix form is
x
=
_
7/100 3/200
1/100 8/200
_
x +
_
12
35
_
.
50
3.1. MOTIVATING PROBLEMS 51
This is precisely the system whose solution structure is discussed in Theorem 1.12.1.
3.1.2 Mass-spring system
L
0
y
k
m
F(t)
Figure 3.2: (color online) A cartoon which depicts the mass-spring problem.
Consider a mass-spring system in which a mass is attached to the end of a spring (see Figure 3.2).
This mass-spring system is then suspended from a support. The length of the spring with the attached
mass is L
0
: let y(t) denote the distance from the equilibrium position. If y(t) > 0, then the spring is being
extended; otherwise, it is being compressed. If m denotes the mass, and g is the gravitational constant,
then the force on the system due to gravity is given by F
g
= mg. By Hookes law the force on the mass
when is given by F
s
= k(L
0
+y), where k > 0 is the spring constant. If k is small, then the spring is weak,
whereas if k is large then the spring is sti. When the spring is at equilibrium, then by Newtons second
law it is the case that
F
g
+F
s
= 0 mg = kL
0
.
If the spring is in motion, then by Newtons second law we have that
my
= F
g
+F
s
+F
d
+F(t),
where F
d
is a damping force, and F(t) represents a time-dependent external forcing on the system.
Substituting our expressions for F
g
and F
s
into the above yields
my
= mg k(L
0
+y) +F
d
+F(t) = ky +F
d
+F(t).
If we now assume that the damping force is proportional to velocity, i.e., F
d
= cy
for some c > 0 (c small

means weak damping, and c large means strong damping), then the equation of motion is
y
+
c
m
y
+
k
m
y =
1
m
F(t). (3.1.1)
We will now write the scalar second-order linear ODE (3.1.1) as a rst-order linear system. Set
x
1
= y, x
2
= y
,
and redene the various constants as
2b =
c
m
,
2
=
k
m
, f (t) =
1
m
F(t).
52 CHAPTER 3. SYSTEMS OF FIRST-ORDER LINEAR DIFFERENTIAL EQUATIONS
We then have that
x
1
= y
= x
2
x
2
= y
=
2
y 2by
+f (t) =
2
x
1
2bx
2
+f (t).
In other words, (3.1.1) is equivalent to the system
x
=
_
0 1
2b
2
_
x +
_
0
f (t)
_
. (3.1.2)
If we are able to solve (3.1.2) for the vector x, then the solution y to (3.1.1) is given by x
1
.
Remark 3.1.1. This type of equivalence between n
th
-order scalar ODEs and rst-order systems holds in
general. For example, suppose that we wish to convert the third-order linear ODE
y
2ty
+3e
t
y
+cos(2t)y = 3t
2
1
into a system. Upon setting
x
1
= y, x
2
= y
, x
3
= y
,
we get that
x
1
= y
= x
2
x
2
= y
= x
3
x
3
= y
= cos(2t)y 3e
t
y
+2ty
+3t
2
1 = cos(2t)x
1
3e
t
x
2
+2tx
3
+3t
2
1,
which is the system
x
=
_
_
0 1 0
0 0 1
cos(2t) 3e
t
2t
_
_
x +
_
_
0
0
3t
2
1
_
_
.
In general, an n
th
-order ODE will be converted to a system by setting
x
1
= y, x
2
= y
, , x
3
= y
, . . . , x
n
= y
(n1)
,
noting that
x
1
= x
2
, x
2
= x
3
, x
3
= x
4
, . . . , x
n1
= x
n
,
and the equation for x
n
= y
(n)
is found from the ODE.
3.1.3 LRC circuit
Show the equation, and compare with the forced-damped mass-spring system.
3.2 Existence and uniqueness theory
In this section we will develop the general theory for the IVP
x
= A(t)x +f (t), x(t

0
) = x
0
,
which is the system version of the scalar linear ODE discussed in Corollary 2.2.3.
Lemma 3.2.1. Consider the IVP
x
= A(t)x +f (t), x(t

0
) = x
0
.
If A(t) is continuous for t t
0
< C, i.e., if each entry of the matrix is a continuous function on that
t-interval, and if f (t) is also continuous for t t
0
< C, i.e., if each entry of the vector is a continuous
function on that t-interval, then there is a unique solution for t t
0
< C.
3.2. EXISTENCE AND UNIQUENESS THEORY 53
In order to nd the solution to the IVP, we go back (again) to the theory of Theorem 1.12.1, and
write the general solution as x = x
h
+x
p
, where x
h
is a solution to the homogeneous problem and x
p
is
a particular solution to the nonhomogeneous problem. Let us rst consider the homogeneous solution.
Let v
1
, v
2
, . . . , v
n
be a basis for R
n
, and for j = 1, . . . , n consider the IVP
x
= A(t)x, x(t
0
) = v
j
.
By Lemma 3.2.1 we know that there is a unique solution given by x
j
(t). Following the discussion after
Theorem 1.12.1 dene the matrix-valued solution (t) by
(t) = (x
1
(t) x
2
(t) x
n
(t)).
We have already seen that (t) solves the homogeneous ODE, i.e.,
= A(t),
and that the general solution to the homogeneous problem is given by
x
h
(t) = (t)c = c
1
x
1
(t) +c
2
x
2
(t) + +c
n
x
n
(t).
Now, the particular solution for the scalar problemis given in Lemma 2.3.2. It is tempting to use this
formulation for the system problem. However, before doing so we need to know if the matrix-valued
solution is invertible. We have
(t
0
) = (v
1
v
2
v
n
),
which is invertible, since the vectors v
1
, v
2
, . . . , v
n
form a basis. While we will not prove it here, it turns
out to be the case that the determinant of the matrix solution satises Abels formula; namely,
det ((t)) = e
_
t
trA(s) ds
det ((t
0
)) .
Here trA(s) is the trace of the matrix A(s), and it is computed by summing of the diagonal entries of
the matrix. The important consequence of Abels formula is that if det ((t
0
)) 0, then det ((t)) 0
for as long as the solution exists. Since (t
0
) being invertible implies that det ((t
0
)) 0 (e.g., see
Theorem 1.10.4), it is then the case that det ((t)) 0, which in turn implies that (t) is invertible for
t t
0
< C.
Lemma 3.2.2 (Variation of parameters). If (t) is a matrix-valued solution to the homogeneous problem
x
= A(t)x which is invertible at t = t

0
, then a particular solution to the nonhomogeneous problem
x
= A(t)x +f (t)
is given by
x
p
(t) = (t)
_
t
(s)
1
f (s) ds.
The analogue to Theorem 2.3.3 is then:
Theorem 3.2.3. If (t) is a matrix-valued solution to the homogeneous problem x
= A(t)x which is
invertible at t = t
0
, then the general solution to the nonhomogeneous problem
x
= A(t)x +f (t)
is given by
x(t) = (t)c
.
x
h
(t)
+(t)
_
t
(s)
1
f (s) ds.
.
x
p
(t)
3.3 Analytical solutions: constant coefficient matrices
From Theorem 3.2.3 we know that the nonhomogeneous problem can be solved as soon as an invertible
matrix-valued solution is found for the homogeneous problem. For the scalar problem we learned in
Lemma 2.3.1 that the (matrix-valued) solution to homogeneous problem can always be found:
x
= a(t)x (t) = e
_
t
a(s) ds
.
We might be tempted to generalize this formula for the homogeneous system, i.e.,
x
= A(t)x (t) = e
_
t
A(s) ds
.
There are (at least) two potential problems with this formulation. The rst is that it is not clear as to
what is
e
B(t)
, B(t) =
_
t
A(s) ds.
This problem can be circumvented by considering the Taylor series expansion for the exponential;
namely,
e
x
=
j=0
1
j!
x
j
e
B(t)
=
j=0
1
j!
B(t)
j
.
Indeed, one can construct all types of functions of matrices via the use of the associated power series;
for example,
(I
n
A)
1
=
j=0
A
n
, cos(A) =
j=0
(1)
j
(2j)!
A
2j
, sin(A) =
j=0
(1)
j
(2j +1)!
A
2j+1
.
The second, more serious problem, is e
B(t)
is not a solution to the homogeneous problem unless
B(t)A(t) = A(t)B(t). As we know, it is often the case that that matrix multiplication does not commute;
hence, this (hoped for) formulation of the homogeneous solution is often incorrect. However, in the case
that A(t) A, i.e., the matrix is constant coecient, it will be true that B(t) = tA, so that
B(t)A(t) = (tA)A = A(tA) = A(t)B(t).
Thus, in this case e
At
is a matrix-valued solution to the homogeneous problem. We will not wish to
calculate the matrix-valued solution in this manner, but at least it tells us that it should be possible to
compute it. For the rest of this chapter it will be assumed that A(t) A, i.e., the linear ODE to be solved
is of the form
x
= Ax +f (t), x(t
0
) = x
0
. (3.3.1)
We will now proceed to solve the constant coecient homogeneous problem
x
= Ax. (3.3.2)
We know that the matrix-valued solution is given by
(t) = e
At
= I
n
+tA+
t
2
2!
A
2
+
t
3
3!
A
3
+ .
Since (0) = I
n
is invertible, we knowthat this matrix-valued solution is invertible for all t (see Lemma 3.2.2).
In order to compute e
At
we need to eciently compute A
j
for each j = 1, 2, . . . .
3.3. ANALYTICAL SOLUTIONS: CONSTANT COEFFICIENT MATRICES 55
Recall that in our discussion of Markov processes in Section 1.11.1 we showed for a particular ex-
ample that
A = PDP
1
A
j
= PD
j
P
1
. (3.3.3)
Here the matrix P had each column as an eigenvector, and D was a diagonal matrix with each entry on
the diagonal being an eigenvalue; in other words,
Av
j
=
j
v
j
, j = 1, . . . , n,
and
P = (v
1
v
2
v
n
), D = diag(
1
,
2
, . . . ,
n
).
The term diag refers to a diagonal matrix, so that o the diagonal D
ij
= 0 for i j, and on the diagonal
D
ii
=
i
for i = 1, . . . , n. From Lemma 1.11.4 we have that the product formulation for the matrix A given
in (3.3.3) is always possible as long as all of the eigenvalues are distinct.
Assume that there are enough eigenvectors so that the matrix P is invertible: this is guaranteed if all
of the eigenvalues are distinct. The decomposition of (3.3.3) implies that
e
At
= I
n
+tA+
t
2
2!
A
2
+
t
3
3!
A
3
+
= PI
n
P
1
+tPDP
1
+
t
2
2!
PD
2
P
1
+
t
3
3!
PD
3
P
1
+
= P
_
I
n
+tD+
t
2
2!
D
2
+
t
3
3!
D
3
+
_
P
1
= Pe
Dt
P
1
.
Thus, all that is left is to compute e
Dt
. As we saw in Section 1.11.1, for diagonal matrices it is the case
that
D
j
= diag(
j
1
,
j
2
, . . . ,
j
n
);
thus, it is the case that
e
Dt
= diag
_
j=0
(
1
t)
j
j!
,
j=0
(
2
t)
j
j!
, . . . ,
j=0
(
n
t)
j
j!
_
_
.
In other words, for diagonal matrices we have the (remarkable) fact that
e
Dt
= diag
_
e
1
t
, e
2
t
, . . . , e
n
t
,
_
.
We can conclude to say that
e
At
= Pdiag
_
e
1
t
, e
2
t
, . . . , e
n
t
_
P
1
.
Our discussion is now almost concluded. We have that
Pdiag
_
e
1
t
, e
2
t
, . . . , e
n
t
_
=
_
e
1
t
v
1
e
2
t
v
2
e
n
t
v
n
_
;
in other words, each column of the matrix Pe
Dt
is of the form e
t
v, where is an eigenvalue of the
matrix A and v is an associated eigenvector. Recall that in matrix/vector multiplication the resultant is
a linear combination of the columns of the matrix. It is then the case that each column of the matrix
(Pe
Dt
)P
1
is a linear combination of the columns of the matrix Pe
Dt
. In other words, each column of
the invertible matrix-valued solution e
At
is a linear combination of the columns of the matrix Pe
Dt
. It
is not dicult to check that each column of Pe
Dt
is itself a solution to the homogeneous system (3.3.2);
indeed, the substitution of the vector-valued function e
t
v into the problem and the fact that
d
dt
_
e
t
v
_
= e
t
v
yields
e
t
v = A(e
t
v) v = Av.
In other words, the column of Pe
Dt
is a solution precisely because is an eigenvalue and v is an asso-
ciated eigenvector. In conclusion, we have an invertible matrix-valued solution simply by using Pe
Dt
,
and ignoring the right-multiplication by P
1
.
Theorem 3.3.1. Consider the homogeneous system with constant coecients,
x
= Ax.
If is an eigenvalue with associated eigenvector v, then a solution is given by x(t) = e
t
v. If the eigen-
vectors form a basis for R
n
(which is ensured if the eigenvalues are distinct), then an invertible matrix-
valued solution is given by
(t) =
_
e
1
t
v
1
e
2
t
v
2
e
n
t
v
n
_
.
The homogeneous solution is then given by
x
h
(t) = (t)c = c
1
e
1
t
v
1
+c
2
e
2
t
v
2
+ +c
n
e
n
t
v
n
.
y
x=2x+3y
y=3x+2y
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
Figure 3.3: (color online) The phase plane associated with the system (3.3.4). It is calculated using the
Java applet PPLANE developed by J. Polking. Solution curves are drawn for several dierent initial
conditions.
For our rst example, consider
x
= Ax, A =
_
2 3
3 2
_
. (3.3.4)
Since the characteristic equation is
p
A
() = (2 )
2
9,
the eigenvalues are
1
= 1,
2
= 5. Regarding the associated eigenvectors, we have
A(1)I
2

_
1 1
0 0
_
v
1
=
_
1
1
_
,
and
A5I
2

_
1 1
0 0
_
v
2
=
_
1
1
_
.
Thus, from Theorem 3.3.1 the general solution is given by
x(t) = c
1
e
t
_
1
1
_
+e
5t
_
1
1
_
,
and the matrix-valued solution is
(t) =
_
e
t
e
5t
e
t
e
5t
_
.
We will now plot the solutions on the x
1
x
2
-plane, i.e., the phase plane, in Figure 3.3 The plotting will be
done by using the Java applet PPLANE. We will label the trivial solution x = 0 in this case as a (unstable)
saddle point. In general, x = 0 is a saddle point when the eigenvalues are real have opposite sign. For
this example now suppose that the initial condition is given by
x(0) =
_
5
7
_
.
Since x(t) = (t)c, this means that
_
5
7
_
= x(0) = (0)c
_
1 1
1 1
_
c =
_
5
7
_
.
The solution to this linear system is given by
c =
_
1 1
1 1
_
1
_
5
7
_
=
_
6
1
_
,
so that the solution to the initial value problem is given by
x(t) = 6e
t
_
1
1
_
e
5t
_
1
1
_
.
For our second example, suppose that the eigenvalues and associated eigenvectors are given by
1
= 2, v
1
=
_
1
1
_
;
2
= 4, v
2
=
_
1
1
_
. (3.3.5)
The general solution is given by
x(t) = c
1
e
2t
_
1
1
_
+e
4t
_
1
1
_
,
(t) =
_
e
2t
e
4t
e
2t
e
4t
_
.
Various solutions on the phase plane are plotted in Figure 3.4. In this case the trivial solution x = 0 is a
stable node (attractor). In general, x = 0 is a stable node when both eigenvalues are real and negative.
For our third example, suppose that the eigenvalues and associated eigenvectors are given by
1
= 2, v
1
=
_
1
1
_
;
2
= 4, v
2
=
_
1
1
_
. (3.3.6)
y
x=-3x-y
y=-x-3y
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
Figure 3.4: (color online) The phase plane associated with the system (3.3.5). Solution curves are drawn
for several dierent initial conditions.
The general solution is given by
x(t) = c
1
e
2t
_
1
1
_
+e
4t
_
1
1
_
,
(t) =
_
e
2t
e
4t
e
2t
e
4t
_
.
Various solutions on the phase plane are plotted in Figure 3.5. In this case the trivial solution x = 0 is
an unstable node (repeller). In general, x = 0 is an unstable node when both eigenvalues are real and
positive.
For our nal example, consider the second-order scalar ODE
y
+3y
+2y = 0.
Upon setting x
1
= y, x
2
= y
this second-order ODE becomes the rst-order system

x
= Ax, A =
_
0 1
2 3
_
. (3.3.7)
The characteristic equation is given by
p
A
() =
2
+3+2.
The eigenvalues and associated eigenvectors are
1
= 2, v
1
=
_
1
2
_
;
2
= 1, v
2
=
_
1
1
_
,
x(t) = c
1
e
2t
_
1
2
_
+e
t
_
1
1
_
,
y
x=3x+y
y=x+3y
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
(t) =
_
e
2t
e
t
2e
2t
e
t
_
.
In this case the trivial solution is a stable node, as both of the eigenvalues are real and negative. Various
solutions on the phase plane are plotted in Figure 3.6. Regarding the solution to the original problem
we have from y = x
1
that
y(t) = c
1
e
2t
+c
2
e
t
(= c
1
e
1
t
+c
2
e
2
t
).
We must now consider the case that the eigenvalues are complex-valued. Suppose that there is a
complex-conjugate pair of eigenvalues and associated eigenvectors given by
1
= a +ib, v
1
= p+iq;
2
= a ib, v
2
= piq.
Formally, one of the solutions is given by
x(t) = e
1
t
v
1
= e
(a+ib)t
(p+iq).
This is a complex-valued solution, and it is not the one that we want, for we knowthat all of the columns
of the matrix-valued solution e
At
are real-valued. Thus, we must rewrite the solution in a dierent form.
Assuming that e
(a+ib)t
= e
at
e
ibt
, we need to determine what exactly is e
ibt
. This will be accomplished (yet
again) via the use of the Taylor series for the exponential function. Upon using the fact that
i
2
= 1, i
3
= i
2
i = i, i
4
= i
2
i
2
= 1,
we can write for R,
e
i
=
j=0
(i)
j
j!
= 1 +i

2
2!
i
3
3!
+

4
4!
+i
5
5!
+
=
_
1

2
2!
+

4
4!
+ +(1)
j

2j
(2j)!
+
_
+i
_

3
3!
+

5
5!
+ +(1)
j

2j+1
(2j +1)!
+
_
.
y
x=y
y=-2x-3y
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
In other words, we have the relationship
e
i
= cos +i sin,
which is known as Eulers formula. Note that Eulers formula yields the intriguing identity e
i
= 1,
which brings into one simple formula some of the most important constants and concepts in all of
mathematics.
Since
e
(a+ib)t
= e
at
(cos(bt) +i sin(bt)) ,
we now have that the solution can be expressed as
x(t) = e
at
(cos(bt) +i sin(bt)) (p+iq) = e
at
(cos(bt)psin(bt)q)
.
x
1
(t)
+i e
at
(sin(bt)p+cos(bt)q)
.
x
2
(t)
.
Since
x
= Ax x
1
+ix
2
= A(x
1
+ix
2
) = Ax
1
+iAx
2
,
by equating real and imaginary parts we have that x
1
and x
2
both solve the ODE. Thus, the complex-
valued solution yields two linearly independent solutions, which is what was desired.
Lemma 3.3.2. Consider the linear ODE x
= Ax for A R
nn
. Suppose that there is a complex-conjugate
pair of eigenvalues and associated eigenvectors given by
1
= a +ib, v
1
= p+iq;
2
= a ib, v
2
= piq.
A pair of linearly independent solutions is given by
x
1
(t) = e
at
(cos(bt)psin(bt)q) , x
2
(t) = e
at
(sin(bt)p+cos(bt)q) .
y
x=ax+4y
y=-4x+ay
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
Figure 3.7: (color online) The phase plane associated with the system (3.3.8) when a = 1.
For our rst example, consider for any a R,
x
= Ax, A =
_
a 4
4 a
_
. (3.3.8)
The characteristic equation is p
A
() = (a)
2
+16, so that the complex-valued eigenvalues are = a|i4.
Regarding the complex-valued eigenvector associated with the eigenvalue a +i4 we have
A(a +i4)I
2

_
i 1
0 0
_
v =
_
i
1
_
=
_
0
1
_
+i
_
1
0
_
.
Upon using Lemma 3.3.2 the two linearly independent solutions are
x
1
(t) = e
at
_
cos(4t)
_
0
1
_
sin(4t)
_
1
0
__
= e
at
_
sin(4t)
cos(4t)
_
,
and
x
2
(t) = e
at
_
sin(4t)
_
0
1
_
+cos(4t)
_
1
0
__
= e
at
_
cos(4t)
sin(4t)
_
.
The general solution is then
x(t) = c
1
e
at
_
sin(4t)
cos(4t)
_
+c
2
e
at
_
cos(4t)
sin(4t)
_
,
(t) =
_
e
at
sin(4t) e
at
cos(4t)
e
at
cos(4t) e
at
sin(4t)
_
.
Now let us consider the phase plane associated with (3.3.8). Consider the solution x
1
(t). The para-
metric plot of the vector-valued function
c
1
_
sin(4t)
cos(4t)
_
y
x=ax+4y
y=-4x+ay
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
is a circle of radius c
1
; furthermore, the circle is traversed in a counterclockwise direction for increasing
t. Multiplying this vector by the function e
at
means that the radius of the circle is increasing (a > 0 - see
Figure 3.7), decreasing (a < 0 - see Figure 3.8), or is constant (a = 0 - see Figure 3.9). Thus, for a 0 the
curve becomes a spiral. The case a > 0 is known as an unstable spiral (source), the case a < 0 is called a
stable spiral (sink), and when a = 0 the solution x = 0 is called a (linear) center.
y
x=ax+4y
y=-4x+ay
-20.
-15.
-10.
-5.
0.
5.
10.
15.
20.
x
-20. -15. -10. -5. 0. 5. 10. 15. 20.
Remark 3.3.3. When A R
22
has eigenvalues = a + ib, then it will always be the case that if a > 0,
then the origin is an unstable spiral, if a < 0, then the origin is a stable spiral, and if a = 0 the origin is a
center. The dierence is that the spiralling will no longer necessarily be associated with circles. Instead,
the level curves (a = 0) will be rotated ellipses, and the amount of rotation, as well as the size of the
major and minor axes, will be a function of the real and imaginary parts of the eigenvector, p and q.
For another example consider the second-order scalar ODE
y
4y
+13y = 0.
Upon setting x
1
= y, x
2
= y

x
= Ax, A =
_
0 1
13 4
_
.
A
() = ( 2)
2
+ 9, so that the complex-valued eigenvalues are = 2 |
i3. The origin is an unstable spiral. Regarding the complex-valued eigenvector associated with the
eigenvalue 2 +i3 we have
A(2 +i3)I
2

_
2 i3 1
0 0
_
v =
_
1
2 +i3
_
=
_
1
2
_
+i
_
0
3
_
.
Upon using Lemma 3.3.2 the two linearly independent solutions are
x
1
(t) = e
2t
_
cos(3t)
_
1
2
_
sin(3t)
_
0
3
__
= e
2t
_
cos(3t)
2cos(3t) 3sin(3t)
_
,
and
x
2
(t) = e
2t
_
sin(3t)
_
1
2
_
+cos(3t)
_
0
3
__
= e
2t
_
sin(3t)
2sin(3t) +3cos(3t)
_
.
The general solution to the ODE system is then
x(t) = c
1
e
2t
_
cos(3t)
2sin(3t) 3cos(3t)
_
+c
2
e
2t
_
sin(3t)
2sin(3t) +3cos(3t)
_
,
and the solution to the original scalar second-order ODE is
y = c
1
e
2t
cos(3t) +c
2
e
2t
sin(3t).
We nally consider the case that there are not n linearly independent eigenvectors. Here we will
focus solely on the case that n = 2: the general case requires that we consider the Jordan canonical form
of a matrix, which is beyond the scope of this course. For an (illuminating) example, suppose that
A =
_
0 1
0 0
_
.
Here we have that the characteristic polynomial is p
A
() =
2
, so that
1
=
2
= 0 is a double eigenvalue.
Since Ais already in RREF, we knowthat there there is only one linearly independent eigenvector, which
is
v
1
=
_
1
0
_
. (3.3.9)
Thus, one solution is
x
1
(t) = e
1
t
v
1
=
_
1
0
_
:
what is the second solution x
2
(t) which satises
x
2
(0) =
_
0
1
_
(an initial condition which is linearly independent from x
1
(0))? We go back to the matrix-valued solu-
tion e
At
, and see if we can directly compute it. Using the notation that 0
n
R
nn
is the matrix will all
zero entries, it is not dicult to compute that
A
2
= 0
2
A
j
= 0
2
, j = 3, 4, . . . ;
consequently,
e
At
= I
2
+tA =
_
1 t
0 1
_
.
A second linearly independent solution is then given by
x
2
(t) =
_
t
1
_
=
_
0
1
_
+t
_
1
0
_
.
Note that this second solution is of the form
x
2
(t) =
_
0
1
_
+tv
1
,
so that the eigenvector does play a (non-obvious) role in the form of the solution.
While we will not do the calculation here, it is not dicult to check that for the (slightly) more
general case of the matrix
A =
_
a 1
0 a
_
,
which has the eigenvalues
1
=
2
= a and sole associated eigenvector v
1
as given in (3.3.9), a matrix-
valued solution is
e
At
=
_
e
at
te
at
0 e
at
_
.
Thus, a second linearly independent solution is given by
x
2
(t) = e
at
_
0
1
_
+te
at
_
1
0
_
= e
at
_
0
1
_
+te
at
v
1
.
With these two examples serving as a guide, we now know what to do in the case that there is a
double eigenvalue
1
=
2
with only one eigenvector v
1
; namely, we will guess the second solution to be
of the form
x
2
(t) = e
1
t
w+te
1
t
v
1
,
where w is an unknown vector. Note that this guess if reminiscent of the one that would be used when
using the method of undetermined coecients to nd the particular solution for the scalar linear ODE
x
= ax +e
at
.
Upon noting that
x
2
= e
1
t
(
1
w+v
1
) +
1
te
1
t
v
1
,
the requirement that x
2
= Ax
2
, and the linearity of matrix multiplication, yields the algebraic system
1
w+v
1
= Aw,
1
v
1
= Av
1
.
The second equation is automatically satised because
1
is an eigenvalue and v
1
is an associated eigen-
vector. The rst equation is the linear system
(A
1
I
2
)w = v
1
. (3.3.10)
Since dim[Nul(A
1
I
2
)] = 1 (this is a consequence of the fact that there is only one eigenvector associ-
ated with the eigenvalue
1
), the RREF of A
1
I
2
is not the identity matrix I
2
; hence, it is not clear that
the linear system (3.3.10) has a solution. It turns out to be the case, however, that it does (the reason for
this fact is beyond the scope of this class); furthermore, the set v
1
, w forms a basis for R
2
. Of course,
the solution will not be unique, because dim[Nul(A
1
I
2
)] = 1 implies that the RREF of A
1
I
2
has
one free variable.
Lemma 3.3.4. Consider the linear ODE x
= Ax for A R
22
. Suppose that p
A
() = (
1
)
2
, and that
there is only one linearly independent associated eigenvector. The two linearly independent solutions
are given by
x
1
(t) = e
1
t
v
1
, x
2
(t) = e
1
t
w+te
1
t
v
1
,
where the vector w is found by solving the linear system
(A
1
I
2
)w = v
1
.
The general solution is then given by
x(t) = c
1
e
1
t
v
1
+c
2
_
e
1
t
w+te
1
t
v
1
_
= e
1
t
(c
1
v
1
+c
2
w) +c
2
te
1
t
v
1
.
Remark 3.3.5. The situation leading to Lemma 3.3.4 is not generic, as a small perturbation of the matrix
generally leads to the characteristic polynomial having two simple zeros. In particular, a small pertur-
bation of the matrix leads to either the case of two real eigenvalues, or the case of a complex-conjugate
pair of eigenvalues with nonzero imaginary part. Both of these cases have been previously studied.
For an example, consider
x
= Ax, A =
_
1 2
2 5
_
.
A
() = ( 3)
2
, so that = 3 is a double eigenvalue. Regarding the
associated eigenvector(s) we have
A3I
2
=
_
2 2
2 2
_
_
1 1
0 0
_
v
1
=
_
1
1
_
.
One solution is given by
x
1
(t) = e
3t
_
1
1
_
.
Since there is only one linearly independent eigenvector associated with the double eigenvalue, we must
use Lemma 3.3.4 in order to nd the second linearly independent solution. The RREF of the augmented
matrix associated with the linear system (A3I
2
)w = v
1
is
(A3I
2
v
1
)
_
1 1 1/2
0 0 0
_
,
which is equivalent to the linear equation
w
1
+w
2
=
1
2
w =
_
1/2
0
_
+c
_
1
1
_
, c R.
We need only one of these solutions: picking c = 0 yields the vector
w =
_
1/2
0
_
x
2
(t) = e
3t
_
1/2
0
_
+te
3t
_
1
1
_
= e
3t
_
1/2 t
t
_
.
The general solution to the ODE is then
x(t) = c
1
e
3t
_
1
1
_
+c
2
e
3t
_
1/2 t
t
_
,
(t) =
_
e
3t
(1/2 t)e
3t
e
3t
te
3t
_
.
For another example consider the second-order scalar ODE
y
+2y
+y = 0.
Upon setting x
1
= y, x
2
= y

x
= Ax, A =
_
0 1
1 2
_
.
A
() = ( + 1)
2
, so that = 1 is a double eigenvalue. Regarding the
associated eigenvector(s) we have
A(1)I
2
=
_
1 1
1 1
_
_
1 1
0 0
_
v
1
=
_
1
1
_
.
One solution is then given by
x
1
(t) = e
t
_
1
1
_
.
Since there is only one linearly independent eigenvector associated with the double eigenvalue, we must
use Lemma 3.3.4 in order to nd the second linearly independent solution. The RREF of the augmented
matrix associated with the linear system (A+I
2
)w = v
1
is
(A+I
2
v
1
)
_
1 1 1
0 0 0
_
.
Solving as in the previous example yields the second solution to be
x
2
(t) = e
t
_
1
0
_
+te
t
_
1
1
_
= e
t
_
1 +t
t
_
.
The general solution to the ODE system is then
x(t) = c
1
e
t
_
1
1
_
+c
2
e
t
_
1 +t
t
_
,
and the solution to the original scalar second-order ODE is
y = c
1
e
t
+c
2
te
t
.
We conclude with a classication of the trivial solution x = 0, which is based upon our understanding
of the solution behavior:
Theorem 3.3.6. Consider the homogeneous system with constant coecients,
x
= Ax, A R
22
.
Let
1
,
2
denote the eigenvalues of A. The trivial solution x = 0 is classied for real eigenvalues as
follows:
(a)
1
<
2
< 0: stable node
(b) 0 <
1
<
2
: unstable node
(c)
1
< 0 <
2
: unstable saddle point.
If the eigenvalues are complex-valued with
1
= a +ib and
2
= a ib, then x = 0 is said to be a:
(a) a < 0: stable spiral
(b) a > 0: unstable spiral
(c) a = 0: (linear) center.
3.3.2.1 Variation of parameters: examples
We have already seen in Theorem 3.2.3 that the general solution to the nonhomogeneous problem
x
= Ax +f (t)
is given by the variation of parameters formula
x(t) = (t)c +(t)
_
t
(s)
1
f (s) ds, (3.3.11)
where (t) is a matrix-valued solution to the homogenous problem, i.e.,
= A.
The rst term in the sum is the homogeneous solution, and the second term in the sum is the particular
solution. We have just seen how to solve the homogeneous problem via nding the eigenvalues and
associated eigenvectors for the matrix A. Thus, all that is left to do is a couple of examples which
illustrate the theory.
For our rst example, consider the IVP
x
=
_
2 1
3 2
_
x +
_
2/(1 +e
t
)
0
_
, x(0) =
_
5
3
_
.
The eigenvalues and associated eigenvectors for the matrix A are given by
1
= 1, v
1
=
_
1
3
_
;
2
= 1, v
2
=
_
1
1
_
,
so that a matrix-valued solution for the homogeneous problem is
(t) =
_
e
t
e
t
3e
t
e
t
_
.
Let us now construct the particular solution. Since
(s)
1
=
1
2
_
e
s
e
s
3e
s
e
s
_
,
the integrand associated with the particular solution becomes
(s)
1
f (s) =
1
2
_
e
s
e
s
3e
s
e
s
__
2/(1 +e
s
)
0
_
=
_
e
s
/(1 +e
s
)
3e
s
/(1 +e
s
)
_
.
It can be checked (perhaps by using Wolfram Mathematica Online Integrator) that
_
t
(s)
1
f (s) ds =
_
ln(1 +e
t
)
3ln(1 +e
t
) 3e
t
_
,
so that the particular solution is now seen to be
x
p
(t) = (t)
_
t
(s)
1
f (s) ds =
_
e
t
ln(1 +e
t
) +3e
t
ln(1 +e
t
) 3
3e
t
ln(1 +e
t
) +3e
t
ln(1 +e
t
) 3
_
.
From (3.3.11) the general solution is
x(t) =
_
e
t
e
t
3e
t
e
t
_
c +
_
e
t
ln(1 +e
t
) +3e
t
ln(1 +e
t
) 3
3e
t
ln(1 +e
t
) +3e
t
ln(1 +e
t
) 3
_
.
In order to take solve for the initial condition we have
_
5
3
_
= x(0) =
_
1 1
3 1
_
c +
_
2ln(2) 3
3
_
,
which is equivalent to the linear system
_
1 1
3 1
_
c =
_
8 2ln(2)
0
_
.
The solution to this linear system is
c =
_
1 1
3 1
_
1
_
8 4ln(2)
6ln(2)
_
=
_
4 +ln(2)
12 3ln(2)
_
.
Putting this all together, the solution to the IVP is given by
x(t) = (4 +ln(2))e
t
_
1
3
_
+(12 3ln(2))e
t
_
1
1
_
+
_
e
t
ln(1 +e
t
) +3e
t
ln(1 +e
t
) 3
3e
t
ln(1 +e
t
) +3e
t
ln(1 +e
t
) 3
_
.
For our second example, consider nding the general solution to the second-order scalar ODE
y
+y = sec(t).
Upon setting x
1
= y, x
2
= y
this ODE becomes the rst-order system

x
=
_
0 1
1 0
_
x +
_
0
sec(t)
_
.
The complex-valued eigenvalues and associated eigenvectors for the matrix A are given by
1
= i, v
1
=
_
1
i
_
=
_
1
0
_
+i
_
0
1
_
,
so that a matrix-valued solution for the homogeneous problem is
(t) =
_
cos(t) sin(t)
sin(t) cos(t)
_
.
Let us now construct the particular solution. Since
(s)
1
=
_
cos(s) sin(s)
sin(s) cos(s)
_
,
the integrand associated with the particular solution becomes
(s)
1
f (s) =
_
cos(s) sin(s)
sin(s) cos(s)
__
0
sec(s)
_
=
_
tan(s)
1
_
.
It can be checked (perhaps by using Wolfram Mathematica Online Integrator) that
_
t
(s)
1
f (s) ds =
_
ln(cos(t))
t
_
,
so that the particular solution is
x
p
(t) = (t)
_
t
(s)
1
f (s) ds =
_
cos(t) ln(cos(t)) +t sin(t)
sin(t) ln(cos(t)) +t cos(t)
_
.
From (3.3.11) the general solution for the vectorized problem is
x(t) =
_
cos(t) sin(t)
sin(t) cos(t)
_
c +
_
cos(t) ln(cos(t)) +t sin(t)
sin(t) ln(cos(t)) +t cos(t)
_
.
Consequently, the general solution for the original second-order ODE is
y(t) = c
1
cos(t) +c
2
sin(t) +cos(t) ln(cos(t)) +t sin(t).
3.3.2.2 Undetermined coecients
Recall that in Section 2.3.2.2 we discussed how to nd the particular solution for scalar problems of the
form
x
= ax +f (t),
where f (t) was composed of products and sums of exponential functions, polynomials, and sines and
cosines. We can do the same thing for linear systems of the form
x
= Ax +f (t),
where each component of f (t) is composed of products and sums of exponential functions, polynomials,
and sines and cosines. The only dierence will be that instead of using undetermined coecients as
part of our guess, we will instead use undetermined vectors. We will illustrate the method for only
a few simple examples for which the forcing function is of a type that often occurs in applications.
For anything relatively complicated it is generally best to use the variation or parameters formula, and
use some type of computer algebra system to negotiate all of the indenite integrals that arise. The
homogeneous problem for each example is
x
= Ax, A =
_
2 1
3 2
_
.
As we have already seen, the general solution for this problem is
x
h
(t) = c
1
e
t
_
1
3
_
+c
2
e
t
_
1
1
_
.
For our rst example, suppose that
f (t) =
_
4
5
_
.
A natural guess for the particular solution is
x
p
(t) = a,
where a is an unknown vector. Plugging this guess into the ODE yields
0 = Aa+
_
4
5
_
Aa =
_
4
5
_
.
The solution to this linear system is
a = A
1
_
4
5
_
=
_
3
2
_
.
In conclusion, the general solution is
x(t) = c
1
e
t
_
1
3
_
+c
2
e
t
_
1
1
_
_
3
2
_
.
For the second example, suppose that
f (t) = e
3t
_
2
1
_
.
x
p
(t) = e
3t
a,
where a is an unknown vector. Plugging this guess into the ODE yields
3e
3t
a = e
3t
Aa+e
3t
_
2
1
_
(A3I
2
)a =
_
2
1
_
.
Since = 3 is not an eigenvalue of A, the matrix A3I
2
is invertible. The solution to this linear system
is
a = (A3I
2
)
1
_
2
1
_
=
1
8
_
9
7
_
.
x(t) = c
1
e
t
_
1
3
_
+c
2
e
t
_
1
1
_
1
8
e
3t
_
9
7
_
.
For our nal example, suppose that
f (t) = cos(2t)
_
1
4
_
.
x
p
(t) = cos(2t)a+sin(2t)b,
where a, b are unknown vectors. Plugging this guess into the ODE yields
2sin(2t)a+2cos(2t)b = cos(2t)Aa+sin(2t)Ab +cos(2t)
_
1
4
_
,
which after equating the cosine and sine terms is equivalent to the pair of linear systems
Aa2b =
_
1
4
_
, 2a+Ab = 0.
3.4. EXAMPLES 71
In block-matrix form this linear system is equivalent to
_
A 2I
2
2I
2
A
__
a
b
_
=
_
_
1
4
0
0
_
_
2 1 2 0
3 1 0 2
2 0 2 1
0 2 3 1
_
_
_
_
a
1
a
2
b
1
b
2
_
_
=
_
_
1
4
0
0
_
_
.
Solving this linear system yields
a =
1
5
_
2
5
_
, b =
2
5
_
1
4
_
.
x(t) = c
1
e
t
_
1
3
_
+c
2
e
t
_
1
1
_
+
1
5
cos(2t)
_
2
5
_
+
2
5
sin(2t)
_
1
4
_
.
Remark3.3.7. The last example, while relatively simple, clearly illustrates the limitations of the method
of undetermined coecients for systems of ODEs. While the guess is straightforward, it is often the case
that the resulting set of linear equations to be solved gets large fairly quickly. For example, if
f (t) = v
0
+tv
1
+t
2
v
2
,
then a natural guess for the particular solution is
x
p
(t) = a
0
+ta
1
+t
2
a
2
.
Plugging this guess into the ODE and equating like terms yields the system of equations
Aa
0
+a
1
= v
1
, Aa
1
+2a
2
= v
1
, Aa
2
= v
2
.
This is a system of six equations in six unknowns! In this case it is probably easier to simply use the
variation of parameters formula in order to get the particular solution.
3.4 Examples
3.4.1 Mass-spring system
Recall from Section 3.1.2 that the equation for an undamped mass-spring system subject to external
forcing is given by
my
+ky = F(t),
where m is the mass attached to the end of the spring, k is the spring constant, and F(t) models the
external forcing. Upon setting
2
0
=
k
m
, f (t) =
1
m
F(t),
this system is equivalent to
y
+
2
0
y = f (t).
The constant
0
is known as the natural frequency for the undamped system. We will solve this system
under the assumption that the forcing is sinusoidal, i.e.,
f (t) = f
0
cos(t),
where f
0
is the amplitude of the forcing and is the frequency. In conclusion, we will study the ODE
y
+
2
0
y = f
0
cos(t) x
= Ax +
_
0
f
0
cos(t)
_
, A =
_
0 1
2
0
0
_
. (3.4.1)
First consider the homogeneous system. The eigenvalues and associated eigenvectors for the matrix
A are given by
1
= i
0
, v
1
=
_
1
i
0
_
;
2
= i
0
, v
2
=
_
1
i
0
_
.
Consequently, the two linearly independent solutions to the homogeneous system are
x
1
(t) =
_
cos(
0
t)
0
sin(
0
t)
_
, x
2
(t) =
_
sin(
0
t)
0
cos(
0
t)
_
,
and the homogeneous solution is given by
x
h
(t) = c
1
_
cos(
0
t)
0
sin(
0
t)
_
+c
2
_
sin(
0
t)
0
cos(
0
t)
_
. (3.4.2)
All solutions to the homogeneous problem are 2/-periodic; hence, the moniker natural frequency.
Note that the homogeneous solution to the original problem is
y
h
(t) = c
1
cos(
0
t) +c
2
sin(
0
t) =
_
c
2
1
+c
2
2
cos(
0
t ), tan =
c
1
c
1
.
Now consider the particular solution. If
0
(we will later see what happens when =
0
), then
by the method of undetermined coecients the particular solution is of the form
x
p
(t) = cos(t)a+sin(t)b.
Plugging this guess into the system (3.4.1) and equating the cosine and sine terms yields the pair of
linear systems
Aab =
_
0
f
0
_
, A+Ab = 0
_
A I
2
I
2
A
__
a
b
_
=
_
_
0
f
0
0
0
_
_
.
The solution to this linear system is given by
a =
f
0
2
0
2
_
1
0
_
, b =
f
0
2
0
2
_
0
_
.
It is at this point that we see the assumption
0
is necessary; otherwise, our guess for the particular
solution would not be correct. We conclude by saying that the particular solution is given by
x
p
(t) =
f
0
2
0
2
cos(t)
_
1
0
_
+
f
0
2
0
2
sin(t)
_
0
_
=
f
0
2
0
2
_
cos(t)
sin(t)
_
.
Recalling the homogeneous solution given in (3.4.2), we have that the general solution to (3.4.1) is
given by
x(t) = c
1
_
cos(
0
t)
0
sin(
0
t)
_
+c
2
_
sin(
0
t)
0
cos(
0
t)
_
+
f
0
2
0
2
_
cos(t)
sin(t)
_
.
We will not focus solely on the eect of forcing on the solution behavior, which means that we will
consider the IVP with initial condition x(0) = 0, i.e., the mass is initially at its equilibrium position and
is not moving when the forcing is applied (y(0) = y
(0) = 0). It is not dicult to check that it must then

be true that
c
1
=
f
0
2
0
2
, c
2
= 0,
3.4. EXAMPLES 73
so that the solution to the IVP is
x(t) =
f
0
2
0
2
_
cos(t) cos(
0
t)
0
sin(
0
t) sin(t)
_
.
In terms of the original variables the position of the mass follows
y(t) =
f
0
2
0
2
[cos(t) cos(
0
t)] . (3.4.3)
y
Figure 3.10: (color online) A cartoon of the solution to the sinusoidally forced mass-spring for the initial
condition y(0) = y
(0) = 0. The thick (red) curve is the amplitude of the solution (see Figure 3.11), and
the thin (blue) curve is the actual solution curve.
In conclusion, the position of the mass at time t is given by the formula in (3.4.3). Unfortunately, in
its current form the solution formula does not give a clear explanation as to the motion of the mass. In
particular, what is the amplitude of the oscillation at a particular point in time, and is there periodic
motion associated with the oscillations about y = 0? Using the trigonometric identity
cos(
1
) cos(
2
) = 2sin
_
1
2
_
sin
_
1
+
2
2
_
,
we can rewrite the solution as
y(t) =
2f
0
2
0
2
sin
_
2
t
_
sin
_
0
+
2
t
_
. (3.4.4)
The solution is a modulated sinusoid (see Figure 3.10). The amplitude is given by the sinusoidal function
A(t) =
2f
0
2
0
2
sin
_
2
t
_
.
The amplitude function has amplitude
A
()
2f
0
2
0
,
and period
T
()
4
.
The plots of these two functions are given in Figure 3.11: the plot of the amplitude is often referred
to as an amplitude diagram. Note that both the amplitude and period become arbitrarily large as the
forcing frequency approaches the natural frequency, and further note that both become arbitrarily small
as the forcing frequency becomes large. In particular, the response of the mass to the forcing becomes
negligible when the forcing frequency becomes large. On the other hand, the internal oscillations have
period 4/(
0
+), which approaches that of the natural frequency as the forcing frequency approaches
the natural frequency.
0
A*
0
T*
Figure 3.11: (color online) A cartoon of the plot of the amplitude A
() is given by the thick (blue) curve

in the left gure, and a cartoon of the plot of the period of the amplitude function T
() is given in the
right gure.
Finally, what is the solution when =
0
, i.e., when there is resonant forcing? Setting
h =

0
2
t,
we can rewrite the amplitude function as
A(t) = f
0
t
sin(h)
h
.
Since
0
implies that h 0, we have
lim
0
A(t) = f
0
t lim
h0
sin(h)
h
= 1.
Thus, for the solution as written in (3.4.4) we have
lim
0
y(t) = lim
0
A(t) lim
0
sin
_
0
+
2
t
_
= f
0
t sin(
0
t).
The amplitude of the sinusoidal wave, f
0
t, becomes large as t increases, which leads to an eventual
failure of the spring. Of course, in an actual system the forcing need not be exactly at resonance in
order for the spring to fail. If is chosen so that A
() is larger than the spring can stretch and still

recover, then the spring will fail. Referring again to Figure 3.11, we then see that for a given spring
there is a number such that if forcing frequency satises
0
< , then the spring will fail.
3.4. EXAMPLES 75
L
0
y
1
k
m
k
m
k
1
y
2
L
0
Figure 3.12: (color online) A cartoon which depicts the coupled mass-spring problem.
3.4.2 Coupled oscillators
Consider the coupled mass-spring system depicted in Figure 3.12. Here we are taking two identical
mass-spring systems (spring constant k), and coupling themvia a spring with a dierent spring constant
(spring constant k
1
). While we will not go through the derivation of the equations of motion here, it
turns out that the model ODE is given by
my
1
= ky
1
+k
1
(y
2
y
1
), my
2
= ky
2
k
1
(y
2
y
1
). (3.4.5)
Upon making the typical substitutions of
x
1
= y
1
, x
2
= y
2
, x
3
= y
1
, x
4
= y
2
,
this system of second-order coupled ODEs is equivalent to the rst-order system
x
=
_
_
0 0 1 0
0 0 0 1
2
+

2
0 0

2
+
0 0
_
_
x;
2
+
=
k +k
1
m
,
2
=
k
1
m
.
We could solve this system in the usual manner by nding the eigenvalues and associated eigenvec-
tors for the matrix, and then using the solution formulas. However, we can exploit the fact that the
middle spring is coupling two identical mass-spring systems to simplify the equations of motion, and
consequently the solution formula.
If we set
x
1
= y
1
+y
2
, x
2
= y
1
y
2
,
then the coupled system (3.4.5) becomes the uncoupled system
mx
1
= kx
1
mx
2
= (k +2k
1
)x
2
1
+
2
0
x
1
= 0,
2
0
=
k
m
x
2
+
2
1
x
2
+0,
2
1
=
k +2k
1
m
(this type of transformation is sometimes called nding the normal coordinates for the system, and it
is equivalent to making a change of variables so that the original matrix has been diagonalized). Note
that k
1
> 0 implies that
1
>
0
, and further note that
0
is the natural frequency associated with
the uncoupled mass-spring system. These two problems were solved in the previous example, and the
general solution is given by
x
1
(t) = c
1
cos(
0
t) +c
2
sin(
0
t), x
2
(t) = c
3
cos(
1
t) +c
4
sin(
1
t).
It is not dicult to check that the solution to a generic IVP is
x
1
(t) = x
1
(0) cos(
0
t) +
x
1
(0)
0
sin(
0
t), x
2
(t) = x
2
(0) cos(
1
t) +
x
2
(0)
1
sin(
1
t).
In order to better understand the solution behavior, suppose that we choose the initial conditions
y
1
(0) = y
1
(0) = 0; y
2
(0) = F, y
2
(0) = 0.
In other words, suppose that the top mass is started at its equilibrium position, and the bottom mass is
moved F units, but is given no initial velocity. In the variables x
1
, x
2
this is the initial condition
x
1
(0) = F, x
2
(0) = F, x
1
(0) = x
2
(0) = 0,
and the solution to the IVP becomes
x
1
(t) = F cos(
0
t), x
2
(t) = F cos(
1
t).
Since
y
1
=
1
2
(x
1
+x
2
), y
2
=
1
2
(x
1
x
2
),
the solution in the original variables is
y
1
(t) =
1
2
F [cos(
0
t) cos(
1
t)] = F sin
_
0
2
t
_
sin
_
1
+
0
2
t
_
y
2
(t) =
1
2
F [cos(
0
t) +cos(
1
t)] = F cos
_
0
2
t
_
cos
_
1
+
0
2
t
_
.
The second equality follows from the identity
cos(
1
) +cos(
2
) = 2cos
_
1
2
_
cos
_
1
+
2
2
_
.
The solution for y
1
(t) and y
2
(t) are now in the form described by the forced mass-spring problem,
and the plot for each one will be similar to that described in Figure 3.10. The time-dependent amplitude
for each mass is given by
A
1
(t) = F sin
_
0
2
t
_
, A
2
(t) = F cos
_
0
2
t
_
.
Since for any
cos(t) = sin(t +/2) = sin
_
(t +

2
)
_
,
3.4. EXAMPLES 77
we have that
A
1
_
t +

0
_
= A
2
(t);
in other words, the amplitude for y
1
(t) is /(
1
0
) out-of-phase with that for y
2
(t). Since this phase
dierence is precisely one-fourth the period of the amplitude functions, which is given by
T
=
4
0
,
we now see that when one amplitude is at a maximum or minimum, the other must be at a zero (see
Figure 3.13). This explains the alternating of the beating phenomena seen in the computer simulations
(e.g., see Applet: Coupled Oscillators).
y
A
1
(t)
A
2
(t)
Figure 3.13: (color online) A cartoon which depicts the amplitudes associated with the motion for each
mass. The amplitude for the top mass is denoted by the (red) solid curve, and that for the bottom mass
is given by the (blue) dashed curve.
Finally, this alternating beating phenomena is most pronounced when the period of the amplitude
functions is large. The period will clearly be large when
1
0
is small. Using the Taylor approximation
1 +x = 1 +
1
2
x +
yields that
1
=
_
k +2k
1
m
=
_
k
m
_
1 +
2k
1
k
=
0
_
1 +
k
1
k
+
_
,
which in turn implies that
0
=
0
k
1
k
+ .
Thus, we see that
1
0
will be small when k
1
is small. In conclusion, the beating phenomena will be
most pronounced when the spring coupling the two mass-spring systems is weak. Again, this is seen in
the simulations.
3.4.3 Two-tank mixing problem
Consider the two-tank mixing problem illustrated in Figure 3.14. Arguing as in Section 3.1.1, if we let
x
1
denote the pounds of salt in tank A and x
2
the pounds of salt in tank B, then the governing equation
c(t) lb/gal
200 gal
300 gal
3 gal/min
3 gal/min
2 gal/min
5 gal/min
tank A tank B
Figure 3.14: (color online) A cartoon which depicts a two-tank mixing problem in which the incoming
concentration c(t) is not necessarily constant.
is given by the IVP
x
= Ax +
_
3c(t)
0
_
, x(0) =
_
x
1
(0)
x
2
(0)
_
; A =
_
5/200 2/300
5/200 5/300
_
. (3.4.6)
We will be interested in the concentrations in each tank for large time.
First consider the homogeneous system (the calculations are done using the applet Matrix Calcula-
tor). The eigenvalues and associated eigenvectors for the matrix A are given
1
= 0.0344, v
1
=
_
0.5785
0.8157
_
;
2
= 0.0073, v
2
=
_
0.4247
1.1297
_
.
Since both eigenvalues are negative, the homogeneous solution x
h
(t) satises lim
t+
x
h
(t) = 0. Conse-
quently, since the initial condition is taken into account via the arbitrary constants c
1
and c
2
associated
with the homogeneous solution, the concentration in each tank for large times will be described by the
particular solution, and will not depend on the initial concentration in each tank.
First suppose that c(t) = c
0
, i.e., the incoming concentration is constant. Using the method of unde-
termined coecients we know that the particular solution will be of the form
x
p
(t) = a Aa+
_
3c
0
0
_
= 0.
Solving the linear system yields
x
p
(t) = c
0
_
200
300
_
.
In other words, the concentration in each tank after a long period of time is c
0
, which is the expected
result.
Nowsuppose that the incoming concentration is sinusoidal with average c
0
, i.e., c(t) = c
0
(1+0.5sin(t)).
Using the above result we know that the particular solution will be of the form
x
p
(t) = c
0
_
200
300
_
+cos(t)a+sin(t)b.
3.4. EXAMPLES 79
y
t
c
A
(t)
c
B
(t)
c(t)
Figure 3.15: (color online) A cartoon which as t + depicts the concentration in tank A, c
A
(t), and
tank B, c
B
(t), when the incoming concentration is given by c(t) = c
0
(1 +0.5sin(t)). The curves are not to
scale. The horizontal dashed line associated with each curve is a plot of its mean.
Arguing in the usual way, the undetermined vectors are found by solving the system
_
A I
2
I
2
A
__
a
b
_
=
_
_
0
0
0.5c
0
0
_
_
a = c
0
_
0.4996
0.0005
_
, b = c
0
_
0.0125
0.0125
_
.
Collecting terms and simplifying, we see then that the particular solution is given by
x
p
(t) = c
0
_
200 0.4996cos(t) +0.0125sin(t)
300 0.0005cos(t) 0.0125sin(t)
_
.
In order to better interpret the solution, we will now use the identity
acos(t) +bsin(t) =
a
2
+b
2
cos(t ), tan =
b
a
.
After some simplication this identity yields
0.4996cos(t) +0.0125sin(t) = 0.4997cos(t 0.9924)
0.0005cos(t) 0.0125sin(t) = 0.0125cos(t 1.4870);
thus, we can rewrite particular solution as
x
p
(t) = c
0
_
200 +0.4997cos(t 0.9924)
300 +0.0125cos(t 1.4870)
_
.
After a long time the concentration in each tank is then
c
A
(t) = c
0
(1 +0.0025cos(t 0.9924)) , c
B
(t) = c
0
(1 +0.00004cos(t 1.4870)) .
Thus, we see that tank A, which is closer to the source of the incoming concentration, experiences
a roughly 60 times greater uctuation in its concentration than does tank B. Furthermore, there is
roughly a /2, i.e., one-fourth the period of the incoming concentration, phase dierence between the
concentrations in the two tanks. This implies, for example, that when the concentration in tank A is at
a minimum of maximum, i.e., c
A
(t) = c
0
(1|0.0025), then the concentration in tank B will be at its mean
level c
0
(see Figure 3.15).
4
Scalar higher-order linear dierential equations
4.1 Connection with first-order systems
In the previous chapter we learned how to the general solution to systems of ODEs of the form x
= Ax+
f (t). In particular, we used the eigenvalues and eigenvectors of A in order to construct the homogeneous
solution, and then we constructed the particular solution using either the method of undetermined
coecients, or variation of parameters. We further learned how to convert scalar higher-order ODEs of
the form
y
(n)
+a
n1
y
(n1
) + +a
1
y
+a
0
y = f (t) (4.1.1)
into a rst-order system: upon setting
x
1
= y, x
2
= y
, x
3
= y
, . . . , x
n
= y
(n1)
, (4.1.2)
we get
x
=
_
_
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 0 0 1
a
0
a
1
a
2
a
3
a
4
a
n1
_
_
.
A
x +
_
_
0
0
0
.
.
.
0
f (t)
_
_
.
f (t)
. (4.1.3)
It should be noted that (4.1.1) arises in many physical contexts. For example, if n = 2 it is a model used
in both circuit theory and the damped-forced mass spring, if n = 3 it is a model used in the theory of
water waves, and if n = 4 it is a model used to study the deection of a beam from its equilibrium state.
We now proceed to solve the system (4.1.3). First consider the homogeneous problem, i.e.,
x
= Ax y
(n)
+a
n1
y
(n1
) + +a
1
y
+a
0
y = 0. (4.1.4)
While we will not prove it here, the characteristic polynomial is given by
p
A
() =
n
+a
n1
n1
+ +a
1
+a
0
; (4.1.5)
in other words, it is simply the ODE (4.1.1) with the derivative of k
th
-order being replaced by the
polynomial
k
. The interested student should verify this statement for the case of n = 2. For the case
n = 2 it is easy to nd the zeros of the characteristic polynomial via the quadratic formula; otherwise,
we can nd the zeros numerically. It further turns out to be the case that an eigenvector associated with
80
4.1. CONNECTION WITH FIRST-ORDER SYSTEMS 81
an eigenvalue =
0
is given by
=
0
, v =
_
_
1
2
.
.
.
n2
n1
_
_
. (4.1.6)
With this form of the eigenvector, we now have the following results. If
0
is real-valued, then the
associated solution to the vector homogeneous system (4.1.4) is given by
x(t) = ce
0
t
_
_
1
2
.
.
.
n2
n1
_
_
.
Using the transformation (4.1.2), this means that a solution to the scalar homogeneous system (4.1.4) is
given by
y(t) = ce
0
t
.
Now suppose that the eigenvalue is complex-valued, i.e.,
0
= a + ib. The associated eigenvector will
then be of the form
v =
_
_
1
2
.
.
.
n2
n1
_
_
=
_
_
1
p
1
p
2
.
.
.
p
n2
p
n1
_
_
+i
_
_
0
q
1
q
2
.
.
.
q
n2
q
n1
_
_
,
where we are using the notation
k
0
= p
k
+iq
k
, k = 1, . . . , n 1.
The two solutions associated with this eigenvalue are then
x
1
(t) = c
1
e
at
_
_
cos(bt)
_
_
1
p
1
p
2
.
.
.
p
n2
p
n1
_
_
sin(bt)
_
_
0
q
1
q
2
.
.
.
q
n2
q
n1
_
_
_
_
, x
2
(t) = c
2
e
at
_
_
sin(bt)
_
_
1
p
1
p
2
.
.
.
p
n2
p
n1
_
_
+cos(bt)
_
_
0
q
1
q
2
.
.
.
q
n2
q
n1
_
_
_
_
.
Again using the transformation (4.1.2) we see that two solutions to the scalar homogeneous system
(4.1.4) are given by
y
1
(t) = c
1
e
at
cos(bt), y
2
(t) = c
2
e
at
sin(bt).
Finally, although we will not prove it here, it turns out to be the case that if =
0
is a real zero of order
2 k n, then in addition to the expected solution e
0
t
there will be the solutions te
0
t
, t
2
e
0
t
, . . . , t
k1
e
0
t
.
Theorem 4.1.1. Consider the scalar n
th
-order homogeneous ODE
y
(n)
+a
n1
y
(n1
) + +a
1
y
+a
0
y = 0.
82 CHAPTER 4. SCALAR HIGHER-ORDER LINEAR DIFFERENTIAL EQUATIONS
The characteristic polynomial is given by
p() =
n
+a
n1
n1
+ +a
1
+a
0
.
If
0
is a real root of order k of the characteristic polynomial, then the associated solution to the ODE is
given by
y(t) = (c
1
+c
2
t +c
3
t
2
+ +c
k
t
k1
)e
0
t
,
where the constants c
1
, c
2
, . . . , c
k
are arbitrary. On the other hand, the general solution associated with
the (simple) complex-valued roots
0
= a |ib is given by
y(t) = c
1
e
at
cos(bt) +c
2
e
at
sin(bt),
where the constants c
1
, c
2
are arbitrary.
Remark 4.1.2. If there is a complex root
0
= a +ib of order k to the characteristic polynomial, then the
associated general solution is given by
y(t) = (c
1
+c
2
t + +c
k
t
k
)e
at
cos(bt) +(c
k+1
+c
k+2
t + +c
2k
t
k
)e
at
sin(bt),
where all of the constants are arbitrary. These types of solutions will not appear, however, if n = 2, 3.
As we can see from Theorem 4.1.1, the most dicult part in solving the homogeneous problem is
factoring the characteristic polynomial. We now introduce some notation which will help us to write an
n
th
-order ODE in factored form. Set
D
d
dt
D
k
=
d
k
dt
k
, k = 1, 2 . . . .
The homogeneous problem
y
(n)
+a
n1
y
(n1
) + +a
1
y
+a
0
y = 0.
can then be rewritten as
(D
n
+a
n1
D
n1
+ +a
1
D+a
0
)y = 0.
Note that the characteristic polynomial is found simply via the substitution D . Suppose that the
characteristic polynomial has been factored, i.e.,
p() =
n
+a
n1
n1
+ +a
1
+a
0
= (
1
)(
2
) (
n
),
where
1
,
2
, . . . ,
n
are the roots. Upon replacing with D in the factored form of the characteristic
polynomial, it is not dicult to check that
(D
n
+a
n1
D
n1
+ +a
1
D+a
0
)y = (D
1
)(D
2
) (D
n
)y
(here we implicitly use the fact that the coecients are constants). This factored form of the ODE
allows us to easily read o the roots of the characteristic polynomial, and thus solve ODEs of large
order without doing the dicult work of factoring a polynomial of large order.
For our rst example, suppose that the ODE is
(D4)(D5)(D+3)
2
y = 0.
This is a 4
th
-order ODE, and the associated characteristic polynomial is
p() = (4)(5)(+3)
2
.
Upon using Theorem 4.1.1 we see that the general solution is
y(t) = c
1
e
4t
+c
2
e
5t
+e
3t
(c
3
+c
4
t).
For our second example, suppose that the ODE is
(D+1)(D6)
3
(D
2
+4D+13)y = 0.
This is a 6
th
-order ODE, and the associated characteristic polynomial is
p() = (+1)(6)
3
(
2
+4+13).
Since
2
+4+13 = 0 = 2 |i3,
upon using Theorem 4.1.1 we see that the general solution is
y(t) = c
1
e
t
+e
6t
(c
2
+c
3
t +c
4
t
2
) +c
5
e
2t
cos(3t) +c
6
e
2t
sin(3t).
a
0
a
1
a
0
=a
1
/4
2
Distinct real-valued roots
Complex-valued roots
Figure 4.1: (color online) The type of the zeros of the characteristic polynomial for scalar second-order
ODEs in the a
0
a
1
-plane. The thick (red) curve denotes the transition between real-valued zeros and
complex-valued zeros.
Now let us apply the result of Theorem 4.1.1 to the special case that n = 2, i.e., let us consider the
general solution to the second-order ODE
y
+a
1
y
+a
0
y = 0.
The goal is to classify the behavior of solutions in a very general manner. This classication is not only
of interest in its own right, but it is crucial in understanding the dynamics associated with nonlinear
ODEs (the topic of Math 331). The characteristic polynomial is given by
p() =
2
+a
1
+a
0
,
which has the roots
=
|

1
2
_
a
1
|
_
a
2
1
4a
0
_
.
The roots are real-valued and distinct if a
2
1
4a
0
> 0 with
<
+
, complex-valued if a
2
1
4a
0
< 0, and are
a double root if a
2
1
4a
0
= 0 with
=
+
= a
1
/2 (see Figure 4.1). Now let us further rene this picture.
If a
0
< 0, then it will be the case that
< 0 <
+
, i.e., the origin x = 0 is a saddle point for the vectorized
system x
= Ax. Now suppose that a

0
> 0. If a
2
1
4a
0
> 0, it will then be the case that the real-valued
zeros
and
+
will have the same sign; furthermore, they will have the sign of a
1
. Thus, in this case
x = 0 will be a stable node if a
1
> 0, and an unstable node if a
1
< 0. Now assume that a
2
1
4a
0
< 0, so that
the zeros are complex-valued. The real part of the zero is given by a
1
/2; hence, if a
1
< 0 the origin is an
unstable spiral, if a
1
> 0 the origin is a stable spiral, and if a
1
= 0 the origin is a center (see Figure 4.2).
a
0
a
1
a
0
=a
1
/4
2
Figure 4.2: (color online) A cartoon of the dynamics in the x
1
x
2
-plane, where x
1
= y, x
2
= y
, for the
second-order ODE in each portion of the a
0
a
1
-plane.
For the sake of clarity we will focus here only on the case of n = 2, i.e., we will consider the ODE
y
+a
1
y
+a
0
y = f (t). (4.1.7)
We know that the general solution is given by y(t) = y
h
(t) + y
p
(t), and the construction of the homoge-
neous solution was discussed in Theorem 4.1.1. We now need to nd the particular solution.
Let us rst consider the method of variation of parameters. In order to derive the particular solution
using this method, it is best if we rst consider the vectorized form of (4.1.7), i.e.
x
=
_
0 1
a
0
a
1
_
x +
_
0
f (t)
_
. (4.1.8)
In this manner we can use the result of Lemma 3.2.2, which states that the particular solution is given
by
x
p
(t) = (t)
_
t
(s)
1
_
0
f (s)
_
ds, (4.1.9)
where (t) is the matrix-valued solution to the homogeneous problem. Now, by using Theorem 4.1.1
we can nd two solutions, say y
1
(t) and y
2
(t), to the homogeneous problemassociated with (4.1.7). Since
the vectorized form (4.1.8) is found via x
1
= y, x
2
= y
, this means that a matrix-valued solution to the

homogeneous problem is given by
(t) =
_
y
1
(t) y
2
(t)
y
1
(t) y
2
(t)
_
.
Since
(s)
1
=
1
det (s)
_
y
2
(t) y
2
(t)
y
1
(t) y
1
(t)
_
,
we can rewrite (4.1.9) in terms of the solutions y
1
and y
2
as
x
p
(t) = (t)
_
t
1
det (s)
_
y
2
(t) y
2
(t)
y
1
(t) y
1
(t)
__
0
f (s)
_
ds =
_
y
1
(t) y
2
(t)
y
1
(t) y
2
(t)
__
t
1
det (s)
_
f (s)y
2
(s)
f (s)y
1
(s)
_
ds.
In other words, after performing one more matrix/vector multiplication we see that
x
p
(t) =
_
t
f (s)y
2
(s)
det (s)
ds
_
y
1
(t)
y
1
(t)
_
+
_
t
f (s)y
1
(s)
det (s)
ds
_
y
2
(t)
y
2
(t)
_
.
Taking the rst component of this vector-valued solution then yields the following result:
Theorem 4.1.3 (Variation of parameters). A particular solution for the second-order ODE
y
+a
1
y
+a
0
y = f (t)
is given by
y
p
(t) =
__
t
f (s)y
2
(s)
det (s)
ds
_
y
1
(t) +
__
t
f (s)y
1
(s)
det (s)
ds
_
y
2
(t).
Here y
1
(t) and y
2
(t) are homogeneous solutions found via Theorem 4.1.1, and (t) is the matrix
(t) =
_
y
1
(t) y
2
(t)
y
1
(t) y
2
(t)
_
.
Remark 4.1.4. The argument leading to Theorem 4.1.3 does not require that the solutions to the homo-
geneous problem be explicitly given as in Theorem 4.1.1. In other words, it can also be used to solve the
problem
y
+a
1
(t)y
+a
0
(t)y = f (t).
The problem, of course, is that in general we do not know how to explicitly solve the homogeneous
problem
y
+a
1
(t)y
+a
0
(t)y = 0,
and consequently cannot then explicitly evaluate the integrals associated with the particular solution.
For an example, let us revisit a problem rst discussed in Section 3.3.2.1. In particular, let us nd
the general solution to the second-order ODE
y
+y = sec(t).
Since the characteristic polynomial is p() =
2
+1, by using Theorem 4.1.1 two solutions to the homo-
geneous problem are given by
y
1
(t) = cos(t), y
2
(t) = sin(t).
Using Theorem 4.1.3 the matrix (t) is given by
(t) =
_
cos(t) sin(t)
sin(t) cos(t)
_
det (t) 1;
hence, the particular solution is
y
p
(t) =
__
t
sec(s) sin(s) ds
_
cos(t) +
__
t
sec(s) cos(s) ds
_
sin(t) = cos(t) ln(cos(t)) +t sin(t).
Consequently, the general solution is
y(t) = c
1
cos(t) +c
2
sin(t) +cos(t) ln(cos(t)) +t sin(t).
Let us now nd the particular solution using the method of undetermined coecients. The idea is
exactly the same as in Section 2.3.2.2 (rst-order scalar ODEs) and Section 3.3.2.2 (rst-order systems
of ODEs); namely, you guess the solution form based upon the functional form of the forcing function.
As in Section 2.3.2.2, in order for the method to work it must be the case that the forcing function is a
linear combination of functions of the form p(t)e
at
cos(bt) and p(t)e
at
sin(bt), where p(t) is a polynomial.
We will again illustrate the method via a couple of examples. In all of the examples the homogeneous
problem will be
y
+4y
+3y = 0 y
h
(t) = c
1
e
t
+c
2
e
3t
.
First consider
y
+4y
+3y = 5e
t
.
The guess for the particular solution is
y
p
(t) = a
0
e
t
which after plugging into the ODE and simplifying yields the algebraic equation
8a
0
= 5 a
0
=
5
8
;
in other words, the particular solution is
y
p
(t) =
5
8
e
t
,
y(t) = c
1
e
t
+c
2
e
3t
+
5
8
e
t
.
For our second example, consider
y
+4y
+3y = 7e
t
.
Since the forcing function is also a homogeneous solution, our guess for the particular solution is
y
p
(t) = a
0
te
t
.
Plugging this guess into the ODE and simplifying yields the algebraic equation
2a
0
= 7 a
0
=
7
2
;
in other words, the particular solution is
y
p
(t) =
7
2
te
t
,
y(t) = c
1
e
t
+c
2
e
3t
+
7
2
te
t
.
For our last example in which we will actually nd the solution, consider
y
+4y
+3y = 4cos(2t).
Our guess for the particular solution is
y
p
(t) = a
0
cos(2t) +a
1
sin(2t).
Plugging this guess into the ODE and simplifying yields the algebraic system
a +8b = 4, 8a b = 0
_
1 8
8 1
_
a =
_
4
0
_
.
The solution to the linear system is
a =
_
1 8
8 1
_
1
_
4
0
_
=
1
65
_
4
32
_
4.2. EXAMPLE: THE FORCED-DAMPED OSCILLATOR 87
so that the particular solution is
y
p
(t) =
4
65
cos(2t) +
32
65
sin(2t).
The general solution is then
y(t) = c
1
e
t
+c
2
e
3t
4
65
cos(2t) +
32
65
sin(2t).
Let us now consider the problem of simply determining the form of the particular solution for a
more complicated homogeneous problem. For all of the given forcing functions it will be assumed that
the homogeneous problem is the 9
th
-order ODE
(D
2
+4)(D1)
3
(D
2
+6D+25)
2
y = 0
y
h
(t) = c
1
cos(2t) +c
2
sin(2t) +(c
3
+c
4
t +c
5
t
2
)e
t
+(c
6
+c
7
t)e
3t
cos(4t) +(c
8
+c
9
t)e
3t
sin(4t).
For each forcing function f (t) we will simply write the form of the corresponding particular solution
y
p
(t). The key in the end is that the guess must have nothing in common with the homogeneous solution:
(a) f (t) = 7e
3t
+t
2
y
p
(t) = a
0
e
3t
+a
1
+a
2
t +a
3
t
2
(b) f (t) = 5t sin(4t) y
p
(t) = (a
0
+a
1
t) cos(4t) +(a
2
+a
3
t) sin(4t)
(c) f (t) = 17t
2
e
t
y
p
(t) = t
3
(a
0
+a
1
t +a
2
t
2
)e
t
(d) f (t) = 22te
3t
sin(4t) y
p
(t) = t
2
(a
0
+a
1
t)e
3t
cos(4t) +t
2
(a
2
+a
3
t)e
3t
sin(4t).
Regarding examples (c) and (d), the power of t that is seen as a multiplicative prefactor was chosen to
be the minimal power so that no part of the guess corresponded to any part of a homogeneous solution.
4.2 Example: the forced-damped oscillator
0
b
0
=b
underdamped
overdamped
critically damped
Figure 4.3: (color online) A cartoon of the dynamics in the x
1
x
2
-plane for the homogeneous problem
associated with (4.2.1) (compare with Figure 4.2).
The solution for the forced and undamped oscillator was derived and analyzed in Section 3.4.1. We
now consider the eect of damping. Following the discussion from Section 3.1.2, the equation that is to
be analyzed is given by
y
+2by
+
2
0
y = f (t), (4.2.1)
where b > 0, and the coecients and functions in the ODE are related to the physical terms via
2b =
c
m
,
2
=
k
m
, f (t) =
1
m
F(t).
Our goal is to solve this ODE, and to analyze the solution behavior as a function of the parameters.
First consider the homogeneous problem,
y
+2by
+
2
0
y = 0.
Since b,
2
0
> 0, upon referring to Figure 4.3 (which is determined from Figure 4.2 upon setting a
1
= 2b
and a
0
=
2
0
) we know that solutions with any initial condition will decay to zero as t +: the only
question is whether the origin in the x
1
x
2
-plane (where x
1
= y and x
2
= y
) is a stable node or a stable

spiral. The characteristic equation is
p() =
2
+2b+
2
0
= 0 =
+
b |
_
b
2
2
0
.
There are three cases to consider. If b >
0
, then both roots are real-valued and unequal with
<
+
<
0, so that the origin is a stable node (this is the case of overdamping). If b <
0
, then the roots are
complex-valued with Re(
) = Re(
+
) = b < 0, so that the origin is a stable spiral (this is the case of
underdamping). The transition point, b =
0
, is known as the case of critical damping, and in this case
the homogeneous solution is given by
y
h
(t) = (c
1
+c
2
t)e
bt
. (4.2.2)
b
Re(
+
)

0
-
0
Figure 4.4: (color online) A cartoon of the plot of Re(
+
), which is dened in (4.2.3).
It is natural to enquire as to which value of b, for a given value of
0
, leads to solutions which decay
to zero most quickly as t +. This question is equivalent to asking for the value(s) of b for which the
real part of
+
is minimized. Since
Re(
+
) =
_
_
b, b
0
b +
_
b
2
2
0
, b >
0
,
(4.2.3)
the plot of Re(
+
) in Figure 4.4 reveals that solutions will decay most quickly in the case of critical
damping, i.e., when the damping constant is equal to the natural frequency
0
of the undamped system.
In this case an examination of the solution given in (4.2.2) reveals that the mass passes through the
4.2. EXAMPLE: THE FORCED-DAMPED OSCILLATOR 89
equilibrium position at most one time before damping to zero. Of course, in practice the decay will be
optimized as long as the damping constant b is suciently close to the natural frequency.
We now consider the full problem. Since the homogeneous solution is transient, i.e., it decays to
zero as t +, we will not worry about the eect of the initial condition on the solution. In addition,
we will, as in Section 3.4.1, assume that the forcing is sinusoidal; in other words, we will focus solely on
nding and analyzing the particular solution to
y
+2by
+
2
0
y = f
0
cos(t). (4.2.4)
Using the method of undetermined coecients we know that the particular solution is of the form
y
p
(t) = a
0
cos(t) +a
1
sin(t).
Plugging this guess into (4.2.4) and equating the cosine and sine terms yields the linear systemin matrix
form
_

2
0
2
2b
2b
2
0
2
_
a =
_
f
0
0
_
a =
f
0
_

2
0
2
2b
_
.
Here the positive constant is dened by
= (
2
0
2
)
2
+4b
2
2
.
In conclusion, the particular solution is given by
y
p
(t) =
(
2
0
2
)f
0
cos(t)
2bf
0
sin(t) =
f
0
cos(t +), tan =

2b
2
0
2
.
0
A*
0
*
/2
in-phase
out-of-phase
Figure 4.5: (color online) A cartoon of the plots of the amplitude A
() (left gure) and the phase
()
(right gure). Note that when is small the phase is roughly zero, so that the mass roughly oscillates
in-phase with the forcing. On the the other hand, if is large, then the phase is roughly , so that the
mass is oscillating out-of-phase with the forcing.
We know that for t >0 that the solution satises y(t) y
p
(t). The amplitude of oscillation is given
by
A
() =
f
0
,
and although the frequency of oscillation is exactly that of the forcing frequency, there is the phase shift
() which satises
() = tan
1
_
2b
2
0
2
_
.
The amplitude is maximized when
dA
d
() = 0 =
2
0
+2b
2
>
0
,
and satises A
() 0 as +. Note also that

A
) =
f
0
4b
2
(
2
0
+3b
2
)
,
so that the maximal amplitude becomes arbitrarily large as b 0
+
. It is also true that

0
as
b 0
+
; hence, as b 0
+
the plot of A
() closely approximates that given for the undamped problem

in Figure 3.11, and the maximal amplitude occurs roughly for the case of resonant forcing. The phase
shift is a monotone increasing function that satises
(0) = 0,
(
0
) = /2, and
() as +.
The graphs of each of these functions is given in Figure 4.5.
5
Discontinuous forcing and the Laplace transform
5.1 Discontinuous forcing
The goal of this chapter is to understand how to nd the general solution for (primarily) second-order
ODEs of the form
y
+a
1
y
+a
0
y = f (t), (5.1.1)
where the function f (t) is piecewise continuous. We will not talk about general forcing functions; instead,
we will focus our discussion on two functions which often appear in applications: the (Heaviside) unit
step function, and the (Dirac) delta function.
5.1.1 The Heaviside unit step function
The unit step function is dened as
u(t a) =
_
_
0, 0 t < a
1, a t.
(5.1.2)
The graph of the step function is given in the left gure of Figure 5.1. Practically speaking, the step
function models an instantaneous on-o switch: the switch is turned o until time t = a, and af-
terwards it is turned on. From a mathematical perspective, the step function can be used to write a
piecewise-dened function as a sum.
a a a+h
1/h
1
Figure 5.1: (color online) The plot of the step function u(t a) is given in the left gure, and the plot of
the function
h
(t a) is given in the right gure.
91
92 CHAPTER 5. DISCONTINUOUS FORCING AND THE LAPLACE TRANSFORM
For our rst example, suppose that
f (t) =
_
_
2, 0 t < 3
5, 3 t < 6
3, 6 t.
Upon using the fact that
1 u(t a) =
_
_
1, 0 t < a
0, a t,
(5.1.3)
we can rewrite the function as the sum
f (t) = 2u(t 0) +[5u(t 3) 2u(t 3)] +[3u(t 6) (5)u(t 6)]
= 2u(t 0) 7u(t 3) +8u(t 6).
In writing the function in the above manner, we are using (5.1.3) in the sense that, e.g.,
u(t 0) u(t 3) =
_
_
1, 0 t < 3
0, 3 t,
u(t 3) u(t 6) =
_
_
0, 0 t < 3
1, 3 t < 6
0, 6 t.
For our second example, suppose that
f (t) =
_
_
6, 0 t < 3
t
2
+2, 3 5
27 t, 5 < t.
As above, we can rewrite the function as the sum
f (t) = 6u(t 0) +[(t
2
+2)u(t 3) 6u(t 3)] +[(27 t)u(t 5) (t
2
+2)u(t 5)]
= 6u(t 0) +(t
2
4)u(t 3) (t
2
+t 25)u(t 5).
It will sometimes be useful to rewrite the function in the following manner. Using the algebraic identi-
ties
t
2
4 = [(t 3)
2
+6t 9
.
t
2
] 4 = (t 3)
2
+6[(t 3) +3
.
t
] 13 = (t 3)
2
+6(t 3) +5,
and
t
2
+t 25 = (t 5)
2
+11(t 5) +5
we can rewrite the function as
f (t) = 6u(t 0) +[(t 3)
2
+6(t 3) +5]u(t 3) [(t 5)
2
+11(t 5) +5]u(t 5).
It is this form of the function which turns out to be more useful in problems of integration.
5.1.2 The Dirac delta function
We dene the delta function via a limiting process. First, set
h
(t a) =
u(t a +h) u(t a)
h
(the graph is given in the right gure of Figure 5.1). This function has the property that for any a > 0
(and 0 < h < a),
_
+
0
h
(t a) dt = 1.
5.2. THE PARTICULAR SOLUTION: VARIATION OF PARAMETERS 93
Practically speaking, this function models the instantaneous application and removal of a large force:
one analogy would be the sharp strike of a hammer on a nail. The delta function is dened as the limit,
i.e.,
(t a) = lim
h0
+
h
(t a). (5.1.4)
The delta function is not a function in any real sense, because formally the limit gives
(t a) =
_
_
0, t a
+, t = a.
However, as we shall see in the next result, this denition does yield an extremely useful property:
Lemma 5.1.1. Suppose that f (t) is continuous at the point t = a, where a > 0. Then with the delta
function dened as in (5.1.4),
_
t
0
(s a)f (s) ds = f (a)u(t a).
Proof. First suppose that t < a. By the denition of
h
(s a) it is clear that for any h > 0
_
t
0
h
(s a)f (s) ds = 0;
consequently, it is true in the limit. Now suppose that t > a, and let 0 < h < t a. Set
F(t) =
_
t
0
f (s) ds,
so that
F(a +h) F(a)
h
=
1
h
_
a+h
a
f (s) ds =
_
t
0
f (s)
h
(s a) ds.
Since f (t) is continuous at t = a, by the Fundamental Theorem of Calculus we have that
lim
h0
+
F(a +h) F(a)
h
= f (a),
i.e.,
f (a) = lim
h0
+
_
t
0
f (s)
h
(s a) ds =
_
t
0
f (s)(s a) ds.
The nal result now follows from the denition of the unit step function.
Remark 5.1.2. Upon setting f (t) 1, the statement of Lemma 5.1.1 can be restated to say that the unit
step function is the anti-derivative of the delta function.
5.2 The particular solution: variation of parameters
As we recall from the variation of parameters formula given in Theorem 4.1.3, the particular solution
for (5.1.1) can be found by integrating the forcing function f (t) against certain algebraic combinations
of solutions to the homogeneous problem. However, as currently dened, the particular solution is
somewhat ambiguous, since the lower limit of the integration has not been dened. We take care of this
ambiguity in the following restatement of the result. In particular, we dene the particular solution
to have the property that y
p
(0) = y
p
(0) = 0 (a verication of this fact is an exercise for the interested
student).
Theorem 5.2.1 (Variation of parameters). The particular solution for the second-order ODE
y
+a
1
y
+a
0
y = f (t)
which satises y
p
(0) = y
p
(0) = 0 is given by
y
p
(t) =
__
t
0
f (s)y
2
(s)
det (s)
ds
_
y
1
(t) +
__
t
0
f (s)y
1
(s)
det (s)
ds
_
y
2
(t).
Here y
1
(t) and y
2
(t) are homogeneous solutions found via Theorem 4.1.1, and (t) is the matrix
(t) =
_
y
1
(t) y
2
(t)
y
1
(t) y
2
(t)
_
.
Remark 5.2.2. This form of the particular solution is useful in one manner because the solution to
the IVP then depends only on the homogeneous solution. In particular, if the initial condition is y(0) =
y
(0) = 0, then it will be the case that the coecients c

1
, c
2
in the general solution are given by c
1
= c
2
= 0.
For our rst example, let us nd the general solution to
y
+4y
+3y = 5u(t 2).

From an applications perspective, this ODE could be used to model an overdamped mass-spring system
which undergoes constant forcing which begins at t = 2. The homogeneous solution is given by
y
h
(t) = c
1
e
t
+c
2
e
3t
.
Before solving for the particular solution, let us make a couple of observations. The rst is that since
there is no forcing for t < 2, it must be the case that y
p
(t) 0 for t < 2. The second is that for t > 2 the
forcing is constant; hence, the particular solution must contain a constant.
Since
(t) =
_
e
t
e
3t
e
t
3e
3t
_
det (t) = 2e
4t
,
upon using the variation of parameters formula in Theorem5.2.1 and simplifying we see that the desired
particular solution is given by
y
p
(t) =
5
2
__
t
0
e
s
u(s 2) ds
_
e
t
5
2
__
t
0
e
3s
u(s 2) ds
_
e
3t
.
The integrals are easiest to evaluate after a little massaging. First, we have that
_
t
0
e
s
u(s 2) ds = 0, 0 t < 2,
while for t 2,
_
t
0
e
s
u(s 2) ds =
_
t
2
e
s
ds = e
t
e
2
= e
2
_
e
t2
1
_
.
Consequently, for t 2 we have
__
t
0
e
s
u(s 2) ds
_
e
t
= e
2
_
e
t2
1
_
e
t
= 1 e
(t2)
,
which in conclusion means that the rst term in the particular solution can be written as
5
2
__
t
0
e
s
u(s 2) ds
_
e
t
=
5
2
[1 e
(t2)
]u(t 2).
5.2. THE PARTICULAR SOLUTION: VARIATION OF PARAMETERS 95
Similarly, for the second term we have
_
t
0
e
3s
u(s 2) ds = 0, 0 t < 2,
while for t 2,
_
t
0
e
3s
u(s 2) ds =
_
t
2
e
3s
ds =
1
3
(e
3t
e
6
) =
e
6
3
_
e
3(t2)
1
_
.
Consequently, for t 2 we have
__
t
0
e
3s
u(s 2) ds
_
e
3t
=
e
6
3
_
e
3(t2)
1
_
e
3t
=
1
3
[1 e
3(t2)
],
which means that the second term in the particular solution can be written as
5
2
__
t
0
e
s
u(s 2) ds
_
e
t
=
5
6
[1 e
3(t2)
]u(t 2).
In conclusion, the particular solution is given by
y
p
(t) =
5
2
[1 e
(t2)
]u(t 2)
5
6
[1 e
3(t2)
]u(t 2) =
_
5
3

5
2
e
(t2)
+
5
6
e
3(t2)
_
u(t 2),
y(t) = c
1
e
t
+c
2
e
3t
+
_
5
3

5
2
e
(t2)
+
5
6
e
3(t2)
_
u(t 2).
A plot of this solution is given in Figure 5.2 for the initial condition y(0) = y
(0) = 0. Note the presence

of the constant term 5/3 in the solution for t 2: this is precisely what would be found by the method
of undetermined coecients if one were to nd the general solution only for t > 2. Further note that
the solution, as did the forcing function, contains the unit step function u(t 2). This second feature
will always be the case when considering forcing functions comprised of sums of unit step functions.
Indeed, it should be expected, for there cannot be a response to the forcing until the forcing has been
applied. Finally, note that the solution is continuous at t = 2, even though the forcing function itself
is discontinuous at that point. The eect of the discontinuous forcing shows up in the fact that the
derivative of the solution is discontinuous at t = 2.
For our second example, let us nd the general solution to the ODE
y
+4y
+3y = 5(t 2).

Since the homogeneous problem is the same as for the previous example, we can immediately write the
particular solution as
y
p
(t) =
5
2
__
t
0
e
s
(s 2) ds
_
e
t
5
2
__
t
0
e
3s
(s 2) ds
_
e
3t
.
Using the result of Lemma 5.1.1 we can simplify to say that
y
p
(t) =
5
2
e
2
e
t
u(t 2)
5
2
e
6
e
3t
u(t 2) =
5
2
_
e
(t2)
e
3(t2)
_
u(t 2);
consequently, the general solution is
y(t) = c
1
e
t
+c
2
e
3t
+
5
2
_
e
(t2)
e
3(t2)
_
u(t 2).
A plot of this solution is given in Figure 5.2 for the initial condition y(0) = y
(0) = 0.
0 2 4 6 8
0
0.5
1
1.5
2
t
y
(
t
)
Figure 5.2: (color online) The plot of the solution to the ODE y
+4y
+3y = f (t) with the initial condition

y(0) = y
(0) = 0. The thick dashed (blue) curve is the solution for f (t) = 5u(t 2), while the thick solid
(red) curve is the solution for f (t) = 5(t 2).
Now let us continue this problem by supposing that the initial condition is given by
y(0) = 2, y
(0) = 4.
By Theorem 5.2.1 the particular solution plays no role with respect to the initial data; hence, the coe-
cients satisfy the linear system in matrix form
_
1 1
1 3
_
c =
_
2
4
_
c =
_
1
1
_
.
The solution to the IVP is then
y(t) = e
t
e
3t
+
5
2
_
e
(t2)
e
3(t2)
_
u(t 2).
5.3 The particular solution: the Laplace transform
5.3.1 Denition and properties of the Laplace transform
Denition 5.3.1. Let f (t) be piecewise continuous and satisfy f (t) Ce
bt
for some M, b > 0. The
Laplace transform of f (t), denoted L[f ](s), is given by
L[f ](s)
_
+
0
e
st
f (t) dt.
Remark 5.3.2. In practice we often write L[f ](s) = F(s).
Remark 5.3.3. The exponential growth bound on f (t) implies that the F(s) is well-dened for s > b.
From this point forward it will be assumed that any functions for which the Laplace transform is
being computed satises the exponential growth bound. As a consequence of the fact that the Laplace
transform is dened via an improper integral, it is a linear function.
Proposition 5.3.4. The Laplace transform is linear, i.e.,
L[c
1
f +c
2
g](s) = c
1
L[f ](s) +c
2
L[g](s).
5.3. THE PARTICULAR SOLUTION: THE LAPLACE TRANSFORM 97
Since the Laplace transform is dened via an integration against an exponential function, it is not
terribly dicult to compute when f (t) is some combination of polynomials, exponential functions, and
sines and cosines. We present the results of these calculations for the functions which appear most
commonly in applications in the following table:
f (t) L[f ](s)
1 1/s, s > 0
t
n
n!/s
n+1
, s > 0
e
at
1/(s a), s > a
t
n
e
at
n!/(s a)
n+1
, s > a
cos(bt) s/(s
2
+b
2
), s > 0
sin(bt) b/(s
2
+b
2
), s > 0
e
at
cos(bt) (s a)/((s a)
2
+b
2
), s > a
e
at
sin(bt) b/((s a)
2
+b
2
), s > a
u(t a) e
as
/s, s > 0
(t a) e
as
, s > 0
f (t a)u(t a) e
as
L[f ](s)
(5.3.1)
All but the last entry in the table follow from the denition of the Laplace transform and direct
integration. The last entry in the table of (5.3.1) requires just a bit more work:
Proposition 5.3.5. L[f (t a)u(t a)](s) = e
as
L[f ](s).
Proof. By denition
L[f (t a)u(t a)](s) =
_
+
0
e
st
f (t a)u(t a) dt =
_
+
a
e
st
f (t a) dt = e
as
_
+
0
e
st
f (t) dt.
The result now follows from the fact that the last integral on the right is precisely L[f ](s).
Now let us do a couple of examples. Recall that we showed that the function f (t) given by
f (t) =
_
_
6, 0 t < 3
t
2
+2, 3 5
27 t, 5 < t,
can be rewritten as
f (t) = 6u(t 0) +[(t 3)
2
+6(t 3) +5]u(t 3) [(t 5)
2
+11(t 5) +5]u(t 5).
As a consequence of linearity of the Laplace transform (Proposition 5.3.4), the table in (5.3.1), and
Proposition 5.3.5 we have that
L[f ](s) = 6
1
s
+e
3s
_
2
s
3
+6
1
s
2
+5
1
s
_
e
5s
_
2
s
3
+11
1
s
2
+5
1
s
_
.
For another example, we have that
f (t) = 3 6t +5(t 2) +7e
2t
cos(t )u(t ) L[f ](s) =
3
s

6
s
2
+5e
2s
+7e
2
e
s
s 2
(s 2)
2
+1
.
The last term follows from the identity e
2t
= e
2
e
2(t)
.
When using the Laplace transform to solve ODEs, we will need to invert the Laplace transform in
order to nd the original function. This leads to the following question: given a transform L[f ](s),
what is the generating function f (t)? For the problems of interest the table, as well as the result of
Proposition 5.3.5 will be sucient in order to answer this question. In particular, what we will need to
do is algebraically massage the original L[f ](s) so that we see expressions in the second column of the
table: once we see these expressions, the generating function can be found simply by appealing to the
corresponding function f (t) in the rst column. The algebraic massaging takes the form of the method
of partial fractions. We will not review that material here; instead, when necessary, we will use the
WolframAlpha Partial Fraction Calculator in order to properly decompose the original transform.
For our rst example suppose that
L[f ](s) =
2s 5
(s
2
+4)(s +2)
.
By the method of partial fractions we can rewrite this as
L[f ](s) =
1
8
9s 2
s
2
+4

9
8
1
s +2
=
9
8
s
s
2
+4

1
8
2
s
2
+4

9
8
1
s +2
.
Comparing with the table in (5.3.1) we see that
f (t) =
9
8
cos(2t)
1
8
sin(2t)
9
8
e
2t
.
For our second example suppose that
L[f ](s) =
s
4
+6s
3
15s
2
10s +50
s
3
(s
2
2s +10)
.
By the method of partial fractions we can rewrite this as
L[f ](s) =
5
s
3

2
s
+
3s +2
s
2
2s +10
.
In order to use the table we must replace s with s 1 in the last term. Doing so and simplifying yields
L[f ](s) =
5
s
3

2
s
+
3(s 1) +5
(s 1)
2
+9
=
5
2
2
s
3
2
1
s
+3
s 1
(s 1)
2
+9
+
5
3
3
(s 1)
2
+9
.
Comparing with the table in (5.3.1) we see that
f (t) =
5
2
t
2
2 +3e
t
cos(3t) +
5
3
e
t
sin(3t).
For our last example suppose that
L[f ](s) = e
5s
2s 5
(s
2
+4)(s +2)
.
Upon using the result of our rst example and Proposition 5.3.5 we have that
f (t) =
_
9
8
cos(2(t 5))
1
8
sin(2(t 5))
9
8
e
2(t5)
_
u(t 5).
5.3.2 Application: second-order scalar ODEs
Before continuing, we need the following important result, which tells us the manner in which the
Laplace transform of a derivative of a function is related to the Laplace transform of a function.
Theorem 5.3.6. Assume that y(t), y
(t) Ce
bt
for some b > 0. The Laplace transform then satises for
s > b,
L[y
](s) = sL[y](s) y(0), L[y
](s) = s
2
L[y](s) sy(0) y
(0).
Proof. The result follows from an integration by parts. By denition we have
L[y
](s) =
_
+
0
e
st
y
(t) dt = lim
A+
_
e
st
y(t)
t=A
t=0
_
+s
_
+
0
e
st
y(t) dt.
The second term is sL[y](s), and assuming that s > b we have that
lim
A+
_
e
st
y(t)
t=A
t=0
_
= y(0).
The rst result now follows. As for the second result we have that
L[y
](s) = sL[y
](s) y
(0) = s [sL[y](s) y(0)] y
(0),
which simplies to the desired result.
Remark 5.3.7. In general, if we assume that a function has zero initial data, i.e., y(0) = y
(0) = =
y
(n1)
(0) = 0, then it will be true that L[y
(n)
](s) = s
n
L[y](s). If the initial data is not all zero, then there
is a more complicated formula which will not be of interest here. The formula can be easily derived,
however, by applying the idea leading to the second result in Theorem 5.3.6.
We now wish to use the Laplace transform to solve second-order IVPs of the form
y
+a
1
y
+a
0
y = f (t); y(0) = y
0
, y
(0) = y
1
. (5.3.2)
A careful examination of the result of Theorem 5.3.6 reveals that it will rst be convenient to rewrite
the ODE so that there is no initial data, for in this case it will be true that
L[y
](s) = sL[y](s), L[y
](s) = s
2
L[y](s).
This is relatively easy to do; however, in doing so we pay the price of adding additional terms to the
forcing function. Set
z(t) = y(t) y
0
y
1
t z(0) = z
(0) = 0.
Upon noting that
z
(t) = y
(t) y
1
, z
(t) = y
(t),
the IVP (5.3.2) can be rewritten as the IVP
z
+a
1
z
+a
0
z = f (t) [a
0
y
0
+a
1
y
1
] a
0
y
1
t; z(0) = z
(0) = 0. (5.3.3)
The eect of the transformation is to remove the initial conditions at t = 0, and instead put them into
the forcing via the mapping
f (t) f (t) [a
0
y
0
+a
1
y
1
] a
0
y
1
t
.
internal forcing
.
In this sense we can think of the initial conditions as being an internal forcing for the system, and the
function f (t) as being an external forcing. We will focus our eorts on solving this new problem (5.3.3)
for z(t). After solving for z(t) we can recover the solution to the original problem (5.3.2) by inverting,
i.e.,
y(t) = z(t) +y
0
+y
1
t.
As a consequence of this discussion, we can now without loss of generality make the following assump-
tion:
Assumption 5.3.8. The initial condition for any n
th
-order scalar ODE will be zero initial data, i.e.,
y(0) = y
(0) = = y
(n1)
(0) = 0. Consequently, L[y
(n)
](s) = s
n
L[y](s).
Remark 5.3.9. This type of transformation in which the initial data becomes an internal forcing can
easily be generalized to n
th
-order ODEs. In particular, setting
z(t) = y(t) y(0) y
(0)t
1
21
y
(0)t
2

1
(n 1)!
y
(n1)
(0)t
n1
will do the trick.
For our rst example, let us nd the solution to the IVP
y
+4y
+3y = 5u(t 2); y(0) = y
(0) = 0.
We previously found the general solution via the method of variation of parameters, and the solution
curve is given in Figure 5.2. Applying the Laplace transform to both sides and using the result of
Theorem 5.3.6 and the table in (5.3.1) gives
(s
2
+4s +3)L[y](s) = 5e
2s
1
s
L[y](s) = e
2s
5
s(s
2
+4s +3)
= e
2s
_
5
3

5
2
1
s +1
+
5
6
1
s +3
_
.
After comparing with the table in (5.3.1) and using the result of Proposition 5.3.5 we get the expected
result; namely,
y(t) =
_
5
3

5
2
e
(t2)
+
5
6
e
3(t2)
_
u(t 2).
For our second example, let us nd the solution to the IVP
y
+4y
+3y = 5(t 2); y(0) = y
(0) = 0.
We previously found the general solution via the method of variation of parameters, and the solution
curve is given in Figure 5.2. Applying the Laplace transform to both sides and using the result of
Theorem 5.3.6 and the table in (5.3.1) gives
(s
2
+4s +3)L[y](s) = 5e
2s
L[y](s) = e
2s
5
s
2
+4s +3
= e
2s
_
5
2
1
s +1

5
2
1
s +3
_
.
After comparing with the table in (5.3.1) and using the result of Proposition 5.3.5 we get the expected
result; namely,
y(t) =
_
5
2
e
(t2)
5
2
e
3(t2)
_
u(t 2).
5.3.3 Example: the forced oscillator
Here we will revisit the problem discussed in Section 3.4.1: the forced, but undamped, mass-spring
system which is modeled by
y
+
2
0
y = f (t).
Since we are only interested in the motion due to the forcing, it will be assumed that the initial condition
for the system is given by y(0) = y
(0) = 0. In order to simplify the calculations, it will henceforth be

0 5 10 15
2
1
0
1
2
t
y
(
t
)
Figure 5.3: (color online) Two solution plots of the forced oscillator when the forcing function is mod-
eled by a sum of delta functions (see (5.3.5)). Here we choose f
0
= 1 and
1
= 3.7. The solid (red) curve
is the solution for f
1
= 0.9, and the dashed (blue) curve is the solution for f
1
= 0.9. The two solutions
coincide for 0 t 3.7.
assumed that the natural frequency satises
0
= 1. Applying the Laplace transformto both sides yields
that the Laplace transform of the solution is given by
L[y](s) =
L[f ](s)
s
2
+1
. (5.3.4)
Suppose that the forcing function is given by the function
f (t) = f
0
(t 1) +f
1
(t
1
),
1
> 1. (5.3.5)
This forcing function is meant to model the striking of the system with, say, a hammer once at t = 1 with
force f
0
, and yet again at t =
1
> 1 with force f
1
. Given an f
0
, our goal is to determine f
1
and
1
such
that there is no motion for t
1
. Using the form of the solution given in (5.3.4) and the fact that
L[f ](s) = f
0
e
s
+f
1
e
1
s
,
we have that
L[y](s) = e
s
f
0
1
s
2
+1
+e
1
s
f
1
1
s
2
+1
.
Inverting yields that the solution is given by
y(t) = f
0
sin(t 1)u(t 1) +f
1
sin(t
1
)u(t
1
)
=
_
_
0, 0 t < 1
f
0
sin(t 1), 1 t <
1
f
0
sin(t 1) +f
1
sin(t
1
),
1
t.
A plot of two solution curves is given in Figure 5.3.
In order to determine the parameter values for which y(t) 0 for t >
1
, we need to rewrite the
solution. Doing so yields
y(t) = [f
0
sin(1) +f
1
sin(
1
)] cos(t) +[f
0
cos(1) +f
1
cos(
1
)] sin(t)
=
_
f
2
0
+f
2
1
+2f
0
f
1
cos(
1
1) cos(t +),
t
y
t
y
Figure 5.4: (color online) Cartoon plots of the solution curve for f
1
= f
0
(left gure) and f
1
= f
0
(right
gure). The time
1
is chosen in each gure so that there is no motion for t >
1
. In the left gure
1
= 1+4, while in the right gure
1
= 1+3. The (red) arrows depict the direction of the forcing at
t = 1 and t =
1
.
where the exact expression for the phase shift is irrelevant here. The second line follows after some
algebraic and trigonometric simplications. Now, it is clearly the case that y(t) 0 for t >
1
if and only
if
0 = f
2
0
+f
2
1
+2f
0
f
1
cos(
1
1) = (f
0
+f
1
)
2
2f
0
f
1
[1 cos(
1
1)].
From this we conclude that the motion can be stopped if
f
1
= f
0
;
1
= 1 +2k, k = 1, 2, . . . .
In other words, in order to stop the motion we must apply at t =
1
a force equal in magnitude, but
opposite in direction, as that applied at t = 1. Furthermore, the time at which we apply this force must
be k multiples of the natural period (for any k) after which the original force was applied (see the left
gure in Figure 5.4). Alternatively, the motion can be stopped if
f
1
= f
0
;
1
= 1 +(2k 1), k = 1, 2, . . . .
In this case the applied force at t =
1
has the same magnitude and direction as that applied at t = 1,
but the time is when it is applied is k multiples of the natural period plus an additional half-period (see
the right gure in Figure 5.4).
5.3.4 Example: one-tank mixing problem
A one-tank mixing problem in which the initial concentration of brine in the tank is zero is modeled by
the rst-order ODE
y
+ay = f (t), y(0) = 0. (5.3.6)

Here y(t) is the amount of salt in the tank, a > 0 is the ratio of the inow rate of briny solution to the
total amount of solution in the tank, and f (t) is proportional to the incoming concentration of briny
solution. Without loss of generality we can suppose that f (t) is actually the incoming concentration.
Here we will assume that the the incoming concentration is given by the function
f (t) = f
0
j=1
(t j), > 0.
This forcing function is meant to model the instantaneous injection of a briny solution of concentration
f
0
into the tank at t = , 2, 3, . . . .
Applying the Laplace transform to both sides yields that the Laplace transform of the solution is
given by
L[y](s) =
L[f ](s)
s +a
= f
0
j=1
e
js
s +a
.
Inverting yields that the solution is then given by
y(t) = f
0
j=1
e
a(tj)
u(t j).
In order to better understand the solution, consider the following representation of the solution. By the
denition of the unit step function we can rewrite the solution as
y(t) = f
0
_
_
0, 0 t <
e
a(t)
, t < 2
e
a(t)
+e
a(t2)
, 2 t < 3
e
a(t)
+e
a(t2)
+e
a(t3)
, 3 t < 4
n
j=1
e
a(tj)
, n t < (n +1).
(5.3.7)
Upon using the fact that
n
j=1
e
a(tj)
= e
at
n
j=1
e
aj
=
e
an
1
e
a
1
e
a(t)
=
e
a
e
a
1
e
a(tn)
e
a
e
a
1
e
at
, (5.3.8)
we can then say that the solution is given by
y(t) = f
0
_
_
0, 0 t <
e
an
1
e
a
1
e
a(t)
, n t < (n +1), n = 1, 2, . . . ,
(5.3.9)
or
y(t) = f
0
_
_
0, 0 t <
e
a
e
a
1
e
a(tn)
e
a
e
a
1
e
at
, n t < (n +1), n = 1, 2, . . . .
(5.3.10)
Two plots of the solution for dierent values of are given in Figure 5.5.
On observation of the solution plots given in Figure 5.5 reveals that there are (at least) two interesting
cases to understand regarding the solution behavior. First suppose that 0
+
(the solid (blue) solution
curve), which corresponds to the rapid injection of briny solution into the tank. Consider the form of
the solution given in (5.3.9). Let N 1 be such that n is small for n N + 1, and let us suppose that
0 t < (N +1). By LH opitals rule we have that
lim
0
+
e
an
1
e
a
1
= n;
hence, the solution for small and 1 n N +1 is approximately given by
y(t) f
0
ne
a(t)
= f
0
ne
a(n1)
e
a(tm)
, n t < (n +1).
It is then the case that on the interval n t < (n +1),
f
0
ne
an
y(t) f
0
ne
a(n1)
,
0 2 4 6 8
0
0.5
1
1.5
2
2.5
3
t
y
(
t
)
Figure 5.5: (color online) Plots of the solution curve of (5.3.9) under the condition that f
0
= a = 1. The
thick solid (blue) curve is the solution when = 0.5, and the thick dashed (red) curve is the solution
when = 2.5.
i.e., both the minimum and maximum values are increasing as a function of n. Furthermore, the dier-
ence between the minimum and maximum values, which is given by
f
0
ne
a(n1)
f
0
ne
an
= f
0
(e
a
1)ne
an
,
is an increasing function of n. This analysis explains the nature of the plot of the solid (blue) curve in
Figure 5.5 for (at least) 0 t 2(N = 3 with = 0.5). On the other hand, now suppose that M 1 is
large enough so that e
at
<1 for t > M. Then upon using the form of the solution given in (5.3.10) we
have that for n > M,
y(t) f
0
e
a
e
a
1
e
a(tn)
, n t < (n +1).
For suciently large t we can then say that the solution is periodic with period and satises
f
0
1
e
a
1
y(t) f
0
e
a
e
a
1
.
Note that the dierence between the minimum and maximum values is precisely f
0
. For the parameters
associated with Figure 5.5 we have that
f
0
1
e
a
1
1.54, f
0
e
a
e
a
1
2.54,
which is in very good agreement with the plotted solution. Further note that the minimal and maximal
amplitudes becomes arbitrarily large as 0
+
.
Now suppose that +, which corresponds to the dashed (red) solution curve in Figure 5.5.
Consider the form of the solution given in (5.3.10). Since for large
e
a
e
a
1
1,
and since e
at
<1 for t > >0, it will be the case that the solution is approximately given by
y(t) f
0
_
_
0, 0 t <
e
a(tn)
, n t < (n +1), n = 1, 2, . . . .
In other words, the sum in (5.3.7) is dominated by the last term. As is the case for small the solution
is periodic with period ; however, the upper and lower bounds are now dierent with
f
0
e
a
y(t) f
0
.
The maximal amplitude is independent of , whereas the minimal amplitude vanishes as +.
For the parameters associated with Figure 5.5 we have that f
0
e
a
0.08, which again is in very good
agreement with the plotted solution. Physically speaking, the rapid injection of the briny solution
causes the asymptotic minimal and maximal concentration in the tank to be (potentially much) larger
than that of the incoming concentration via a build-up of solute, whereas a very slow injection of the
briny solution does not allow for this possibility. Furthermore, as the time between injections increases
the solution inside the tank becomes almost distilled water before the next injection.
5.3.5 The transfer function and convolutions
H(s)
F(s) Y(s)=H(s)F(s)
Figure 5.6: (color online) The input F(s) is the Laplace transform of the forcing function f (t), and the
output Y(s) is the Laplace transformof the solution y(t). The physical systembeing modeled is described
by the transfer function H(s).
We will conclude our study of the Laplace transform by looking at the form of the solution through
a dierent lens. Consider the n
th
-order ODE with zero initial data given by
y
(n)
+a
n1
y
(n1)
+ +a
1
y
+a
0
y = f (t). (5.3.11)
Upon setting
Y(s) L[y](s), F(s) L[f ](s),
we know from Assumption 5.3.8 and the linearity of the Laplace transform that
(s
n
+a
n1
s
n1
+ +a
1
s +a
0
)Y(s) = F(s) Y(s) = H(s)F(s),
where
H(s)
1
s
n
+a
n1
s
n1
+ +a
1
s +a
0
is known as the transfer function for the ODE (5.3.11). In the engineering vernacular the transfer
function represents the physical system which is being modeled by an ODE. See Figure 5.6 for a typical
transform representation of a system response (the solution y(t)) to a given input (the forcing function
f (t)).
Regarding the transfer function there are (at least) two important things to note:
(a) it is the inverse of the characteristic polynomial for the ODE (5.3.11)
(b) the poles are the zeros of the characteristic polynomial.
Denote by h(t) the inverse Laplace transform of the transfer function, i.e., L[h](s) = H(s). Regarding the
homogeneous solution to (5.3.11), we knowthat the location of the zeros of the characteristic polynomial
determine the solution behavior. If all of the zeros have negative real part, then the homogeneous
solution will be transient, i.e., it will decay to zero exponentially fast as t +: the zero solution
is (asymptotically) stable. If (at least) one of the zeros has positive real part, then there will be some
solutions which grow exponentially fast as t +: the zero solution will be unstable. If some of the
zeros are purely imaginary (but simple), then there will be solutions which are uniformly bounded, but
which do not decay: the zero solution is stable. Now, since the transfer function is the inverse of the
characteristic polynomial, then by our method of inverting the Laplace transform we know that the
inverse h(t) will be a homogeneous solution. Thus, the poles of the transfer function, i.e., those s-values
for which the function 1/H(s) (which is simply the characteristic polynomial) has a zero, determine the
solution behavior. Figure 5.7 gives two cartoons in which the potential poles of a transfer function are
plotted.
Re(s)
Im(s)
Re(s)
Im(s)
stable unstable
Figure 5.7: (color online) The poles of the transfer function are denoted by (blue) circles. Recall that if
s = a +ib, then Re(s) = a and Im(s) = b. In the left gure all of the poles have negative real part, so that
the homogeneous solution is a transient solution. In the right gure some of the poles have positive real
part, so that the zero solution is unstable.
Finally, given the transfer function it turns out to be the case that we can immediately write down
the solution to the ODE (5.3.11) without rst nding the Laplace transform Y(s). The convolution of
h(t) and f (t) is given by
h f (t)
_
t
0
h(t u)f (u) du.
We have the following:
Lemma 5.3.10. Consider the ODE (5.3.11) with zero initial data. Letting h(t) denote the inverse Laplace
transform of the transfer function H(s), we have that the solution is given by
y(t) = h f (t)
_
=
_
t
0
h(t u)f (u) du
_
.
Proof. We have already seen that the Laplace transform of the solution is given by
Y(s) = H(s)F(s).
Given that we are proposing that the solution is the convolution of h(t) and f (t), in order to prove the
result we must then show that the Laplace transform of a convolution of two functions is precisely
the product of the Laplace transform of each function. By denition the Laplace transform of the
convolution is
L[h f ](s) =
_
+
0
e
st
h f (t) dt =
_
+
0
_
t
0
e
st
h(t u)f (u) du dt.
Reversing the order of integration yields
_
+
0
_
t
0
e
st
h(t u)f (u) du dt =
_
+
0
_
+
u
e
st
h(t u)f (u) dt du,
and applying the change of variables t = u +r means that
_
+
0
_
+
u
e
st
h(t u)f (u) dt du =
_
+
0
_
+
0
e
s(u+r)
h(r)f (u) dr du.
We conclude by noting that
_
+
0
_
+
0
e
s(u+r)
h(r)f (u) dr du =
_
+
0
_
+
0
e
sr
h(r)e
su
f (u) dr du
=
__
+
0
e
sr
h(r) dr
___
+
0
e
su
f (u) du
_
,
which is simply the product of H(s) and F(s).
As a consequence of Lemma 5.3.10, the solution to an ODE with zero initial data can easily be
written down once the transfer function has been inverted. For example, again consider the forced, but
undamped, mass-spring system modeled by
y
+
2
0
y = f (t), y(0) = y
(0) = 0.
The transfer function for this system is given by
H(s) =
1
s
2
+
2
0
=
1
0
s
2
+
2
0
,
so that the inverse Laplace transform h(t) is
h(t) =
1
0
sin(
0
t).
As a consequence of Lemma 5.3.10, the solution to the IVP is then
y(t) =
1
0
_
t
0
sin(
0
(t u))f (u) du.
Of course, it is readily apparent that even though the solution has this relatively compact form, it is not
so clear as to what is the solution behavior. It is for this reason that the convolution form of the solution
is rarely used in practice.

Ordinary Di Fferential Equations: (Lecture Notes)

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Ordinary Di Fferential Equations: (Lecture Notes)

Diunggah oleh

Hak Cipta:

Format Tersedia

Ordinary Dierential Equations

() is given by the thick (blue) curve

() (left gure) and the phase

+3y = f (t) with the initial condition

(t) = aD[f (t)] +bD[g(t)].

is the line through the origin which is parallel to a

(c) the RREF of A has no zero rows.

= A(t)x +f (t), x(t

(t), satises the equation

(t) = [inow rate] [outow rate].

= a(t)x +f (t). (2.1.4)

= g(t, x), x(t

= g(t, x), x(t

= a(t)x +f (t), x(t

= g(t, x), x(a) = x

= a(t)x +f (t) (2.3.1)

= a(t) and the Fundamental Theorem of Calculus we have that

for some c > 0 (c small

= A(t)x +f (t), x(t

= A(t)x +f (t), x(t

= A(t)x which is invertible at t = t

this second-order ODE becomes the rst-order system

this second-order ODE becomes the rst-order system

this second-order ODE becomes the rst-order system

this ODE becomes the rst-order system

(0) = 0). It is not dicult to check that it must then

() is given by the thick (blue) curve

() is larger than the spring can stretch and still

= Ax. Now suppose that a

, this means that a matrix-valued solution to the

) is a stable node or a stable

cos(t +), tan =

() (left gure) and the phase

() 0 as +. Note also that

() closely approximates that given for the undamped problem

(0) = 0, then it will be the case that the coecients c

+3y = 5u(t 2).

(0) = 0. Note the presence

+3y = 5(t 2).

+3y = f (t) with the initial condition

](s) = sL[y](s) y(0), L[y

(0) = s [sL[y](s) y(0)] y

](s) = sL[y](s), L[y

+3y = 5u(t 2); y(0) = y

+3y = 5(t 2); y(0) = y

(0) = 0. In order to simplify the calculations, it will henceforth be

+ay = f (t), y(0) = 0. (5.3.6)

Anda mungkin juga menyukai