DIFFERENTIAL CALCULUS
WORKBOOK
Leo Jonker
Queen’s University
Fall 2007-2008
Foreword
i
ii
duction to each topic is concise, but complete. You should be able to read
and understand it. At certain points the flow of thought in interrupted by
a question for which the answer is not provided. If the notes are used in
a class, this is where the instructor will ask you to become involved in the
completion of the argument. If the notes are used as self-study, then this is
where you try to make sense of the question. When you get stuck you can
consult James Stewart, Calculus: Early Transcendentals, the text for the
course, or any Calculus book that may be available to you. At other times,
the gaps in the notes follow problems in which the theory you have learned
can be applied. These, too, are for you to fill in. Some of these problems are
indicated as ”Concept Questions”. These are multiple-choice questions that
signal points at which key concepts are often misunderstood or misapplied.
They have short answers, but do require some thought.
To find these learning objects, go to the course website and click on “Com-
puter assistance” in the column on the left.
Contents
iii
iv CONTENTS
WHAT IS A DERIVATIVE REALLY? 1
This means that if you try successive input values, equally spaced and very
close to each other, the output values seem to go up at a constant rate. Here
are two examples:
Consider the function f (x) = x2 . We know this function is not linear (its
graph is not a straight line), and yet, look at the following table of (a few)
values of this function:
x x2
3.000 9.000
3.001 9.006
3.002 9.012
3.003 9.018
It sure looks linear on this small scale! So does the following table of some
values of the function f (x) = ex :
x ex
1.000 2.71828
1.001 2.72100
1.002 2.72372
1.003 2.72644
In each case the output goes up by 0.00272 when the input goes up by 0.001;
so once again, this function looks linear when you get close.
This property of looking linear when you “zoom in to a point” becomes
visual when you apply it to the graph of a function. Here are three pictures,
produced by Maple, of the graph of the exponential function ex at smaller
and smaller scales around the point x = 1:
2 CONTENTS
8
3.2
3.6
3.1
6
3
3.2
2.9
y4 y y
2.8
2.8
2.7
2
2.6
2.4
2.5
0
0 0.5 1 1.5 2 0.8 0.9 1 1.1 1.2 0.9 0.95 1 1.05 1.1
x x x
This property that makes graphs look linear when observed on a small scale,
is closely related to the fact that when we look around us, the earth looks
more or less flat. We can, perhaps, forgive our distant ancestors for thinking
that it really was flat.
Of course, as with all simplifications, there is a caveat. The secret is not
quite true of all functions. If the graph of your function has a corner at some
point, as in the case of f (x) = |x|, no amount of zooming in is going to make
that corner go away. A function that looks linear when you zoom in to a
point on its graph is called a differentiable function. Fortunately most
of the functions we deal with are differentiable, or else are differentiable at
most points.
Eventually we should turn our observation about the local linearity of func-
tions into something we can do mathematics with. In particular, the insight
should help us do certain kinds of calculations.
Before we get to that, however, there is a second secret at the heart of
Calculus, which has a lot to do with the one we revealed at the start of this
chapter, even though it may seem quite unlike it:
5+δ 5
1. ≈ A. Equation 3.
3 3
2
δ+δ
2. ≈δ B. Equation 5.
δ2
3. 5 + δ ≈ 5 C. Equations 1 and 3.
7−δ 7
4. 2
≈ D. Equations 1, 3, and 4.
δ+δ δ
13
5. ≈ 13 E. All of the equations.
δ
explain why this function will seem linear if we stay very close to any given
initial point. Say we start at x = a, and examine what happens to f (x) when
x is replaced by a + ∆x, where ∆x represents a very small number. That is,
we want to see how the value of this function changes as ∆x changes, but is
kept very small. Then
By the secret on page 2, we can ignore (∆x)2 for all practical purposes, as
long as ∆x stays very small. Thus
f (a + ∆x) ≈ a2 + 2a∆x .
g(x) = a2 + 2a(x − a) ,
where, as we said already, f (x) and g(x) are extremely close to each other,
as long as x is very close to a. Notice that if we let x = a the two function
are precisely equal to each other.
Notice, from the equation of the graph of g, that the tangent line to the
graph of f at (a, f (a)) has slope 2a. This slope is defined as the derivative
of f (x) = x2 at a.
f (x) = c0 + c1 x + c2 x2 + · · · + cn xn .
Then to find the derivative f we could apply the same idea to all of the
terms at once: replace x by a + ∆x. Then
f (x) = c0 + c1 x + c2 x2 + · · · cn xn
is
f (x) = c1 + 2c2 x + 3c3 x2 + · · · + ncn xn−1 .
Notice that this says that to calculate the derivative of a polynomial you
should calculate the derivatives of the power functions in the terms, and then
combine the results in the way they were combined in the original polynomial.
Notice also that the derivative of a constant function is always 0 (see what
you get if all the coefficients except c0 are equal to zero). The reason is that
the graph of a constant function is a horizontal line, and therefore the slope
is 0 at each point.
WHAT IS A DERIVATIVE REALLY? 9
Example 4. For what values of x does the graph of f (x) = x4 −8x3 + 22x2 −
24x + 3 have positive slope and for which values does it have negative slope?
Rates of Change
(Sections 2.7 and 2.9 in Stewart)
f
∆y
∆x
a a + ∆x
RATES OF CHANGE 11
In the diagram, this average rate is equal to the slope of the line segment
joining the points (a, f (a)) and (a + ∆x, f (a + ∆x)). The line extending this
segment is sometimes referred to as a “secant line”. Now suppose we let ∆x
get smaller and smaller. Then the point (a + ∆x, f (a + ∆x)) will gradually
slide (along the graph) towards (a, f (a)), and in the process the secant line
will turn (a little) and gradually converge to the tangent line, as in the next
picture:
a
On the one hand, the slope of this tangent line is what we mean by the
derivative f (a). On the other hand, this slope is what the slopes of the
secant lines get closer and closer to as we let ∆x get smaller and smaller.
This idea is expressed as a limit:
f (a + ∆x) − f (a)
f (a) = lim .
∆x→0 ∆x
This means that if, in the fractional expression on the right, you let ∆x get
closer and closer to 0, then the value of that expression gets closer and closer
to f (a). In general, whenever we have any kind of expression E(u) that
depends on a variable u, then by “the limit of E(u) as u → c” we mean the
quantity L that E(u) gets closer and closer to when u is allowed to get closer
and closer to c. The notation for this is
L = lim E(u) .
u→c
f (x) − f (a)
f (a) = lim .
x→a x−a
dy d
f (x) = = f (x) .
dx dx
Let us see how this limit approach might work for the function f (x) = x2 :
The slope of the secant between (a, a2 ) and (a + ∆x, (a + ∆x)2 ) is
2a∆x + (∆x)2
f (a) = lim = lim (2a + ∆x) = 2a .
∆x→0 ∆x ∆x→0
Either way, we have confirmed the derivative calculation we did much earlier.
We saw earlier that the derivative of xn is nxn−1 , for any positive integer n.
In fact this formula is valid even when n is not a positive integer, but any
fixed real number. We do not have time to do all the proofs necessary to
show this, but we do have time to show it for two instances.
RATES OF CHANGE 13
What does the expression on the right look like for this function?
1 1
a
+ ∆x − a
A. lim
∆x→0 ∆x
1 1
a+∆x
− a
B. lim
∆x→0 ∆x
∆x
C. lim
∆x→0 (a + ∆x) − a
Example 5. Calculate this limit and thus find the derivative of x−1 .
√
For a second example, we will find the derivative of x = x1/2 .
√
Concept Question 4. If we want to calculate the derivative of x, say at
a point a, we should begin with the expression for the average rate of change
between a and a + ∆x. Which of the following is the correct expression for
that?
√ √
a + ∆x − a
A.
∆x
(a + ∆x) − a
B.
∆x
√ √ √
( a + ∆x) − a
C.
∆x
We must take the limit of this expression as ∆x → 0. This does not look
simple, and may require some algebra to transform the expression:
Example 6. Find the limit of √ the expression for the average rate of change
at a for the function f (x) = x.
A combination of similar techniques will show that the theorem for derivatives
of power functions,
d n
x = nxn−1 ,
dx
16 CONTENTS
is true for any fixed rational power n. In fact,as already pointed out, the
theorem is true for any fixed real power whatsoever; but proving that requires
other methods.
2x
x
e
(0.5)x
(1.5)x
1x
ea+∆x − ea
A. lim =1
∆x→0 ∆x
(e0 + e∆x ) − e0
B. lim =1
∆x→0 ∆x
e∆x − 1
C. lim =1
∆x→0 ∆x
Theorem:
d x
(e ) = ex
dx
In other words, at every point, the value of the function ex is precisely
equal to its rate of change.
If you have had some study of exponential growth in high school, this will
sound very familiar. When something grows exponentially, it means that the
rate at which a function increases is proportional to is value.
18 CONTENTS
Note that when we speak of the function cf we mean the function whose value
at an input x is equal to the product cf (x). If we evaluate this function at
a we get cf (a); if we evaluate it at a + ∆x we get cf (a + ∆x). Thus the
average rate of change of cf between a and a + ∆x is
cf (a + ∆x) − cf (a)
.
∆x
This factors immediately to become
f (a + ∆x) − f (a)
c .
∆x
To find the derivative of cf we have to take the limit of this as ∆x → 0. But
since c is constant we can take the limit of the fraction and multiply by c
afterwards - it will come to the same thing (think about this!) Therefore
f (a + ∆x) − f (a)
(cf ) (a) = c lim = cf (a) .
∆x→0 ∆x
If now we replace a by x we have a proof of the theorem.
20 CONTENTS
The next theorem tells us how to differentiate the sum of two functions whose
derivatives we know already. Notice that it tells us to go ahead, differentiate
each function, and add the results afterward.
The Sum Rule: If the functions f and g are both differential, then so
is f + g, and
d d d
[f (x) + g(x)] = f (x) + g(x)
dx dx dx
or, equivalently,
(f + g) = f + g
Notice that we have already seen instances illustrating the sum rule, for when
we differentiated polynomials we found that it amounted to differentiating
each of the polynomial’s terms separately and then adding (or subtracting)
them.
Concept Question 6. What is the correct expression for the average rate
of increase of the function f (x) + g(x) between a and a + ∆x?
We are now ready to prove the Sum Rule, by taking the limit of this expres-
sion as ∆x → 0:
Now that we know how to differentiate the sum and the difference of two
functions whose derivatives we already know, it is time to turn to their prod-
uct. Here the product of f and g is thought of as a new function whose value,
at input x, is given by f (x)g(x). In other words,
(f g)(x) = f (x)g(x) .
DERIVATIVES OF COMBINATIONS OF FUNCTIONS 23
You are probably wondering why the formula is so complicated. This will
become clear when we look at the proof:
Notice that the coefficient of the linear term in this expression is f (a)g (a) +
g(a)f (a). That is,
The domain of this new function does not include those input values at which
the denominator g gives the value 0.
or, equivalently,
f f g − g f
=
g g2
Again, when we prove this theorem we will see why the Quotient Rule has
this strange form.
ex
Example 12. Calculate the derivative of .
x2
√
x
Example 13. Find an equation of the tangent line to the graph y =
x+1
at the point (1, 0.5).
Example 14. Determine, from the expression we found for its derivative,
whether the graph has a low point, high point, or inflection point at x = 1.
Here is a picture of this graph, generated using Maple by entering the com-
mand
> plot(sqrt(x)/(x+1), 0..2);
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2
x
DERIVATIVES OF COMBINATIONS OF FUNCTIONS 29
Example 15. (From Section 3.7 in Stewart) If a tank holds 5000 liters of
water, which drains from the bottom of the tank in 40 minutes, then Torri-
celli’s law give the volume V of the water remaining after t minutes as
2
t
V = 5000 1 − 0 ≤ t ≤ 40
40
Find the rate at which water is draining from the tank after 5 minutes, and
after 30 minutes.
If you go to the course web site and click ”computer-assisted learning” in the
column on the left, it will take you to ”MathQ’s” a set of interactive learning
devices. One of these is on the topic of the chain rule. Try it out!
We have studied derivatives of constant multiples, sums, differences, products
and quotients of functions. Another important way to combine functions is
through composition. If f and g are functions, the composite f ◦ g is the
function that for the input x gives the output f (g(x)). In other words,
(f ◦ g)(x) = f (g(x))
√
Concept Question 7. Suppose f (x) = x and g(x) = x2 + 1. Then the
composition f ◦ g is
A. x + 1
√ 2
B. x(x + 1)
√
C. x2 + 1
√
D. x x2 + 1
3
Concept Question 8. We want to express the function k(x) = ex +5 as a
composition of two functions f and g; that is we want k = f ◦g. What should
f and g be?
Another way to put this: If y = f (x) and u = g(x) are both differen-
tiable, then
dy dy du
=
dx du dx
Notice, however, that the two factors on the right are not fractions! Each
represents a derivative.
32 CONTENTS
The idea behind the chain rule is very simple. The following diagram illus-
trate this. Imagine starting at some initial input x, as illustrated on the left
of the diagram. The function g turns x into u, which f then turns into y.
Composing the two functions f and g is a little like building a “box” around
the two components f and g to make a new “input-output machine” called
F (dashed line in the diagram). Suppose we change x by a small amount ∆x.
This produces a small change ∆u in the output of g, where the relationship
between the two small changes is given (approximately) by the derivative:
∆u
≈ g (x)
∆x
Since u is also the input to f , the small change ∆u in its input in turn
produces a small change ∆y in the output of f , again related to ∆u by the
derivative of f :
∆y
≈ f (u)
∆u
The derivative of F at x is (approximately) the ratio of the two small changes:
∆y ∆y ∆u
F (x) ≈ =
∆x ∆u ∆x
u = g(x)
x g f y
∆x produces ∆u produces ∆y
∆u ∆y
∆x ≈ g (x) ∆u ≈ f (u)
so,
THE CHAIN RULE 33
∆y
≈ f (g(x)) · g (x) .
∆x
∆y ∆u
≈ ≈
∆u ∆x
While the equations in this discussion are all in the form of approximate
∆y ∆u ∆y
equalities, in that the ratios , and all represent average rates
∆u ∆x ∆x
of change, and therefore only approximate the instantaneous rates, these
approximations get better and better as ∆x (and therefore also ∆u) tend to
zero. In the limit, therefore the equality becomes an exact equality combining
derivatives.
∆y
Notice that lim represents the derivative of f at u = g(x); that is, it
∆u→0 ∆u
is equal to f (u) = f (g(x)). This accounts for the form of the first factor in
the formula for the chain rule:
√
Example 16. Calculate the derivative of x2 + 1
3 +5
Example 17. Find the derivative of 4ex .
2 +3x+8)7
Example 18. Find the derivative of F (x) = e(x
Inverse Functions
(Section 1.6 in Stewart)
Our next goal and final goal in this short introduction to derivatives is to
present the formulas for the derivatives of logarithmic functions. In order
to do this we have included a brief review of the theory of inverse functions
(this section) followed by the definition of logarithmic functions (the next
section), before we discuss their derivatives in the last section.
Note the following indication that there is computer help for this topic on the
course web site. This computer help is interactive and designed not only to
test whether you remember the facts about inverse functions, but also to help
you understand them.
Computer help for this topic is available on the course website
y
f (x)
Not one-to-one
x1 x2
The graph shows that the points x1 and x2 produce the same output
(i.e. f (x1 ) = f (x2 )), therefore the function f (x) is not one-to-one. We can
geometrically check to see if a function is one-to-one if no horizontal line
intersects the graph of the function at more than one point.
36 CONTENTS
y
g(x)
One-to-one
The graph of g(x) illustrates that there are no two distinct points in its
domain that produce the same output (i.e. g(x1 ) = g(x2 ), whenever x1 = x2 ).
Suppose we have a function f with domain D and range R. We want to look
for a function g that undoes what f does. If such a function g exists for
f , then it is called the inverse of f . In symbols this can be written as
g(y) = x ⇐⇒ f (x) = y.
This can be illustrated with the following diagram:
x f y
D R
x g y
Then
g(f (x)) = x,
and
f (g(x)) = x.
INVERSE FUNCTIONS 37
A. g(y) = y + 5
B. g(x) = 1/(x + 5)
C. g(x) = x − 5
Example 19. Suppose f (x) = 1/x, then what is its inverse g(x)?
N2 + 3H2 −→ 2NH3 .
If the reaction is started with the concentration of hydrogen (in moles per
litre) three times as high as the concentration of nitrogen, then the amount
of ammonia (in moles) produced after t minutes is given by the formula
1
A(t) = 2α −
,
3 1
8α3
+ kt
y1
x
x1
Example 21. Sketch the graph of the inverse of f on the axes given above.
Usually we write f −1 , rather than g, for the inverse. Note that this means
that f −1 (x) is not the same as (f (x))−1 ! The former is the inverse function
applied to x, while the latter is the reciprocal of f (x).
To understand the effect of the inverse function g of a function f whose
graph is before you, you do not have to reflect the graph of f to produce the
graph of g! The effect of the function f is shown by going from a point x1
on the horizontal axis via a vertical line to the graph of f , and then moving
horizontally towards the vertical axis, reaching it at y1 = f (x1 ). The effect of
g is seen on the same picture (without drawing the graph of g) by reversing
this procedure: start at a point y1 , then travel horizontally to the graph of
f and then vertically to reach the horizontal axis at x1 = g(y1).
LOGARITHMIC FUNCTIONS 43
Logarithmic Functions
(Section 1.6 in Stewart)
loga (x).
g(y) = x ⇐⇒ f (x) = y ,
g(f (x)) = x ,
f (g(x)) = x .
Concept Question 11. Let f (x) = ax and g(x) = loga (x). Which of the
following statements is true? (⇐⇒ means “if and only if ”)
[i. ] loga (y) = x ⇐⇒ ax = y [A. ] i,ii, and iii
[ii. ] a(loga )x = x [B. ] ii, iii, and v
[iii. ] loga (ax ) = x [C. ] i, iii, and iv
[iv. ] aloga (x) = x [D. ] i, ii, and iv
[v. ] loga (y) = ax ⇐⇒ x = y [E. ] iii, iv, and v
loga x
x
1
You might say that logarithms turn products into sums, quotients into dif-
ferences and powers into products. These laws are a direct consequence of
the laws of exponents on page 57 in the textbook. MAKE SURE YOU
KNOW THESE LAWS VERY WELL.
LOGARITHMIC FUNCTIONS 45
ln x = y ⇐⇒ ey = x;
∴ ln ex = x;
∴ eln x = x (if x > 0);
∴ ln(e) = 1.
x−2 x−3
Example 24. Solve for x in the expression ln x−1
= 1 + ln x−1
.
Example 26. We know that x = aloga (x) . Using the Chain Rule for the
right hand side, differentiate both sides and then use the result to calculate
the derivative of loga x.
−4 −3 −2 −1 1 2 3