Anda di halaman 1dari 11

What Exactly Is dy/dx?

Author(s): Hugh Thurston


Source: Educational Studies in Mathematics, Vol. 4, No. 3 (Apr., 1972), pp. 358-367
Published by: Springer
Stable URL: http://www.jstor.org/stable/3482173 .
Accessed: 24/02/2015 13:49
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

Springer is collaborating with JSTOR to digitize, preserve and extend access to Educational Studies in
Mathematics.

http://www.jstor.org

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

HUGH THURSTON

WHAT EXACTLY IS dyldx?

I. INTRODUCTION

Mathematicsis, by its very nature,preciseand logical - for the most part.


But thereis one topic that falls short of the ideal, namelythe definitionand
basic propertiesof dy/dx. Mathematicsis also a concise and efficientlanguage in which well-chosensymbols ease difficultcomputations;a mathematicalsymbolmay be packedfull of meaningand dependon a numberof
formulae,but at least it refers to a sharply-distinpreviously-introduced
guished,well-definedconcept.Again, there is an exceptionto this: dy/dx.
My aim in this essay is to pin-pointthese logical deficiencies,to show that
they are consequencesof a faulty definition,and to suggesta remedy.Fortunatelya very simplesatisfactorydefinitionis possible.
The readerhimselfmay havefelt uneasyaboutdy/dx, possiblyfor reasons
like the following.
(1) Most definitionsof dy/dx do not enableus to answersuch questions
as:
In dy/dx, does the y denote a numberor a function?
Does the x denote a numberor a function?
Does dy/dx itself denote a numberor a function?
(In the technicallanguageof mathematicallogic, the firstquestionwouldbe
"... is the symboly a numericalvariableor afunction-variable?".
In the language ot the New Mathematicsit would be "...is the y a placeholderfor a
numeralor for the name of a function?")
Forf', such questionspresentno problemat all. Given the definition

f' ()=

lim f ({

f R)

we can sayfirmlythatf andf' denotefunctionsand that {,f (f) andf'(4) denote numbers.Indeed,a well-writtendefinitionof f'({) will start "Givena
functionf and a numberc...".
(2) Everyoneknows that the relationy = f (x) yields dy/dx=f '(x). And
if f is differentiableat, say, 0, thenf'(0) makes sense. But dy/dOdoes not.
This is probablythe commonestway in which the ambiguitiesin (1) plague
student.
the slightly-more-thoughtful-than-average
EducationalStudies in Mathematics4 (1972) 358-367. All Rights Reserved
CopyrightC 1972 by D. ReidelPublishingCompany,Dordrecht-Holland

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

WHAT EXACTLY IS dy/dx?

359

(3) Let v and s denotethe velocityand displacementof a particlemoving


along a straightline. If the particleat any time reversesdirectionthen v is
not a functionof s, becausethe value of s does not then uniquelydetermine
the correspondingvalue of v. In particular,if the particleoscillatesin simple
harmonicnotion (for more than half a cycle),then v is not a functionof s.
In this contextdv/dsis a valid and usefulconcept,as any physicistwill confirm. If we refer to any definition,however,we see that dv/ds is not even
definedunlessv is a functionof s.
(4) Evenwithinpuremathematicsitself,we needa definitionof dy/dx that
can be appliedwhen y is not a function of x. 'Implicitdifferentiation',for
instance, needs such a definition.I take an examplefrom a recent book
(RobertBonic et al., Freshmancalculus,page 89, problemla) to emphasize
that the difficultyis still with us; many similarexamplesare to be found in
older books. This particularproblemgoes as follows.
Given
(i) y4= X2+5
show that dy/dx=x/2y3.
Given only (i), we cannot say that y is a functionof x: the value of x does
not uniquelydeterminethe correspondingvalueof y. Thusdy/dx is not even
defined.
(5) The situationis very similarwhenx andy are relatedparametrically.
If x and y are both functionsof u, then a standardresultfrom elementary
calculusis
dy

dy/du

dx

dx/du

Yet the parametricrelationshipdoes not guaranteethat y is a functionof x.


For instance,if x= u2 and y = 2au (the familiarparametricequationsof a
parabola)then y is not a function of x. Neverthelessthe standardresult
dy/dx= a/u does serveto give the slope of the tangentcorrectly- or would
do if dy/dx werein fact defined.
(6) I have kept until last the strongestobjectionto the traditionaldefinition of dy/dx: it is ambiguous.The definitioncomesin severalvariants,but
they are all equivalentto the following
If y = f (x),

then dy/dx = f'(x).

What happensif both y= f (x) and y=g(x)? Naturally,dy/dx will then be


equalboth to f'(x) and to g'(x). What,then,iff'(x) #g'(x)? This can easily
happen:if f, g, andx satisfythe conditionf(x) = g(x), it does not necessarily
follow thatf '(x)=g'(x). Becausethe definitionis ambiguouswe shouldnot
perhaps be too surprisedthat it leads to trouble.

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

HUGH THURSTON

360

We can avoid someof the difficultiesif we foregothe use of dy/dx and use
only the 'prime'notation. Indeed, in the MathematicalGazette(No. 385)
M. Bruckheimerand R.E. Scratonstateflatly "thereis no suchthing as differentiationwith respectto x." In a courseof pureanalysiswe mightwell do
withoutdy/dx and,indeed,few writers,if any, statethe mean-valuetheorem
in Leibniz'snotation.But for calculusin the wide sense the sacrificewould
be too great.We needds/dt, dv/dtand v dv/dsin kinematics;we needdP/dV
of gases; we needdcx/dsand ds/dx in the
and dV/d Tin the thermodynamics
differentialgeometryof curves;and we need dy/dx whenx andy arerelated
If u, v, w, x, y and z are six coordinatesof parts
implicitlyor parametrically.
of a machine(angularcoordinatesof wheels,linearcoordinatesof rods, and
so on) each of which is a function of any other, then we can differentiate
any one of these coordinateswith respectto any otherin Leibniz'snotation
withoutfurtherado; whereasthe 'prime'notationwould requirethirtysymbols for functionslike thef for whichu=f(v). Problemswith a largenumber
of relatedvariablesare not uncommonin the exact sciences.
Finally, even if the teacherdoes not feel that
dw dz dy

dw

dz dy dx

dx

is immenselysuperiorto
(f o g o h)' (cx)= (f'o g o h) (c) (g' o h) (c) h' (cx)

as a formulafor the chain-rulewith two links, his studentssurelywill.


Anyone who prefersthe notationDf to f ' for the derivativeoff will find
that everythingthat I haveto say aboutf ' will applyto Df, andhe can mentally substituteit throughoutthe essay if he likes. Similarly,everythingthat
I have to say about dy/dx will apply to Cauchy'snotation Dxy. (Whether
or not it is a pedagogicalmistaketo use a symbolthat 'looks like a quotient
but is not one' is a separatequestionthat I am not concernedwith here.)
By tinkeringwiththe definitionof dy/dx we can overcomethe variousdifficulties. For (1) we have to decide whetherx, y and dy/dx are to denote
numbersor functions.A littlethoughtshowsthattheymustdenotefunctions.
This clearsup point (2); if x denotesa functionthereis no temptationto replace it by a numeral.We must, of course, be clearwhat we mean byf (x)
wheref and x are both functions:it is the compositefunction
{ _*f(x (0)

We can overcomedifficulty(3) by generalizingthe definitionof dy/dx so


that it applieswheny is only locallya functionof x. This clearsup (4) and
(5) at the sametime. For (6) we can restrictthe definitionto the casewherex

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

WHAT EXACTLY IS

dy/dx?

361

is locally one-to-onebecausethen x and y betweenthem definef uniquely


(locally).We end up with somethinglike the following:
If x and y are real-valuedfunctionsof a real variable(or "If x and y are
functionsin R into R") then dy/dx is the functiondefinedas follows. For
each real number4 in the domainof x whichhas a neighbourhoodN such
thatthe restrictionx1 of x to Nis one-to-one,and suchthatfis differentiable
at x(X), wheref=y(xj'), the value of dy/dx at { isf'(x(4)).
Fortunately,this complicationis unnecessary.If, insteadof tinkeringwith
the traditionaltreatment,we startfrom scratch,we can devisea verysimple
and naturaldefinition.It will agreewith the common-senseidea of what the
derivativeof y with respectto x ought to be, becauseit will arise naturally
from the idea of rate-of-change;and dy/dx will be defineddirectlyin terms
of x and y withoutany extraneous!. Moreover,the well-knownformula
dz dy

dz

dy dx

dx

will follow as a directresultof the 'limitof a product'rule.It will be a theoremthat (undercertainreasonableconditions)if y=f(x) thendy/dx=f '(x).
Further,if y=f (x) locally, then dy/dx=f '(x) locally.
II. THE DEFINITION

OF DERIVATIVE

IN LEIBNIZ 'S NOTATION

The derivativeis essentiallya tool for dealingwith ratesof change,so let us


startby consideringa typicalrate of change.
Example
Let v denote the velocityof a particlemovingalong a line: that is to say, let
v(T) be the velocityat time T, for each relevantT. Let s denote the displacementof the particle.BetweentimesT and T+ 3 the velocitywill changefrom
v (T) to v (T+ 6); thus it will increaseby v(T+ 3)-v (T). Betweenthe same
times,the displacementwill increaseby s(T +6)-s(-(). Thereforethe average
rateof increaseof velocitywith respectto displacement,whichis, of course,
the averageincreasein velocityper unit increasein displacement,will be
V (T + 6) -V

(T)

S (T + ) -S (T)

providedthats(T+3)-s(T)# O.
The(instantaneous)rate of increaseof velocitywith respectto displacement
at time T is the limit as 6 approacheszero of this averageincrease.
This suggeststhe followingdefinition.

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

362

HUGH THURSTON

Definition
If x and y are functions,then dy/dx is the functiondefinedby
dy
dx

Y(T+)-Y(T)

T =
() =

rnOx

)-X

(T)

for everyTfor which the limit exists.


Theorem1. If dy/dx(T) and dz/dy(T) exist, then

dz T)dz
dx

dy

T)dy ()
dx

Proof: from the theoremon the limit of a product.


Corollary.If dy/dx(T) and dz/dy(T) exist for every T in the domainsof
x, y and z, then
dz
dx

dz dy
dy dx

Theorem2. If y' (T) exists and x' (T)exists and is non-zero,then


dy (T)

y'(T)

dx

x_(T)

Proof: from the theoremon the limit of a quotient.


Corollary.If x' () and y' (T)exist for everyx in the domainsof x and y,
and if x' (T) is neverzero, then
dy
dx

y'
x'

at T andf at x (), and if


Theorem3. If y=f (x) and x is differentiable
x'(T)0j; then

dy (T) =f

'(x (T)).

Proof: y' (T)=f ' (x(T)).x' (T) by the theoremon the derivativeof a composite function.Theorem2 then yields our result.
Corollary.If x is differentiableand iff is differentiableat x (x) for every-r
in the domain of x, it follows that:
if y = f(x),

then dy/dx = f'(x).

Theorem4. If thereis a neighbourhoodU of a numberT such that


y (4) = f(x (4)) for every 4 in U

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

WHAT EXACTLY

IS

dy/dx?

363

and if x is differentiableat r andf at x (X)and if x'(i) # 0, then


dy (r) =f
dx

(x))
(X

Proof: apply theorem3 to the restrictionsof x and y to U.


Note: we could phrasethe first two lines of the statementof the theorem
as "If y=f (x) locally at T..."
We could replacethe conditions"x is differentiableat T and x' (T')#0" by
"x is continuousat T and x is not constanton any neighbourhoodof T", but
the proof would then be ratherdifferent.
Example(implicitdifferentiation)
If x and y two differentiablefunctionsfor which
X2 + y2

(thatis to say, the functionX2 + y2 is constantand has value 1), then, on the
set of points at which the value of x' is non-zero
2x + 2y

dy
dx

= 0.

Proof:let T be suchthatx' (T)+0. By the theoremson the derivativeof a sum


of two functions,of a compositefunction,and of a constant,
2x (T)*x' (T)+ 2y (T)*y' (T)= 0.
Then theorem2 yields
dy
2x (r) + 2y (T)-d (T)
dx
as required.

Example(parametricdifferentiation).
Let t be the identity-functionand

x = t2 and y =2at
then
dy
dx

a
t
III. CONCLUSIONS

Now that we havepresentedour definition,let us turnbackto the difficulties


mentionedin the introduction.The firsttwo are resolvedbecausethe defini-

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

tion makesit crystal-clearthat dyldx, x andy arefunctions.Difficultynumber 3 is overcome,becausewe have defineddv/dswithoutrequiringv to be


a function of s. If s'('r)A0 then, by theorem2, the value of dv/ds at T is
v' (T)/s'(X).Thereforedv/dsis definedfor all timesexceptthose at whichthe
particleis (momentarily)stationary;whichis preciselywhata physicistwould
expecton common-sensegrounds.
In (4), the problemcited is not properlystated: it does not make clear
what (if anything)x and y are. If statedproperlyit would not, of course,be
Given numbersx and y such that
(i) y4=X2+5

show that dy/dx=x/2y3


because this, on any definitionof dy/dx, is nonsense; (j3)4 =2 2+ 5, but
no-one in his senseswould maintainthat d13/d2= 1/127. If the problemis
Givenfunctionsx and y such that
(i) y4=X2+5

show that dy/dx=x/2y3


then (assumingthatx andy arereasonablywell-behaved)we havea situation
like the one in (3); and so difficultynumber4 is resolvedin the sameway as
number3. Similarremarksapplyto (5).
Finallythe reallyimportantdifficulty,(6), is overcomebecauseour definition of dy/dx is directlyin termsof x and y.
One useful by-productof our definitionis that it removesall temptation
to use what I call the pseudo-Leibnizian
notation,df/dx.
This notation is largelyconfinedto text-booksand examination-papers,
and is not muchused in actualproblem-solving.It was definitelyneverused
by Leibnizhimself.Indeed,it seemsquitemodern;it is not usedin the classic
texts by Goursat,de la Vall6ePoussin,Hardyor Courant.
It occursin two slightvariants.If y =f (x) then sometimesdf /dx is used
to mean dy/dx (that is, f' (x)) and sometimesto meanf '; and frequently
the definitionis so unclearthat we cannot tell which of the two is meant.
It matterslittle becauseneitheris satisfactory.Neither,for example,yields
the formula
dz dy

dz

dy dx

dx

IV. EXTENSION

TO ONE-DIMENSIONAL

MANIFOLDS

So far we have taken it for grantedthat, in dy/dx, the y and the x denote

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

WHAT EXACTLY IS

dy/dx?

365

real-valuedfunctionsof a real variable; and theorem3 in particularused


the standardchain-rulefor suchfunctions.For certainapplications,however,
we need to considermore generalfunctions.
It is well-knownthat the state of a specimenof gas is determinedwhen
its pressure,volume and temperatureare given. Let us considerchangesin
a given specimenof gas at a fixed temperature(isothermalchanges),a type
of changeof some importancein physics.Thenthe state of the gas is determinedwhentwo numbers,namelythevaluesof thepressureandof thevolume,
aregiven.However,in practice,theseare not independent;thereis a relation
betweenthem, and for an 'ideal'gas the relationis Boyle'slaw: the volume
is inverselyproportionalto the pressure.More generally,if the gas is not
'ideal',we assumethat some such law exists:thereis a functionF such that
our isothermalspecimencan have pressure4 at the sametime as volumetj if
andonlyif
F ({, 'i) = 0.
Because({, il) determinesthe state of the gas, and becausethe concept of
'state'is not yet preciselydefined,let us agreeto call each pair (4, ?t) satisfying (1) a state of the specimen.A 'functionof state' is then any function
whose domainis the set of all states- which,in the isothermalcase we are
considering,is a subsetof R2. For example,the entropyis a functionof state,
and so is the internalenergy(or rather,to be precise,the differencebetween
the internalenergy and the internalenergyin some fixed standardstate).
For any reasonablefunctionF (and for the functiongivenby Boyle'slaw in
particular)the set of all stateswill be a reasonablywell-behavedcurvein R2.
We can say preciselywhat 'reasonablywell-behaved'means in any given
context(roughlyit means"wellenoughbehavedfor us to be able to use the
techniquesof calculus")and we have a technicalnamefor the set of points
so described:it is a 'one-dimensionalmanifold'.
Definition
A subsetM of R2 is a one-dimensionalmanifoldif, for each pointp of M,
thereis a neighbourhoodU of p in R2 such that M n U is either
{(c

f(t)):

edom f}

for some functionf, or


{(ff(t), U):6edomf}
for some functionf.
In otherwordsif we selecta pointp of M and directa magnifying-glass
at
p (whosefieldof viewis U) then the partof M thatwe see (whichis M n U)

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

366

HUGH THURSTON

is eithera curvewith equationof the form


y = f(x)

or a curvewith equationof the form


x =f(y).

More succinctlywe say that each point of M is eitherlocally of the form


y =f(x) or locally of the form x =f (y).

If each functionf is, say, continuous,then we say that M is continuous,


and so on. It turns out that M is well-enoughbehavedfor our purposesif
everyf is continuouslydifferentiable.
Let us returnto our specimenof gas. Let U be a functionof state.We shall
naturallywant to definedU/dP and dU/dV, whereP and V denotepressure
and volume. The functions U, P and V are definedas follows.
U(4, ,l) is the value of U when the gas is in state ({, ,).
P ({, i) is the pressurewhen the gas is in state (4, ).
V({, tl) is the volumewhen the gas is in state ((, ).
ThusP ({, t) = 4 and V({, i) -.= That is to say,P and V are the two coordinate-functions.
In general,let M be a one-dimensionalmanifoldin R2 and x andy be the
coordinate-functionswith domain M, and u any real-valuedfunctionwith
domainM. Our definitionof du/dx will naturallybe
(ii)

du
(x)

= lim u (O + 6)-u

(a)

for each point a of M for whichthe limit exists, and the definitionof du/dy
will be similar.
We must,of course,be quiteclearwhat we meanby 'limit'in this context.
Two definitionsof limit are to be found in the literature,a strong (older)
form and a weak (newer)form; we must use the newerform.
Let us look at the older form. Iff is a functionin R2 unto R, and a is a
point of R2, then the numberAis a limit off at a if for eachneighbourhood
N of A there is a neighbourhoodU of a such thatf () 'eN whenever4 e U
but { # oc.Under this definitionf cannot have a limit at a unlessthereis at
least one neighbourhoodof a all of which(exceptpossiblyfor a itself)is containedin the domainoff.
Underthe newerdefinition,A is a limit off at a if ocis a limit-pointof domf
and for each neighbourhoodN of A there is a neighbourhoodU of a such
thatf (c) e N whenever4 E U n domf
In our case, domfis M, and so we can describethe limitin (ii) as 'thelimit
as oc+c approachesa in M'. The reasonwhy the olderdefinitionwill not do

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions

WHAT EXACTLY IS dy/dx?

367

is that there will be no neighbourhoodof a containedin M: a curveis too


thin to containa disc. (Peano'sspace-fillingcurveis not a one-dimensional
manifold.)
We can make a similardefinitionof manifoldin R3 or spaces of higher
dimension.To go into any furtherdetail requirespartialdifferentiation(or
Fr6chetderivatives).
Universityof BritishColumbia,
Vancouver

This content downloaded from 130.235.66.10 on Tue, 24 Feb 2015 13:49:08 PM


All use subject to JSTOR Terms and Conditions