Calculo LSE

© All Rights Reserved

327 tayangan

Calculo LSE

© All Rights Reserved

- Mathematics for Economists
- Algebra
- LSE Abstract Mathematics
- Study Guide (Math Econ)
- Mathematics 1
- AC3091_vle Financial reporting
- EC2096_vle[1]
- EC2020 Study Guide
- Subject Guide
- Quantitative finance
- ECONOMICS1002 Subject Guide (2016)
- ST104a Vle
- Financial Economics
- EC3115 - Monetary Economics
- EC2065_2013
- PS3086 Subject Guide
- MT105b_vle[1]
- LSE_Distribution Theory.pdf
- xfdgsfzfgdrfesdc
- ST104B Statistics 2.pdf

Anda di halaman 1dari 360

J.M. Ward

MT1174, 2790174

2011

Undergraduate study in

Economics, Management,

Finance and the Social Sciences

This subject guide is for a 100 course offered as part of the University of London

International Programmes in Economics, Management, Finance and the Social Sciences.

This is equivalent to Level 4 within the Framework for Higher Education Qualifications in

England, Wales and Northern Ireland (FHEQ).

For more information about the University of London International Programmes

undergraduate study in Economics, Management, Finance and the Social Sciences, see:

www.londoninternational.ac.uk

This guide was prepared for the University of London International Programmes by:

J.M. Ward, Department of Mathematics, London School of Economics and Political Science.

This is one of a series of subject guides published by the University. We regret that due to

pressure of work the author is unable to enter into any correspondence relating to, or arising

from, the guide. If you have any comments on this subject guide, favourable or unfavourable,

please use the form at the back of this guide.

Publications Office

Stewart House

32 Russell Square

London WC1B 5DN

United Kingdom

Website: www.londoninternational.ac.uk

Published by: University of London

University of London 2011

The University of London asserts copyright over all material in this subject guide except where

otherwise indicated. All rights reserved. No part of this work may be reproduced in any form,

or by any means, without permission in writing from the publisher.

We make every effort to contact copyright holders. If you think we have inadvertently used

your copyright material, please let us know.

Contents

Contents

Preface

1 Introduction

1.1

This subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

1.3.1

The VLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.2

1.4

1.5

Examination advice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.6

2 Functions

2.1

2.1.1

11

2.1.2

Combinations of functions . . . . . . . . . . . . . . . . . . . . . .

15

2.1.3

Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

2.1.4

Identities

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

2.1.5

Applications of functions . . . . . . . . . . . . . . . . . . . . . . .

26

Conic sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

2.2.1

Parabolae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

2.2.2

Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

2.2.3

Ellipses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

2.2.4

Hyperbolae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

2.2

3 Differentiation

3.1

53

53

Contents

3.2

55

3.2.1

Standard derivatives . . . . . . . . . . . . . . . . . . . . . . . . .

56

3.2.2

57

3.2.3

Higher-order derivatives . . . . . . . . . . . . . . . . . . . . . . .

65

Using derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

3.3.1

66

3.3.2

68

3.3.3

Applications of derivatives . . . . . . . . . . . . . . . . . . . . . .

72

3.3.4

Existence of derivatives . . . . . . . . . . . . . . . . . . . . . . . .

74

78

3.4.1

Maclaurin series . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

3.4.2

Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

3.3

3.4

4 One-variable optimisation

4.1

103

4.2

104

4.2.1

104

4.2.2

Stationary points . . . . . . . . . . . . . . . . . . . . . . . . . . .

106

4.2.3

109

110

4.3.1

110

4.3.2

111

4.3.3

Points of inflection . . . . . . . . . . . . . . . . . . . . . . . . . .

113

Curve sketching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

114

4.4.1

115

4.4.2

119

4.4.3

121

Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

123

4.5.1

Constrained optimisation . . . . . . . . . . . . . . . . . . . . . . .

125

4.5.2

126

4.5.3

Applications of optimisation . . . . . . . . . . . . . . . . . . . . .

127

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

130

4.3

4.4

4.5

ii

103

Contents

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

131

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

136

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

138

5 Integration

145

5.1

145

5.2

147

5.2.1

Standard integrals . . . . . . . . . . . . . . . . . . . . . . . . . .

147

5.2.2

149

5.2.3

Integration by substitution . . . . . . . . . . . . . . . . . . . . . .

150

5.2.4

Integration by parts

. . . . . . . . . . . . . . . . . . . . . . . . .

158

5.2.5

162

5.2.6

167

5.3

. . . . . . . . . . . . . . . . . . . . . . . . .

170

5.3.1

170

5.3.2

178

Applications of integrals . . . . . . . . . . . . . . . . . . . . . . . . . . .

182

5.4.1

182

5.4.2

183

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

186

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

187

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

195

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

196

5.4

201

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

201

6.2

Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

202

6.2.1

Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

203

6.2.2

204

Partial differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

210

6.3.1

211

6.3.2

212

6.3.3

214

6.3.4

220

6.3.5

224

226

6.3

6.4

iii

Contents

6.4.1

Tangent planes . . . . . . . . . . . . . . . . . . . . . . . . . . . .

226

6.4.2

Gradient vectors . . . . . . . . . . . . . . . . . . . . . . . . . . .

230

6.4.3

Directional derivatives . . . . . . . . . . . . . . . . . . . . . . . .

232

6.4.4

234

6.4.5

Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

238

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

241

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

242

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

253

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

255

7 Two-variable optimisation

7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

261

7.2

Unconstrained optimisation . . . . . . . . . . . . . . . . . . . . . . . . .

261

7.2.1

Stationary points . . . . . . . . . . . . . . . . . . . . . . . . . . .

262

7.2.2

264

7.2.3

269

7.2.4

Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

272

Constrained optimisation . . . . . . . . . . . . . . . . . . . . . . . . . . .

275

7.3.1

277

7.3.2

279

7.3.3

282

7.3.4

Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

284

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

289

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

290

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

294

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

296

7.3

8 Differential equations

303

8.1

303

8.2

First-order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

306

8.2.1

307

8.2.2

308

8.2.3

310

Second-order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

312

8.3.1

312

8.3.2

314

8.3

iv

261

Contents

8.4

318

8.4.1

319

8.4.2

321

Applications of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

323

8.5.1

323

8.5.2

324

8.5.3

325

8.5.4

Market trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

327

Learning outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

327

Solutions to activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

328

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

334

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

334

8.5

339

341

Contents

vi

Preface

This subject guide is not a course text. It sets out a logical sequence in which to study

the topics in this subject. Where coverage in the main texts is weak, it provides some

additional background material.

I am grateful to Mark Baltovic for his careful reading of a draft of this guide and for his

many helpful comments.

Preface

Chapter 1

Introduction

In this very brief introduction, we aim to give you an idea of the nature of this subject

and to advise you on how best to approach it. We give general information about the

contents and use of this subject guide, and on recommended reading and how to use the

textbooks.

1.1

This subject

Calculus, as studied in this Level 1 course is primarily the study of derivatives and

integrals of functions of one variable and partial derivatives of functions of several

variables.

Our approach here is not just to help you acquire proficiency in techniques and

methods, but also to help you understand some of the theoretical ideas behind these.

For example, after completing this course, you will hopefully understand why the

derivatives of a function allow you to determine where a function of one variable is

optimised. In addition to this, we try to indicate the uses of some of the methods in

applications to economics, finance and related disciplines.

Aims of the course

The broad aims of this course are as follows:

to enable students to acquire skills in the methods of calculus (including

multivariate calculus), as required for their use in further mathematics subjects and

economics-based subjects;

to prepare students for further courses in mathematics and/or related disciplines.

As emphasised above, however, we do also want you to understand why certain

methods work: this is one of the skills that you should acquire. Indeed, the

examination will not simply test your ability to perform routine calculations, it will also

probe your knowledge and understanding of the principles that underlie the material.

Learning outcomes

We now state the broad learning outcomes of this course, as a whole. At the end of this

course and having completed the essential reading and activities, you should be able to:

1. Introduction

use the concepts, terminology, methods and conventions covered in the course to

solve mathematical problems in this subject;

solve unseen mathematical problems involving understanding of these concepts and

application of these methods;

see how calculus can be used to solve problems in economics and related subjects;

demonstrate knowledge and understanding of the underlying principles of calculus.

There are a couple of things that we should stress at this point. Firstly, note the

intention that you will be able to solve unseen problems. This means simply that you

will be expected to be able to use your knowledge and understanding of the material to

solve problems that are not completely standard. This is not something you should

worry unduly about: all topics in mathematics expect this, and you will never be

expected to do anything that cannot be done using the material of this course.

Secondly, we expect you to be able to demonstrate knowledge and understanding and

you might well wonder how you would demonstrate this in the examination. Well, it is

precisely by being able to grapple successfully with unseen, non-routine, questions that

you will indicate that you have a proper understanding of the topic.

Topics covered

Descriptions of the topics to be covered appear in the relevant chapters. However, it is

useful to give a brief overview at this stage.

We start by revising some of the basic ideas that are needed for the study of this course

and, in particular, the idea of a function of one variable. We then introduce derivatives

of such functions and how to find them using the techniques of differentiation. This

enables us to see how such functions are behaving and, in particular, enables us to see

where such functions are optimised. We then introduce integrals of such functions and

how to find them using the techniques of integration. In particular, this will enable us

to see how to relate functions to areas. We then introduce functions of several variables

and develop techniques for finding their partial derivatives. In particular, we will see

how we can use these ideas to see where these slightly more complicated functions are

optimised. Lastly, we introduce the idea of a differential equation and examine methods

for solving them.

Throughout this subject guide, the emphasis will be on the theory as much as on the

methods. That is to say, our aim in this subject is not only to provide you with some

useful techniques and methods from calculus, but to also enable you to understand why

these techniques work.

1.2

Reading

There are many books that would be useful for this subject. We recommend two in

particular, and a couple of others for additional, further reading. (You should note,

however, that there are very many books suitable for this course. Indeed, almost any

text on first-year university calculus will cover the majority of the material.)

1.2. Reading

Textbook reading is essential as textbooks will provide you with more in-depth

explanations than you will find in this subject guide, and they will also provide many

more examples to study and exercises to work through. The books listed are the ones

we have referred to in this guide.

Essential reading

Detailed reading references in this subject guide refer to the editions of the set

textbooks listed below. New editions of one or more of these textbooks may have been

published by the time you study this course. You can use a more recent edition of any

of the books; use the detailed chapter and section headings and the index to identify

relevant readings. Also check the virtual learning environment (VLE) regularly for

updated guidance on readings.

Binmore, K. and J. Davies Calculus: Concepts and Methods. (Cambridge:

Cambridge University Press, 2002, second revised edition) [ISBN 9780521775410].

Anthony, M. and N. Biggs Mathematics for economics and finance: methods and

modelling. (Cambridge: Cambridge University Press, 1996) [ISBN 9780521559133].

By and large we will be following Binmore and Davies but, sometimes, we will follow

the simpler treatment found in Anthony and Biggs. Both texts, when used wisely, will

provide you with a large number of examples for you to study and exercises for you to

attempt. It is recommended that you purchase both of these. Another thing you might

like to bear in mind is that some of the material from Binmore and Davies that we omit

here will be useful if you go on to study 176 Further Calculus.

Further reading

Once you have covered the essential reading you are then free to read around the

subject area in any text, paper or online resource. You will need to support your

learning by reading as widely as possible and by thinking about how these principles

apply in the real world. To help you read extensively, you have free access to the VLE

and University of London Online Library (see Section 1.3.2). However, two useful

textbooks that we have referred to in this subject guide are the following.

Simon, C.P. and L. Blume Mathematics for economists. (New York and London:

W.W. Norton and Company, 1994) [ISBN 9780393957334].

Adams, R.A. and C. Essex Calculus: a complete course. (Toronto: Pearson, 2010,

seventh edition) [ISBN 9780321549280].

Simon and Blume is a useful supplementary text with an emphasis on applications of

the material to economics; whereas Adams and Essex (which is merely an example from

a large range of very similar calculus textbooks) is a detailed calculus textbook which

contains much material which is beyond the scope of this course. Both of these texts are

suitable as sources of additional explanation, examples and exercises, but they are

probably not worth purchasing.

1. Introduction

1.3

In addition to the subject guide and the essential reading, it is crucial that you take

advantage of the study resources that are available online for this course, including the

virtual learning environment (VLE) and the Online Library.

You can access the VLE, the Online Library and your University of London email

account via the Student Portal at

http://my.londoninternational.ac.uk

You should receive your login details in your study pack. If you have not, or you have

forgotten your login details, please email uolia.support@london.ac.uk quoting your

student number.

1.3.1

The VLE

The VLE, which complements this subject guide, has been designed to enhance your

learning experience, providing additional support and a sense of community. It forms an

important part of your study experience with the University of London and you should

access it regularly.

The VLE provides a range of resources for EMFSS courses:

Self-testing activities: Doing these allows you to test your own understanding of

subject material.

Electronic study materials: The printed materials that you receive from the

University of London are available to download, including updated reading lists

and references.

Past examination papers and Examiners commentaries: These provide advice on

how each examination question might best be answered.

A student discussion forum: This is an open space for you to discuss interests and

experiences, seek support from your peers, work collaboratively to solve problems

and discuss subject material.

Videos: There are recorded academic introductions to the subject, interviews and

debates and, for some courses, audio-visual tutorials and conclusions.

Recorded lectures: For some courses, where appropriate, the sessions from previous

years Study Weekends have been recorded and made available.

Study skills: Expert advice on preparing for examinations and developing your

digital literacy skills.

Feedback forms.

Some of these resources are available for certain courses only, but we are expanding our

provision all the time and you should check the VLE regularly for updates.

1.3.2

The Online Library contains a huge array of journal articles and other resources to help

you read widely and extensively.

To access the majority of resources via the Online Library at

http://tinyurl.com/ollathens

you will either need to use your University of London Student Portal login details, or

you will be required to register and use an Athens login.

The easiest way to locate relevant content and journal articles in the Online Library is

to use the Summon search engine.

If you are having trouble finding an article listed in a reading list, try removing any

punctuation from the title, such as single quotation marks, question marks and colons.

For further advice, please see the online help pages at

www.external.shl.lon.ac.uk/summon/about.php

1.4

We have already mentioned that this guide is not a textbook. It is important that you

read textbooks in conjunction with the guide and that you try problems from the

textbooks. The exercises at the end of the main chapters of this subject guide are a very

useful resource and you should try them once you think you have mastered the material

from the chapter. You should really try these exercises before consulting the solutions,

as simply reading the solutions provided will not help you at all. Sometimes, the

solutions we provide will just be an overview of what is required, i.e. an indication of

how you should answer the questions, but in the examination, you must always show all

of your calculations. It is vital that you develop and enhance your problem-solving skills

and the only way to do this is to try lots of exercises.

1.5

Examination advice

Important: the information and advice given here are based on the examination

structure used at the time this guide was written. Please note that subject guides may

be used for several years. Because of this we strongly advise you to always check both

the current Regulations for relevant information about the examination, and the virtual

learning environment (VLE) where you should be advised of any forthcoming changes.

You should also carefully check the rubric/instructions on the paper you actually sit

and follow those instructions.

Remember, it is important to check the VLE for:

Up-to-date information on examination and assessment arrangements for this

course.

1. Introduction

Where available, past examination papers and Examiners commentaries for the

course which give advice on how each question might best be answered.

This course is assessed by a three hour unseen written examination. There are no

optional topics in this subject: you should study them all and this is reflected in the

structure of the examination paper. There are five questions (each worth 20 marks) and

all questions are compulsory. A sample examination paper may be found in an appendix

to this subject guide.

Please do not think that the questions in your real examination will necessarily be very

similar to the exercises in this subject guide or those in the sample examination paper.

The examination is designed to test you. You will get examination questions unlike the

questions in this subject guide. The whole point of examining is to see whether you can

apply your knowledge in familiar and unfamiliar settings. The Examiners (nice people

though they are) have an obligation to surprise you! For this reason, it is important

that you try as many examples as possible, from the subject guide and from the

textbooks. This is not so that you can cover any possible type of question the

Examiners can think of! It is so that you get used to confronting unfamiliar questions,

grappling with them, and finally coming up with the solution.

Do not panic if you cannot completely solve an examination question. There are many

marks to be awarded for using the correct approach or method.

1.6

You will not be permitted to use calculators of any type in the examination. This is not

something that you should worry about: the Examiners are interested in assessing that

you understand the key concepts, ideas, methods and techniques, and will set questions

which do not require the use of a calculator.

Chapter 2

Functions

Essential reading

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 2.12.6, 2.14 and part of 7.1.2.

Anthony and Biggs (1996) Chapters 1, 2 and parts of 7.

Further reading

Simon and Blume (1994) Sections 2.1, part of 2.2, 5.1, 5.3, and 5.4, Appendices

A1.1, parts of A1.2 and A2.16.

Adams and Essex (2010) Preliminaries parts of P.1P.7, parts of Sections 3.13.3

and 3.5.

Aims and objectives

The objectives of this chapter are as follows.

To introduce functions in general and the elementary functions and their graphs in

particular.

To see how to find combinations of functions and the inverse of a function (if it

exists).

To see how functions can be used in economics-based subjects.

To introduce conic sections and see how to draw them.

Specific learning outcomes can be found near the end of this chapter.

2.1

NOTE: Before you start this chapter, you should make sure that you have

covered the background material in Chapter 1 of 173 Algebra.

Given two sets A and B, a function, f , from A to B is a rule which takes each element

of A and gives us a unique (or exactly one) element of B. We often express the fact that

the function f takes elements from A and gives us elements of B by writing

2. Functions

f : A B. In such cases, we call the sets A and B the domain and co-domain of the

function respectively.

any x A, the domain, and applies the rule given by f to it to get the unique output

f (x) B, the co-domain, i.e.

x A f f (x) B.

Here, we have used x to denote the independent variable as we are free to choose any

element, x A from the domain. But, of course, the choice of x here is not essential as

it is just a dummy variable we could have used any other letter instead and said

that the function f : A B is a black box that takes any p A and applies the rule

given by f to it to get the unique output f (p) B.

stand for the elements of B that the function f : A B gives us. For instance, we

could say that this function takes any x A, the domain, and applies the rule given by

f to it to get the unique output y B, the co-domain, where y = f (x). Of course, here

the independent variable, x, via the rule given by f , will determine the value of y and

this is why we think of y as the dependent variable.

For now, we will only be interested in functions whose domain and co-domain are

certain sets of real numbers. In particular, they will either be R itself or certain subsets

of R called intervals. Typically, we think of R as the points on a line so that intervals

are described by line segments. Indeed, for a, b R, we will have finite intervals like

(a, b) = {x R | a < x < b}

and

[a, b] = {x R | a x b},

which only differ according to whether the end-points, i.e. the elements a and b, are in

the set. Of course, we can also have finite intervals where one end-point, but not the

other, is in the set and we denote these by

(a, b] = {x R | a < x b}

and

There are also infinite intervals which will have one finite end-point, say a R, and we

denote these by

(, a] = {x R | x a}

and

[a, ) = {x R | a x},

and

(, a) = {x R | x < a}

if it isnt. Of course, as we can see by looking at the sets involved when writing these

infinite intervals, the symbols and are not end-points as they are not real

numbers, they are just a notational convenience.

Putting these ideas together, we find that another way of visualising a function

f : A B is its graph which is the set of all points (x, y) R2 such that y = f (x).

Indeed, as a function f : A B must give a unique output y B for each x A, its

graph could look like the one illustrated in Figure 2.1(a) but not like the one in

Figure 2.1(b).

10

y

c

x

y = f (x)

a x

(a)

(b)

Figure 2.1: In (a) we have the graph of a function f : [0, a] [b, c] as each input, x [0, a],

gives a unique output y [b, c]. In (b), we do not have the graph of a function from [0, a]

to [b, c] as each input, x [0, a], gives two outputs y [b, c].

2.1.1

We now revise some elementary functions that will be useful in this course and look at

their graphs.

Power functions

A power function is a function f : R R given by

f (x) = xn ,

where n N. Depending on the value of n, the graphs of these functions look very much

like the ones illustrated in Figure 2.2. In addition to this, we also include the power

function f (x) = x0 = 1 as the function whose graph is a horizontal straight line that

goes through the point (0, 1).

y

y = xn

y=x

O

y=

O

(a) n = 1

(b) n is even

xn

(c) n 3 is odd

Figure 2.2: (a) When n = 1, the graph of the function f (x) = xn is just the straight line

y = x. (b) The graph of the function f (x) = xn when n is even. (c) The graph of the

function f (x) = xn when n 3 is odd. Of course, in (b) and (c) we are only looking at

the shape of the graph for different values of n without any regard to the scales on the

axes.

In particular, if we let x mean that x is positive and getting arbitrarily large (i.e.

we are considering what happens as x takes values far to the right on the x-axis) and

11

2. Functions

x means that x is negative but getting arbitrarily large in magnitude (i.e. we are

considering what happens as x takes values far to the left on the x-axis), we see that:

If n is even, xn as x and as x .

If n is odd, xn as x whereas xn as x .

This insight will be important in Section 4.4 when we consider how to sketch the graphs

of more complicated functions.

Exponential functions

An exponential function with base a is a function f : R (0, ) given by

f (x) = ax ,

where a = 1 is a positive real number. Depending on the value of a, the graphs of these

functions look very much like the ones illustrated in Figure 2.3.

y

y = ax

y = ax

(b) a > 1

Figure 2.3: (a) The graph of the function f (x) = ax when 0 < a < 1. (b) The graph of the

function f (x) = ax when a > 1. Of course, in both of these graphs we are only looking

at the shape of the graph for different values of a without any regard to the scales on the

axes.

Indeed, looking at these graphs we see that

If 0 < a < 1, ax 0 as x and ax as x .

If a > 1, ax as x and ax 0 as x .

And, as a0 = 1 for any positive a = 1, the graphs of these functions always go through

the point (0, 1).

Trigonometric functions

The two elementary trigonometric functions that we will be using are the sine and

cosine functions but, unlike what you may have seen before, we will always be using

them for angles that are given in radians instead of degrees. As you may know, we can

easily convert between these two units by using the formula

angle in radians =

12

2

angle in degrees,

360

measuring in radians, we can define the sine and cosine functions for 0 /2 by

using the right-angled triangle in Figure 2.4 to get

opposite

hypotenuse

and

s

nu

e

t

po

hy

adjacent

cos =

adjacent

.

hypotenuse

opposite

sin =

Figure 2.4: Defining the sine and cosine functions, sin and cos , for 0 /2.

In particular, by considering the two special triangles in Figure 2.5, we can see that the

values of these functions for some common angles (in radians) are

sin

cos

6

1

2

3

2

4

1

2

1

3

3

2

1

2

Activity 2.1 Recall that we also have the tangent function which, for 0 /2,

can be defined by using the right-angled triangle in Figure 2.4 to get

tan =

opposite

.

adjacent

Use the triangles in Figure 2.5 to find the values of tan when is /6, /4 and /3

radians. Incidentally, what are these three angles in degrees?

/4

4 1

/6

(a)

(b)

3 1

Figure 2.5: Finding sin and cos when (a) = /4 radians and (b) when = /6 or

= /3 radians.

At this point, well stop saying that an angle is in radians as, unless explicitly stated

otherwise, this will always be the case.

13

2. Functions

think of a unit circle and a triangle with an hypotenuse of 1 as illustrated in

Figure 2.6(a) which, for 0 /2 gives us a point (x, y) with

x = cos

and

y = sin ,

which can be found as before. But, if we now have /2 2, we get the situation

illustrated in Figure 2.6(b), where we can find the magnitude of x and y using our

original triangle and their sign by considering where the point lies in the (x, y)-plane.

For instance, in Figure 2.6(b), the angle could be 5/4 and so the angle in the triangle

y

y

(x, y)

1

O

(x, y)

(a)

(b)

Figure 2.6: Finding sin and cos when 0 2 by considering a unit circle.

would be /4 (i.e. 5

4

the x-axis is ). This gives x and y a magnitude of 1/ 2 and their signs would be

negative as x, y < 0 so we see that

sin

5

1

=

4

2

and

cos

5

1

= ,

4

2

Activity 2.2 Use the unit circle method to find sin and cos if = 2/3.

Activity 2.3 Use the unit circle method to find the values of sin and cos when

= 0 and = /2.

Hence deduce the values of these functions when = , = 3/2 and = 2.

If we want to extend the definition of the sine and cosine functions to all R, we can

see from the unit triangle method that both of these functions are periodic with a

period of 2, i.e.

sin( + 2) = sin

and

cos( + 2) = cos ,

and their graphs are illustrated in Figure 2.7. In particular, we observe that

cos = sin( + 2 ), i.e. the graph of the cosine function is what we get when we shift the

sine function to the left by /2.

14

Figure 2.7: The graphs of the sine and cosine functions, sin (solid line) and cos (dashed

line), for 4.

2.1.2

Combinations of functions

The elementary functions we have seen can be combined in various ways to make more

complicated functions. Generally, this is straightforward and works in the way you

would expect, but sometimes there are slight complications and so we revise these

different types of combination here.

Linear combinations of functions

If we have two functions with the same domain and co-domain, say f : A B and

g : A B, we can define a new function which is a linear combination of these two

functions. For instance, if k and l are constants, we would have the new function

kf + lg : A B defined by

(kf + lg)(x) = kf (x) + lg(x),

for all x A. In particular, this gives us polynomials, i.e. functions pn : R R which

are a linear combination of power functions of the form

pn (x) = an xn + an1 xn1 + + a1 x + a0 ,

where the ai for 0 i n are real constants. Indeed, if an = 0, we say that this is a

polynomial of degree n.

Of course, you have seen polynomials before as, in Chapter 1 of 173 Algebra, you saw

how to solve polynomial equations of the form pn (x) = 0 where n = 1 (a linear

equation), n = 2 (a quadratic equation) and n = 3 (a cubic equation). The information

we get from solving these equations is useful when we come to draw the graphs of

polynomial functions as the next example shows.

15

2. Functions

f (x) = 5 and g(x) = x + 2 on the same axes. At what point(s) do these graphs

intersect?

When we draw graphs, we will often do this by doing a sketch. Indeed, for a sketch

of the simple functions given here, it suffices to indicate their shape (they are both

straight lines) and where they are relative to the x and y-axes (by indicating where

they intersect these axes). So, as we saw in Section 2.1.1, we should expect the graph

of g(x) to be a horizontal line that goes through the point (0, 5) as g(x) = 5 for all

x R whereas for f (x), we would expect a straight line that has an

x-intercept that occurs when f (x) = 0, i.e. when x = 2, and a

y-intercept that occurs when x = 0, i.e. when f (0) = 2.

This information allows us to obtain the sketch illustrated in Figure 2.8.

To find the point(s) at which these two graphs intersect, we are looking for the

value(s) of x that make f (x) = g(x), i.e. where 5 = x + 2. This gives x = 3 and we

know that the values of the functions here must satisfy f (3) = g(3) = 5 which gives

(3, 5) as the required point of intersection.1

y

5

y = g(x)

y = f (x)

2

2

Figure 2.8: The graphs of the functions f (x) = 5 and g(x) = x + 2. Notice that these

We will see how to draw the graphs of polynomial functions where n = 2 in

Section 2.2.1 and we will develop a more general method for dealing with the case

where n 3 in Section 4.4.

Products and quotients of functions

If we have two functions with the same domain but possibly different co-domains, say

f : A B and g : A C, we can define a new function which is the product of these

two functions. For instance, here we would have the new function f g : A D where

D is the possibly new co-domain, defined by

(f g)(x) = f (x)g(x),

for all x A. Of course, we have seen how this works the other way in Chapter 1 of 173

Algebra as the process of factorisation involves writing a polynomial of degree n as the

1

Of course, thinking about the graphs of these functions as the points, (x, y), satisfying the equations

y = f (x) and y = g(x), all we have done here is solve the equations y = 5 and y = x + 2 simultaneously.

16

However, the quotient of these two functions is slightly more tricky to deal with as the

function f /g defined by

f (x)

,

(f /g)(x) =

g(x)

only makes sense for those x A where g(x) = 0 as, of course, we can never divide by

zero. As such, when finding the quotient of two functions, we get a function

f /g : A B where A is a new domain given by

A = {x A | g(x) = 0}.

The points at which a quotient are undefined may have interesting consequences for its

graph since they can give rise to vertical asymptotes. But, this neednt be the case as

the next example shows.

Example 2.2

x+1

f (x) =

x1

x2 + x 2

and g(x) =

,

x1

at the point x = 1.

For f (x), the polynomials in the numerator and denominator of the quotient are

defined for all x R, but f itself is not defined at x = 1 because that would entail

division by zero. As such, f must be a function from {x R | x = 1} to R. Indeed, if

we are considering values of x close to one, i.e. x 1, we could say that

f (x)

1+1

2

=

,

x1

x1

If we let x go to one from values of x that are larger than one (here we say x

goes to 1 from above and write x 1+ ), we see that x 1 is positive and

getting very small, which means that f (x) itself is positive and getting very

large. That is, f (x) is getting arbitrarily large as x goes to 1 from above and we

write this as f (x) as x 1+ .

If we let x go to one from values of x that are smaller than one (here we say x

goes to 1 from below and write x 1 ), we see that x 1 is negative and

getting very small, which means that f (x) itself is negative and getting very

large in magnitude. That is, f (x) is negative but getting arbitrarily large in

magnitude as x goes to 1 from below and we write this as f (x) as

x 1 .

As such, we see that f (x) has a vertical asymptote at the point x = 1 where it is

undefined. The graph of this function is illustrated in Figure 2.9(a) so that you can

see this asymptote and you will understand why its graph looks like this away from

the asymptote after you have covered the material in Section 2.2.4.

For g(x), the polynomials in the numerator and the denominator of the quotient are

defined for all x R, but g itself is not defined at x = 1 because, again, that would

17

2. Functions

However, in this case, we notice that x = 1 makes both the numerator and the

denominator equal to zero and so, in particular, x = 1 must be a root of the

numerator. This means that, if we factorise the numerator, we find that

g(x) =

(x + 2)(x 1)

x2 + x 2

=

,

x1

x1

g(x) = x + 2.

As such, where it is defined (i.e. for x = 1) the graph of g is a straight line like the

one sketched in Figure 2.9(b) although, of course, we must exclude the point (1, 3)

from this line as g(x) is not defined there. In particular, note that in this case the

function does not have a vertical asymptote at x = 1 even though it is undefined

there.

We will look at asymptotes in more detail when we see them again in Section 2.2.4 and

Section 4.4.

y

3

2

y

y = f (x)

O

(a)

y = g(x)

(b)

Figure 2.9: The graphs of the functions f (x) and g(x) from Example 2.2. In (a), the

vertical asymptote at x = 1 is indicated by a dashed line. In (b), the point where the

function is undefined is indicated by .

We can also form quotients using trigonometric functions and, in particular, we can use

the triangle in Figure 2.4 to see that

tan =

opposite/hypotenuse

sin

opposite

=

=

,

adjacent

adjacent/hypotenuse

cos

tan =

sin

,

cos

(2.1)

which will be defined for R as long as cos = 0, i.e. as long as = (2n + 1) 2 for

n Z. At the points where it is undefined this function has vertical asymptotes and its

graph is sketched in Figure 2.10.

18

Figure 2.10: The graph of the tangent function, tan for 4. Note the vertical

We can also find the reciprocals of our three trigonometric functions and these are

defined as follows.

The secant function, sec =

1

which is defined when = (2n + 1) 2 for n Z.

cos

1

which is defined when = n for n Z.

sin

1

which is defined when = n for n Z.

tan

Activity 2.4

cos

as long as = n for n Z.

sin

Compositions of functions

If we have two functions, say f : A B and g : B C, then we can define the

composition g f : A C to be the function

(g f )(x) = g(f (x)),

and here we say that we are applying g after f . That is, thinking of this in terms of

black boxes we have

x A f f (x) B g g(f (x)) C,

i.e. we take an x A and apply f to get the output f (x) B which we then use as the

input for g yielding the final output g(f (x)) C which is the value of (g f )(x).

19

2. Functions

g(x) = 2x 1. What are the functions g f and f g?

Here, as the functions both go from R to R, we can find both of these compositions.

In particular,

g f is the function where

(g f )(x) = g(f (x)) = g(x2 ) = 2x2 1,

where (g f ) : R R.

f g is the function where

(f g)(x) = f (g(x)) = f (2x 1) = (2x 1)2 ,

and (f g) : R R.

Indeed, observe that as (2x 1)2 = 4x2 4x + 1, these are certainly not the same

function.

Activity 2.5 Let f : R R and g : R R be the functions f (x) = x2 + 1 and

g(x) = 2x . What are the functions g f and f g?

In particular, we will also need to be able to identify compositions the other way when

we cover the chain rule in Section 3.2.2. For instance, it should be clear that the

function (x2 + 5)3 is the composition of the function x3 after the function x2 + 5.

Activity 2.6 Explain why the function (x2 + 5)3 is the composition of the function

x3 after the function x2 + 5.

2.1.3

Inverse functions

If A and B are sets and we have a function f : A B, we know that this means that

for every x A there is a unique y B such that y = f (x). Now, if we can define

another function g : B A, i.e. for every y B there is a unique x A such that

y = f (x) if and only if x = g(y),

then we call the function, g, the inverse of f and denote it by f 1 . In terms of black

boxes, this means that we have

x A f f (x) B,

for f and, if it exists, we have

y B f 1 f 1 (y) A,

for f 1 , or more usefully,

f (x) B f 1 x A.

20

In particular, this means that if the inverse, f 1 , of f exists, we see that the

composition f 1 after f gives us

x A f f (x) B f 1 x A,

and so (f 1 f )(x) = f 1 (f (x)) = x whereas the composition f after f 1 gives us

y B f 1 f 1 (y) A f y B,

and so (f f 1 )(y) = f (f 1 (y)) = y. That is, the inverse of a function (if it exists)

undoes what the function does and vice versa.

The question, then, is how can we tell whether an inverse function exists? And, if it

does exist, how can we find it? Well, given the function f : A B, the inverse will exist

if we are able to take y = f (x) and solve it to obtain a unique solution, x, in terms of y

for every y B. And, if we can do this, these solutions will tell us what the inverse

function is, i.e. they will allow us to identify the function, f 1 (y), by comparison with

x = f 1 (y). To make this clear, lets look at an example.

Example 2.4 Consider the function f : R R given by f (x) = x + 2. Explain why

this function has an inverse and find it.

Using the graph or common sense, we see that the function f (x) = x + 2 has an

inverse, since every y R where y = f (x) gives rise to a unique x R given by

x = y 2. As such, we can conclude that the inverse of this function exists and we

have x = f 1 (y) = y 2. Of course, we can now write this inverse as f 1 (x) = x 2

if we want it in terms of x.

Indeed, notice that, if we have the function f (x) and its inverse function f 1 (x), the

graph of f 1 is the reflection of the graph of f about the line y = x. This happens

because any point (x, y) on the curve y = f (x) becomes, under a reflection about the

line y = x, a point (y, x) on the curve x = f (y) which is the same as saying that

y = f 1 (x)!

Activity 2.7 Verify that the curve y = f 1 (x) is the reflection about the line y = x

of the curve y = f (x) using the function we saw in Example 2.4.

Of course, not every function has an inverse as the next example shows.

Example 2.5 Consider the function f : R R given by f (x) = x2 . Explain why

this function does not have an inverse.

If we take any y R where y = f (x) this gives us the equation y = x2 and, if we are

considering x R, this gives rise to a problem as far as the inverse of f is concerned

because:

If y < 0, we get no solution for x as we know that x R means that y = x2 0.

That is, we can find no inverse in this case since we cannot guarantee a unique

solution for x R from the equation y = x2 for all y R.

21

2. Functions

Of course, we can usually get around such problems if we are prepared to restrict the

domain and the co-domain of the function. But, in that case, we would be finding the

appropriate local inverses as opposed to its inverse (which, remember, doesnt exist!).

Activity 2.8 By considering the domains (, 0] and [0, ) and suitably

restricting the co-domain of the function in Example 2.5, find its local inverses.

Lets now look at the inverses of the elementary functions we considered in Section 2.1.1.

Power functions: root functions

If we have the power function f (x) = xn where x N and f : [0, ) [0, ) we can

see that the inverse is given by

x = f 1 (y) = y 1/n ,

and this is called a root function. Thus, we have

x = y 1/n

if and only if y = xn ,

have y 1/2 = y.

Activity 2.9 Draw the graph of the power function f : [0, ) [0, ) where

f (x) = x2 and its inverse.

This also works for f (x) = xn where f : R R if n is odd. But, if n is even, the

function f (x) = xn where f : R R does not have an inverse as we saw, for n = 2, in

Example 2.5.

Activity 2.10 Explain why we can find an inverse of the function f : R R where

f (x) = xn if n is odd. Why doesnt this work if n is even?

Exponential functions: logarithmic functions

If we have the exponential function f (x) = ax where f : R (0, ) and a = 1 is a

positive real number, the inverse is the function f 1 : (0, ) R given by

x = f 1 (y) = loga y,

which is the logarithm to base a. Thus, we have

x = loga y

provided that y > 0.

22

if and only if y = ax ,

Activity 2.11 Draw the graph of the exponential function f : R (0, ) where

f (x) = 2x and its inverse, f 1 (x) = log2 x where f 1 : (0, ) R.

In particular, we see from this that as

(f f 1 )(x) = f (f 1 (x)) = x we have aloga x = x,

and as

(f 1 f )(x) = f 1 (f (x)) = x we have

loga ax = x.

These results will be useful in Section 2.1.4 when we consider the laws of of logarithms.

Trigonometric functions: inverse trigonometric functions

If we want to discuss the inverses of the trigonometric functions sine and cosine, it is

first necessary to restrict their domain due to their oscillatory nature. To do this, we

consider a certain interval of values of , called the principal range, so that each value of

the function corresponds to a unique value of . Indeed, for the:

sine function, we take the principal range to be the interval [ 2 , 2 ] so that the

function sin : [ 2 , 2 ] [1, 1] where y = sin has an inverse. This inverse is

denoted by sin1 (or arcsin) where sin1 : [1, 1] [ 2 , 2 ]. Thus, we have

y = sin

provided that 2

and 1 y 1.

cosine function, we take the principal range to be the interval [0, ] so that the

function cos : [0, ] [1, 1] where y = cos has an inverse. This inverse is

denoted by cos1 (or arccos) where cos1 : [1, 1] [0, ]. Thus, we have

y = cos

It will also be convenient for us to consider the inverse of the tangent function where, as

well as the oscillations, we need to take care to avoid the asymptotes that occur when

this function is undefined. As such, for the

tangent function, we take the principal range to be the interval ( 2 , 2 ) so that the

function tan : ( 2 , 2 ) R where y = tan has an inverse. This inverse is denoted

by tan1 (or arctan) where tan1 : R ( 2 , 2 ). Thus, we have

y = tan

In particular, observe that sin1 , cos1 and tan1 are the inverses of the functions sin,

cos and tan respectively and not their reciprocals which we denoted by cosec, sec and

cot respectively in Section 2.1.2!

23

2. Functions

Activity 2.12 Find the acute angles 1 , 2 and 3 where 1 = sin1 12 , 2 = cos1

and t3 = tan1 1.

1

2

2.1.4

Identities

An expression such as

(x + 1)2 = x2 + 2x + 1,

which is true for all x is called an identity and, as you know, these are useful when we

need to simplify expressions. In particular, in Chapter 1 of 173 Algebra, you saw that

the power laws dictate that

am an = am+n ,

am

= amn

an

(am )n = amn ,

and

and these are identities that work for any values of a, m and n for which both sides are

defined. Indeed, these laws allow us to simplify expressions that may result from

appropriate products, quotients and compositions of power functions or exponential

functions.

Activity 2.13 If f (x) = x3 , g(x) = x4 and h(x) = 2x , find the functions (f g)(x),

(f /g)(x) and (g h)(x) simplifying your answers as far as possible.

We now look at some other identities that will be useful in this course.

The laws of logarithms

For any positive real number a = 1, the laws of logarithms state that

loga x + loga y = loga (xy),

x

y

provided that all of the terms involved are defined. As you may know, these laws are

easily derived from the power laws we saw above and the fact that

aloga x = x,

which we saw earlier in Section 2.1.3.

Activity 2.14

It is also useful to note that if a, b = 1 are positive real numbers, then we have the

change of base formula which states that

loga x =

logb x

,

logb a

24

Activity 2.15

Trigonometric identities

There are also identities that allow us to simplify various expressions involving the

trigonometric functions. For instance, using the triangle in Figure 2.4, Pythagoras

theorem allows us to see that

2

opposite

adjacent

+

hypotenuse

hypotenuse

2

2

opposite + adjacent

=

hypotenuse2

hypotenuse2

=

hypotenuse2

= 1,

sin2 + cos2 =

sin2 + cos2 = 1.

(2.2)

In particular, for natural numbers n 2, note that we commonly abbreviate things like

(sin )n by writing them as sinn . Further, dividing both sides of this expression by

sin2 we get

1 + cot2 = cosec2 ,

(2.3)

and this works as long as = n for n Z whereas dividing both sides of this

expression by cos2 we get

tan2 + 1 = sec2 ,

(2.4)

and this works as long as = (2n + 1) 2 for n Z. We call these three identities the

Pythagorean identities as they are simple consequences of Pythagorass theorem.

Activity 2.16

Another useful pair of trigonometric identities are the compound-angle formulae given

by

sin( + ) = sin cos + cos sin and

Activity 2.17 Observe from the graphs of the sine and cosine functions in

Figure 2.7 that sine is an odd function, i.e. sin() = sin , and cosine is an even

function, i.e. cos() = cos . Use these facts and the compound-angle formulae to

show that we also have

sin( ) = sin cos cos sin and

for , R.

2

Of course, if we consider how we extend the definitions of the sine and cosine functions to all R,

it should be clear that this identity is actually true for all R.

25

2. Functions

for , R. Indeed, they are especially useful since, setting = , we can use them to

obtain the double-angle formulae

sin(2) = 2 sin cos

and

(2.6)

Activity 2.18 Use the compound-angle formulae to derive the double-angle

formulae given above.

Use the Pythagorean identity sin2 + cos2 = 1 to show that we also have

cos(2) = 1 2 sin2

and

cos(2) = 2 cos2 1,

for all R.

2.1.5

Applications of functions

In economics and related subjects, functions can be used to represent how one quantity

depends on another. For instance, as the profit that a company makes, , would depend

on the quantity of goods sold, q, it makes sense to suppose that there is some function

of q, say f , that tells us the corresponding profit, . In this case, we would use an

equation of the form = f (q) to express this dependency and we would have found a

profit function. Moreover, if f is invertible, we could find its inverse function, f 1 , and

we would use this to find the value of q that corresponds to a given value of . In which

case, the dependency would now be given by an equation of the form q = f 1 (). We

will look at profit functions properly in Section 4.5.3, but for now, we consider another

application of functions in economics, namely how they can be used to represent

information about supply and demand in a market.

Supply and demand functions

In any given market, there is a good which is supplied by the producers (and demanded

by the consumers) and the general idea is that, for both supply (and demand), if

producers are charging (or consumers are buying) at a price of p per-unit, then the level

of supply (or demand) for that good, q, will depend on p. Indeed, since each value of p

will lead the producers to supply (and the consumers to demand) exactly one quantity

q, it makes sense to think of the quantity, q, supplied (or demanded) as a function of

the price, p. This leads us to a description of the market in terms of two kinds of

function, namely:

If the quantity supplied, q, can be written in terms of p then we can identify the

supply function, q S , from the fact that we have q = q S (p). This tells us the

quantity, q, that the producers will supply if the prevailing market price is p.

26

If the quantity demanded, q, can be written in terms of p then we can identify the

demand function, q D , from the fact that we have q = q D (p). This tells us the

quantity, q, that the consumers will demand if the prevailing market price is p.

In particular, note that, although we have q as a function of p in both of these cases we

follow the practice common in economics and use the vertical axis for p and the

horizontal axis for q when drawing the graphs of these functions. As such, any point on

the graph of these functions is of the form (q, p) where q = q S (p) for supply and

q = q D (p) for demand. Also, these functions and their graphs only make economic sense

when p 0 and the quantities they yield, q, are also non-negative.3

Once we have these functions, we are often interested in the the equilibrium point for

the market as this is the point where the supply and demand functions are equal. In

theory, this is the point, (q , p ), where the market stabilises since, at this point, the

per-unit price, p , is such that the levels of supply and demand are equal, i.e. we have

q S (p ) = q D (p ).

As such, we can find the equilibrium price, p , by solving the resulting equation and the

corresponding equilibrium quantity, q , can then be found by, say, using the demand

function as q = q D (p ). Lets look at a simple example.

Example 2.6

q S (p) = p + 1

and

q D (p) = 3 p,

respectively. Sketch the graphs of these functions and find the equilibrium point.

Here the supply and demand functions are straight lines which can easily be

sketched using the method outlined in Example 2.1 and the results of doing this are

illustrated in Figure 2.11. To find the equilibrium price, p , we have

q S (p ) = q D (p )

p + 1 = 3 p

2p = 4,

q = q D (p ) = 3 p ,

and so the equilibrium quantity is q = 3 2 = 1. Consequently, the equilibrium

point is (q , p ) = (2, 1) which, as indicated in Figure 2.11, is the point at which the

two straight lines intersect.

Usually, the supply and demand functions are invertible and so we can also find the

inverses of these functions. In particular, if they are invertible, we note that:

If the price, p, can be written in terms of q then we can identify the inverse supply

function, pS , from the fact that we have p = pS (q). This tells us the price, p, that

the producers will charge if the quantity being supplied is q.

3

Although, when drawing their graphs, it is often useful to consider all possible values of p and q

before restricting your attention to the economically meaningful ones where p, q 0!

27

2. Functions

p

3

S

1

D

O

1

Figure 2.11: A sketch of the graphs of the supply and demand functions in Example 2.6

indicating the equilibrium point for this market. (Note that this sketch only makes

economic sense when p 0.)

If the price, p, can be written in terms of q then we can identify the inverse

demand function, pD , from the fact that we have p = pD (q). This tells us the price,

p, that the consumers will pay if the quantity being demanded is q.

Activity 2.19 Decide whether the supply and demand functions in the example

above are invertible. If they are, find the inverse supply and demand functions.

The effects of taxation

Sometimes, in order to control a market, a government will impose an excise tax of T

per unit sold. We model such situations by assuming that the tax is paid to the

government by the supplier and so, if the price paid by the consumers in the presence of

this tax is p per unit, the suppliers effectively receive p T for each unit sold as they

must pay T of each p received to the government. As such, the supply and demand

functions in the presence of the tax, lets call them qTS (p) and qTD (p) respectively, will be

given by

qTS (p) = q S (p T )

and

That is, the consumers still pay a price of p per unit and so the demand function is

unchanged, but the suppliers now only receive an amount p T per unit and so the

supply function is modified by the introduction of an excise tax. Of course, the

introduction of an excise tax will affect the equilibrium price and quantity for a market,

i.e. in the presence of such a tax, the new equilibrium point, lets call it (qT , pT ), will be

the point where

qTS (pT ) = qTD (pT )

or, equivalently,

q S (pT T ) = q D (pT ),

and, using the unchanged demand function qT = qTD (pT ) or, equivalently, qT = q D (pT ).

Lets look at how such a tax would affect the market we considered in Example 2.6.

28

Example 2.6. Find the new equilibrium point and, by sketching the graph of the new

supply function on your earlier sketch, comment on how the equilibrium point for

the market has changed. How much of the tax has been passed onto the consumers?

What is the maximum tax, Tm , that can be imposed if this market is to continue

functioning?

If an excise tax of T per unit is imposed, the demand function is still

qTD (p) = q D (p) = 3 p,

but the supply function becomes

qTS (p) = q S (p T ) = p T + 1,

as the suppliers now see an effective price of p T . This means that the equilibrium

price in the presence of the tax, pT , is given by

qTS (pT ) = qTD (pT )

pT T +1 = 3pT

2pT = 2+T

T

p = 1+ ,

2

qT = qTD (pT ) = 3 1 +

T

2

=2

T

,

2

if we use the demand function, qTD (p).4 Thus, the new equilibrium point is

(2 T /2, 1 + T /2). Sketching the graph of the new supply function, as in

Figure 2.12, we see that it is parallel to the old one and the p-intercept has increased

by T . Indeed, as the equilibrium price has increased from 1 to 1 + T /2 due to the

presence of the tax, half the tax has been passed on to the consumer. Of course, the

equilibrium quantity in the presence of the tax must be positive and so, for the

market to function, we require that

qT > 0

T

>0

2

T < 4,

Alternatively, the government may decide to impose a percentage of the price tax of

100r% (so, for instance, a tax of 5% of the price would correspond to r = 0.05) instead

of the per unit tax that we have considered so far. So, again assuming that the tax is

paid to the government by the supplier, if the price paid by the consumers in the

presence of this tax is p per unit, the suppliers effectively receive p rp for each unit

sold as they must pay rp of each p received to the government. As such, in the presence

of such a tax, the supply and demand functions in the presence of the tax, lets call

4

qTS (p) = q S (p T ) = p T + 1,

to find qT . However, we can not use q S (p) = p + 1 as this no longer holds in the presence of the tax!

29

2. Functions

(2 12 T, 1 + 21 T )

new S

S

1

T 1

D

1

1

Figure 2.12: Following on from the sketch in Figure 2.11, if an excise tax of T per unit is

imposed, the supply set changes as shown and the demand set stays the same. Observe

how the introduction of this tax affects the equilibrium point for this market. (Note that

this sketch only makes economic sense when p 0.)

qrS (p) = q S (p rp)

and

That is, once again, the consumers still pay a price of p per unit and so the demand

function is unchanged, but the suppliers now only receive an amount p rp per unit

and so the supply function is modified by the introduction of a percentage of the price

tax. Of course, the introduction of this tax will also affect the equilibrium price and

quantity for the market, i.e. in the presence of such a tax, the new equilibrium point,

lets call it (qr , pr ), will be the point where

qrS (pr ) = qrD (pr )

or, equivalently,

and, using the unchanged demand function qr = qrD (pr ) or, equivalently, qr = q D (pr ).

See, for example, Exercise 2.3 at the end of this chapter.

2.2

Conic sections

So far, we have been dealing with functions that are explicitly defined in terms of an

independent variable but, sometimes, we may have an equation relating two variables,

say x and y, which implicitly defines y as one or more functions of x. As it will be useful

in various places, we now investigate some important instances of functions defined in

this way and their graphs, the so-called conic sections.5

2.2.1

Parabolae

y = ax2 + bx + c,

5

See, for example, Binmore and Davies (2002) Section 2.14 for a full discussion of the geometric

aspects of conic sections and where they come from. Although this is interesting, we will not be delving

into these overly geometric aspects of conic sections in this course.

30

where a = 0, b and c are constants. Indeed, if we complete the square, we can write this

in the form

y = a(x p)2 + q,

for some constants p and q. This curve will have a y-intercept which we can find by

setting x = 0 and it may have x-intercepts which, if they exist, we can find by setting

y = 0. It will also have a turning point with coordinates (p, q) which will be a minimum

if a > 0 and a maximum if a < 0. Once we have this information, the parabola should

be easy to draw as the next example shows.

Example 2.8

(a) y = x2 4x + 3, and

(b) y = x2 + 2x + 3.

For (a), we are told that y = x2 4x + 3 and so we find that:

For the y-intercept: Setting x = 0 we get y = 3.

For the x-intercepts: Setting y = 0 we get

x2 4x + 3 = 0

(x 1)(x 3) = 0,

The turning point of the parabola can be found by writing the equation of the

parabola in completed square form and, doing this, we get

y = (x 2)2 1.

Here, a = 1 > 0 and so we get a minimum at the point (2, 1).

Putting this information together, we then get the sketch in Figure 2.13(a).

For (b), we are told that y = x2 + 2x + 3 and so we find that

For the y-intercept: Setting x = 0 we get y = 3.

For the x-intercept: Setting y = 0 we get

x2 + 2x + 3 = 0

x2 2x 3 = 0

(x + 1)(x 3) = 0,

The turning point of the parabola can be found by writing the equation of the

parabola in completed square form and, doing this, we get

y = x2 +2x+3 = x2 2x +3 = (x1)2 1 +3 = (x1)2 +1+3 = (x1)2 +4.

Here, a = 1 < 0 and so we get a maximum at the point (1, 4).

Putting this information together, we then get the sketch in Figure 2.13(b).

31

2. Functions

y

y = x2 4x + 3

4

3

y = x2 + 2x + 3

2

O

x

1

(a)

x

1

(b)

Figure 2.13: In (a) we have a sketch of the parabola from Example 2.8(a). In (b) we have

Activity 2.20 Given the equation of a parabola in completed square form, i.e.

y = a(x p)2 + q,

use the fact that (x p)2 0 for all x R to explain why the turning point of this

parabola will be a minimum if a > 0 and a maximum if a < 0.

2.2.2

Circles

(x a)2 + (y b)2 = r2 .

Of course, such a circle is easy to draw and its x and y-intercepts can be found by

seeing where y = 0 and where x = 0 respectively. Once we have this information, the

circle should be easy to draw as the next example shows.

Example 2.9 Find the radius and centre of the circle whose equation is given by

Sketch the circle.

x2 6x + y 2 8y = 0.

x2 6x + y 2 8y = 0,

is the equation of a circle and so, completing the square in x and y, we find that

(x 3)2 9 + (y 4)2 16 = 0

32

and so, comparing this with (x a)2 + (y b)2 = r2 we see that we have a circle of

radius 5 centred on the point (3, 4). We also find that:

(x 3)2 + 16 = 25

(x 3)2 = 9

x 3 = 3,

y 4 = 4,

For the y-intercept: Setting x = 0 we get

9 + (y 4)2 = 25

(y 4)2 = 16

Putting this information together, we then get the sketch in Figure 2.14(a).

y

8

y

3

5

4

O

2

3

(a)

(b)

Figure 2.14: In (a) we have a sketch of the circle from Example 2.9. In (b) we have a

2.2.3

Ellipses

x2 y 2

+ 2 = 1.

a2

b

In particular, an ellipse of this form is effectively a circle centred on the origin that has

been squashed and it is easy to draw once we have found its x and y-intercepts by

seeing where y = 0 and where x = 0 respectively.

Example 2.10 Sketch the ellipse whose equation is given by

x2 y 2

+

=1

4

9

x2 y 2

+

= 1,

4

9

we see that the x-intercepts, which occur when y = 0, are given by

x2

=1

4

x2 = 4

x = 2,

33

2. Functions

y2

=1

9

y2 = 9

y = 3.

Putting this information together, and bearing in mind that this should look like a

circle centred on the origin that has been squashed, we then get the sketch in

Figure 2.14(b).

2.2.4

Hyperbolae

x2 y 2

2 = 1.

a2

b

This curve will have x-intercepts which can be found by setting y = 0, but no

y-intercepts. It will also have slant (or oblique) asymptotes which can be found by

writing the equation as

1

1

y2

= b2

2 ,

2

2

x

a

x

2

so that, as x , we have 1/x 0 and this leaves us with

b2

b

y2

=

=

y = x,

2

2

x

a

a

as the equations of the asymptotes. Once we have this information, the hyperbola

should be easy to draw as the next example shows.

Example 2.11

x2 y 2

= 1.

4

9

x2 y 2

= 1,

4

9

we see that the x-intercepts, which occur when y = 0, are given by

x2

= 1 = x2 = 4 = x = 2,

4

whereas there are no y-intercepts since, setting x = 0, we get

y2

= 1 = y 2 = 9,

9

which has no real solutions. To find the asymptotes, we write the equation as

y2

=9

x2

1

1

2

4 x

y2

9

3

= y = x,

=

2

x

4

2

as the equations of the asymptotes. Putting this information together, we then get

the sketch in Figure 2.15(a).

34

y

2

x1

y =1+

x2

4

y2

9

=1

x

O

1

1

=

3

(a)

(b)

Figure 2.15: In (a) we have a sketch of the hyperbola from Example 2.11. In (b) we have

Of course, similar remarks apply to a hyperbola which has an equation of the form

y 2 x2

2 = 1,

b2

a

and, in particular, this curve will have y-intercepts but no x-intercepts.

y 2 x2

= 1.

9

4

(x a)(y b) = c,

and this arises when the asymptotes turn out to be the horizontal line y = b and the

vertical line x = a as the next example illustrates.

Example 2.12 Sketch the rectangular hyperbola whose equation is given by

(x 1)(y 1) = 2.

Given that (x 1)(y 1) = 2, we can see that:

For the x-intercept: Setting y = 0 we get (x 1)(1) = 2 or x 1 = 2, i.e.the

x-intercept is given by x = 1.

For the y-intercept: Setting x = 0 we get (1)(y 1) = 2 or y 1 = 2, i.e. the

y-intercept is given by y = 1.

y =1+

2

,

x1

35

2. Functions

y .

x we have y 1 from below.

Putting this information together, we then get the sketch in Figure 2.15(b). In

particular, observe that here we have

y =1+

(x 1) + 2

x+1

2

=

=

,

x1

x1

x1

and so this gives us y = f (x) where f (x) is the first function in Example 2.2 which

was illustrated in Figure 2.9(a).

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

identify elementary functions and sketch their graphs;

find combinations of elementary functions and inverses (if they exist);

use identities to rewrite expressions involving powers, logarithms and trigonometric

functions;

solve problems from economics-based subjects that involve functions;

identify and sketch conic sections.

Solutions to activities

Solution to activity 2.1

Using the triangles in Figure 2.5 and the definition of the tangent function, it should be

clear that

tan = , tan = 1 and tan = 3.

6

4

3

3

Indeed, using the fact that

angle in radians =

2

angle in degrees,

360

60 degrees respectively.

Solution to activity 2.2

In this case, the unit circle method gives us the situation illustrated in Figure 2.16 and

so the angle in the triangle would be /3 (i.e. 2

= 3 as the angle subtended by a

3

36

straight line in this case the x-axis is ) giving x a magnitude of 1/2 and y a

magnitude of 3/2 whereas their signs would be negative for x (as x < 0) and positive

for y (as y > 0). Thus we see that

3

2

2

1

sin

=

and

cos

= ,

3

2

3

2

using the unit circle method.

(x, y)

y

1

2/3

Figure 2.16: For Activity 2.2, we find sin and cos when = 2/3 by considering a unit

circle.

Solution to activity 2.3

Using the unit circle in Figure 2.17(a), it should be clear that

sin 0 = 0

and

cos 0 = 1,

whereas using the unit circle in Figure 2.17(b), it should be clear that

and

cos = 0.

sin = 1

2

2

Then, using similar reasoning, we should be able to deduce that

sin

cos

3

2

1

0

2

0

1

are the other values of the functions sin and cos that we seek.

Solution to activity 2.4

From the definition of cot , we have

1

1

cos

=

=

,

sin

tan

sin

cos

as we know that tan = sin / cos . This function is defined as long as = n for n Z

since, at these values of , we have tan = 0 or, equivalently, sin = 0.

cot =

Given the functions f : R R and g : R R where f (x) = x2 + 1 and g(x) = 2x , we

see that

37

2. Functions

1

O

(a)

1

x

(b)

Figure 2.17: For Activity 2.3, we find sin and cos by considering a unit circle when (a)

= 0 and (b) = /2.

(g f )(x) = g(f (x)) = g(x2 + 1) = 2x

2 +1

where (g f ) : R R.

f g is the function where

(f g)(x) = f (g(x)) = f (2x ) = (2x )2 + 1 = 22x + 1,

and (f g) : R R.

2 +1

If we have f (x) = x3 and g(x) = x2 + 5, then the function (x2 + 5)3 can be written as

(x2 + 5)3 = f (x2 + 5) = f (g(x)) = (f g)(x),

i.e. it is the composition we get from applying f after g or, in terms of x, it is the

composition of the function x3 after the function x2 + 5.

Solution to activity 2.7

By considering the graphs of the functions f (x) = x + 2 and f 1 (x) = x 2 as

illustrated in Figure 2.18, we see that the latter is indeed the reflection in the line y = x

of the former. Alternatively, we can see that if y = x + 2, a reflection in the line y = x

just means replacing all points (x, y) that satisfy this equation with points given by

(y, x) to get the new equation x = y + 2. But, of course, this gives y = x 2 which is

what we wanted.

Solution to activity 2.8

When we considered the function f : R R in Example 2.5, there were two problems

that prevented us from finding an inverse. To counteract these so that we can find the

local inverses of this function, we note that:

If we take the co-domain to be the interval [0, ) so that we have y 0, then we

remove the problem that occurs because y = x2 has no solution for y < 0.

38

2

x

y

O

x

=

Figure 2.18: For Activity 2.7, we see that the graph of f 1 (x) = x 2 is the reflection of

If we take the two domains given by the intervals (, 0] and [0, ) so that we

have x 0 and x 0 respectively, then we remove the problem that occurs

because y = x2 has two solutions for x R.

y = f (x)

y = x2

x=

y,

1

as x 0 because x [0, ).

Thus, using x = f (y), the inverse of this function is

1

1

f (y) = y or f (x) = x if we want it in terms of x.

y = f (x)

y = x2

x = y,

1

as x 0 because x (, 0]. Thus,

using x = f (y), the inverse of this function

1

1

is f (y) = y or f (x) = x if we want it in terms of x.

In particular,

this means that the local inversesof f : R [0, ) where f (x) = x2 are

Solution to activity 2.9

the function f : [0, ) [0, ) where f (x) = x2 has an

functions are illustrated in

Figure 2.19. In particular, observe that the curve y = x is the reflection about the line

y = x of the curve y = x2 and that all three of these curves intersect at the points (0, 0)

and (1, 1).

Solution to activity 2.10

From the graphs of the function f (x) = xn where f : R R when n is odd, which we

saw in Figure 2.2(a) and (c), it should be clear that the the equation y = f (x) has a

unique solution, x, for all y R and so the inverse of this function exists. In particular,

we see that

y = xn

=

x = y 1/n = n y,

gives us this unique solution for any y R provided that n

is odd and so we have

1

1

n

n

f (y) = y as the inverse function or, indeed, f (x) = x if we want it in terms of x.

39

x

y

y=

x2

2. Functions

y=

Figure 2.19: For Activity 2.9, we see that the graph of f 1 (x) =

x is the reflection of

However, from the graph of the function f (x) = xn where f : R R when n is even,

which we saw in Figure 2.2(b), it should be clear that when

y < 0, the equation y = f (x) has no solution for x as we know that x R means

that y = xn 0 when n is even.

y > 0, the equation y = f (x) has two solutions for x as we know that we can get

x = n y R when n is even.

As such, we can not find a unique solution, x, for all y R and so the inverse of this

function can not exist.

Solution to activity 2.11

x

log 2

=

y

1

O

y= x

2

We saw the graph of a function like f : R (0, ) where f (x) = 2x in Figure 2.3(b)

since we have a = 2 > 1 here. As such, we find that the graphs of the function

f : R (0, ) where f (x) = 2x and its inverse, f 1 (x) = log2 x where

f 1 : (0, ) R, are as illustrated in Figure 2.20. In particular, observe that the curve

y = log2 x is the reflection about the line y = x of the curve y = 2x .

Figure 2.20: For Activity 2.11, we see that the graph of f 1 (x) = log2 x is the reflection

40

To find the acute angles 1 and 2 where 1 = sin1

of values in Section 2.1.1 to see that

1

= sin

2

6

sin 1 =

gives us

1

2

1 = sin1

= ,

2

6

and

1

1

= cos

gives us

2 = cos1 = ,

2

3

2

3

1

whereas to find the acute angle 3 where t3 = tan 1, we use the table we found in

Activity 2.1 to see that

cos 2 =

tan 3 = 1 = tan

gives us

3 = tan1 1 =

.

4

We also have

cosec 1 =

1

1

1

1

1

1

=

= 2, sec 2 =

=

= 2 and cot t3 =

= = 1,

sin 1

1/2

cos 2

1/2

tan 3

1

using the definitions of the reciprocals of our three trigonometric functions, which we

saw in Section 2.1.2.

Solution to activity 2.13

Given that f (x) = x3 , g(x) = x4 and h(x) = 2x , we use the definitions of the

combinations of functions we need from Section 2.1.2, to get

(f g)(x) = f (x)g(x) = (x3 )(x4 ) = x7 ,

f (x)

x3

1

(f /g)(x) =

= 4 = , and

g(x)

x

x

(g h)(x) = g(h(x)) = g(2x ) = (2x )4 = 24x ,

where we have used the power laws to simplify our answers. Indeed, observe that for the

last function, we can also write 24x = (24 )x = 16x .

Solution to activity 2.14

To derive the laws of logarithms, we note that for the first one, we use the power laws

and the given fact to get

aloga x+loga y = aloga x aloga y = xy = aloga (xy) ,

which means that loga x + loga y = loga (xy), for the second one, we similarly get

aloga xloga y =

x

aloga x

=

= aloga (x/y) ,

aloga y

y

which means that loga x loga y = loga (x/y) and for the third one, we get

y

which means that y loga x = loga (xy ).

41

2. Functions

We take logarithms to the base b on both sides of the given fact to see that

aloga x = x

where we have used the third law of logarithms in the last step. Then, dividing through

on both sides by logb a (which is non-zero as a = 1), we get

loga x =

logb x

,

logb a

as required.

Solution to activity 2.16

Starting with sin2 + cos2 = 1, we divide both sides by sin2 to get

sin2 + cos2

1

=

2

sin

sin2

sin2 cos2

1

=

2 +

2

sin sin

sin2

1+

cos

sin

1

sin

so that 1 + cot2 = cosec2 if we use the definition of cosec from Section 2.1.2 and the

result from Activity 2.4. Then, again starting with sin2 + cos2 = 1, we divide both

sides by cos2 to get

sin2 + cos2

1

=

2

cos

cos2

sin2 cos2

1

+ 2 =

2

cos cos

cos2

sin

cos

+1 =

1

cos

so that tan2 + 1 = sec2 if we use the definition of sec and (2.1) from Section 2.1.2.

Solution to activity 2.17

With the given facts, we can use the compound-angle formula for sin( + ) to see that

sin( ) = sin( + ()) = sin cos() + cos sin() = sin cos cos sin ,

and the compound-angle formula for cos( + ) to see that

cos( ) = cos( + ()) = cos cos() sin sin() = cos cos + sin sin ,

as required.

Solution to activity 2.18

Using the compound-angle formula

sin( + ) = sin cos + cos sin ,

with = we get

sin( + ) = sin cos + cos sin

cos( + ) = cos cos sin sin ,

42

with = we get

cos( + ) = cos cos sin sin

2

as required. Indeed, since we also have the Pythagorean identity sin + cos = 1, we

can write this last double-angle formula as

cos(2) = (1 sin2 ) sin2 = 1 2 sin2 ,

in terms of sin2 , or as

cos(2) = cos2 (1 cos2 ) = 2 cos2 1,

in terms of cos2 , as required.

Solution to activity 2.19

From the graph in Figure 2.11, we can see that the economically meaningful part of the

supply function is q S : [0, ) [1, ) where q S (p) = p + 1 and the economically

meaningful part of the demand function is q D : [0, 3] [0, 3] where q D (p) = 3 p.

Clearly, both of these functions are invertible as each q in the co-domain gives rise to a

unique p in the domain and we find that

q =p+1

p = pS (q) = q 1,

q =3p

p = pD (q) = 3 q,

Solution to activity 2.20

Given that

y = a(x p)2 + q,

we see that

If a > 0, then for any x R,

(x p)2 0

a(x p)2 0

a(x p)2 + q q,

i.e. for all x R, y q and so the smallest value of y occurs when y = q which, in

turn, means that we must have x = p. Thus, the turning point of the parabola is a

minimum and this occurs at the point (p, q).

If a < 0, then for any x R,

(x p)2 0

a(x p)2 0

a(x p)2 + q q,

i.e. for all x R, y q and so the largest value of y occurs when y = q which, in

turn, means that we must have x = p. Thus, the turning point of the parabola is a

maximum and this occurs at the point (p, q).

43

2. Functions

y 2 x2

= 1,

9

4

we see that there are no x-intercepts since, setting y = 0, we get

x2

=1

4

x2 = 4,

which has no real solutions, whereas we see that the y-intercepts, which occur when

x = 0, are given by

y2

= 1 = y 2 = 9 = y = 3.

9

To find the asymptotes, we write the equation as

y2

=9

x2

1

1

+ 2

4 x

9

y2

=

2

x

4

3

y = x,

2

as the equations of the asymptotes. Putting this information together, we then get the

sketch in Figure 2.21.

x2

4

=1

y2

9

y

=

3

44

y 2 x2

= 1.

9

4

2.2. Exercises

Exercises

Exercise 2.1

Sketch the graph of the function f : {x R | x = 1, 1} R given by

x4 1

.

x2 1

f (x) =

Exercise 2.2

Use the compound-angle formulae to show that

tan( ) =

tan tan

,

1 tan tan

Exercise 2.3

The supply and demand functions for a good are

q S (p) = p 4

and

q D (p) = 8 p,

respectively. Sketch the graphs of these functions and find the equilibrium point.

A percentage [of the price] tax of 100r% is imposed. Find the new equilibrium point

and, by sketching the graph of the new supply function on your earlier sketch, comment

on how the equilibrium point for the market has changed. How much of the tax has

been passed onto the consumers? What is the maximum tax, rm , that can be imposed if

this market is to continue functioning?

Exercise 2.4

When selling a quantity, q, a firm makes a profit given by

(q) = q 2 + 2q + 2,

and the largest quantity it can produce is 10. Sketch the graph of this profit function

and deduce the value of q that will yield the greatest profit for this firm.

Explain why the inverse profit function exists and find it.

Exercise 2.5

Sketch the circle and the rectangular hyperbola with equations

x2 + y 2 = 1

and

2xy = 1,

45

2. Functions

Solutions to exercises

The function f : {x R | x = 1, 1} R given by

x4 1

f (x) = 2

,

x 1

is clearly undefined at x = 1 and x = 1 as these values of x would entail division by

zero. However, we notice that factorising the numerator and the denominator we get

f (x) =

(x2 1)(x2 + 1)

,

x2 1

f (x) = x2 + 1.

This means that, to sketch the graph of f (x), we start by sketching the graph of the

parabola

y = x2 + 1,

which has

a y-intercept when x = 0, i.e. when y = 1,

no x-intercepts as y = 0 gives x2 + 1 = 0 which has no real solutions, and

a turning point which is a minimum at the point (0, 1).

We then exclude the points (1, 2) and (1, 2) on the parabola at which f (x) itself is

undefined to get the sketch in Figure 2.22.

y

y = f (x)

2

1

1 O 1

Figure 2.22: For Exercise 2.1, a sketch of the graph of f (x). (Note that the points at

Solution to exercise 2.2

Using (2.1), we have

tan( ) =

46

sin( )

,

cos( )

sin

sin

cos cos

tan( ) =

=

,

sin sin

cos cos sin cos

1

cos cos

if we divide the numerator and denominator by cos cos and cancel where

appropriate. Thus, using (2.1) again, we have

tan( ) =

tan tan

,

1 tan tan

as required. Indeed, observe that this only makes sense if , = (2n + 1) 2 for n Z as,

if this isnt true, we cant divide through by cos cos or, equivalently, one of tan or

tan wont exist.

To deduce a formula for tan(2), we set = in the formula for tan( + ) to get

tan(2) = tan( + ) =

2 tan

tan + tan

=

.

1 tan tan

1 tan2

Again, we observe that this only makes sense if = (2n + 1) 2 for n Z as, if this isnt

true, tan wont exist.

Solution to exercise 2.3

Here the supply and demand functions are straight lines which can be easily sketched

using the method outlined in Example 2.1 and the results of doing this are illustrated in

Figure 2.23(a). To find the equilibrium price, p , we have

q S (p ) = q D (p )

p 4 = 8 p

2p = 12,

q = q D (p ) = 8 p ,

and so the equilibrium quantity is q = 8 6 = 2. Consequently, the equilibrium point

is (q , p ) = (2, 6) which, as indicated in Figure 2.23(a), is the point at which the two

straight lines intersect.

If a percentage [of the price] tax of 100r% is imposed,6 the demand function is still

qrD (p) = q D (p) = 8 p,

but the supply function becomes

qrS (p) = q S (p rp) = p rp 4,

6

Here we will start by restricting our attention to the case where 0 r 1 as, prima facie, these are

the values that would appear to be economically sensible. Although, as we will soon see, the economically

meaningful values of r will turn out to be 0 r < 1/2!

47

2. Functions

as the suppliers now see an effective price of p rp. This means that the equilibrium

price in the presence of tax, pr , is given by

= pr (2 r) = 12

12

= p =

,

2r

and so the equilibrium quantity in the presence of tax, qr , is

qr = qrD (pr ) = 8

16 8r 12

4 8r

12

=

=

,

2r

2r

2r

if we use the demand function, qrD (p).7 Sketching the graph of the new supply function,

as in Figure 2.23(b), we see that by writing its equation as

p=

q

4

+

,

1r 1r

1

4

1

and

4,

1r

1r

when considering 0 r 1, this means that it is steeper than the old one and that the

p-intercept, which is now

4

4(1 r) + 4r

4r

=

=4+

,

1r

1r

1r

has increased by 4r/(1 r). In this case, as the equilibrium price has increased from 6 to

12

6(2 r) + 6r

6r

=

=6+

,

2r

2r

2r

we see that the consumer pays 6r/(2 r) more. But, as the total tax to be paid by the

supplier is given by

12

12r

rpr = r

=

,

2r

2r

this means that only half of the tax has been passed on to the consumer in this case. Of

course, the equilibrium quantity in the presence of the tax must be positive and so, for

the market to function, we require that

qr > 0

4 8r

>0

2r

4 > 8r

1

r< ,

2

(bearing in mind that 2 r > 0 if 0 r 1), i.e. the maximum tax, rm , that can be

imposed is given by rm = 1/2.

7

qrS (p) = q S (p rp) = p rp 4,

to find qr . However, we can not use q S (p) = p 4 as this no longer holds in the presence of the tax!

48

p

8

new S

12 48r

, 2r )

( 2r

4

1r

6

4

D

4

(a)

(b)

Figure 2.23: For Exercise 2.3, a sketch of the graphs of the supply and demand functions

indicating the equilibrium point of the market when (a) there is no tax and (b) a

percentage of the price tax of 100r% is imposed. (Note that these sketches only make

economic sense when q 0.)

Solution to exercise 2.4

The firms profit function is

(q) = q 2 + 2q + 2,

and its domain is the interval [0, 10] as q 0 since it is a quantity and q 10 since the

largest quantity it can produce is 10. So, to sketch the graph of this profit function, we

start by sketching the parabola

y = q 2 + 2q + 2 = (q + 1)2 + 1,

in completed square form. This has

a y-intercept when q = 0, i.e. y = 2,

no q-intercepts as y = 0 gives (q + 1)2 + 1 = 0 which has no real solutions, and

a turning point which is a minimum at the point (1, 1).

We then restrict our attention to the relevant values of q, i.e. those that satisfy

0 q 10, to get a sketch of the graph of the profit function itself as illustrated in

Figure 2.24.

Looking at the graph of the profit function, we see that as it is a function

: [0, 10] [2, 122], its inverse exists since there is a unique q [0, 10] such that

y = (q) for all y [2, 122]. Indeed, solving this equation we find that, using the

completed square form above, we have

y = (q+1)2 +1

(q+1)2 = y1

q+1 = y 1

q = 1 y 1,

which gives us two values of q for each value of y > 1. However, as we must be getting

values of q [0, 10] from our inverse function, we take the + sign here (i.e. we discard

49

2. Functions

the sign) so that we can get the solutions where q 1 (instead of getting the

solutions where q 1 which we dont want). That is, we have found that

q = 1 (y) = 1 +

y 1,

y

122

y = (q)

2

1

1 O

10 q

Figure 2.24: For Exercise 2.4, a sketch of the graph of the profit function, (q). (Note

that dashed parts of the curve are on the parabola but are not part of the graph of the

profit function.)

Solution to exercise 2.5

To sketch the circle and the rectangular hyperbola, we note that:

The circle with equation

x2 + y 2 = 1,

is centred on the origin and has a radius of 1. Indeed, setting x = 0, we find that its

y-intercepts are y = 1 and, setting y = 0, we find that its x-intercepts are x = 1.

The rectangular hyperbola with equation

2xy = 1

y=

1

,

2x

has the x and y-axes, i.e. the lines y = 0 and x = 0, respectively, as its asymptotes

since

For the vertical asymptote: As x 0+ we have y and as x 0 we have

y .

For the horizontal asymptote: As x we have y 0 from above and as

x we have y 0 from below.

y-intercepts (as no value of y makes x = 0).

and these curves are illustrated in Figure 2.25.

To find the points of intersection of these two curves we have to solve the equations

x2 + y 2 = 1

and

2xy = 1,

simultaneously. This can easily be done in two different ways which we give here for

completeness.

50

Method I: The equation 2xy = 1 tells us that, say, y = 1/(2x) and substituting

this into the other equation we get

x2 + y 2 = 1

1

=1

4x2

x2 +

4x4 4x2 + 1 = 0,

(2x2 1)2 = 0

x2 =

1

2

1

x = .

2

1

2

1

y=

=

= ,

2x

2

2

as the corresponding values of y.

Method II: We note that, using our equations, we have

(x y)2 = x2 2xy + y 2 = (x2 + y 2 ) (2xy) = 1 1 = 0,

and so any solutions we seek must satisfy (x y)2 = 0 or, equivalently, x = y. If we

substitute this into one of our equations, say 2xy = 1, we get

2xy = 1

2y 2 = 1

y2 =

1

2

1

y = .

2

values of x.

So, whichever

method

we find

that the points of intersection of these two

you choose,

curves are (1/ 2, 1/ 2) and (1/ 2, 1/ 2), both of which are indicated in

Figure 2.25.

y

y=

1

2x

1

1

O

1

x2 + y 2 = 1

the rectangular

hyperbola

2xy

(1/ 2, 1/ 2).

51

2. Functions

52

Chapter 3

Differentiation

3

Essential reading

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 2.72.13.

Anthony and Biggs (1996) Chapter 6 and parts of Chapter 7.

Further reading

Simon and Blume (1994) Sections 2.32.7 and 3.6, Chapter 4 and Section 5.5.

Adams and Essex (2010) Sections 2.12.7, parts of Sections 3.1 and 3.3, parts of

Sections 4.9 and 4.10.

Aims and objectives

The objectives of this chapter are as follows.

To introduce the idea of a derivative and see how it can be found using various

techniques.

To use derivatives to find tangent lines and approximate functions using various

techniques.

To see how derivatives can be used in economics-based subjects.

Specific learning outcomes can be found near the end of this chapter.

3.1

Having revised the idea of a function in the previous chapter, we now turn to

differentiation, the process by which we find the derivative of a function. Given a

function, f , its derivative at the point a, which we denote by f (a), is given by the

formula

f (a + h) f (a)

f (a) = lim

,

h0

h

provided that the limit exists. Indeed, when the limit exists, i.e. when we can find a

value for f (a), we say that the function is differentiable at a. Observe that here, we

have introduced the notation

lim g(h),

h0

53

3. Differentiation

to denote the value1 of the function g(h) as h 0 (provided, of course, that there is

such a value) and we call this value the limit of g(h) as h 0 whereas if there is no

such value, we say that this limit does not exist.2 To see how this works in practice, we

can consider a simple example.

Example 3.1 Use the definition to find the derivative of the function f (x) = x2 at

the point x = 3.

We need to find f (3) and, using the formula above with a = 3, we start by looking at

f (3 + h) f (3)

(3 + h)2 32

=

,

h

h

which, looking at the numerator, is easily simplified to give

f (3 + h) f (3)

(9 + 6h + h2 ) 9

6h + h2

=

=

= 6 + h.

h

h

h

This in turn means that

f (3) = lim

h0

f (3 + h) f (3)

= lim 6 + h

h0

h

= 6,

that the derivative of f (x) at the point x = 3, i.e. f (3), is 6. Indeed, we can say that

the function f (x) = x2 is differentiable at x = 3.

Activity 3.1 Use the definition to find the derivative of the function f (x) = x2 at

the point x = 1.

More generally, instead of finding the derivative of f at individual points, we can find

its derivative at a general point, x, by finding f (x). Of course, according to our

formula, this would involve finding

f (x + h) f (x)

,

h0

h

and, provided the limit exists, this will give us another function of x. This can then be

used to find the derivative, say f (a), at an individual point, a, by setting x = a in our

result. Lets see how this works.

f (x) = lim

Example 3.2 Use the definition to find the derivative of the function f (x) = x2 at

the general point x and use this to verify that f (3) = 6 as we found in Example 3.1.

We need to find f (x) and, using the formula above, we start by looking at

f (x + h) f (x)

(x + h)2 x2

=

,

h

h

1

In 176 Further Calculus, you will do limits properly, but this simple explanation of what is going on

should suffice for our purposes here. In particular, we briefly introduced the notation in Example 2.2

and, with the examples below, you should be able to see what is happening for now. Also, in Section

3.3.4, we will see some examples where a limit fails to exist and we will explain what that means there.

2

54

f (x + h) f (x)

(x2 + 2xh + h2 ) x2

2xh + h2

=

=

= 2x + h.

h

h

h

This in turn means that

f (x + h) f (x)

= lim 2x + h

h0

h0

h

f (x) = lim

= 2x,

see that the derivative of f (x) at the general point x, i.e. f (x), is 2x which is also a

function of x as we should expect.3

Having found this, we can substitute x = 3 into our result to see that

f (3) = 2(3) = 6 as we found in Example 3.1.

Activity 3.2

Use the result in Example 3.2 to verify your answer to Activity 3.1.

We have seen that a function, f (x), has a derivative, f (x), which is also a function of x.

The process by which we go from a function to its derivative is called differentiation.

That is, when we have a function, f (x), we differentiate it with respect to x and we

sometimes denote this operation by

d

f (x) which is read as differentiate f (x) with respect to x,

dx

and the outcome of this operation will be the sought after derivative which we can write

as

df

or f (x).

dx

If we then want to evaluate the derivative of f at the point a, we write

df

dx

or f (a),

x=a

We will see what derivatives tell us about functions in Section 3.3 and, in particular, we

will see that some functions do not have derivatives at certain points as the limit in the

definition may not exist. But, before we do that, we turn our attention to how we can

find the derivative of a function when we dont want to explicitly use the definition.

3.2

The previous section told us how to find derivatives from first principles, but now we

want to explore a more convenient way of finding them. The key idea is that we

3

Indeed, as this limit exists for all x R, we can say that the function f (x) = x2 is differentiable for

all x R.

55

3. Differentiation

introduce standard derivatives which tell us how to differentiate the basic functions that

we saw in the previous chapter. Once we know how to differentiate these, the rules of

differentiation will allow us to differentiate combinations of these functions.

3.2.1

Standard derivatives

In Example 3.2, we used the definition of the derivative to show that the function

f (x) = x2 has a derivative given by f (x) = 2x. We now state some results that will

allow us to differentiate other elementary functions.

Power and root functions

If n Z, we can use the definition of the derivative to show that

f (x) = xn

f (x) = nxn1 .

If n = 0, we have

f (x) = x0 = 1

f (x) = 0x1 = 0,

If n = 1, we have

f (x) = x1 = x

f (x) = 1x0 = 1,

If n = 2, we have

f (x) = x2

Indeed, we will also use this rule when n Q, and so we also have things like

f (x) = x 2 =

1 1

1

f (x) = x 2 = ,

2

2 x

x.

If we are using e and ln, the derivatives are very simple, i.e.

f (x) = ex

f (x) = ex ,

f (x) = ln x

56

f (x) =

1

,

x

which, as we will see in Activity 3.12, follows from the fact that the function ln x is the

inverse of ex .

If we have another base, a, the derivatives are not so simple. We shall see in Activity 3.9

that

f (x) = ax

=

f (x) = ax ln a,

and, using the change of base formula for logarithms, we will see that

f (x) = loga x

1

,

f (x) =

x ln a

f (x) = cos x,

in Section 3.2.2.

Sine and cosine functions

For the sine function we find that

f (x) = sin x

and for the cosine function we have

f (x) = cos x

f (x) = sin x.

Although, we could have used the fact that the sine and cosine functions are

interdefinable, i.e.

cos x = sin x +

and

sin x = cos x +

,

2

to derive the latter from the former once we have the chain rule (see Exercise 3.2).

Indeed, using these standard derivatives, we can then derive the derivatives of the other

trigonometric functions using their definitions in terms of sine and cosine together with

the rules of differentiation in Section 3.2.2 see, for example, Activity 3.6(c).

3.2.2

In Section 2.1.2, we saw that there are several standard ways of making new functions

from old ones. Here, we see how we can use the standard derivatives, i.e. the derivatives

of our basic functions, and rules of differentiation to differentiate new functions that are

created from these basic ones in these standard ways. We start with the most

straightforward of these which allows us to differentiate linear combinations of functions.

The linear combination rule

If k and l are constants, this allows us to differentiate the linear combination,

kf (x) + lg(x), of two functions f (x) and g(x). It states that

df

dg

d

kf (x) + lg(x) = k

+l ,

dx

dx

dx

or, using our shorthand, (kf + lg) (x) = kf (x) + lg (x). Indeed, this gives us three more

basic rules straightaway, i.e. the

57

3. Differentiation

d

df

kf (x) = k ,

dx

dx

or, using our shorthand, (kf ) (x) = kf (x).

d

df

dg

f (x) + g(x) =

+

,

dx

dx dx

or, using our shorthand, (f + g) (x) = f (x) + g (x).

difference rule: If f (x) and g(x) are functions, then

d

df

dg

f (x) g(x) =

,

dx

dx dx

or, using our shorthand, (f g) (x) = f (x) g (x).

Activity 3.3 Derive the constant multiple, sum and difference rules from the linear

combination rule.

Example 3.3

3

rule;

1

if f (x) = cos x sin x, then f (x) = sin x cos x by the difference rule;

if f (x) = 3 ln x 4 ex , then f (x) =

3

4 ex by the linear combination rule.

x

So, in the case of simple combinations of functions such as these, we see that the

derivative of the linear combination is given by the linear combination of the derivatives.

Activity 3.4 Use the rules above to differentiate the following functions with

respect to x.

(a) 3 cos x,

(b) ex + cos x,

(c) 3 sin x 3 ln x.

Indeed, we can see that using the change of base formula for logarithms from

Section 2.1.4, we have

ln x

loga x =

,

ln a

58

d

dx

loga x

d

dx

ln x

ln a

1 d

ln a dx

ln x

1

ln a

1

x

1

,

x ln a

as mentioned in Section 3.2.1. We now look at the other rules of differentiation, i.e. the

ones that will allow us to differentiate products, quotients and compositions of functions.

The product rule

This allows us to differentiate the product of two functions f (x) and g(x). It states that

d

df

dg

f (x)g(x) =

g(x) + f (x) ,

dx

dx

dx

or, using our shorthand, [f (x)g(x)] = f (x)g(x) + f (x)g (x)]. Lets have a look at some

examples of how it works.

Example 3.4

f (x) = x

and

g(x) = ex ,

f (x) = 1

and

g (x) = ex .

As such, the product rule tells us that

h (x) = (1)(ex ) + (x)(ex ) = (1 + x) ex ,

is the derivative of the function h(x) = x ex with respect to x.

Example 3.5

f (x) = x

and

g(x) = ln x,

f (x) = 1

and

g (x) =

1

.

x

h (x) = (1)(ln x) + (x)

1

x

= ln x + 1.

59

3. Differentiation

Example 3.6

f (x) = ex

and

g(x) = ln x,

f (x) = ex

and

g (x) =

1

.

x

h (x) = (ex )(ln x) + (ex )

1

x

= ex ln x +

1

x

Activity 3.5 Use the product rule to differentiate the following functions with

respect to x.

(a) x sin x, (b) ex cos x, (c) sin x cos x.

What can you deduce about the derivative of sin(2x) from your answer to (c)?

The quotient rule

This allows us to differentiate the quotient of two functions f (x) and g(x). It states that

df

dg

g(x) f (x)

d f (x)

dx ,

= dx

dx g(x)

[g(x)]2

or, using our shorthand,

f (x)

g(x)

.

[g(x)]2

Of course, as we saw in Section 2.1.2, this all assumes that the quotient of the two

functions is defined for the values of x that we are working with, i.e. it only works for

values of x in the domain where g(x) = 0. Lets have a look at some examples of how it

works.

Example 3.7

f (x) = ex

and

g(x) = x,

f (x) = ex

and

g (x) = 1.

60

ex

with respect to x.

x

h (x) =

x1 x

=

e ,

2

x

x2

Example 3.8

x3

with respect to x.4

ln x

f (x) = x3

and

g(x) = ln x,

f (x) = 3x2

and

g (x) =

1

.

x

(3x2 )(ln x) (x3 )

h (x) =

[ln x]2

1

x

x2 (3 ln x 1)

,

[ln x]2

Example 3.9

ln x

with respect to x.5

ex

f (x) = ln x

and

g(x) = ex ,

1

and

x

As such, the quotient rule tells us that

f (x) =

h (x) =

1

x

g (x) = ex .

(1 x ln x) ex

1 x ln x

=

=

,

[ex ]2

x e2x

x ex

Activity 3.6 Use the quotient rule to differentiate the following functions with

respect to x and find the values of x for which the derivatives exist.

(a)

sin x

,

x

(b)

ex

,

cos x

(c)

sin x

.

cos x

What can you deduce about the derivative of tan x from your answer to (c)?

4

5

Observe that as ex > 0 for all x R, we dont have to worry about dividing by zero here.

61

3. Differentiation

This allows us to differentiate the composition of two functions f (x) and g(x). It states

that

d

df dg

[f (g(x))] =

,

dx

dg dx

or, using our shorthand, [f (g(x))] = f (g)g (x). Lets have a look at some examples of

how it works.

Example 3.10

f (g) = g 3

and

g(x) = 2x + 1.

As such we have

f (g) = 3g 2

and

g (x) = 2,

h (x) = (3g 2 )(2) = 6g 2 = 6(2x + 1)2 ,

is the derivative of h(x) with respect to x.

Activity 3.7 Verify that this is correct by multiplying out the brackets and

differentiating your new expression for h(x) with respect to x.

Example 3.11

f (g) =

g = g2

and

g(x) = 2x + 1.

As such we have

1 1

f (g) = g 2

2

and so the chain rule tells us that

h (x) =

1 1

g 2

2

and

g (x) = 2,

(2) = g 2 =

1

,

2x + 1

Example 3.12

6

3 +2

with respect to x.

In particular, observe that here the original function is only defined if x 1/2 whereas the derivative

is only defined if x > 1/2 (as, in the derivative, x = 1/2 would entail division by zero).

62

3 +2

f (g) = eg

and

g(x) = x3 + 2.

As such we have

f (g) = eg

g (x) = 3x2 ,

and

3 +2

Activity 3.8

to x.

Use the chain rule to differentiate the following functions with respect

(a) sin(2x),

(c) ln(ex ).

The chain rule can also be used to derive some useful results.

Activity 3.9 (A useful result)

Using the fact that

ax = ex ln a ,

which we saw in Section 2.1.3, show that

dax

= ax ln a.

dx

This was mentioned in Section 3.2.1, but there is no need to remember it as you

should be able to derive this result if it is needed.

Activity 3.10 (Deriving the quotient rule)

Derive the quotient rule by writing the quotient

f (x)

g(x)

and using the product and chain rules to differentiate it with respect to x.

Activity 3.11 (Derivatives of inverse functions)

If the function, f , has an inverse, f 1 , then we can let y = f (x) so that x = f 1 (y).

Use the chain rule to show that

d 1

f (y) = 1

dy

d

f (x) .

dx

63

3. Differentiation

Activity 3.11 and the fact that (ex ) = ex to show that the derivative of ln y with

respect to y is 1/y.

Using the rules together

order to find a derivative. This is easily done as long as care is taken to recognise what

you are differentiating at each step. Here are two examples that should make the

procedure clear.

Example 3.13

x.

f (x) = x3 + 1

and

and clearly, f (x) = 3x2 . But to differentiate g(x) we need to use the chain rule

because it is a composition. In this case, we have

g(h) = ln h

which gives us

g (h) =

so that

g (x) =

and

1

h

and

1

h

(2x) =

h(x) = x2 + 4,

h (x) = 2x,

2x

2x

= 2

,

h

x +4

by the chain rule. Now, putting all of this into the product rule gives us

l (x) = (3x2 ) ln(x2 + 4) + (x3 + 1)

2x

2

x +4

= 3x2 ln(x2 + 4) +

2x(x3 + 1)

,

x2 + 4

Example 3.14

2 +x

f (x) = ex

2 +x

and

and to differentiate f (x) we need to use the chain rule because it is a composition.

In this case, we have

f (h) = eh

and

h(x) = x2 + x,

f (h) = eh

and

h (x) = 2x + 1,

which gives us

64

so that

f (x) = (eh )(2x + 1) = (2x + 1) eh = (2x + 1) ex

2 +x

by the chain rule. Then, to differentiate g(x), we need to use the chain rule again

because it is also a composition. In this case, we have

g(h) = ln h

and

h(x) = x3 + 1,

1

h

and

h (x) = 3x2 ,

(3x2 ) =

3x2

3x2

= 3

,

h

x +1

which gives us

g (h) =

so that

g (x) =

1

h

by the chain rule. Now, putting all of this into the product rule gives us

l (x) = (2x + 1) ex

2 +x

ln(x3 + 1) + ex

= (2x + 1) ln(x3 + 1) +

2 +x

3x2

x3 + 1

3x2

x2 +x

e

,

x3 + 1

Of course, once you can reliably apply the rules, there is no need to show all of the

intermediate working.

Activity 3.13 Use the rules of differentiation to differentiate the following

functions with respect to x.

2

(b)

sin(cos x)

,

esin x

3.2.3

Higher-order derivatives

As we have seen above, when we differentiate a function, f (x), we find that its

derivative, f (x), is also a function of x. In this context, we call f (x) the first-order

derivative of f (x) and we can differentiate it again to get the second-order derivative,

i.e. we find

d2 f

d df

and we denote this by

or f (x).

dx dx

dx2

Of course, the second-order derivative will also be a function of x and so we can

differentiate it again to get the third-order derivative, i.e. we find

d

dx

d2 f

dx2

d3 f

dx3

or f (x).

We can, of course, do this again and again but the shorthand notation we use can

become a bit unwieldy once we pass the third-order derivative. As such, for n 4, we

65

3. Differentiation

dn f

dxn

as f (n) (x),

Example 3.15 Find the first four derivatives of f (x) = sin x. What is the

relationship between these derivatives of sin x?

We have f (x) = sin x, and so the first-order derivative of f is given by

f (x) =

d

sin x = cos x.

dx

f (x) =

d

dx

df

dx

d

cos x = sin x.

dx

f (x) =

d

dx

d2 f

dx2

d

( sin x) = cos x.

dx

f

(4)

d

(x) =

dx

d3 f

dx3

d

( cos x) = sin x.

dx

So, in particular, we see that f (x) = f (x), f (x) = f (x) and f (4) (x) = f (x).

Activity 3.14

n 1?

Using the pattern inherent in Example 3.15, what is f (n) (x) for

Activity 3.15 Find the first four derivatives of f (x) = x ex . Hence deduce an

expression for f (n) (x) for n 1.

3.3

Using derivatives

Derivatives can be very useful in mathematics and economics, but before we see how,

we need to understand what derivatives represent.

3.3.1

If we draw the graph of a function, f , we get the curve with equation y = f (x). At any

point on this curve, say the point (a, f (a)), we can draw a chord (or secant line) that

connects the given point to another point on the curve. For instance, in Figure 3.1, the

66

y

y = f (x)

f (b)

3

C

f (a)

O

Figure 3.1: The line segment C is the chord joining the points (a, f (a)) and (b, f (b)) on

the curve y = f (x). This is extended using the dotted lines at both ends so that we can

see what line the chord is a line segment of.

line segment C is the chord joining the points (a, f (a)) and (b, f (b)) on the curve

y = f (x). In particular, we see that the gradient of this chord, lets call it mC , can be

found using the formula

f (b) f (a)

,

mC =

ba

which you should know.

To relate this to the derivative, we take some number, h = 0, and let b = a + h so that

we now have a chord, C, which is joining the points (a, f (a)) and (a + h, f (a + h)). The

gradient of this chord is then given by

mC (h) =

f (a + h) f (a)

f (a + h) f (a)

=

,

(a + h) a

h

and, for h = 0, this is a function of h since the value of mC will depend on the value of

h that we choose. In particular, recalling what we saw in Section 3.1, we can see that

f (a) = lim mC (h),

h0

We now consider the chords that join the point (a, f (a)) to the points

(a + h1 , f (a + h1 )), (a + h2 , f (a + h2 )) and (a + h3 , f (a + h3 )) where h3 > h2 > h1 > 0.

These are the line segments on the lines C1 , C2 and C3 which can be seen in Figure 3.2.

That is, we have three points that are getting successively closer to the point (a, f (a))

so that we can see the effect of letting h 0. In particular, as we let h get smaller, we

see that the gradients of these particular chords are decreasing. The question is, do the

67

3. Differentiation

y

y = f (x)

C3

C2

f (a + h3 )

C1

T

f (a + h2 )

f (a + h1 )

f (a)

O

a + h1

a + h2

a + h3

Figure 3.2: C1 , C2 and C3 are three chords of the curve y = f (x) originating from the

point (a, f (a)). Observe that as the other end of a chord approaches this point, the chords

pivot about it and their gradients get closer to the gradient of the line, T .

gradients of these chords tend to some finite limit as h 0? That is, does the limit in

our expression for f (a) above exist?

Hopefully, in Figure 3.2, you can see that as h gets smaller (i.e. as we consider C3 , then

C2 and then C1 ), the lines are pivoting through the point (a, f (a)) and their gradients

are getting closer to the gradient of the line T . Indeed, in the limit as h 0, the lines

we get from extending an arbitrary chord joining the points (a, f (a)) and

(a + h, f (a + h)) should become the line T . In particular, this means that the limit of

mC (h) as h 0 exists because it should be equal to the gradient of T . This means that

the line T , called the tangent to f at the point (a, f (a))

goes through the point (a, f (a)), and

its gradient is the limit, as h 0, of mC (h), i.e. f (a).

For this reason, we define the gradient of a curve y = f (x) at the point (a, f (a)) to be

the gradient of its tangent line at that point and this, as we have seen, is simply the

value of f (a).

3.3.2

Now that we know how the tangent lines to a curve are related to derivatives, we can

use derivatives to find the equation of the tangent line to a curve at a given point. This,

in turn, will introduce us to a useful way of performing approximations.

68

Given that f (a) is the gradient of the curve y = f (x) at the point x = a, we can use

this to find the equation of the tangent line at this point. In particular, the formula for

the gradient of a straight line, i.e.

f (a) =

y f (a)

,

xa

(3.1)

gives us the equation of the tangent line as it goes through the point (a, f (a)) and its

gradient is given by f (a). Lets look at a quick example.

Example 3.16

when x = 3.

When x = 3, the point on the curve y = x2 is (3, 9) and we know that f (3) = 6 as

f (x) = 2x. Consequently, using (3.1), the equation of the tangent line is given by

6=

y9

x3

y 9 = 6x 18

y = 6x 9.

In particular, when written in this form, we see that the gradient of the line is

indeed 6 and the point (3, 9) does indeed lie on it as 6(3) 9 = 9.

Activity 3.16 Find the equation of the tangent line to the function f (x) = ex when

x = 1.

Linear approximations

One use of tangent lines is that they provide us with a simple way of approximating the

value of a function. For instance, if we have the tangent line to the function f (x) at the

point x = a, the equation of its tangent line, i.e.

f (a) =

y f (a)

,

xa

y = f (a) + (x a)f (a).

Now, if we pick a value of x that is close to a, say x , the value of y when x = x , will be

y = f (a) + (x a)f (a),

and this will be close to the value of f (x ) as illustrated in Figure 3.3. Of course, if we

pick a value of x which is closer to a, the value of y will be closer to the value of f (x )

and we will have a better approximation to the value of f (x) at this point.

As we are approximating the function f by a straight line, we call this a linear

approximation to f around a. In particular, we have

f (x)

69

3. Differentiation

y

y = f (x)

3

T

error

f (x )

f (a)

O

Figure 3.3: When x is close to a we can use the tangent line at a to find y which gives

approximation but this can be made smaller if we use values of x which are closer to a.

if x is close to a. In Section 3.4, we will discuss Taylor series and these will allow us to

find better approximations to f around a, but before we do that, lets consider a useful

application of linear approximations.

Example 3.17

f (x) = 3 ex

f (0) = 3 e0 = 3,

f (0.1)

calculator, we can see that the exact value of 3 e0.1 is 2.71 to 2dp and so this is a

pretty good approximation.

Using linear approximations to find changes

As well as allowing us to find approximations to f around a, linear approximations can

give us useful information about how the value of f is changing as we move from a to,

say, a + h. Using our linear approximation, we see that

f (a + h)

70

f (a) + hf (a)

f (a)

f (a + h) f (a)

,

h

and so, if we denote the change in f by f and the change in x by x = h, we see that

f

f (a)

or

f f (a)x.

x

That is, we can find the approximate value of the change in f if we change x from a to

a + h. Of course, the smaller x = h is, the better our approximation. This is

illustrated in Figure 3.4.

y

y = f (x)

approx f

f (a)

O

exact f

error

f (a + h)

a+h

x = h

Obviously, the smaller the value of the change x = h, the better the approximation for

f will be.

Example 3.18 Without using a calculator, find the approximate change in 3 ex if

x is increased from zero to 0.1. Hence deduce the approximate value of 3 e0.1 .

Given that f (x) = 3 ex , we have

f (x) = 3 ex

f (0) = 3 e0 = 3,

f

i.e. the change in f is approximately 0.3. Observe that the minus sign is telling us

that when x increases from 0 to 0.1, f (x) is decreasing by approximately 0.3.

This means that using

f (0.1)

we see that the approximate value of 3 e0.1 is 2.7 as we would expect from the linear

approximation in Example 3.17.

71

3. Differentiation

Further, as the derivative of a function gives us information about how f (x) is changing

due to changes in x, we often refer to f (a) as the rate of change of f (x) with respect to

x when x = a.

3.3.3

Applications of derivatives

Derivatives are useful in economics and we now introduce two ways in which they can

arise in that subject. The first is their use when discussing marginal functions and the

second is when they are used in the context of elasticities. At this point, we will just

introduce these ideas and see how they might be useful, but they will also be used when

we consider some applications of the material contained in other chapters of this subject

guide.

Marginal functions

In economics, the term marginal denotes the rate of change of a quantity with respect

to a variable on which it depends. For instance, if a firm has a cost function, C(q), this

tells us the cost of producing q units of their product. The marginal cost of the firm,

which we denote by MC(q), would then be given by

MC(q) =

dC

.

dq

This is useful since, using what we saw above, we can see that the marginal cost is

telling us (approximately) about how changes in the level of production, q, will incur

changes in the costs, C. That is, if the level of production is increased by q, i.e. our

production increases from q to q + q, we find that

MC(q) =

dC

dq

MC(q)

C

q

MC(q)q,

particular, if q is so large that a change in production of one unit (i.e. q = 1) is small

compared to q, we see that

C = C(q + 1) C(q)

MC(q).

That is, in these circumstances, the marginal cost tells us (approximately) the extra

cost incurred if the firm wishes to produce one more unit of their good given that they

are already producing q units.

Example 3.19

C(q) = 1000 + 5q + q 2 ,

in dollars. Find the marginal cost function for this firm and use it to determine the

approximate cost of producing one more unit if the original level of production is 100

units.

The marginal cost function, MC(q), is given by

MC(q) = C (q) = 5 + 2q,

72

and so using the fact that the change in cost, C, is related to the change in

production, q, by

C C (q)q,

we see that an increase in production of one unit, i.e. q = 1, gives rise to an

increase in costs given by

C

That is, if the firm is producing 100 units and they increase their production by one

unit, they will incur additional costs of approximately 205 dollars.

Activity 3.17 By using C(q + 1) C(q) directly when q = 100, determine how

good the approximation found in Example 3.19 is.

Generally then, if f is some economically meaningful function, its derivative is referred

to as the marginal of f and we denote this by Mf . For instance, if R(q) is the revenue

function for a firm, the marginal revenue, MR(q), is just R (q).

Elasticities

Suppose that, as in Section 2.1.5, we have a market where consumers purchase a good

according to the demand function, q D (p). If the price of this good was to increase from

p to p + p, then there will be a change in the quantity demanded by the consumers

from q to q + q. Indeed, since a rise in price will usually lead to a fall in demand, we

would expect q to be negative here. In these circumstances, we can see how these

changes are related by noting that

q = q D (p + p) q D (p)

q (p)p

q

p

q (p),

where we have used q to denote the quantity demanded, i.e. q(p) = q D (p).

Now, suppose that we are interested in the relative change in quantity, q/q, and the

relative change in price, p/p, we can see that the ratio of these two terms is then given

by

q/q

p q

p

=

q (p).

p/p

q p

q

Indeed, as q is usually negative (whereas the other terms on the left-hand-side, i.e. p,

q and p, are all positive) we would usually expect the right-hand-side to be negative

as well. With this in mind, we define the [price] elasticity of demand, (p), to be

p

(p) = q (p),

q

where q = q D (p) and the minus sign is introduced so that, in the usual case where q is

negative, we can be sure that (p) itself will be positive.7 Then, we can see that using

q

q

(p)

p

,

p

Some books omit the minus sign in their definition of the elasticity of demand, but it will be useful

for us to include it as it is easier to deal with positive quantities.

73

3. Differentiation

we can see how the relative change in quantity is simply related to the relative change

in price via the elasticity of demand.

Example 3.20 Suppose that the demand function for some good is given by

q D (p) = 10pr where r is a constant. Find the elasticity of demand. What does this

tell us about the effect of relative changes in price on relative changes in quantity?

q (p) = 10rpr1 ,

which means that the elasticity of demand is given by

p

p

(p) = q (p) =

q

10pr

10rpr1

= r,

q

q

(p)

p

,

p

we see that a relative increase in price of, say, x% will lead to a relative decrease in

quantity purchased of (approximately) rx%.

Indeed, we will see, in Section 4.2.3, that elasticities can also give us useful information

about how the revenue, R = pq, generated from selling a quantity, q, at a price of p per

unit will be affected by increases in the price.

3.3.4

Existence of derivatives

Although we will usually be dealing with situations where a function has a derivative at

every point where it is defined, we will occasionally encounter situations where there is

at least one point at which the derivative of a function does not exist. Just so that we

are aware of what this means and the kinds of situation in which it can arise, we

consider some of the most common ways in which a derivative can fail to exist at a

certain point.8

Discontinuous functions

If a function is discontinuous at a point, i.e. there is a point at which the function is not

continuous, then the derivative will not exist at that point as the next example

illustrates.

Example 3.21

f (x) =

1

x0

,

1 x < 0

See, for example, Section 2.8 of Binmore and Davies (2002) for a discussion of some similar cases.

74

This function is illustrated in Figure 3.5(a) and, clearly, as the function is a

continuous horizontal line when x = 0, its derivative is defined and equal to zero as

long as x = 0. However, when x = 0, the function is discontinuous and its derivative,

if it exists, would be given by

f (h) f (0)

.

h0

h

f (0) = lim

However, here we can not just find

f (h) f (0)

,

h

and let h 0 as we did in Section 3.1 since the value of f (h) is different depending

on whether h is positive or negative. In such cases, we say that the limit we seek, i.e.

f (h) f (0)

,

h0

h

lim

lim

h0

f (h) f (0)

h

and

lim+

h0

f (h) f (0)

,

h

exist9 and, secondly, if they exist, they must be equal. But, using the given function,

we see that

(1) 0

1

f (h) f (0)

lim

= lim

= lim

= ,

h0

h0

h0

h

h

h

and

(1) 0

1

f (h) f (0)

= lim+

= lim+ = ,

lim+

h0

h0 h

h0

h

h

i.e. neither of these limits exists as is not a value10 but more of a notational

convenience which tells us that a function is getting arbitrarily large in the limit.

Consequently, we see that

f (h) f (0)

,

h0

h

f (0) = lim

fails to exist too and so the derivative of this function does not exist at x = 0.

Of course, the graph of a function can also have a discontinuity due to the presence of a

vertical asymptote. In such cases, the function is not actually defined at the value of x

where the asymptote occurs and so, because of this, the derivative cannot exist at this

point either.11 In both of these cases, as we cant ascribe a gradient to the function at

these points, the function cant have a tangent line at these points.

9

Notice that the former limit allows us to deal with negative h and the latter allows us to deal with

positive h. Also recall that the notation h 0 and h 0+ was explained in Example 2.2.

10

That is, it is not a real number.

11

Well come across this again in Section 4.4.3.

75

3. Differentiation

y

y

y = x1/3

y = |x|

1

y=

1

x0

1 x < 0

(a)

(b)

(c)

Figure 3.5: The graphs of three functions that have no derivative at x = 0 as explained in

(a) Example 3.21, (b) Example 3.22 and (c) Example 3.23. We note however that, unlike

the functions in (a) and (b), the function in (c) does have a tangent line at x = 0 given

by the vertical line with equation x = 0.

Continuous functions with corners

But, even if a function is continuous at every point, the derivative will not exist at

points where the curve changes too sharply, i.e. when the curve has a corner, as the

next example illustrates.

Example 3.22

when x = 0.

Show that the derivative of the function f (x) = |x| does not exist

This function is illustrated in Figure 3.5(b) and, clearly, as the function is the

continuous straight line f (x) = x when x < 0 and f (x) = x when x > 0, its

derivative is defined and equal to 1 when x < 0 and 1 when x > 0. However, when

x = 0, the function has a corner and its derivative, if it exists, would be given by

f (h) f (0)

.

h0

h

f (0) = lim

f (h) f (0)

,

h

and let h 0 as we did in Section 3.1 since the value of f (h) is different depending

on whether h is positive or negative. In such cases, we again say that the limit we

seek, i.e.

f (h) f (0)

lim

,

h0

h

exists if, firstly, both of the limits

lim

h0

f (h) f (0)

h

and

lim+

h0

f (h) f (0)

,

h

exist12 and, secondly, if they exist, they must be equal. But, using the given

function, we see that

lim

h0

76

f (h) f (0)

(h) 0

= lim

= lim 1 = 1,

h0

h0

h

h

and

f (h) f (0)

h0

= lim+

= lim+ 1 = 1,

h0

h0

h0

h

h

i.e. both of these limits exist, but they are clearly not equal. Consequently, we see

that

f (h) f (0)

f (0) = lim

,

h0

h

fails to exist and so the derivative of this function does not exist at x = 0.

lim+

Observe that, in this case, the limits as h 0+ and as h 0 both exist, but the

problem occurs because they are not equal and so we cannot ascribe a value to the

derivative (i.e. the limit as h 0) in such situations. In particular, as this means that

we cant ascribe a gradient to f at this point, the function cant have a tangent line

here either.

Continuous functions with vertical tangent lines

Also, if a function is continuous at every point, the derivative will not exist at points

where the gradient of the curve becomes infinite, i.e. when the curve has a vertical

tangent line, as the next example illustrates.

Example 3.23

when x = 0.

Show that the derivative of the function f (x) = x1/3 does not exist

This function is illustrated in Figure 3.5(c) and, clearly, we can see that its

derivative is given by

1

f (x) = 31 x2/3 = 2/3 ,

3x

which exists as long as x = 0. Of course, when x = 0, the derivative cannot exist

since, if we were to use this formula, we would have to divide by zero and this is

never allowed. However, we can see from Figure 3.5(c) that the graph of the function

has a vertical tangent line at x = 0 which is given by the vertical line with equation

x = 0.13 Thus, we have a situation where the derivative of the function does not

exist at x = 0, but it does have a tangent line at that point.

Observe that, in cases where the tangent line to f at a point is a vertical line we cannot

use (3.1) to find its equation as its derivative is not defined.14

12

Again, as in Example 3.21, the former limit allows us to deal with negative h and the latter allows

us to deal with positive h.

13

Notice that the tangent lines of the function are getting steeper as we move towards x = 0 on the

left and shallower as we move away from x = 0 on the right.

14

Well come across this again in Section 4.4.3.

77

3. Differentiation

3.4

We have seen that the first derivative of a function, f , can allow us to find a linear

approximation to f around a by using the formula

f (x)

around a by using the formula

f (x) = f (a) + (x a)f (a) +

(x a)2

(x a)n (n)

f (a) + +

f (a) + ,

2!

n!

(3.2)

which is called the Taylor series for f (x) about x = a.15 You will notice that the

right-hand-side of this formula is an infinite series and, for reasons beyond the scope of

this course, there will generally be conditions that depend on f and a that determine

whether this infinite series does indeed give us the value of f (x) that we expect to get

on the left-hand-side. For now, we just note that these conditions can be used to find a

set of values of x, that includes the point x = a, for which the formula works. Of course,

if the value of x in question does not lie in this set, the formula does not work!

In this course, we will often just use the first few terms from the Taylor series to get an

approximate value of f (x).16 And, as long as we are considering what this formula tells

us about f (x) when x is close to a, these approximations will generally be more than

adequate. For instance, if we take n = 1 in this formula, i.e. if we take the first two

terms of the Taylor Series, we recover our linear approximation to f around a and, if we

take n = 2, we get

f (x)

(x a)2

f (a),

2!

which is now a quadratic approximation to f around a. Indeed, we have seen how useful

the linear approximation is in Section 3.3.2 and the quadratic approximation will be

useful in the next chapter.

3.4.1

Maclaurin series

Lets start with the relatively simple case of a Maclaurin series which is what we call a

Taylor series about x = 0. That is, the Maclaurin series of the function f (x) is found by

setting a = 0 in (3.2) to get

f (x) = f (0) + xf (0) +

xn

x2

f (0) + + f (n) (0) + .

2!

n!

(3.3)

To see how this works, lets start by finding a simple Maclaurin series.

15

See, for instance, Section 2.13 of Binmore and Davies (2002) for an explanation of where this formula

comes from.

16

It will be an approximation since, if we only keep the first few terms from the beginning of the series,

we lose all the information about the value of f (x) that is contained in the terms we are neglecting.

78

Example 3.24

Here we have f (x) = ex so that f (0) = 1. We also note that the first three

derivatives of this function are

f (x) = ex ,

f (x) = ex

and f (x) = ex .

Indeed, it should be clear that f (n) (x) = ex for all n 1. Then, to use these in (3.3),

we need to evaluate these derivatives at x = 0, i.e. we find that

f (0) = e0 = 1,

Indeed, it should be clear that f (n) (0) = e0 = 1 for all n 1. Consequently, putting

this into (3.3), we get

x2 x3

xn

e =1+x+

+

+ +

+ ,

2!

3!

n!

x

In particular, observe that ex is only equal to the series on the right-hand-side if we

keep all of the terms in this infinite series. Of course, it is not always so easy to find a

Maclaurin series and so lets look at another example.

Example 3.25

Here we have f (x) = (1 + x)r so that f (0) = 1. We also note that the first three

derivatives of this function are

f (x) = r(1 + x)r1 , f (x) = r(r 1)(1 + x)r2 and f (x) = r(r 1)(r 2)(1 + x)r3 .

Indeed, it should be clear that

f (n) (x) = r(r 1) (r [n 1])(1 + x)rn ,

for all n 1. Then, to use these in (3.3), we need to evaluate these derivatives at

x = 0, i.e. we find that

f (0) = r,

f (n) (0) = r(r 1) (r [n 1]),

for all n 1. Consequently, putting this into (3.3), we get

(1 + x)r = 1 + rx +

x +

x +

2!

3!

r(r 1) (r [n 1]) n

+

x + ,

n!

79

3. Differentiation

In particular, notice that if r Q but r N, this is always an infinite series as, for any

n N, we will find that r [n 1] = 0. However, if r N, we will find a value of n,

namely n = r + 1 that makes r [n 1] = 0 and this will mean that all of the terms

with n r + 1 will be zero, i.e. the Maclaurin series will be finite and will terminate at

the term where n = r. This is a very special Maclaurin series that you may have

encountered before as the binomial theorem and we look at some examples of this

special case in Activity 3.18.

Activity 3.18 Use the Maclaurin series for (1 + x)r which we found in

Example 3.25 to find (1 + x)2 and (1 + x)3 .

As well as the two Maclaurin series derived in Examples 3.24 and 3.25, you should also

remember the following

x3 x5

x2n+1

sin x = x

+

+ +

+

3!

5!

(2n + 1)!

cos x = 1

x2n

x2 x4

+

+ +

+

2!

4!

(2n)!

for x R.

for x R.

x2 x3

xn

+

+ + (1)n+1 + for |x| < 1.

2

3

n

In particular, observe how these series differ in their first term, the presence of terms of

odd and even degree and the absence of factorials in the series for ln(1 + x).

ln(1 + x) = x

As we have seen, a Maclaurin series is an infinite series in powers of x and, by taking a

certain number of terms, we can use it to approximate a function. In particular, we say

that we have the nth-order Maclaurin series of a function if we keep all of the terms up

to and including the one in xn and discard the rest.

Example 3.26

Write down the second and fourth-order Maclaurin series for cos x.

As we saw above, the Maclaurin series for cos x is given by the infinite series

x2 x 4

x2n

cos x = 1

+

+ +

+ ,

2!

4!

(2n)!

As such, the second-order Maclaurin series for cos x is

1

x2

,

2!

which, since there is no x3 term in the Maclaurin series for cos x, is also the

third-order Maclaurin series for cos x. Similarly, the fourth-order Maclaurin series for

cos x is

x2 x4

1

+ ,

2!

4!

5

which, since there is no x term in the Maclaurin series for cos x, is also the

fifth-order Maclaurin series for cos x.

80

These nth-order Maclaurin series can be used to approximate a function, f (x), for

values of x close to x = 0. In general, there are two factors that determine how accurate

this approximation will be, namely

the value of x we are considering: the closer this value of x is to x = 0, the better

the approximation will be, and

the order of the Maclaurin series we use: the more terms we keep, the better the

approximation will be.

The precise way of determining the accuracy of such approximations in terms of these

two factors will be dealt with in 176 Further Calculus where you will encounter Taylors

theorem. But, we can see how it works and begin to see how these factors affect the

accuracy of our approximations by considering some examples.

Example 3.27 Use the fourth-order Maclaurin series for cos x to find an

approximate value for cos 1 and cos 2.

The fourth-order Maclaurin series for cos x is

1

x2 x4

+ .

2!

4!

1

cos 1

13

12 14

+

= ,

2

24

24

which is 0.5417 to 4dp. Using a calculator we see that the true value of cos 1 is 0.5403

to 4dp and so this is a good approximation as, to 2dp, it gives us 0.54 either way.

Similarly, taking x = 2, we see that

cos 2

22 24

2

1

+

=12+ = ,

2

24

3

3

which is 0.3333 to 4dp. Using a calculator we see that the true value of cos 2 is

0.4161 to 4dp and so this is a poor approximation as it isnt even accurate to 1dp.

But, of course, we should expect our approximations to be poor if we move too far away

from x = 0 as, by definition, the Maclaurin series represents how the function is

behaving around x = 0. To see this, consider the curves in Figure 3.6 which illustrate

how the fourth-order Maclaurin series for cos x becomes less accurate at approximating

the function as we move away from x = 0.

The other way in which the accuracy of our approximation to a function can be affected

is the number of terms we take in the Maclaurin series. For instance, the second-order

Maclaurin series for cos x contains less information about the function than the

fourth-order one and so we would expect this to give us a worse approximation. This

can be seen in Figure 3.7, which illustrates how the second-order Maclaurin series is

even less accurate than the fourth-order one as we move away from x = 0.

81

3. Differentiation

Figure 3.6: The solid curve is the graph of the function cos x and the dashed curve is the

graph of the fourth-order Maclaurin series for this function. Observe how the Maclaurin

series moves away from the function as we take values of x further away from x = 0.

Using Maclaurin series to approximate other functions

We now look at some ways of finding Maclaurin series for more complicated functions

and see how we can use these to find approximations.

Example 3.28

There are two ways to do this. We could use (3.3) to see that as f (x) = x ex we have

f (0) = 0 and then, using what we found in Activity 3.15 above, i.e.

f (x) = (1+x) ex ,

f (x) = (2+x) ex ,

f (x) = (3+x) ex ,

we see that

f (0) = 1,

f (0) = 2,

f (0) = 3,

0 + x(1) +

x3

x4

x3 x 4

x2

(2) + (3) + (4) = x + x2 +

+ ,

2!

3!

4!

2

6

Alternatively, since we know that the Maclaurin series for ex is given by

ex = 1 + x +

82

x2 x3 x4

+

+

+ ,

2!

3!

4!

Figure 3.7: The solid curve is the graph of the function cos x, the dotted curve is the graph

of its second-order Maclaurin series and the dashed curve is the graph of its fourth-order

Maclaurin series. Observe how the former less accurately tracks the function than the

latter as we take values of x further away from x = 0.

we can see that

x ex = x 1 + x +

x2 x3 x4

+

+

+

2!

3!

4!

= x + x2 +

x3 x4

+

+ ,

2

6

This example illustrates a general point: when asked to find a Maclaurin series of a

certain order, we can always use the definition and differentiation. But, if the

derivatives start to become difficult to calculate, it is always easier to use the Maclaurin

series for the elementary functions (which we saw above) and a little algebra to find

what we are looking for. Lets consider another example to see how we can do this in a

slightly harder situation.

Example 3.29

Here we have f (x) = cos(ln(1 + x)) which is a composition where f (x) = cos y with

y = ln(1 + x). So we need to look at the Maclaurin series for cos y which is given by

y2 y4

+

+ ,

2!

4!

and y, in turn, will be given by the Maclaurin series for ln(1 + x), i.e.

cos y = 1

y = ln(1 + x) = x

x2 x3 x 4

+

+

+ .

2

3

4!

83

3. Differentiation

So, substituting our series for y into our series for cos y, we can see that

f (x) = 1

1

2!

x2 x3 x4

+

+

2

3

4

1

4!

x2 x3 x4

+

+

2

3

4

+ ,

and we start by looking at how the terms A and B contribute to the series if we are

only interested in terms up to x4 . For A, we have

A=

=

x2 x3 x4

+

+

2

3

4

x2 x3 x4

x

+

+

2

3

4

x2 x3 x4

+

+

2

3

4

so we can multiply each term in the second bracket by the appropriate terms in the

first bracket (taking care to include cross-terms) to get

A = (x)(x) 2

x2

2

(x) + 2

x3

3

(x) +

x2

2

x2

2

+ = x2 x3 +

11 4

x + ,

12

where indicates terms we can ignore because their degree is greater than four.

Similarly, for B, we have

B=

x2 x3 x4

+

+

x

2

3

4

x

x2 x3 x 4

+

+

2

3

4

multiplied by itself four times. The terms which arise from this product are obtained

by multiplying together four objects, one from each occurrence of the bracketed

expression. Since the term with lowest power of x in each bracket is x, it is only by

taking the x from each bracket that we obtain a term which is at most x4 and so we

get

B = x4 + ,

where indicates terms we can ignore because their degree is greater than four.

Of course, using similar reasoning, we can see that there will be no further terms for

our series as the next term in the cos y series (i.e. the first one we omitted above) is

y 6 /6! and the smallest term this can yield looks like x6 whose degree is greater

than four.

Therefore, putting this all together, we have

A B

+ +

2! 4!

1

11

=1

x2 x3 + x4 +

2

12

1

x4 +

24

5

x2 x3

+

x4 + ,

2

2

12

and this gives us the fourth-order Maclaurin series for cos(ln(1 + x)) as we have kept

all of the terms up to x4 .

=1

84

Activity 3.19 Find the fourth-order Maclaurin series for cos(ln(1 + x)) by using

the definition and differentiation to verify the answer we found in Example 3.29.

(Notice that it is harder to work it out using this method!)

Once we have the Maclaurin series of a function, f (x), we can use it to estimate the

value of the function at some value of x close to zero as we did above.

Example 3.30 Use the Maclaurin series we found in Example 3.29 to find an

approximate value for cos(ln 1.1) and cos(ln 1.9).

To find an approximate value for cos(ln 1.1), we use the Maclaurin series above to

get the approximation

cos(ln(1 + x))

x2 x3

5

+

x4 ,

2

2

12

cos(ln 1.1) = cos(ln(1+0.1))

0.12 0.13 5

+

0.14 = 10.005+0.00050.000042,

2

2

12

which is 0.995458 to 6dp. In passing we note that, using a calculator, the true value

is 0.995461 to 6dp and so this is a good approximation as, to 5dp, it gives us 0.99546

either way.

To find an approximate value for cos(ln 1.9), we use the approximation above with

x = 0.9 to get

cos(ln 1.9) = cos(ln(1+0.9))

0.92 0.93 5

+

0.94 = 10.405+0.36450.273375,

2

2

12

which is 0.686125 to 6dp. In passing we note that, using a calculator, the true value

is 0.800987 to 6dp and so this is a poor approximation as it isnt even accurate to

1dp.

Observe that this approximation has deteriorated much more quickly than the one we

used when considering approximate values of cos x in Example 3.27. We wont pursue

the nature of this sensitivity here, but we do reiterate that we should expect our

approximations to be poor if we move too far away from x = 0 for, as we have seen, the

Maclaurin series is there to represent how the function is behaving around x = 0.

3.4.2

Taylor series

We now briefly consider what happens when we are looking for the Taylor series for

f (x) around x = a when a = 0. In this case, we follow the general method outlined

above, but now we have to use (3.2), i.e.

f (x) = f (a) + (x a)f (a) +

(x a)2

(x a)n (n)

f (a) + +

f (a) + ,

2!

n!

85

3. Differentiation

Example 3.31

Here we have f (x) = ex so that f (1) = e. We also note, as in Example 3.24, that

f (n) (x) = ex for n 1. Then, to use these derivatives in (3.2), we need to evaluate

them at x = 1, i.e. we find that f (n) (1) = e for n 1. Consequently, putting this into

(3.2), we get

ex = e +(x 1) e +

(x 1)2

(x 1)3

(x 1)n

e+

e+ +

e+ ,

2!

3!

n!

Activity 3.20 We can write ex as e1 ex1 so that values of x around x = 1

correspond to values of x 1 around x = 0. Use this fact and the Maclaurin series for

ex which we found in Exercise 3.24 to derive the result we found in Example 3.31.

Activity 3.21

We can use the Taylor series of a function around x = a to get approximations to the

value of the function for values of x close to x = a in the same way as we used the

Maclaurin series of a function to get approximations to the value of the function for

values of x close to x = 0 in Section 3.4.1. As the ideas are so similar, we will just take a

brief look at how they work.

Example 3.32 Find an approximation to e1.1 using (a) the second-order Maclaurin

series for ex and (b) the second-order Taylor series for ex around x = 1. How do

these approximations compare?

For (a), we know from Example 3.24 that the second-order Maclaurin series for ex is

given by

x2

1+x+ ,

2!

and, using this, we find that

1.12

= 2.705.

2!

Incidentally, the exact value of e1.1 is 3.0042 (to 4dp) and so this approximation

doesnt even agree with this to 1dp.

e1.1

1 + 1.1 +

For (b), we know from Example 3.31 that the second-order Taylor series for ex

around x = 1 is given by

(x 1)2

e +(x 1) e +

e,

2!

and, using this, we find that

e1.1

86

e +(1.1 0.1) e +

(1.1 1)2

e = 1.105 e,

2!

which, if we know the value of e, gives us 3.0037 (to 4dp). This agrees with the

exact value of e1.1 to 3dp.

As we should expect, the answer to (b) gives us a better approximation to e1.1 than

the one we found in (a) since x = 1.1 is closer to x = 1 than it is to x = 0. But, on

the other hand, the answer to (a) didnt require us to have any accurate knowledge

of the value of e itself!

Following on from this example, as we can see in Figure 3.8, we observe that the

Maclaurin series for ex is most accurate when x is close to x = 0 whereas the Taylor

series for ex about x = 1 is most accurate when x is close to x = 1. This is, of course,

exactly what we should expect!

Figure 3.8: The solid curve is the graph of the function ex , the dashed curve is the graph

of its second-order Maclaurin series and the dotted curve is the graph of its second-order

Taylor series about x = 1. Observe how, as we might expect, the Maclaurin series is more

accurate around x = 0 and this Taylor series is more accurate around x = 1.

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

find simple derivatives using the definition of the derivative;

find derivatives using standard derivatives and the rules of differentiation;

use the derivative to find tangent lines and use these to approximate functions;

solve problems from economics-based subjects that involve derivatives;

find Maclaurin and Taylor series and use these to approximate functions.

87

3. Differentiation

Solutions to activities

Solution to activity 3.1

We need to find the derivative of the function f (x) = x2 at the point x = 1, i.e.

f (1). So, using the definition of the derivative with a = 1, we start by looking at

f (1 + h) f (1)

(1 + h)2 (1)2

=

,

h

h

which, looking at the numerator, is easily simplified to give

f (1 + h) f (1)

(1 2h + h2 ) 1

2h + h2

=

=

= 2 + h.

h

h

h

This in turn means that

f (1 + h) f (1)

= lim

h0

h0

h

f (1) = lim

2+h

= 2,

Solution to activity 3.2

In Example 3.2, we showed that if f (x) = x2 , then f (x) = 2x. This means that

f (1) = 2 in agreement with what we saw in Activity 3.1.

To find the point at which the derivative of f (x) = x2 , i.e. f (x) = 2x, is equal to (a) 16,

we see that

f (x) = 2x = 16 when x = 8,

and (b) 4, we see that

f (x) = 2x = 4 when x = 2.

Solution to activity 3.3

Given the linear combination rule, i.e.

dg

d

df

+l ,

kf (x) + lg(x) = k

dx

dx

dx

we can derive the constant multiple rule by setting l = 0 so that

d

d

df

dg

df

kf (x) =

kf (x) + 0g(x) = k

+0

=k ,

dx

dx

dx

dx

dx

the sum rule by setting k = 1 and l = 1 so that

d

d

df

dg

df

dg

f (x) + g(x) =

1f (x) + 1g(x) = 1

+1

=

+

,

dx

dx

dx

dx

dx dx

and the difference rule by setting k = 1 and l = 1 so that

d

d

df

dg

df

dg

f (x) g(x) =

1f (x) + (1)g(x) = 1

+ (1)

=

.

dx

dx

dx

dx

dx dx

88

For (a), we use the constant multiple rule to see that

d

dx

3 cos x

= 3

d

dx

cos x

= 3 sin x

= 3 sin x.

d

dx

ex + cos x

d

dx

ex

d

dx

= ex + sin x

cos x

= ex sin x.

d

3 sin x 3 ln x

dx

=3

d

dx

sin x 3

d

dx

ln x

= 3 cos x 3

1

x

3

= 3 cos x .

x

For (a), h(x) = x sin x is the product of the two functions

f (x) = x

and

g(x) = sin x,

f (x) = 1

and

g (x) = cos x.

As such, the product rule tells us that

h (x) = (1)(sin x) + (x)(cos x) = sin x + x cos x.

For (b), h(x) = ex cos x is the product of the two functions

f (x) = ex

and

g(x) = cos x,

f (x) = ex

and

g (x) = sin x.

h (x) = (ex )(cos x) + (ex )( sin x) = ex (cos x sin x).

For (c), h(x) = sin x cos x is the product of the two functions

f (x) = sin x

and

g(x) = cos x,

f (x) = cos x

and

g (x) = sin x.

h (x) = (cos x)(cos x) + (sin x)( sin x) = cos2 x sin2 x.

Then, using the double angle formulae

sin x cos x = sin(2x) and

89

3. Differentiation

d

dx

1

sin(2x)

2

= cos(2x),

d

dx

sin(2x)

= 2 cos(2x).

This result will make sense once we have seen the chain rule and, in particular,

Activity 3.8(a).

Solution to activity 3.6

sin x

For (a), h(x) =

is the quotient of the two functions

x

f (x) = sin x

and

g(x) = x,

f (x) = cos x

and

g (x) = 1.

As such, the quotient rule tells us that

h (x) =

x cos x sin x

(cos x)(x) (sin x)(1)

=

.

2

x

x2

In this case, the original function and the derivative are only defined if x = 0.

For (b), h(x) =

ex

is the quotient of the two functions

cos x

f (x) = ex

and

g(x) = cos x,

f (x) = ex

and

h (x) =

g (x) = sin x.

cos x + sin x x

=

e .

2

[cos x]

cos2 x

In this case, the original function and the derivative are only defined if cos x = 0, i.e. if

x = (2n + 1) 2 for n Z.

For (c), h(x) =

sin x

is the quotient of the two functions

cos x

f (x) = sin x

and

g(x) = cos x,

f (x) = cos x

As such, the quotient rule tells us that

and

g (x) = sin x.

cos2 x + sin2 x

h (x) =

=

.

[cos x]2

cos2 x

90

In this case, the original function and the derivative are only defined if cos x = 0, i.e. if

x = (2n + 1) 2 for n Z.

Indeed, using the Pythagorean identity

sin2 x + cos2 x = 1 and the definitions

tan x =

sin x

cos x

and

sec x =

1

,

cos x

d

dx

tan x

1

= sec2 x,

2

cos x

Solution to activity 3.7

Given that h(x) = (2x + 1)3 , we can multiply out the brackets to get

h(x) = 8x3 + 12x2 + 6x + 1,

which means that

h (x) = 24x2 + 24x + 6 = 6(4x2 + 4x + 1) = 6(2x + 1)2 ,

in agreement with what we saw in Example 3.10.

Solution to activity 3.8

For (a), h(x) = sin(2x) is the composition of the functions

f (g) = sin g

and

g(x) = 2x.

As such we have

f (g) = cos g

and

g (x) = 2,

h (x) = (cos g)(2) = 2 cos(2x),

which agrees with what we found in Activity 3.5(c).

For (b), h(x) = ln(cos x) is the composition of the functions

f (g) = ln g

and

g(x) = cos x.

As such we have

1

g

and so the chain rule tells us that

f (g) =

h (x) =

1

g

and

g (x) = sin x,

( sin x) =

sin x

= tan x.

cos x

f (g) = ln g

and

g(x) = ex .

91

3. Differentiation

As such we have

f (g) =

1

g

g (x) = ex ,

and

h (x) =

1

g

(ex ) =

ex

= 1.

ex

therefore 1.

Solution to activity 3.9

Given that ax = ex ln a , we use the chain rule with h(g) = eg and g(x) = x ln a to get

d

dax

=

dx

dx

ex ln a

ex ln a

ln a

= ax ln a,

as required.

Solution to activity 3.10

Writing the quotient f (x)/g(x) as the product f (x)[g(x)]1 , the product rule gives us

d

f (x)[g(x)]1

dx

dg

df

[g(x)]1 + f (x) [g(x)]2

,

dx

dx

where we have used the chain rule to differentiate [g(x)]1 with respect to x. Rewriting

this, we then have

df

dg

g(x) f (x)

d f (x)

dx ,

= dx

dx g(x)

[g(x)]2

which is the quotient rule, as required.

Solution to activity 3.11

We have y = f (x) so that x = f 1 (y). Thus, differentiating both sides of the latter with

respect to x, we get

dx

df 1 dy

=

,

dx

dy dx

where we have used the chain rule on the right-hand-side as y itself is a function of x

since y = f (x). This gives us

1=

df 1 dy

dy dx

df 1

=1

dy

df

,

dx

as required.17 In particular, observe that this formula makes no sense at points where

f (x) = 0.

17

See Section 2.9 of Binmore and Davies (2002) for a geometric view of this result.

92

Here we have y = f (x) = ex and x = f 1 (y) = ln y so, using the result from

Activity 3.11, we see that

d

dy

ln y

d

dx

=1

ex

1

1

= ,

x

e

y

as (ex ) = ex = y.

Solution to activity 3.13

There is, generally, no need to apply the rules of differentiation in as much detail as we

have been using. So, lets do the three examples in this activity quickly.

2

For (a), we have h(x) = ex ln(sin x) which is the product of two compositions and so

using the product and chain rules we get

h (x) =

x2

2x e

ln(sin x) + e

x2

cos x

sin x

ex

=

2x sin x ln(sin x) + cos x .

sin x

sin(cos x)

,

esin x

which is the quotient of two compositions and so using the quotient and chain rules we

get

h(x) =

h (x) =

[esin x ]2

sin x cos(cos x) + cos x sin(cos x)

.

=

esin x

For (c), we have h(x) = sin2 (3x) + cos2 (3x) which is the sum of two compositions and so

we can easily use the chain rule to see that

h (x) = 2 sin(3x) cos(3x)(3) + 2 cos(3x)[ sin(3x)](3) = 0.

Of course, this is obvious as sin2 (3x) + cos2 (3x) = 1 using (2.2) and so its derivative

with respect to x is zero.

Solution to activity 3.14

We have seen that the first four derivatives are given by

f (x),

f (x) = f (x),

f (x) = f (x),

which returns us to our original function. Indeed, we can then see that the next four

derivatives will be given by

f (5) (x) = f (x),

93

3. Differentiation

which, again, returns us to our original function. This means that, spotting the pattern,

we can see that

f (x)

n = 4, 8, . . .

f (x)

n = 1, 5, 9, . . .

f (n) (x) =

f (x) n = 2, 6, 10, . . .

f (x) n = 3, 7, 11, . . .

for n 1.

To find the first four derivatives of x ex , we use the product rule to see that

f (x) = (1)(ex ) + (x)(ex ) = (1 + x) ex ,

f (x) = (1)(ex ) + (1 + x)(ex ) = (2 + x) ex ,

f (x) = (1)(ex ) + (2 + x)(ex ) = (3 + x) ex , and

f (4) (x) = (1)(ex ) + (3 + x)(ex ) = (4 + x) ex .

Indeed, spotting the pattern, we can deduce that f (n) (x) = (n + x) ex for n 1.

Solution to activity 3.16

Here we have f (x) = ex so that f (x) = ex . Then using (3.1), we see that when x = 1 we

have

y f (1)

= y e1 = e1 (x 1) = y = e x.

f (1) =

x1

as the equation of the tangent line to the function f (x) = ex at x = 1.

Solution to activity 3.17

Here we have C(q) = 1000 + 5q + q 2 and so, when operating at q = 100, the increase in

cost given an increase in quantity of one is given by

C = C(101) C(100) = (1000 + 5(101) + 1012 ) (1000 + 5(100) + 1002 ) = 206.

This is pretty close to the approximate answer of 205 that we found in Example 3.19,

especially if we consider this as a relative error of 1/206 = 0.49% (to 2dp) instead of an

absolute error of one.

Solution to activity 3.18

The Maclaurin series for (1 + x)2 is given by

(1 + x)2 = 1 + 2x +

(2)(1) 2

x = 1 + 2x + x2 ,

2!

as all terms involving xn with n 3 will have a coefficient of zero. Similarly, the

Maclaurin series for (1 + x)3 is given by

(1 + x)3 = 1 + 3x +

94

(3)(2) 2 (3)(2)(1) 3

x +

x = 1 + 3x + 3x2 + x3 ,

2!

3!

as all terms involving xn with n 4 will have a coefficient of zero. Of course, this is

exactly what we would get if we just multiplied out the brackets in the usual way!

Solution to activity 3.19

To use (3.3), we see that f (x) = cos(ln(1 + x)) gives

and then, finding the first four derivatives of f (x), we get

f (x) =

sin(ln(1 + x))

,

1+x

f (x) =

,

(1 + x)2

f (x) =

,

(1 + x)3

f (4) (x) = 10

cos(ln(1 + x))

,

(1 + x)4

f (0) = 1,

f (0) = 0,

x2

x3

x4

(1) + (3) + (10) +

2!

3!

4!

x2 x3

5 4

=1

+

x + ,

2

3

12

and this gives us the fourth-order Maclaurin series for cos(ln(1 + x)) in agreement with

what we saw before in Example 3.29. Notice, however, that this method involved some

fairly complicated differentiation whereas the method in Example 3.29 only involved

some simple algebra!

Solution to activity 3.20

For values of y around y = 0 we have the Maclaurin series

ey = 1 + y +

yn

y2 y3

+

+ +

+ ,

2!

3!

n!

y = 0, i.e. we can write

ex1 = 1 + (x 1) +

(x 1)2 (x 1)3

(x 1)n

+

+ +

+ ,

2!

3!

n!

95

3. Differentiation

which gives us the Taylor series for ex1 for values of x around x = 1. So, as

ex = e1 ex1 , this means that

ex = e +(x 1) e +

(x 1)3

(x 1)n

(x 1)2

e+

e+ +

e+ ,

2!

3!

n!

is the Taylor series for ex for values of x around x = 1 in agreement with what we found

in Example 3.31.

Solution to activity 3.21

To find the Taylor series for ex around x = 2, we can either use (3.2) or the method we

saw in Activity 3.20.

Method I: Using (3.2), we have f (x) = ex so that f (2) = e2 . We also note, as in

Example 3.24, that f (n) (x) = ex for n 1. Then, to use these derivatives in (3.2), we

need to evaluate them at x = 2, i.e. we find that f (n) (2) = e2 for n 1. Consequently,

putting these into (3.2), we get

ex = e2 +(x 2) e2 +

(x 2)2 2 (x 2)3 2

(x 2)n 2

e +

e + +

e + ,

2!

3!

n!

Method II: Using the method of Activity 3.20, we know that for values of y around

y = 0 we have the Maclaurin series

ey = 1 + y +

yn

y2 y3

+

+ +

+ ,

2!

3!

n!

y = 0, i.e. we can write

ex2 = 1 + (x 2) +

(x 2)2 (x 2)3

(x 2)n

+

+ +

+ ,

2!

3!

n!

which gives us the Taylor series for ex2 for values of x around x = 2. So, as

ex = e2 ex2 , this means that

ex = e2 +(x 2) e2 +

(x 2)n 2

(x 2)2 2 (x 2)3 2

e +

e + +

e + ,

2!

3!

n!

is the Taylor series for ex for values of x around x = 2 in agreement with what we have

just found using the other method.

Exercises

Exercise 3.1

Find the derivatives of the following functions.

(a) esin x cos x,

96

(b)

tan x

,

ex2

(c) sin(x ex ).

Exercise 3.2

Use the compound-angle formulae to show that

cos x = sin x +

and

sin x = cos x +

.

2

Hence use the chain rule to derive the derivative of cos x from the derivative of sin x.

Exercise 3.3

Verify that the point (e, e) is on the curve with equation

y = x ln x,

and find the equation of the tangent line to the curve at this point.

Consider, for some constants a and b, the curve with equation

y = ax2 + b.

For what values of a and b does this curve pass through the point (e, e) with the same

tangent line as the one you found above?

Exercise 3.4

Suppose the demand function for a good is

q D (p) =

1

1 + p4

Find the elasticity of demand in terms of p and verify that it is positive if p > 0.

Exercise 3.5

Find the fourth-order Maclaurin series for ln

1 + sin x

.

1+x

Solutions to exercises

Solution to exercise 3.1

We apply the rules of differentiation quickly as we did in Activity 3.13.

(a) The function h(x) = esin x cos x is a product that has the composition esin x as one

of its terms. As such, applying the product rule we get

h (x) =

sin x

97

3. Differentiation

2

(b) The function h(x) = (tan x)/ ex is a quotient whose denominator is the

2

composition ex . As such, applying the quotient rule we get

2

h (x) =

x2

[e ]2

sec2 x 2x tan x

,

ex2

where we have used the fact, from Activity 3.6(c), that the derivative of tan x is

sec2 x and the chain rule to differentiate the composition.

Also note that this derivative can be found by writing the function as

2

h(x) = (tan x) ex and, if we do this, we would use the product rule instead of the

quotient rule.

(c) The function h(x) = sin(x ex ) is the composition sin x after x ex where the latter

function is a product. As such, applying the chain rule we get

h (x) = cos(x ex ) (1) ex +x(ex )

= (1 + x) ex cos(x ex ),

Solution to exercise 3.2

Using the compound-angle formulae from (2.5), we have

sin x +

= sin x cos

2

2

and

2

2

2

as required. Indeed, notice that we have used the facts, from Activity 2.3, that

sin(/2) = 1 and cos(/2) = 0.

cos x +

Now, using chain rule and the derivative of sin x, we see that

d

sin x +

dx

2

= cos x +

(1) = cos x +

,

2

2

d

cos x = sin x,

dx

as required.

Solution to exercise 3.3

Substituting x = e into y = x ln x we get

y = e ln e = e(1) = e,

and so the point (e, e) lies on this curve. The gradient of the curve at any point is given

by the derivative of f (x) = x ln x and so, using the product rule, we get

f (x) = (1) ln(x) + x

98

1

x

= ln(x) + 1.

f (e) = ln(e) + 1 = 1 + 1 = 2,

which means that, using (3.1), we get

2=

ye

xe

y e = 2(x e)

y = 2x e,

as the equation of the tangent line to the curve y = x ln x at the point (e, e).

The curve y = ax2 + b will have a tangent line at (e, e) which is the same as the one we

have just found if, firstly, the curve goes through the point (e, e), i.e. a and b must satisfy

e = a e2 +b,

and, secondly, it has the same gradient at e, i.e. if the derivative of g(x) = ax2 + b at

x = e is two. That is, as

g (x) = 2ax

we need

g (e) = 2a e,

2a e = 2

1

a= ,

e

b = e e = 0.

e=

1

e

e2 +b

Consequently, we see that when a = 1/ e and b = 0 the curve y = ax2 + b passes through

the point (e, e) with the same tangent line as the one we found above.

Solution to exercise 3.4

We have the demand function

q D (p) =

1

1 + p4

= (1 + p4 ) 2 ,

and so, setting q = q D (p), we can use the chain rule to get the derivative

3

1

2p3

q (p) = (1 + p4 ) 2 (4p3 ) =

3 .

2

(1 + p4 ) 2

Then, using the definition of the elasticity of demand from Section 3.3.3, we have

p

p

(p) = q (p) =

1

q

(1 + p4 ) 2

2p3

(1 + p4 ) 2

2p4

,

1 + p4

in terms of p. Indeed, when p > 0, we have p4 > 0 and 1 + p4 > 0, which means that

(p) > 0 too.

99

3. Differentiation

We start by noticing that it really is much easier to make use of the standard Maclaurin

series rather than trying to use (3.3) directly on the given function. Especially as, in

order to apply (3.3), we would need to find the first four derivatives of the function to

answer this question and this would get very messy very quickly! Indeed, if we decide to

use the standard Maclaurin series, two methods present themselves.

Method I: We start by simplifying the function by using the laws of logarithms from

Section 2.1.4. This gives us

ln

1 + sin x

1+x

and so, we can easily use the Maclaurin series for ln(1 + x) from Section 3.4.1, i.e.

x2 x3 x4

+

+ ,

2

3

4

to get the second term in this difference. Then, using the Maclaurin series for sin x, also

from Section 3.4.1, we have

x3

sin x = x

+ ,

3!

which means that the first term in this difference is

ln(1 + x) = x

ln(1 + sin x) = ln 1 + x

=

x3

+

3!

x3

x

+

3!

x3

x

+

3!

1

+

x

3

x

4

+ ,

where we have used the Maclaurin series for ln(1 + x) again in the second line. Now, as

we want to keep terms up to x4 , we can see that the brackets in the second term give us

x

x3

+

3!

x3

+

3!

= x2 2 x

x3

x4

+ = x2

+ ,

3!

3

where, here, were trying to make it clear that each term that arises from this product is

obtained by multiplying out the relevant brackets. Further, we see that the brackets in

the last two terms will give us x3 and x4 respectively. Overall, then, we have

1

x4

1 3

1 4

x3

+

x2

+ +

x

x

3!

2

3

3

4

x2 x3 x4

=x

+

+ ,

2

6

12

for the first part of our difference. Putting these together in our expression for the

function, we then have

ln(1 + sin x) =

ln

1 + sin x

1+x

x2 x3 x4

+

+

2

3

4

x3 x4

= +

,

6

6

=

100

x2 x3 x4

+

+

2

6

12

Method II: We could also use the Maclaurin series for sin x which we saw above to get

1 + sin x = 1 + x

x3

+ ,

3!

3

1

= (1 + x)1 = 1 x + x2 x3 + x4 + ,

1+x

which, with r = 1, follows from a simple application of the Maclaurin series for

(1 + x)r that we saw in Example 3.25. This means that we have

1 + sin x

=

1+x

1+x

x3

+

3!

1 x + x2 x3 + x4 +

= 1 1 x + x2 x3 + x4 +

+ x 1 x + x2 x3 +

x3

1 x +

3!

x3 x4

=1

+

+ ,

3!

3!

if we want to keep terms up to x4 . Then, using the Maclaurin series for ln(1 + x) which

we saw above, we get

ln

1 + sin x

1+x

= ln 1 +

x3 x 4

+

+

3!

3!

x3 x4

+

+ ,

3!

3!

and this gives us the same fourth-order Maclaurin series as the one we found using the

other method.

101

3. Differentiation

102

Chapter 4

One-variable optimisation

Essential reading

Binmore and Davies (2002) Sections 4.14.3.

Anthony and Biggs (1996) Chapters 8 and 9.

Further reading

Simon and Blume (1994) Chapter 3.

Adams and Essex (2010) Sections 4.44.6.

Aims and objectives

The objectives of this chapter are as follows.

To see what first and second-order derivatives tell us about functions.

To see how derivatives and other information about a function can be used to

sketch curves.

To use derivatives to solve problems where a function needs to be optimised.

Specific learning outcomes can be found near the end of this chapter.

4.1

Having seen how to find derivatives in the previous chapter, we now consider what they

tell us about a function. In particular, we will see that the first-order derivatives of a

function tell us where the function is increasing, stationary or decreasing; and its

second-order derivatives tell us where the function is convex or concave. Indeed, once we

have access to this information about a function we will be able to do two things.

Firstly, we will be able to sketch the curve that represents the graph of a function; and

secondly, we will be able to see where a function is optimised, i.e. we will be able to find

the points where the function takes its largest and smallest values.

103

4. One-variable optimisation

4.2

decreasing. If it is neither increasing or decreasing, we say that the function is

stationary. As we shall see in Section 4.5, stationary points are important when we are

finding the points where a function is optimised.

4.2.1

increasing when the values of f (x) get larger as x gets larger, and

decreasing when the values of f (x) get smaller as x gets larger.

Or, more precisely, if a and b are any two points in an interval, I, such that a < b, then

f is increasing on I if f (a) < f (b), and

f is decreasing on I if f (a) > f (b).

Indeed, we can see that this makes sense by considering the two functions illustrated in

Figure 4.1.

y

y = f (x)

y = f (x)

00000001010

1111111

1111

0000

1010 1010

1010 1010

f (b)

f (a)

(a) f is increasing

1010

0000

1111

1010 10

1111111

0000000

1010 1010

10 10

f (a)

f (b)

(b) f is decreasing

Figure 4.1: As x increases, (a) f is increasing as its values get larger and (b) f is decreasing

as its values get smaller. This can also be seen by taking two values of x, say a and b,

such that a < b. In (a), the function is increasing because we have f (a) < f (b) and in (b)

the function is decreasing because we have f (a) > f (b).

However, of more interest here is the fact that we can use derivatives to determine

whether a function is increasing or decreasing over some interval, I. To see how this

works, consider that the first-order Taylor approximation to f (x) around x = a is given

by

f (x) = f (a) + (x a)f (a),

and to make this a good approximation, we want x a to be small. So, if we now

consider another value of x, say x = b, where b > a and b a is small, we see that this

104

approximation gives us

f (b) = f (a) + (b a)f (a).

Now, b a > 0, so we just need to know the sign of f (a) to determine whether f (b) is

greater or less than f (a), i.e. whether f is increasing or decreasing as we move from a to

b. Indeed, we see that

if f (a) > 0, then f is increasing at a because f (b) > f (a), and

if f (a) < 0, then f is decreasing at a because f (b) < f (a).

Indeed, by letting a be any value of x, we can generalise this to obtain the following

useful result. Let I be an interval,

if f (x) > 0 for x I, then f is increasing on I, and

if f (x) < 0 for x I, then f is decreasing on I.

Example 4.1 Determine the intervals on which the function f (x) = x3 2x2 15x

is (a) increasing and (b) decreasing.

Differentiating the function with respect to x, we find that

f (x) = 3x2 4x 15.

This factorises to give us

f (x) = (3x + 5)(x 3),

and so, by looking at what is happening away from the points x = 5/3 and x = 3

where f (x) = 0, we see that the sign of this derivative can be found by considering

the signs of its two factors, i.e.

3x + 5

x3

f (x)

x < 35

53 < x < 3

+

3<x

+

+

+

This means that the function is (a) increasing on the intervals x < 5/3 and x > 3

where f (x) > 0 and (b) decreasing on the interval 5/3 < x < 3 where f (x) < 0 as

illustrated in Figure 4.2(a).

A useful consequence of this is that it tells us something about the tangent lines to the

function f (x) at points where it is increasing or decreasing. Recall, from Section 3.3.2,

that the tangent line to f (x) at the point x = a has an equation given by

y = f (a) + (x a)f (a),

and, in particular, the gradient of the tangent line is given by f (a). This means that, if

f (x) is increasing (or decreasing) at x = a, then f (a) will be positive (or negative) and

this, in turn, means that the tangent line at this point will also be an increasing (or

decreasing) function of x. This will be useful in a moment, but for now, we can see how

this works by looking at Figure 4.3.

105

4. One-variable optimisation

y = f (x)

53

y = f (x)

2

3

(a)

(b)

Figure 4.2: The graph of f (x) = x3 2x2 15x indicating the points relevant to (a)

Examples 4.1, 4.2 and 4.4; (b) Examples 4.5 and 4.6.

y

y = f (x)

y = f (x)

1010

0000

1111

1010

1010

10

f (a)

T

0000

1111

1010

1010

f (a)

Figure 4.3: (a) When f (x) is increasing at x = a, its tangent line at the point (a, f (a))

will also be increasing as the gradient of the curve (and hence the gradient of the tangent

line) at this point is positive. (b) When f (x) is decreasing, its tangent line at the point

(a, f (a)) will also be decreasing as the gradient of the curve (and hence the gradient of

the tangent line) at this point is negative.

At this point, we know what a positive or negative derivative tells us about a function

but you may be wondering what happens when the derivative is neither positive nor

negative. That is, what happens when the derivative is zero? This is very important and

we now turn our attention to that.

4.2.2

Stationary points

When we find a point, say x = a, that makes f (x) = 0, the tangent line at that point is

horizontal and its Cartesian equation is given by

y = f (a).

This means that we will have a function which may look like the one illustrated in

Figure 4.4. We call such points, i.e. points where f (x) = 0, stationary points.

106

y = f (x)

f (a)

1

0

0

1

0

1

Figure 4.4: The point x = a is a stationary point of the function f (x) as f (a) = 0. Observe

that this means that the tangent line to f (x) at the point (a, f (a)) is a horizontal line.

There are, essentially, four different kinds of stationary point that we will encounter and

these depend on how the function is changing as we move through the stationary point

in the direction of increasing x. In particular, as x is increasing through a stationary

point at x = a, we have a

local minimum if f changes from being increasing to being decreasing at the

stationary point, and a

local maximum if f changes from being decreasing to being increasing at the

stationary point.

Of course, f could also be increasing (or decreasing) on both sides of the stationary

point and in these cases we have a point of inflection. These four possibilities are

illustrated in Figure 4.5 and, in particular, we see that the stationary point we saw

earlier in Figure 4.4 is a local minimum.

This provides us with a way of classifying any stationary points we find by looking at

the sign of the first-order derivative of the function as we move through a stationary

point. This is called the first-order derivative test and it runs as follows. As we move

through the stationary point in the direction of increasing x, if we find that:

f (x) changes from positive to negative, i.e. the function goes from being increasing

to being decreasing as we pass through the stationary point, then the stationary

point is a local maximum.

f (x) changes from negative to positive, i.e. the function goes from being decreasing

to being increasing as we pass through the stationary point, then the stationary

point is a local minimum.

And, if the sign of f (x) does not change, i.e. if the function is increasing (or decreasing)

on both sides of the stationary point, then the stationary point is a point of inflection.

107

4. One-variable optimisation

y

y

y = f (x)

T

T

y = f (x)

A point of inflection

y

A local maximum

y

y = f (x)

y = f (x)

T

O

A local minimum

A point of inflection

Example 4.2 Find the stationary points of the function given in Example 4.1 and

classify them by using the first-order derivative test.

We saw in Example 4.1 that the derivative of the function can be written as

f (x) = (3x + 5)(x 3),

and so the stationary points of this function, i.e. the points that make f (x) = 0,

occur when x = 5/3 and x = 3 as you can see in Figure 4.2(a).

We can also use what we saw in Example 4.1 to see that, according to the

first-derivative test, the stationary point that occurs when:

x = 5/3 is a local maximum as f changes from being increasing to being

decreasing (i.e. f changes from positive to negative) at the stationary point.

x = 3 is a local minimum as f changes from being decreasing to being

increasing (i.e. f changes from negative to positive) at the stationary point.

This, of course, can be clearly seen in Figure 4.2(a).

108

4.2.3

p

(p) = q (p),

q

where q = q D (p) is the demand function and this told us how changes in price will cause

changes in the quantity purchased by the consumers. Now, of course, from the point of

view of a supplier, you are not interested in such changes per se, but in how such

changes affect your revenue. That is, will a change in price, together with the

corresponding change in quantity purchased, lead to an increase or decrease in your

revenue?

To answer this question, we assume that the supplier is a monopoly, i.e. it is the only

supplier of a given product to the market. In such cases, the revenue generated by

selling at a price per unit, p, is given by

R(p) = pq,

where q = q D (p) is the quantity that will be purchased by the consumers at this price.

Indeed, using the product rule to differentiate this with respect to p, we find that

p

R (p) = q + pq (p) = q 1 + q (p)

q

= q 1 (p) ,

using the definition of (p). So, as q > 0 for this to be economically meaningful, we have:

If (p) > 1, we see that R (p) < 0 and so a small increase in price leads to a

decrease in revenue. In such cases we say that demand for the product is elastic.

If (p) < 1, we see that R (p) > 0 and so a small increase in price leads to an

increase in revenue. In such cases we say that demand for the product is inelastic.

Thus, even though an increase in price will usually lead to a decrease in the quantity

that the consumers will demand, the value of the elasticity (i.e. whether it is greater

than or less than one) determines how such changes will affect the revenue (i.e. whether

it will decrease or increase).

Example 4.3 Suppose that the demand function for a good is given by

q D (p) = 20 2p. Determine the values of p that make the demand (a) elastic and (b)

inelastic.

In this case, we have q = q D (p) = 20 2p and so the elasticity of demand is given by

p

p

p

(p) = q (p) =

(2) =

,

q

20 2p

10 p

as long as p = 10. And, of course, we need values of p where 0 p 10 in order for

the demand function to be economically meaningful.

So, for (a), where we want the values of p that make demand elastic, we see that

(p) > 1

p

>1

10 p

p > 10 p,

109

4. One-variable optimisation

as 10 p > 0 since 0 p 10. This means that demand is elastic if p > 5 and, in

particular, if we have 5 < p 10 a small increase in price will lead to a decrease in

revenue.

For (b), similar reasoning shows us that demand is inelastic if p < 5 and, in

particular, if we have 0 p < 5 a small increase in price will lead to an increase in

revenue.

4.3

The second-order derivative of a function can allow us to infer useful information about

the shape of a function. For instance, they can allow us to infer whether a stationary

point is a local maximum or a local minimum and, more generally, whether the function

is convex or concave. Indeed, once we understand convexity and concavity, we will be in

a position to extend our understanding of what we mean by a point of inflection.

4.3.1

The key to understanding the link between the shape of a function and its

second-derivative is the second-order Taylor approximation to f (x) around x = a, i.e.

f (x) = f (a) + (x a)f (a) +

(x a)2

f (a),

2

and we know that this is a good approximation as long as x a is small. Now, to start

with, lets suppose that f (x) has a stationary point at x = a, i.e. f (a) = 0, so that our

second-order Taylor approximation becomes

f (x) = f (a) +

(x a)2

f (a)

2

f (x) f (a) =

(x a)2

f (a).

2

Here, for all x near the stationary point, the sign of f (x) f (a) on the left-hand-side,

i.e. the relative magnitude of f (x) and f (a), is determined by the sign of f (a) on the

right-hand-side. That is, the sign of f (x) f (a) for x near the stationary point is

determined by the value of the second-order derivative at the stationary point. Indeed,

we see that:

If f (a) > 0, then f (x) > f (a) for all x near to a and so the function always lies

above the horizontal tangent line at x = a. This means that the stationary point is

a local minimum as in Figure 4.5(c).

If f (a) < 0, then f (x) < f (a) for all x near to a and so the function always lies

below the horizontal tangent line at x = a. This means that the stationary point is

a local maximum as in Figure 4.5(b).

Thus, the sign of the second-order derivative at a stationary point allows us to infer

whether the stationary point is a local maximum or a local minimum. When we classify

stationary points in this way, we call it the second-order derivative test. However,

observe that if f (a) = 0, then the second-order Taylor approximation tells us nothing

useful about the shape of the function as it reduces to f (x) = f (a).

110

Example 4.4 Use the second-order derivative test to classify the stationary points

of the function in Example 4.1.

We saw in Example 4.1 that the first-order derivative of f is

f (x) = 3x2 4x 15,

and, in Example 4.2, we saw that its stationary points occur when x = 5/3 and

x = 3. To use the second-order derivative test, we note that

f (x) = 6x 4,

when x = 5/3, f (x) = 14 < 0 and so this is a local maximum,

when x = 3, f (x) = 14 > 0 and so this is a local minimum,

in agreement with what we found in Example 4.2.

4.3.2

More generally, the sign of the second-order derivative of a function tells us whether a

function is convex or concave. Indeed, we find that:

If f (x) > 0 on some interval, we say that f is convex on that interval.

If f (x) < 0 on some interval, we say that f is concave on that interval.

To get an idea of what this means, consider that a convex function on an interval, I,

has f (x) > 0 for all x I. So, if we take any particular point, say a I, the tangent

line to f at x = a has an equation given by

y = f (a) + (x a)f (a),

and so, our second-order Taylor approximation can be written as

f (x) = y +

(x a)2

f (a).

2

Now, as f (a) > 0 (recall that a I too), we see that f (x) > y for all x I where

x = a, i.e. these values of f always lie above the values from the tangent line to f at

x = a, as illustrated in Figure 4.6(a). But, of course, we can use any a I when we run

this argument and so a convex function is one which lies above all of its tangent lines,

as illustrated in Figure 4.6(b). In particular, a function must be convex in the

neighbourhood of a local minimum.

A similar argument can be given to show that a concave function always lies below all

of its tangent lines so that, in particular, a function must be concave in the

neighbourhood of a local maximum.

111

4. One-variable optimisation

y

y = f (x)

y = f (x)

f (x)

y

T

O

Figure 4.6: The relationship between a convex function and its tangent lines. (a) When

changing the value of x, we can see that the values of f (x) are greater than the

corresponding values of y from the tangent line to f at a, i.e. f lies above this tangent

line. (b) By changing the value of a, we can see that f lies above all of its tangent lines.

Activity 4.1 Using an argument similar to the one above, explain why a concave

function always lies below all of its tangent lines.

This gives us another, more visual, way of deciding whether a function is convex or

concave, namely:

A function is convex on some interval if it lies above all of its tangent lines in that

interval.

A function is concave on some interval if it lies below all of its tangent lines in that

interval.

And, we can see how this all works by continuing with our example.

Example 4.5 Determine the intervals on which the function in Example 4.1 is (a)

convex and (b) concave.

In Example 4.3 we saw that the second-order derivative of the function from

Example 4.1 is given by

f (x) = 6x 4,

so we find that

f (x) > 0 when 6x 4 > 0 which means that x > 2/3, and

f (x) < 0 when 6x 4 < 0 which means that x < 2/3.

This means that the function is convex on the interval x > 2/3 where f (x) > 0 and

concave on the interval x < 2/3 where f (x) < 0 as illustrated in Figure 4.2(b).

Indeed, when looking at this figure, observe that when x > 2/3 the function lies

above all of its tangent lines in that interval and that when x < 2/3 the function lies

below all of its tangent lines in that interval.

112

4.3.3

Points of inflection

Not all points of inflection are stationary points like the ones we saw in Section 4.2.2.

More generally, a point of inflection is a point where a function changes from being

convex to concave (or vice versa) in a certain well-defined way. Technically, we say that:

If f (a) = 0 and f (x) changes sign at x = a, then f has a point of inflection at a.

As such, we can see that the points indicated in Figure 4.7 as well as the ones we saw

earlier in Figure 4.5(a) and (d) are points of inflection although, of course, only the ones

in Figure 4.5(a) and (d) are stationary points as well.

y

T

y = f (x)

y = f (x)

O

a

(a)

a

(b)

Figure 4.7: A point of inflection where f changes from (a) convex to concave at a and (b)

a stationary point because neither of them have a horizontal tangent line, i.e. f (a) = 0

in both cases.

Example 4.6

We saw in Example 4.4 that the second-order derivative changes sign when x = 2/3

and, furthermore, we can see that f (2/3) = 0. This means that the function in

Example 4.1 has a point of inflection when x = 2/3.

Indeed, looking at Figure 4.2(b), we can see that when x = 2/3, the function changes

from being concave to convex as we should expect from a point of inflection.

However, this point of inflection is not a stationary point because f (x) = 0 when

x = 2/3.

It is, perhaps, worth stressing that the condition f (a) = 0 on its own is not enough to

guarantee that we have a point of inflection. For instance, the two functions illustrated

in Figure 4.8 both have f (0) = 0, but in neither case does the second derivative change

sign and so we do not have a point of inflection.

Activity 4.2 Show that f (0) = 0 for both of the functions illustrated in

Figure 4.8. How can we infer that they have those shapes by looking at (a) the

first-order derivative and (b) the second-order derivative of the function?

113

4. One-variable optimisation

(a) f (x) = x4 1

(b) f (x) = 1 x4

Figure 4.8: Both of these functions have f (0) = 0 but neither of them have a point

of inflection. (a) This is convex on both sides of x = 0 and the function has a local

minimum at that point. (b) This is concave on both sides of x = 0 and the function has

a local maximum at that point. (The dashed curves in these figures represent the curves

y = x2 1 in (a) and y = 1 x2 in (b) for comparison).

It is also worth noting that the condition that f (x) changes sign at x = a on its own is

not enough to guarantee that we have a point of inflection either. Of course, if f (x) is

changing sign at x = a and f (a) exists, we must have f (a) = 0. But, although we do

not dwell on it here, sometimes we may encounter functions where f (a) does not exist

even though f (x) changes sign at x = a. We will briefly consider what happens in

these cases when we look at cusps and asymptotes in Section 4.4.3.

4.4

Curve sketching

One useful application of this material on derivatives and what they tell us about the

shape of a function is curve sketching. The aim here is to illustrate the behaviour of

the curve described by the equation y = f (x) by picking out its main features and

where these features occur by means of a sketch. For most functions we will deal with,

these features include any points where the curve may cross the axes and the location

and nature of any stationary points. But, it may also be necessary to assess how the

curve behaves as x and, in particular, assessing whether the function has any

asymptotes. A general method for sketching the curve y = f (x) would therefore involve

us thinking about the following:

x-intercepts: The x-axis is given by the equation y = 0 and so the curve y = f (x)

crosses the x-axis at any point (x, 0) for which f (x) = 0. Solving this equation will

therefore give us the x-intercepts of the curve if there are any.

y-intercept: The y-axis is given by the equation x = 0 and so the curve crosses the

y-axis at the point (0, y) for which y = f (0). As f is a function, there can be only

one such point and this is the y-intercept.

Finding stationary points: We can find the stationary points, as we saw above, by

solving the equation f (x) = 0.

114

Classifying stationary points: We can also determine whether each of the stationary

points is a local maximum, local minimum or point of inflection by using the

methods outlined above.

Limiting behaviour in the x-direction: We can determine how f (x) is behaving as

x and as x .

Of course, in certain cases, it may also be advantageous to think carefully about the

intervals in which the function is increasing (or decreasing) or whether the function is

convex (or concave). But, generally, the method above should suffice when we sketch

most functions.

In particular, observe that a sketch is very different from a plot. A plot involves plotting

certain points and joining them up with little regard to any interesting behaviour the

curve may be exhibiting elsewhere. A sketch, on the other hand, isolates any interesting

behaviour the curve may be exhibiting (such as the ones listed above) and concentrates

on these. Please be aware that there is a difference and in this course, we will always

want to see sketches and not plots!

To see how we can implement the method above, we will start by sketching the

relatively simple curves that arise when f is a polynomial. We will then consider how

we would proceed when the functions are differentiable, but involve other elementary

functions. Then, just so that we are aware of some possible complications, we look at

what happens when our function fails to be differentiable at some points.

4.4.1

Given what we have seen so far, the only real obstacle to sketching a polynomial is an

understanding of the limiting behaviour of this kind of function. The key result here is

that, if f (x) is a polynomial, its behaviour as x gets arbitrarily large in magnitude (that

is to say, as x or x ) is determined solely by its leading term, i.e. the one

with the highest power of x. Then, with this in mind, we can look at the term with the

highest power of x, lets say that this is xn , and note that:

if n is even, then xn as x and as x ; whereas

if n is odd, then xn as x and xn as x .

Using these facts and noting how the sign of the coefficient of the term with the highest

power of x can influence the sign of the limit, we can determine the limiting behaviour

of any polynomial.

Activity 4.3 Suppose that f (x) is a polynomial and that, for some constants a = 0

and n N, the term in this polynomial with the highest power of x is axn .

Determine the behaviour of f (x) as x and as x in the cases which arise

according to whether a is positive and negative and whether n is even or odd.

We can now see how to sketch some polynomials and we start by seeing how to sketch

the function that we have been considering throughout this chapter.

115

4. One-variable optimisation

Example 4.7 Sketch the curve y = f (x) where f (x) is the function in Example 4.1.

From the earlier examples in this section, we know quite a lot about this function

and, in particular, we have found and classified its stationary points. But, to sketch

this curve, we need to find a bit more information, namely its

x-intercepts: These occur when y = 0 and so we solve the equation given by

f (x) = 0, i.e.

x3 2x2 15x = 0,

which, on taking out the common factor of x and factorising the remaining

quadratic, gives us

x(x2 2x 15) = 0

x(x 5)(x + 3) = 0.

y-intercept: This occurs when x = 0 and so using y = f (0) we see that the

y-intercept occurs when y = 0. Note, in particular, that this means that the

curve goes through the origin (as we should have expected since one of the

x-intercepts occurs when x = 0).

stationary points: We have found the x-coordinates of the stationary points and

classified them above (see, for instance, Example 4.2). So, all we need to do

here, is use y = f (x) to find the values of y at these points so that we can locate

them on our sketch. Doing this, we find that f (x) has a

local maximum when x = 5/3 and y = f (5/3) = 400/27, and

local minimum when x = 3 and y = f (3) = 36.

limiting behaviour: The term with the highest power of x in f (x) is x3 and so

f (x) as x and f (x) as x .

So, using this information, we begin to sketch this curve by roughly indicating these

key features on some axes as in Figure 4.9(a) and then, joining them up with a nice

smooth curve, we get the sketch itself as in Figure 4.9(b).

In particular, it is worth noting that in this sketch:

all of the key features are labelled;

the curve has the right kind of limiting behaviour, i.e. f (x) as x and

f (x) as x ; and

points of inflection which are not stationary points (recall that, in Example 4.5,

we saw that this curve has one when x = 2/3) are not usually indicated.

Of course, what we see here is similar to what we saw in Figure 4.2, but a sketch

must include information about all of the relevant key features.

116

y = f (x)

400

27

400

27

53

3

5 x

53

5 x

36

36

Figure 4.9: Sketching the curve y = x3 2x2 15x in Example 4.7. (a) Using what we

have discovered about the key features of the curve, we can begin to see what it must

look like. (b) By joining up these key features with a nice smooth curve, we get the sketch

itself.

Indeed, it can be seen that, unlike plotting a function, sketching it is a bit of an art and

it can only be done well by learning to appreciate what your calculations are telling you

about its appearance. With this in mind, lets sketch a function that we havent

encountered before.

Example 4.8

We find the key features of this curve according to the list given above, namely

x-intercepts: These occur when y = 0 and so we solve the equation given by

f (x) = 0, i.e.

2x4 4x3 + 2x2 = 0,

which, on taking out the common factor of 2x2 and factorising the remaining

quadratic, gives us

2x2 (x2 2x + 1) = 0

2x2 (x 1)2 = 0.

y-intercept: This occurs when x = 0 and so using y = f (0) we see that the

y-intercept occurs when y = 0. Note, in particular, that this means that the

curve goes through the origin (as we should have expected since one of the

x-intercepts occurs when x = 0).

finding the stationary points: These occur when f (x) = 0 and so, noting that

f (x) = 8x3 12x2 + 4x,

we solve the equation

8x3 12x2 + 4x = 0,

117

4. One-variable optimisation

quadratic, gives us

4x(2x2 3x + 1) = 0

4x(2x 1)(x 1) = 0,

use y = f (x) to find the values of y at these points so that we can locate them

on the sketch. Doing this, we find that

x = 0 gives y = f (0) = 0,

x = 1 gives y = f (1) = 0.

So, the stationary points have coordinates given by (0, 0), (1/2, 1/8) and (1, 0).

classifying the stationary points: Lets use the second-order derivative test here.

We can see that

f (x) = 24x2 24x + 4,

and so, looking at the stationary points, we have

f (0) = 4 > 0 and so (0, 0) is a local minimum;

f (1) = 4 > 0 and so (1, 0) is a local minimum.

limiting behaviour: The term with the highest power of x in f (x) is 2x4 and so

f (x) as x and as x .

So, using this information, we begin to sketch this curve by roughly indicating these

key features on some axes as in Figure 4.10(a) and then, joining them up with a nice

smooth curve, we get the sketch itself as in Figure 4.10(b).

1

8

1

8

1

2

y = f (x)

1

2

Figure 4.10: Sketching the curve y = 2x4 4x3 + 2x2 in Example 4.8. (a) Using what we

have discovered about the key features of the curve, we can begin to see what it must

look like. (b) By joining up these key features with a nice smooth curve, we get the sketch

itself.

Activity 4.4 Find the points of inflection of the function in Example 4.8.

118

4.4.2

When sketching curves defined using other elementary functions the only real obstacle

is, again, an understanding of the limiting behaviour of such functions. For instance, as

we saw in Section 2.1.1, exponential functions like ex and ex have very simple limiting

behaviours, i.e.

ex as x and ex 0 as x ; whereas

ex 0 as x and ex as x .

But, when functions such as these are multiplied by polynomials (say), it is not clear

how this will affect their limiting behaviour. For now, we just state the following fact1

When an exponential is multiplied by a polynomial, the exponential dominates.

Thus, for example, the function x3 ex 0 as x because the exponential ex 0

as x and this dominates the behaviour of the polynomial, x3 , even though

x3 as x . Lets sketch this curve to see why this is reasonable.

Example 4.9

We find the key features of this curve according to the list given above, namely

x-intercepts: These occur when y = 0 and so we solve the equation given by

f (x) = 0, i.e.

x3 ex = 0.

But, as ex = 0 for all x R, we find that the only x-intercept occurs when

x = 0.

y-intercept: This occurs when x = 0 and so using y = f (0) we see that the

y-intercept occurs when y = 0. Note, in particular, that this means that the

curve goes through the origin (as we should have expected since the x-intercept

we found occurs when x = 0).

finding the stationary points: These occur when f (x) = 0 and so, using the

product rule, we get

f (x) = (3x2 )(ex ) + (x3 )( ex ) = x2 (3 x) ex ,

and so we solve the equation

x2 (3 x) ex = 0.

But, as ex = 0 for all x R, we find that the stationary points occur when

x = 0 and x = 3. Then, we use y = f (x) to find the values of y at these points

so that we can locate them on the sketch. Doing this, we find that

1

In 176 Further Calculus we will encounter techniques for finding limits which are much more

sophisticated than the ones that we have seen so far. Once we have these, we will be able to see exactly

why this fact is true and be in a better position to assess the limiting behaviour of curves which are

defined using other elementary functions.

119

4. One-variable optimisation

So, the stationary points have coordinates given by (0, 0) and (3, 27 e3 ).

classifying the stationary points: Lets use the second-order derivative test here.

We can use the product rule again to see that

f (x) = (6x 3x2 )(ex ) + (3x2 x3 )( ex ) = (6x 6x2 + x3 ) ex ,

and so, looking at the stationary points, we have

f (0) = (0) e0 = 0 and so the second derivative test fails! However, we can

see that as

f (x) = x2 (3 x) ex ,

is positive when x < 0 and positive when 0 < x < 3, we can see that this

function is increasing on both sides of the stationary point at x = 0. Thus,

the first-derivative test tells us that (0, 0) is a point of inflection.

f (3) = (9) e3 < 0 and so (3, 27 e3 ) is a local maximum.

limiting behaviour: Using the fact above we would expect the ex to dominate

and this would mean that f (x) 0 as x whereas, as x , we would

expect f (x) as x3 and ex .

Then, using this information, we begin to sketch this curve by roughly indicating

these key features on some axes as in Figure 4.11(a) and then, joining them up with

a nice smooth curve, we get the sketch itself as in Figure 4.11(b).

27e3

27e3

y = f (x)

O

Figure 4.11: Sketching the curve y = x3 ex in Example 4.9. (a) Using what we have

discovered about the key features of the curve, we can begin to see what it must look

like. (b) By joining up these key features with a nice smooth curve, we get the sketch

itself.

Activity 4.5 Does the function in Example 4.9 have any other points of inflection?

If so, find them.

120

Activity 4.6 Sketch the curve y = f (x) where f (x) = x2 ex and find all of its points

of inflection.

4.4.3

The method above for sketching y = f (x) assumes, as we generally have throughout

this chapter, that the function, f (x), and its derivatives are well-defined for all x R.

But, more generally, there may be points at which the function or some of its

derivatives are not defined. When this happens we start to encounter asymptotes and

cusps. We will not dwell on this a great deal here, but we can use the following

examples to see how this may affect our sketches.

Example 4.10

f (x) =

1

,

x1

f (x) =

1

(x 1)2

and

f (x) =

2

,

(x 1)3

and so these derivatives arent defined at x = 1 either.2 Using these, we can see that

when

x < 1 we have f (x) < 0, f (x) < 0 and f (x) < 0, meaning that for these values

of x the function is negative, decreasing and concave; whereas when

x > 1 we have f (x) > 0, f (x) < 0 and f (x) > 0, meaning that for these values

of x the function is positive, decreasing and convex.

We can also see that the y-intercept of this curve occurs when y = 1 and that

f (x) 0 as x which means that this function has a horizontal asymptote

given by y = 0. However, the main feature that concerns us here is the vertical

asymptote at x = 1 which comes about because

lim f (x) =

x1

and

lim f (x) = ,

x1+

as we should expect to see from our discussion of hyperbolae in Section 2.2.4. The

sketch of this curve is illustrated in Figure 4.12(a).

In particular, observe that in Example 4.10, we have a case like the one mentioned at

the end of Section 4.3.3. That is, the function changes from being concave to convex at

a point, but there is no point of inflection. This happens because the second derivative

of this function does not exist at the point.

2

That is, the function and its derivatives are undefined when x = 1 as that would require us to divide

by zero and that is never allowed.

121

4. One-variable optimisation

Example 4.11

f (x) =

1

,

(x 1)2

f (x) =

2

(x 1)3

and

f (x) =

6

,

(x 1)4

and so these derivatives arent defined at x = 1 either.3 Using these, we can see that

when

x < 1 we have f (x) > 0, f (x) > 0 and f (x) > 0, meaning that for these values

of x the function is positive, increasing and convex; whereas when

x > 1 we have f (x) > 0, f (x) < 0 and f (x) > 0, meaning that for these values

of x the function is positive, decreasing and convex.

We can also see that the y-intercept of this curve occurs when y = 1 and that

f (x) 0 as x which means that this function has a horizontal asymptote

given by y = 0. However, the main feature that concerns us here is the vertical

asymptote at x = 1 which comes about because

lim f (x) =

x1

and

lim f (x) = ,

x1+

Figure 4.12(b).

Example 4.12

f (x) = (x 1)2/3 ,

which is defined for all x R. However, this means that we have

f (x) =

2

3(x 1)1/3

and

f (x) =

2

,

9(x 1)4/3

and so these derivatives arent defined at x = 1.4 Using these, we can see that when

x < 1 we have f (x) > 0, f (x) < 0 and f (x) < 0, meaning that for these values

of x the function is positive, decreasing and concave; whereas when

3

Again, the function and its derivatives are undefined when x = 1 as that would require us to divide

by zero and that is never allowed.

122

4.5. Optimisation

x > 1 we have f (x) > 0, f (x) > 0 and f (x) < 0, meaning that for these values

of x the function is positive, increasing and concave.

We can also see that the y-intercept of this curve occurs when y = 1. The sketch of

this curve is illustrated in Figure 4.12(c) and we say that this curve has a cusp at

x = 1.

y

y

1

y=

x1

O

1

y=

x

1

(x 1)2

y = (x 1)2/3

1

O

x=1

x=1

(a)

(b)

(c)

Figure 4.12: Sketches of the curves in (a) Example 4.10, (b) Example 4.11 and (c)

Example 4.12. Observe the behaviour of all three of these curves at x = 1: in (a) and (b)

we have a vertical asymptote at x = 1 and in (c) we have a cusp at x = 1.

4.5

Optimisation

We have seen how to use derivatives to find and classify the stationary points of a

function and we have seen that a local maximum (or local minimum) is a point where

the function is larger (or smaller) than it is at other nearby points. However, we now

want to find the points, called a global maximum (or global minimum), where the

function is larger (or smaller) than it is at all other points. In such cases, we often say

that we are looking for the points where the function is optimised. We will see that

some functions do not have a global maximum (or a global minimum) even though they

may have a local maximum (or a local minimum).

In order to determine whether a function, f (x), has a global maximum or a global

minimum, it is always useful to ask the following questions.

Which local maximum gives the largest value of f (x) and which local minimum

gives the smallest value of f (x)?

What is the behaviour of f (x) as x and as x ?

Then, having answered these questions one should be in a position to identify the global

maximum with the largest value of f and the global minimum with the smallest value

of f assuming, of course, that these exist. Indeed, one way of making sense of these

questions and their answers is to sketch the relevant features of the curve y = f (x) and

then, using this sketch, one can then easily identify any global maximum or global

minimum that the function may have.

4

We can see that these derivatives are undefined when x = 1 as that would require us to divide by

zero and that is never allowed. Moreover, observe that this function does not have a vertical tangent

line at x = 1 because to the left of x = 1 the gradient is tending to and to the right of x = 1 the

gradient is tending to .

123

4. One-variable optimisation

For instance, consider the function whose graph is sketched in Figure 4.13(a) which has

two local maxima and two local minima. If we ask our questions about this function, we

see that:

Comparing the relevant values, we see that the largest local maximum occurs when

x = a and the smallest local minimum occurs when x = b.

The function tends to zero as x .

So, in this case, it should be clear that the global maximum occurs when x = a and the

global minimum occurs when x = b as illustrated in Figure 4.13(b). However, if we have

global max

local max

local max

local max

b

a

b

a

local min

local min

local min

global min

Figure 4.13: (a) A sketch of a function with two local maxima and two local minima

which tends to zero as x . (b) This function has a global maximum and a global

minimum as indicated.

the function sketched in Figure 4.14(a) and ask our questions about that we see that:

Comparing the relevant values, we see that the largest local maximum occurs when

x = a and the smallest local minimum occurs when x = b.

The function tends to zero as x but tends to as x .

In this case, as illustrated in Figure 4.14(b), it should be clear that the global maximum

still occurs when x = a but now there is no global minimum since we can get far smaller

values of the function as x than we do from the smallest local minimum.

Activity 4.7 Use the sketches in Figures 4.9(b), 4.10(b) and 4.11(b) to determine

whether the functions in Examples 4.7, 4.8 and 4.9 have any global maxima or

global minima.

So, in general, we can see that if f : R R is a function that is differentiable for all

x R, then

its global maximum (or global minimum) can exist if the function is suitably

well-behaved as x and x ; and

if they exist, its global maximum (or global minimum) must occur at a local

maximum (or a local minimum).

But, having said this, a sketch is still the easiest way to see what is happening. We now

turn to some cases of optimisation where things work slightly differently.

124

4.5. Optimisation

y

y

local max

global max

local max

local max

local min

local min

local min

local min

!!

Figure 4.14: (a) A sketch of a function with two local maxima and two local minima

global maximum but no global minimum as indicated.

4.5.1

Constrained optimisation

Sometimes, it may be necessary to find the maximum (or minimum) value of f (x) when

the values of x are constrained (or restricted ). In such cases, there will be some interval,

such as x a or a x b, and we need to find the maximum (or minimum) value of

f (x) when x can only take these values.

For instance, consider the function whose graph is sketched in Figure 4.15(a) which has

a local minimum and a local maximum in the interval a x b. In this case, we can

see that the maximum and minimum values of f (x) for x in this interval must occur at

one of the points indicated by a . And, by comparing the values of f (x) at these

points we can see that the maximum occurs at the local maximum and the minimum

occurs at the local minimum as illustrated in Figure 4.15(b).

y

max

local max

min

local min

Figure 4.15: (a) A sketch of a function in the interval a x b with a local maximum

and a local minimum. (b) This function has a maximum and a minimum as indicated.

125

4. One-variable optimisation

However, suppose we have the function whose graph is sketched in Figure 4.16(a) which,

again, has a local minimum and a local maximum in the interval a x b. In this case,

we can again see that the maximum and minimum values of f (x) for x in this interval

must occur at one of the points indicated by a . And, by comparing the values of f (x)

at these points we can now see that the maximum occurs at the end-point x = a and

the minimum occurs at the end-point x = b as illustrated in Figure 4.16(b).

y

y

max

local max

local max

local min

local min

min

Figure 4.16: (a) A sketch of a function in the interval a x b with a local maximum

and a local minimum. (b) This function has a maximum and a minimum as indicated.

Activity 4.8 Use the sketches in Figures 4.9(b), 4.10(b) and 4.11(b) to find the

maximum and minimum values of the functions in Example 4.7 when 3 x 5,

Example 4.8 when 0 x 1 and Example 4.9 when 0 x 3.

So, in general, suppose that we have the interval a x b and f is a differentiable

function on this interval. In this case, the maximum (or minimum) value of f (x) will

occur

either at the local maximum (or local minimum) inside the interval that gives the

largest (or smallest) value of f (x)

or at one of the end-points of the interval, i.e. at x = a or x = b, if these give the

largest (or smallest) value of f (x).

This means that we should find the value of f (x) at any local maximum (or local

minimum) inside the interval and its value at the end-points of the interval, i.e. f (a)

and f (b). Having done this, the maximum (or minimum) will be the largest (or

smallest) of these values of f (x). But, of course, a sketch is still the easiest way to see

what is happening.

4.5.2

differentiable for all relevant values of x whether that means x R or values of x inside

126

4.5. Optimisation

some interval. However, it is important to note that even if the function is not

differentiable at some relevant value(s) of x, we may still find that the maximum (or

minimum) value of the function occurs at such a point.

For instance, in Sections 3.3.4 and 4.4.3, we considered some ways in which a function

could fail to be differentiable at a point. Using these as a guide, we can consider the

three functions illustrated in Figure 4.17 which all fail to be differentiable at x = 1.

However, despite this, we see that in all three cases the global maximum of the function

occurs at x = 1 even though none of these points is a local maximum.5

y

(a) discontinuous

(b) corner

(c) cusp

Figure 4.17: Three functions which are not differentiable at x = 1 because (a) the function

is discontinuous at x = 1, (b) the function has a corner at x = 1 and (c) the function has

a cusp at x = 1.

Also, thinking about what we saw in Section 4.4.3, the presence of a vertical asymptote

may also mean that a global maximum or global minimum does not exist. Of course, as

we saw above, a sketch should enable us to see what is happening in any of these cases.

Activity 4.9 Consider the curves sketched in Figures 4.12(a) and (b). Do either of

these curves have a global minimum or a global maximum?

Now suppose that we are only interested in these curves for values of x in the

interval 0 x 1. Do either of these curves have a maximum or a minimum?

4.5.3

Applications of optimisation

Optimisation problems are very common in economics and we now introduce two ways

in which they can arise in that subject. The first is their use when a firm wants to find

the level of production which maximises its profit; and the second is when a government

wants to find the level of taxation which maximises the revenue generated by a tax that

has been imposed on a market.

Profit maximisation

When a firm sells an amount, q, it makes a profit given by

(q) = R(q) C(q),

5

That is, in all three cases, as f (1) does not exist it certainly cant be equal to zero!

127

4. One-variable optimisation

where R(q) is the revenue generated by selling this amount and C(q) is the cost of

producing this amount. Obviously, when doing this, the firm will want to sell an

amount q that will maximise its profit. Indeed, whereas the costs involved are

determined by factors intrinsic to the firm, the revenue generated is given by

R(q) = pq,

where p, the price per unit, is determined by the market the firm is selling in.

As an example, consider the case where the firm is a monopoly, i.e. it is the only

supplier of this product to the market. Indeed, as they are the only suppliers and the

amount they are supplying is q, the price that the consumers will be willing to pay for

this is given by p = pD (q) where pD (q) is, as in Section 2.1.5, the inverse demand

function of the market. As such, in this case, the revenue generated by the sale of an

amount q is given by

R(q) = qpD (q),

and this will yield a profit of

(q) = qpD (q) C(q).

Thus, in the case of a monopoly, given the firms cost function and the inverse demand

function for the market, we should be able to determine the amount, q, that the firm

should be selling by finding the value of q that maximises the firms profit. Lets look at

an example.

Example 4.13

C(q) = q 3 10q 2 + 25q + 10,

pD (q) = 10 q.

Find the value of q that will maximise the firms profit.

This is a constrained optimisation problem as we must have

q 0 as q denotes the amount of good being sold, and

q 10 as, otherwise, the price that the consumers will pay will be negative.

(q) = qpD (q) C(q) = q(10 q) (q 3 10q 2 + 25q + 10) = q 3 + 9q 2 15q 10,

given that q is in the interval given by 0 q 10.

To do this, we note that (q) is given by

(q) = 3q 2 + 18q 15,

and so, as the stationary points occur when (q) = 0, we solve the equation

3q 2 + 18q 15 = 0

128

q 2 6q + 5 = 0

(q 1)(q 5) = 0,

4.5. Optimisation

to see that the stationary points occur when q = 1 and q = 5. We can then see that

(q) = 6q + 18,

which, using the second-derivative test, tells us that when:

q = 1, we have (1) = 12 > 0 and so this is a local minimum.

q = 5, we have (5) = 12 < 0 and so this is a local maximum.

This means that the point we seek, i.e. the maximum of the profit function, must

occur at q = 5 or at one of the two end-points of our interval. But, using the profit

function, we see that

(0) = 10,

(5) = 15

and

(10) = 260,

which means that the maximum occurs at q = 5 because it yields the largest profit.

Thus, q = 5 will maximise the firms profit.

Activity 4.10 Sketch the profit function from Example 4.13 to verify that q = 5

does indeed give a maximum. (Do not try to find the q-intercepts here.)

Maximising tax revenue

In Section 2.1.5, we saw how the supply and demand functions for a market are

modified if a tax is imposed. We are now in a position to see what level of tax should be

imposed if the government wants to maximise its tax revenue. For instance, if an excise

tax of T per unit is imposed, then the governments tax revenue, R(T ), is given by the

tax per unit multiplied by the number of units sold at equilibrium, i.e.

R(T ) = qT T,

where qT is the equilibrium quantity in the presence of the tax. Of course, we can then

use this to find the value of T , say T , that maximises this tax revenue. Lets look at an

example.

Example 4.14 In Example 2.7, we saw how the introduction of an excise tax

affected the market in Example 2.6 and that the maximum tax that can be imposed

is given by Tm = 4. What excise tax, T , should be imposed if the government wants

to maximise its tax revenue, R(T ), from this market? Sketch a graph of the tax

revenue, R(T ), against T and comment on the relationship between the values of Tm

and T .

This is a constrained optimisation problem as we must have

T 0 as T is the tax per unit, and

T Tm as, otherwise, the market will cease to function.

129

4. One-variable optimisation

So, we need to maximise the tax revenue generated by the tax, R(T ), i.e.

R(T ) = qT T =

T

2

T =

T2

+ 2T,

2

To do this, we note that R (T ) is given by

R (T ) = T + 2,

and so, as the stationary point occurs when R (T ) = 0, we see that we have a

stationary point when T = 2. We can then see that R (T ) = 1 < 0 which, using

the second-derivative test, tells us that this stationary point is a maximum. This

means that the point we seek, i.e. the maximum of the tax revenue function, must

occur at T = 2 or at one of the two end-points of our interval. But, using the tax

revenue function, we see that

R(0) = 0,

R(2) = 2

and

R(4) = 0,

which means that the maximum occurs at T = 2 because it yields the largest tax

revenue. Thus, we take T = 2 and, as in the sketch in Figure 4.18, we find that T

is half-way between no tax (i.e. T = 0) and the maximum tax, Tm = 4.

RT

2

Figure 4.18: A sketch of the tax revenue generated by an excise tax of T for Example 4.14.

Notice how, in the presence of an excise tax, the tax revenue is maximised at a value of

T half-way between no tax (i.e. T = 0) and the maximum tax that can be imposed (i.e.

Tm = 4).

Of course, if a percentage of the price tax of 100r% is imposed, then the governments

tax revenue, R(r), would be given by the tax per unit, rpr , multiplied by the number of

units sold at equilibrium, i.e.

R(r) = rpr qr ,

where pr and qr are the equilibrium price and quantity in the presence of the tax. Of

course, we can also use this to find the value of r, say r , that maximises this tax

revenue. See, for example, Exercise 4.5.

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

130

use first and second-order derivatives to identify the relevant features of a function;

sketch curves by identifying their key features;

optimise functions of one variable;

solve problems from economics-based subjects that involve optimisation.

Solutions to activities

A concave function on an interval, I, has f (x) < 0 for all x I. So, if we take any

particular point, say a I, the tangent line to f at x = a has an equation given by

y = f (a) + (x a)f (a),

and so, our second-order Taylor approximation can be written as

f (x) = y +

(x a)2

f (a).

2

Now, as f (a) < 0 (recall that a I too), we see that f (x) < y for all x I where

x = a, i.e. these values of f always lie below the values from the tangent line to f at

x = a, as illustrated in Figure 4.19(a). But, of course, we can use any a I when we

run this argument and so a concave function is one which lies below all its tangent lines,

as illustrated in Figure 4.19(b). In particular, a function must be concave in the

neighbourhood of a local maximum.

y

y

f (x)

y = f (x)

O

y = f (x)

O

Figure 4.19: The relationship between a concave function and its tangent lines. (a) When

changing the value of x, we can see that the values of f (x) are less than the corresponding

values of y from the tangent line to f at a, i.e. f lies below this tangent line. (b) By

changing the value of a, we can see that f lies below all of its tangent lines.

Solution to activity 4.2

For f (x) = x4 1, we see that

f (x) = 4x3

and

f (x) = 12x2 ,

131

4. One-variable optimisation

which means, in particular, that f (0) = 0. Then, looking at the first-order derivative,

we see that f (x) < 0 for x < 0 and f (x) > 0 for x > 0 which means that the function

is decreasing for x < 0 and then increasing for x > 0 as shown in Figure 4.8(a). Or,

looking at the second-order derivative, we see that f (x) > 0 for all x = 0 and so the

function is convex as shown in Figure 4.8(a).

For f (x) = 1 x4 , we see that

f (x) = 4x3

and

f (x) = 12x2 ,

which means, in particular, that f (0) = 0. Then, looking at the first-order derivative,

we see that f (x) > 0 for x < 0 and f (x) < 0 for x > 0 which means that the function

is increasing for x < 0 and then decreasing for x > 0 as shown in Figure 4.8(b). Or,

looking at the second-order derivative, we see that f (x) < 0 for all x = 0 and so the

function is concave as shown in Figure 4.8(b).

Solution to activity 4.3

Given that f (x) is a polynomial and that, for some constants a = 0 and n N, the term

in this polynomial with the highest power of x is axn . We see that, as x , we have

f (x) if a > 0 as axn , and

f (x) if a < 0 as axn ,

f (x) if a > 0 and n is even as axn ,

f (x) if a > 0 and n is odd as axn ,

f (x) if a < 0 and n is even as axn , and

f (x) if a < 0 and n is odd as axn ,

Solution to activity 4.4

In Example 4.8, we found that

f (x) = 24x2 24x + 4,

and so we begin our search for points of inflection by seeing where f (x) = 0. That is,

we solve the equation

1

1

6 12

2

2

= ,

24x 24x + 4 = 0 = 6x 6x + 1 = 0 = x =

12

2

12

if we use the quadratic formula. Now, if f (x) also changes sign at these values of x we

have a point of inflection. To see whether this is the case, consider that we now have

f (x) = 24x2 24x + 4 = 24 x

1

1

2

12

1

1

+

2

12

if we let x = a and x = b denote the smaller and the larger values of x we are interested

in respectively. This means that, considering the signs of the two factors, we have

132

x<a

xa

xb

f (x)

+

x=a

0

a<x<b

+

x=b

+

0

0

x>b

+

+

+

so we see that f (x) does indeed change sign at x = a and x = b. Consequently, both

x = a and x = b where

a=

1

1

2

12

and b =

1

1

+ ,

2

12

Solution to activity 4.5

In Example 4.9, we found that

f (x) = (6x 6x2 + x3 ) ex ,

and so we begin our search for points of inflection by seeing where f (x) = 0. That is,

we solve the equation

(6x 6x2 + x3 ) ex = 0

x(6 6x + x2 ) ex = 0,

example and

6 12

= 3 3,

x=

2

if we use the quadratic formula. Now, if f (x) also changes sign at these values of x we

have a point of inflection. To see whether this is the case, consider that we now have

f (x) = (6x6x2 +x3 ) ex = x x 3

x 3+

ex = x(xa)(xb) ex ,

if we let x = a and x = b denote the smaller and the larger values of x we are interested

in respectively. This means that, considering the signs of the four factors, we have

x

xa

xb

ex

f (x)

0<x<a

+

+

+

x=a

+

0

+

0

a<x<b

+

+

x=b

+

+

0

+

0

x>b

+

+

+

+

+

so we see that f (x) does indeed change sign at x = a and x = b. Consequently, both

x = a and x = b where

a = 3 3 and b = 3 + 3,

are also points of inflection of the function in Example 4.9.

Solution to activity 4.6

Here f (x) = x2 ex and we find the key features of the curve y = f (x), namely

133

4. One-variable optimisation

f (x) = 0, i.e.

x2 ex = 0.

But, as ex = 0 for all x R, we find that the only x-intercept occurs when x = 0.

y-intercept: This occurs when x = 0 and so using y = f (0) we see that the

y-intercept occurs when y = 0. Note, in particular, that this means that the curve

goes through the origin (as we should have expected since the x-intercept we found

occurs when x = 0).

finding the stationary points: These occur when f (x) = 0 and so, using the

product rule, we get

f (x) = (2x)(ex ) + (x2 )(ex ) = x(2 + x) ex ,

and so we solve the equation

x(2 + x) ex = 0.

But, as ex = 0 for all x R, we find that the stationary points occur when x = 0

and x = 2. Then, we use y = f (x) to find the values of y at these points so that

we can locate them on the sketch. Doing this, we find that

x = 0 gives y = f (0) = (0)2 e0 = (0)(1) = 0, and

So, the stationary points have coordinates given by (0, 0) and (2, 4 e2 ).

classifying the stationary points: Lets use the second-order derivative test here. We

can use the product rule again to see that

f (x) = (2 + 2x)(ex ) + (2x + x2 )(ex ) = (2 + 4x + x2 ) ex ,

and so, looking at the stationary points, we have

f (0) = (2) e0 > 0 and so (0, 0) is a local minimum, and

limiting behaviour: Using the fact in Section 4.4.2, we would expect the ex to

dominate and this would mean that f (x) as x whereas, as x , we

would expect f (x) 0 as ex 0.

Then, using this information, we begin to sketch this curve by roughly indicating these

key features on some axes as in Figure 4.20(a) and then, joining them up with a nice

smooth curve, we get the sketch itself as in Figure 4.20(b).

To find the points of inflection of this function, we start by seeing where f (x) = 0.

That is, we solve the equation

(2 + 4x + x2 ) ex = 0,

which, as ex = 0 for all x R, gives us

4

x=

2

134

= 2

2,

y

4e2

4e2

y = f (x)

Figure 4.20: Sketching the curve y = x2 ex . (a) Using what we have discovered about the

key features of the curve, we can begin to see what it must look like. (b) By joining up

these key features with a nice smooth curve, we get the sketch itself.

if we use the quadratic formula. Now, if f (x) also changes sign at these values of x we

have a point of inflection. To see whether this is the case, consider that we now have

x 2 + 2 ex = (x a)(x b) ex ,

f (x) = (2 + 4x + x2 ) ex = x 2 2

if we let x = a and x = b denote the smaller and the larger values of x we are interested

in respectively. This means that, considering the signs of the four factors, we have

x<a

xa

xb

ex

+

f (x)

+

x=a

0

+

0

a<x<b

+

x=b

+

0

+

0

x>b

+

+

+

+

so we see that f (x) does indeed change sign at x = a and x = b. Consequently, both

x = a and x = b where

a = 2 2 and b = 2 + 2,

are points of inflection of the function f (x) = x2 ex .

Solution to activity 4.7

Looking at the figures in question, we have:

Using Figure 4.9(b) we see that the function in Example 4.7 has neither a global

maximum (as f (x) as x ) nor a global minimum (as f (x) as

x ).

Using Figure 4.10(b) we see that the function in Example 4.8 has a global

minimum of zero when x = 0 and x = 1 but no global maximum as f (x) as

x or x .

Using Figure 4.11(b) we see that the function in Example 4.9 has a global

maximum of 27 e3 when x = 3 but no global minimum as f (x) as

x .

135

4. One-variable optimisation

Looking at the figures in question, we have:

Using Figure 4.9(b) when 3 x 5 we see that the function in Example 4.7 has

a maximum value of 400/27 when x = 5/3 and a minimum value of 36 when

x = 3.

Using Figure 4.10(b) when 0 x 1 we see that the function in Example 4.8 has a

maximum value of 1/8 when x = 1/2 and a minimum value of zero when x = 0 and

x = 1.

Using Figure 4.11(b) when 0 x 3 we see that the function in Example 4.9 has a

maximum value of 27 e3 when x = 3 a minimum value of zero when x = 0.

Solution to activity 4.9

Looking at the figures in question, we have:

Using Figure 4.12(a) we see that the function has neither a global maximum (as

f (x) as x 1+ ) nor a global minimum (as f (x) as x 1 ).

Using Figure 4.12(b) we see that the function has neither a global maximum (as

f (x) as x 1) nor a global minimum (as, even though f (x) 0 as x

or x , it never gets there).

Now, restricting our attention to 0 x 1, we can see from the figures that:

Using Figure 4.12(a) we see that the function has a maximum value of 1 when

x = 0 but no minimum value as f (x) as x 1 .

Using Figure 4.12(b) we see that the function has a minimum value of 1 when

x = 0 but no maximum value as f (x) as x 1 .

Solution to activity 4.10

Using the information in Exercise 4.13 and noting that (1) = 17, we get the sketch in

Figure 4.21 if we are allowed to omit the q-intercepts. From this, we can clearly see that

q = 5 gives us the maximum value of (q) for 0 q 10.

Exercises

Exercise 4.1

For what values of x is the function

f (x) = x ex ,

increasing or decreasing? Use this information to find and classify any stationary points

of this function.

136

4.5. Exercises

(q)

15

1

O

10

17

10

5

4

260

Figure 4.21: A sketch of the profit function from Example 4.13 for Activity 4.10. (Note

that, as instructed, we have not found the q-intercepts of this profit function.)

For what values of x is this function convex or concave? Use this information to

determine whether this function has any points of inflection.

Exercise 4.2

Consider the function

f (x) = 12 ln x x2 + 10x,

where x > 0. Find the x-coordinates of the stationary points of f (x) and classify them.

Exercise 4.3

Sketch the curve y = f (x) where

f (x) = x3 +

1

,

x3

Exercise 4.4

Consider the function given by

f (x) = 3x5 25x3 + 60x.

(a) Show that the curve y = f (x) has only one x-intercept and find it.

(b) Find the stationary points of this function and classify them.

(c) Sketch the curve y = f (x).

(d) If the domain of f is restricted to values of x such that 2 x 2, identify the

global maximum and the global minimum of the function f (x).

What are the global maximum and the global minimum of the function if the

domain of f is restricted to values of x such that 3 x 3?

137

4. One-variable optimisation

Exercise 4.5

In Exercise 2.3 we saw how the introduction a percentage [of the price] tax of 100r%

affected a market and we found that the maximum tax that can be imposed is given by

rm = 1/2.

What tax, r , should be imposed if the government wants to maximise its tax revenue,

R(r), from this market? Sketch the graph of the tax revenue function, R(r), for values

of r that make economic sense.

Solutions to exercises

Solution to exercise 4.1

Using the product rule, we see that the derivative of the function f (x) = x ex is given by

f (x) = (1) ex +x(ex ) = (1 + x) ex ,

and so, as ex > 0 for all x R, we see that the function is

decreasing when x < 1 as f (x) < 0, and

increasing when x > 1 as f (x) > 0.

furthermore, this stationary point is a local minimum as f (x) is decreasing before it and

increasing after it.

Using the product rule again, we see that the second-order derivative of the function is

given by

f (x) = (1) ex +(1 + x)(ex ) = (2 + x) ex ,

and so, as ex > 0 for all x R, we see that the function is

concave when x < 2 as f (x) < 0, and

convex when x > 2 as f (x) > 0.

changes sign at this point (i.e. the function changes from being concave to being convex

at this point).

Solution to exercise 4.2

For x > 0, we have

f (x) = 12 ln(x) x2 + 10x

f (x) =

12

2x + 10.

x

The stationary points of f (x) occur when f (x) = 0 and so we have to solve the equation

12 + 2x2 10x

12

2x+10 = 0 =

= 0 = x2 5x+6 = 0 = (x2)(x3) = 0.

x

x

138

To classify them, we note that

f (x) = 12x1 2x + 10

f (x) = 12x2 2 =

12

2,

x2

x = 2, we have f (2) = 3 2 = 1 > 0 and so this is a local minimum.

x = 3, we have f (3) =

4

3

Thus, the function, f (x), has a local minimum when x = 2 and a local maximum when

x = 3.

Solution to exercise 4.3

To sketch the curve y = f (x) where

f (x) = x3 +

1

= x3 + x3 ,

x3

x-intercepts: These occur when y = 0 and so we solve the equation given by

x3 +

1

=0

x3

x6 + 1

=0

x3

x6 + 1 = 0.

But, as x6 + 1 > 0 for all x R, we find that this equation has no solutions and so

the curve has no x-intercepts.

y-intercept: This occurs when x = 0 and so, as the function is not defined when

x = 0, we find that the curve has no y-intercepts.

finding the stationary points: These occur when f (x) = 0 and so, as

f (x) = 3x2 3x4 ,

we have to solve the equation

3x2 3x4 = 0

x2 =

1

x4

x6 = 1,

i.e. the stationary points of f (x) occur when x = 1. Then, we use y = f (x) to find

the values of y at these points so that we can locate them on the sketch. Doing

this, we find that

x = 1 gives y = f (1) = 1 + 1 = 2, and

So, the stationary points have coordinates given by (1, 2) and (1, 2).

classifying the stationary points: The second-order derivative of the function is

f (x) = 6x + 12x5 = 6x +

12

,

x5

139

4. One-variable optimisation

limiting behaviour: We see that f (x) as x and f (x) as

x . (Note, in particular, that 1/x3 0 as x and so the limiting

behaviour is determined by the x3 term in f (x).)

In this case, we must also look at what the function is doing near x = 0 as it is

undefined there. Indeed, here, because of the 1/x3 term in f (x), we have

f (x) as x 0+ , and

f (x) as x 0 ,

i.e. the curve y = f (x) has a vertical asymptote when x = 0. Consequently, using this

information, we can get the sketch in Figure 4.22.

y

y = x3 +

2

1

1

x3

x

Indeed, using this sketch, we can clearly see that this function has neither a global

minimum nor a global maximum. In particular, notice that the local minimum is not

global because our local maximum gives us a smaller value of f (x) and the local

maximum is not global since the local minimum gives us a larger value of f (x)!

Solution to exercise 4.4

(a) To find the x-intercepts of the curve y = f (x) we set y = 0 and solve the equation

3x5 25x3 +60x = 0

To deal with this second possibility, we notice that we have a quadratic equation in x2

and so, if we were to use the quadratic formula (say), we get

x2 =

25

252 4(3)(60)

.

2(3)

252 4(3)(60) = 625 720 = 95,

140

and so this equation gives us no solutions for x2 and, hence, no solutions for x. Thus,

the only solution to y = 0 is x = 0 and this is, therefore, the only x-intercept of the

curve y = f (x).

(b) The stationary points occur when f (x) = 0 and so, as

f (x) = 15x4 75x2 + 60,

we have to solve the equation

15x4 75x2 + 60 = 0

x4 5x2 + 4 = 0.

(x2 4)(x2 1) = 0

x2 = 4, 1,

points.

To classify these stationary points, we find the second derivative of f (x), i.e.

f (x) = 60x3 150x = 30x(2x2 5),

and we can see that

If x = 2, we have

f (2) = 30(2)(2(2)2 5) = 60(8 5) = 180 < 0,

and so this is a local maximum. At this point we also have

y = f (2) = 3(2)5 25(2)3 + 60(2) = 96 + 200 120 = 16,

and so the coordinates of this point are (2, 16).

If x = 1, we have

f (1) = 30(1)(2(1)2 5) = 30(2 5) = 90 > 0,

and so this is a local minimum. At this point we also have

y = f (1) = 3(1)5 25(1)3 + 60(1) = 3 + 25 60 = 38,

and so the coordinates of this point are (1, 38).

If x = 1, we have

f (1) = 30(1)(2(1)2 5) = 30(2 5) = 90 < 0,

and so this is a local maximum. At this point we also have

y = f (1) = 3(1)5 25(1)3 + 60(1) = 3 25 + 60 = 38,

and so the coordinates of this point are (1, 38).

141

4. One-variable optimisation

If x = 2, we have

f (2) = 30(2)(2(2)2 5) = 60(8 5) = 180 > 0,

and so this is a local minimum. At this point we also have

y = f (2) = 3(2)5 25(2)3 + 60(2) = 96 200 + 120 = 16,

and so the coordinates of this point are (2, 16).

(c) We can use the information that we have found so far together with the observation

that the y-intercept occurs when x = 0, i.e. when y = f (0) = 0, to get the sketch in

Figure 4.23(a).

y

y = f (x)

y = f (x)

(3, 234)

(1, 38)

(2, 16)

(2, 16)

(1, 38)

(1, 38)

3

(2, 16)

(3, 234)

(a)

(2, 16)

3

(1, 38)

(b)

Figure 4.23: (a) A sketch of the curve y = f (x) from Exercise 4.4(c). (b) For

Exercise 4.4(d), picking out the interval 2 x 2 using vertical dotted lines and

the interval 3 x 3 using vertical dashed lines.

(d) Given that 2 x 2 and looking at the sketch in Figure 4.23(a), it should be

clear that the global maximum and the global minimum of f (x) are at the points (1, 38)

and (1, 38) respectively. If youre unclear about this, this interval is picked out by

the vertical dotted lines in Figure 4.23(b).

If we now have 3 x 3, looking at the sketch in Figure 4.23(a), it should be clear

that the global maximum and the global minimum of f (x) are at the points (3, 234) as

f (3) = 234 and (3, 234) as f (3) = 234 respectively. If youre unclear about this,

this interval is picked out by the vertical dashed lines in Figure 4.23(b).

142

As we mentioned at the end of Section 4.5.3, the tax revenue, R(r), generated by this

tax is given by

R(r) = (rpr )qr ,

as it is the tax paid per unit sold, i.e. rpr , multiplied by the quantity sold, i.e. qr , if the

market is in equilibrium. If we refer back to Exercise 2.3 for pr and qr , this then gives us

R(r) = r

4 8r

2r

12

2r

r 2r2

= 48

.

(2 r)2

Now, to find the value of r, i.e. r , that maximises R(r), we differentiate it with respect

to r using the quotient and chain rules to get

2

R (r) = 48

(1 4r)(2 r) + 2(r 2r )

= 48

,

4

(2 r)

(2 r)3

2 7r

.

(2 r)3

This has a stationary point when R (r) = 0, i.e. when r = 2/7, and as R (r) changes

from positive to negative as r goes through this value, we can see that this stationary

point is a local maximum.6 Now, in this case, we must have 0 r rm for the market

to function and so this is a constrained optimisation problem. That is, the maximum we

seek is either the value of R(r) at our local maximum, i.e.

R (r) = 48

2

7

2

=

7

48

2 72

12

2 72

2

7

2

=

7

12

12

7

12

7

12

7

= 2,

R(0) = 0 < 2

and

1

2

1

=

2

12

2 12

48

2 12

1

2

= 0 < 2,

and so the maximum value of R(r) is 2 and this occurs when r = 2/7, i.e. at the local

maximum. Thus, r = 2 and using the information we have so far, we can get the sketch

in Figure 4.24(a) for values of r that make economic sense, i.e. those where

0 r 1/2.7

Aside: As shown in Figure 4.24(b), observe that once we move away from the

economically meaningful values of r (i.e. where 0 r 1/2) the graph of R(r) gets

quite complicated. Indeed, note that as

R(r) = 48

r 2r2

,

(2 r)2

we can see that it has a vertical asymptote when r = 2 and, because we can write

R(r) = 48

r 2r2

2(r2 4r + 4) 7r + 8

7r 8

=

48

= 96 48

,

2

2

(2 r)

4 4r + r

(2 r)2

we can see that R(r) 96 as r , i.e. we also have a horizontal asymptote here.

6

Alternatively, you can show that this stationary point is a local maximum by showing that R (r) < 0

when r = 2/7, but this isnt quite so easy.

7

Note, in particular, that r is clearly not half-way between no tax (i.e. r = 0) and the maximum tax

(i.e. rm = 1/2) as it was in Example 4.14 when we looked at an excise tax.

143

4. One-variable optimisation

4

R(r)

R(r)

2

O 2

2

7

1

2

(a)

96

(b)

Figure 4.24: For Exercise 4.5: (a) A sketch of the graph of R(r) for the economically

meaningful values of r, i.e. those between zero (i.e. no tax) and 1/2 (i.e. the maximum

tax). (b) As an aside, we could have sketched the graph of R(r) for some economically

meaningless values of r (specifically r < 0 and r 1/2). Observe, in particular, the

vertical asymptote when r = 2 and the horizontal asymptote where R(r) 96 as

r . (Note that the details of what is happening in the positive quadrant, which we

saw in (a), have been omitted from (b) for clarity.)

144

Chapter 5

Integration

Essential reading

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 10.2, parts of 10.310.4, 10.510.9.

Further reading

Simon and Blume (1994) Appendix A4.

Adams and Essex (2010) Sections 5.55.7, 6.16.2 and parts of 6.3.

Aims and objectives

The objectives of this chapter are as follows.

To introduce the idea of an integral and see how it can be found using various

techniques.

To use integrals to find areas.

To see how integrals can be used in economics-based subjects.

Specific learning outcomes can be found near the end of this chapter.

5.1

differentiated with respect to x to yield its derivative, which we denoted by

df

dx

or

f (x).

And, in particular, we saw how to find such derivatives by using the rules of

differentiation and some standard derivatives. Now, given a function, f (x), we want to

make sense of what it means to find the indefinite integral of this function with respect

to x, which is denoted by

f (x) dx.

145

5. Integration

In such cases, as we are integrating the function f (x) with respect to x, we call it the

integrand. And, similarly to what we saw before, we will see how to find such integrals

by using the rules of integration and some standard integrals. In particular, the standard

integrals will be closely related to our standard derivatives since the key idea behind our

method for finding integrals will be the idea that integration is the process that

undoes (or reverses) the process of differentiation, i.e. the process of indefinite

integration can be thought of as antidifferentiation and the resulting indefinite integral

can be thought of as an antiderivative.

Consider the functions F (x) and f (x) where we know that f (x) is the derivative1 of

F (x), i.e.

dF

= f (x).

dx

Now, using the idea that integration undoes differentiation, i.e. if we integrate f (x)

with respect to x we are looking for a function, F (x), whose derivative is f (x), we can

see that

f (x) dx must be, more or less, given by F (x).

In such cases, we say that F (x) is an antiderivative of f (x) as opposed to, say, the

indefinite integral.

However, you may wonder why we say that the function, F (x), that we found above is

an, as opposed to the, antiderivative of f (x). The reason for this is that if, instead of

the function F (x) we had the function F (x) + c where c is a constant, then its

derivative would still be f (x), i.e.

d

F (x) + c

dx

= f (x),

f (x) dx can also, more or less, be given by F (x) + c,

where c is a constant. That is, F (x) + c is also an antiderivative of f (x) for this

constant c.

Example 5.1

can see that 4x2 + 1 is also an antiderivative of 8x as we can differentiate 4x2 + 1 to

get 8x.

As such, because this works for any constant c we add to F (x), we say that the

indefinite integral gives us a whole family of antiderivatives which only differ by a

constant, i.e. the choice of c. In this way, we say that indefinite integration, i.e. the

process of finding

f (x) dx,

is antidifferentiation, i.e. it seeks all the functions F (x) + c that can be differentiated to

yield f (x) and, as such, every one of these functions will be an antiderivative of f (x).

1

We say that it is the derivative because differentiation always yields exactly one answer.

146

Example 5.2

What is

8x dx?

We saw in Example 5.1 that 4x2 is an antiderivative of 8x. This means that

8x dx = 4x2 + c,

where c is an arbitrary (i.e. any) constant. Notice that this works because

differentiating 4x2 + c we get 8x.

Generally speaking then, we have the following.

If F (x) is a function whose derivative is the function f (x), then we have

f (x) dx = F (x) + c,

where c is an arbitrary constant. In particular, we call the

function, f (x), the integrand as it is what we are integrating,

function, F (x), an antiderivative as its derivative is f (x),

constant, c, a constant of integration which is completely arbitrary,2 and

integral,

Now that we have the idea, lets see how were going to actually find the indefinite

integrals of the functions that commonly occur in this course.

5.2

The previous section told us how to find indefinite integrals using the antiderivatives,

but now we want to explore a more convenient way of finding them. The key idea is

that we introduce standard integrals which tell us how to integrate the basic functions

that we saw in Chapter 2. Once we know how to integrate these, the rules of integration

will allow us to integrate combinations of these functions.

5.2.1

Standard integrals

show that the function f (x) = 8x has an indefinite integral given by

8x dx = 4x2 + c,

where c is an arbitrary constant. We now state some results that will allow us to find

the indefinite integrals of our other basic functions.

2

As we can add any constant to F (x) to account for the fact that F (x) + c, for any constant c R,

is also an antiderivative.

147

5. Integration

Power functions

If n = 1, we have

xn+1

x dx =

+ c,

n+1

where c is an arbitrary constant and this works because

n

d

dx

xn+1

+c

n+1

(n + 1)xn

+ 0 = xn .

n+1

In particular, if n = 0, we have

x0 dx = x + c,

1 dx =

However, if we have n = 1, we have

1

dx = ln |x| + c,

x

x1 dx =

where we need the modulus sign in ln |x| as x may be negative but the logarithm

function is only defined for x > 0. This works because, if x > 0, we have |x| = x and so

d ln(x)

1

d ln |x|

=

= ,

dx

dx

x

whereas if x < 0, we have |x| = x and so

d ln(x)

1

1

d ln |x|

=

=

= ,

dx

dx

x

x

If we are using e, we have

ex dx = ex +c,

where c is an arbitrary constant and this works because

d

dx

ex +c

= ex .

However, there is no nice standard integral for ln x and so well see how to find

ln x dx,

when we encounter integration by parts in Example 5.20.

If we have another base, a, the standard integrals are not so simple. But, we can see that

ax dx =

148

ax

+ c,

ln a

where c is an arbitrary constant since, using the result from Activity 3.9, we have

d

dx

ax

+c

ln a

ax ln a

+ 0 = ax .

ln a

However, there is also no nice standard integral for loga x and so well see how to find

loga x dx,

in Activity 5.12 where we will use the change of base formula once we can integrate ln x.

Sine and cosine functions

For the sine and cosine function we find that

sin x dx = cos x + c

and

cos x dx = sin x + c,

d

dx

cos x + c

= ( sin x) + 0 = sin x,

5.2.2

In Section 2.1.2, we saw that there are several standard ways of making new functions

from old ones and, in Section 3.2.2, we saw how the rules of differentiation could be

used to differentiate these new functions. Here we will see how we can use standard

integrals, i.e. the integrals of our basic functions, and rules of integration to integrate

the new functions that are created from these basic ones in these standard ways. We

start with the most straightforward of these which allows us to integrate linear

combinations of functions.

The linear combination rule

If k and l are constants, this allows us to integrate the linear combination,

kf (x) + lg(x), of two functions f (x) and g(x). It states that

[kf (x) + lg(x)] dx = k

f (x) dx + l

g(x) dx.

Indeed, this gives us three more basic rules straightaway, i.e. the

constant multiple rule: If k is a constant and f (x) is a function, then

kf (x) dx = k

f (x) dx.

149

5. Integration

[f (x) + g(x)] dx =

f (x) dx +

g(x) dx.

[f (x) g(x)] dx =

f (x) dx +

g(x) dx.

Activity 5.1 Derive the constant multiple, sum and difference rules from the linear

combination rule.

Activity 5.2

Using these rules we see that:

Example 5.3

3x

21

dx = 3

x2

1

2

3

x +x

1

2

3

1 3

x3 x 2

dx =

+ 3 +c=

x + 2x 2 + c by the sum rule,

3

3

2

3

4 ex dx = 3 ln |x| 4 ex +c by the linear combination rule,

x

where c is an arbitrary constant.

So, in the case of linear combinations of functions such as these, we see that the integral

of the linear combination is given by the linear combination of the integrals.

Activity 5.3

to x.

Use the rules above to integrate the following functions with respect

(a) 3 cos x,

(b) ex + cos x,

3

(c) 3 sin x .

x

We now look at the other rules of integration, i.e. the ones that will allow us to

integrate other combinations of functions. But, unlike what we saw with the rules of

differentiation in Section 3.2.2, we shall see that these are harder to apply.

5.2.3

Integration by substitution

composition of two functions and, as such, it is closely related to the chain rule of

differentiation. To see how it works, we will start by seeing how integration by

substitution is related to the chain rule and then we will describe how to apply this

rule. We will then apply this rule in some simple examples and then some harder ones.

150

We start by noting that the chain rule for differentiation tells us that if

h(x) = (f g)(x) = f (g(x)),

then we write h as f (g) so that, on differentiating, we get

dh

df dg

=

.

dx

dg dx

But, because of this we can see that

h(x) = (f g)(x) = f (g(x)) is an antiderivative of

dh

df dg

=

,

dx

dg dx

df dg

dx = f (g(x)) + c,

dg dx

which is the basis of integration by substitution. However, this is quite hard to apply

and so, as a useful way of applying this rule, we think of

dg

as

dg

dx,

dx

so that we have

df

dg = f (g) + c,

dg

and this is the key to the method that we shall be using here.

How to integrate by substitution

We can now see how to apply integration by substitution. The basic idea is that, if you

are given an integrand that involves a composition of two functions, this rule of

integration sometimes allows you to turn it into an easier integral by making a

substitution. That is:

The integral involves the derivative of a composition and has the form

f (g(x))g (x) dx.

Write f (g(x)) as f (g) and g (x)dx as dg. This should give you the easier integral

f (g) dg.

Find this integral and replace all occurrences of g with g(x) to get your final

answer.

Now, to make this clearer, lets look at some examples.

Some simple applications of integration by substitution

Easy integrations by substitution involve an integrand which is nothing more than a

simple composition of two functions and so there can be no doubt about which function

should be g. To see this, lets consider what happens when we want to integrate a

simple composition which involves the function 3x + 1.

151

5. Integration

Example 5.4

Find

dg

= 3 and so dg = 3 dx, i.e. dx = 13 dg. Hence,

Taking g = 3x + 1 we have

dx

substitution gives

(3x + 1)2 dx =

g2

1

dg

3

1

3

g 2 dg =

(3x + 1)3

1 g3

+c=

+ c,

3 3

9

Example 5.5

1

dx.

3x + 1

Find

dg

Taking g = 3x + 1 we have

= 3 and so dg = 3 dx, i.e. dx = 13 dg. Hence,

dx

substitution gives

1

dx =

3x + 1

1

g

1

dg

3

1

3

g 1 dg =

1

1

ln |g| + c = ln |3x + 1| + c,

3

3

Example 5.6

e3x+1 dx.

Find

dg

= 3 and so dg = 3 dx, i.e. dx = 13 dg. Hence,

Taking g = 3x + 1 we have

dx

substitution gives

e3x+1 dx =

eg

1

dg

3

1

3

eg dg =

1 g

1

e +c = e3x+1 +c,

3

3

In particular, observe what changes in these examples and what stays the same. Indeed,

just for comparison, we can see what would happen if we had a composition which is

like the one in Example 5.4 but it now involves the function 4x + 7 instead of 3x + 1.

Example 5.7

Find

dg

Taking g = 4x + 7 we have

= 4 and so dg = 4 dx, i.e. dx = 14 dg. Hence,

dx

substitution gives

(4x + 7)2 dx =

g2

1

dg

4

152

1

4

g 2 dg =

1 g3

(4x + 7)3

+c=

+ c,

4 3

12

1

dx and

4x + 7

Activity 5.4

e4x+7 dx.

Note that in all of these examples, the substitution works because we have

g(x) = ax + b and hence

dg

=a

dx

dg = a dx

1

dg = dx,

a

which is a constant, it can be moved out of the integral using the constant multiple rule

of integration. So, if our integrand is a composition, i.e. f (g(x)), and g(x) is a linear

function, i.e. it has the form ax + b where a = 0 and b are constants, this kind of

substitution will always work and this leads to the general result that

1

f (ax + b) dx = F (ax + b) + c,

a

Activity 5.5 Suppose that a = 0 and b are constants. Use this result to find an

expression for

(ax + b)n dx,

when n is a constant. Also find expressions for

eax+b dx,

sin(ax + b) dx and

cos(ax + b) dx.

What happens if a = 0?

Activity 5.6 Using the expressions you found in Activity 5.5, verify your answers

to Activity 5.4.

Some less simple applications of integration by substitution

We will also see slightly harder integrations by substitution where the integrand

involves a composition of two functions multiplied by another function. Although, even

in these cases, there can be little doubt about which function should be g. To see this,

lets consider what happens when we want to integrate a simple composition which

involves the function x2 + 1.

Example 5.8

Find

substitution gives

(x2 + 1)2 x dx =

g2

1

dg

2

1

2

g 2 dg =

1 g3

(x2 + 1)3

+c=

+ c,

2 3

6

153

5. Integration

i.e. the extra x in the integrand was actually needed for the substitution g = x2 + 1

to work.

x

dx.

x2 + 1

Find

Example 5.9

substitution gives

x2

1

g

x

dx =

+1

1

dg

2

1

2

g 1 dg =

1

1

ln |g| + c = ln |x2 + 1| + c,

2

2

i.e. the extra x in the integrand was, again, needed for the substitution g = x2 + 1

to work.

5

Example 5.10

x ex

Find

2 +1

dx.

substitution gives

x ex

2 +1

dx =

ex

2 +1

x dx =

1

dg

2

eg

1

2

eg dg =

1 g

1 2

e +c = ex +1 +c,

2

2

i.e. the extra x in the integrand was, again, needed for the substitution g = x2 + 1

to work.

In particular, observe what changes in these examples and what stays the same. Indeed,

just for comparison, we can see what would happen if we had a composition which is

like the one in Example 5.8 but it now involves the function 3x2 + 7 instead of x2 + 1.

Example 5.11

Find

substitution gives

(3x2 + 7)2 x dx =

g2

1

dg

6

1

6

g 2 dg =

1 g3

(3x2 + 7)3

+c=

+ c,

6 3

18

i.e. the extra x in the integrand was actually needed for the substitution

g = 3x2 + 7 to work.

Activity 5.7

x

dx and

+7

3x2

2 +7

x e3x

dx.

To summarise, it is worth noting that in all of these examples, the substitution works

because we have g(x) = ax2 + b and hence

dg

= 2ax

dx

154

dg = 2ax dx,

where a = 0 and b are constants. But, 2ax is not a constant and so we can not deal with

this by taking it out of the integral as we did in the last set of examples. However, in

these cases, the substitution still works because we have

dg

= 2ax

dx

dg = 2ax dx

1

dg = xdx,

2a

and there is also an x in the integrand to facilitate the transition from dx to dg.

Indeed, in the absence of this extra x, the substitution would produce a more

complicated integral and we would not be able to proceed!

Integration by substitution more generally

The general lesson that we should be drawing from the last two sets of examples is that

integration by substitution works when we have an integrand which is the product of

the composition of two functions f (g(x)), and

a constant multiple of g (x).

The first of these enables us to replace f (g(x)) with f (g) and the second enables us to

replace dx with some constant multiple of dg. Having done this, the substitution has

turned a hard integral into an easier one and we can proceed. Lets now consider some

more complicated examples.

Find

Example 5.12

dg

= 3x2 + 2x,

dx

which is the other part of the product in the integrand, i.e. this substitution will

work. Thus, we see that

dg = (3x2 + 2x) dx,

and so the substitution gives

3

2 7

(x + x ) (3x + 2x) dx =

g8

(x3 + x2 )8

g dg =

+c=

+ c.

8

8

7

Here, the extra 3x2 + 2x in the integrand was needed for the substitution

g = x3 + x2 to work.

Example 5.13

Find

x2

2x + 2

dx.

+ 2x + 2

have

dg

= 2x + 2,

dx

155

5. Integration

which is the other part of the product in the integrand, i.e. this substitution will

work. Thus, we see that

dg = (2x + 2) dx,

and so the substitution gives

1

dg = ln |g| + c = ln |x2 + 2x + 2| + c.

g

2x + 2

dx =

x2 + 2x + 2

Here, the extra 2x + 2 in the integrand was needed for the substitution

g = x2 + 2x + 2 to work.

Example 5.14

Find

(x2 + 1) ex

3 +3x+7

3 +3x+7

dx.

dg

= 3x2 + 3 = 3(x2 + 1),

dx

which is a constant multiple of the other part of the product in the integrand, i.e.

this substitution will work. Thus, we see that

dg = 3(x2 + 1) dx

1

dg = (x2 + 1) dx,

3

(x2 + 1) ex

3 +3x+7

dx =

eg

1

dg

3

1

3

eg dg =

1 g

1 3

e +c = ex +3x+7 +c.

3

3

Here, the extra x2 + 1 in the integrand was needed for the substitution

g = x3 + 3x + 7 to work.

Activity 5.8

Find

x sin(x2 ) dx.

Sometimes we can straightforwardly apply what we have just seen to find integrals that

involve compositions of trigonometric functions as the following examples show.

Example 5.15

Find

dg

= cos x,

dx

which is the other part of the product in the integrand, i.e. this substitution will

156

dg = cos x dx,

and so the substitution gives

sin2 x cos x dx =

g 2 dg =

1

g3

+ c = sin3 x + c.

3

3

Here, of course, the extra cos x in the integrand was needed for the substitution

g = sin x to work.

Activity 5.9

Find

Indeed, as the next example shows, this kind of substitution allows us to find another

useful result.

Example 5.16

Find

tan x dx.

sin x

,

cos x

which means that the composition is (cos x)1 and so we take g = cos x. As such, we

have

dg

= sin x,

dx

which, up to a minus, is the other part of the product in the integrand, i.e. this

substitution will work. Thus, we see that

tan x =

dg = sin x dx,

and so the substitution gives

tan x dx =

sin x

dx =

cos x

dg

= ln |g| + c = ln | cos x| + c.

g

Here, of course, the extra sin x in the integrand was needed for the substitution

g = cos x to work.

Activity 5.10

Find

cot x dx.

However, not every trigonometric substitution is so easy to spot as the next example

shows.

Example 5.17

Find

dx

.

(x + a)2 + b2

157

5. Integration

Here, for reasons that will soon become apparent, we make the substitution

x + a = b tan . As such, differentiating both sides of this expression with respect to

, we have

dx

= b sec2

=

dx = b sec2 d.

d

This means that our integral becomes

dx

=

(x + a)2 + b2

b sec2

d =

b2 tan2 + b2

sec2

d =

b sec2

d

,

b

if we use the trigonometric identity tan2 + 1 = sec2 from (2.4). This then gives us

d

1

= + c = tan1

b

b

b

x+a

b

+ c,

since x + a = b tan and where c is an arbitrary constant. Thus, we have found that

dx

1

tan1

=

2

2

(x + a) + b

b

x+a

b

+ c,

Activity 5.11

Find

x2

dx

. (Hint: Complete the square in the

+ 2x + 2

denominator.)

We will see other examples of how trigonometric identities can be used when finding

integrals in Section 5.2.6.

5.2.4

Integration by parts

Integration by parts is a way of dealing with integrands which involve the product of

two functions and, as such, it is closely related to the product rule of differentiation. To

see how it works, we will start by seeing how integration by parts is related to the

product rule and then we will describe how to apply this rule. We will then see some

examples of how it can be applied.

Why integration by parts works

We start by noting that the product rule for differentiation tells us that

d

[f (x)g(x)] = f (x)g(x) + f (x)g (x).

dx

So, integrating both sides with respect to x, we get

d

[f (x)g(x)] dx = f (x)g(x) dx + f (x)g (x) dx,

dx

which, on noting that integration undoes differentiation, yields

f (x)g(x) =

158

f (x)g(x) dx +

f (x)g (x) dx = f (x)g(x)

f (x)g(x) dx,

How to integrate by parts

Observe that integration by parts allows us to write one integral in terms of another

and so a successful application of this rule requires a good choice of f (x) and g (x), i.e.

one where it is straightforward to integrate g (x) and the new integral is easier to find

than the old one. That is:

The integral involves a product of two functions and has the form

Choose f (x) and g (x) so that we can differentiate f (x) to get f (x) and

straightforwardly integrate g (x) to get g(x).

Apply the formula and make sure that the new integral,

integrate.

If it is, proceed. If it is not, then you have been unwise in your choice of f (x) and

g (x).

Lets look at some simple examples of how it works.

Example 5.18

Find

x ex dx.

f (x) = x

and

g (x) = ex ,

f (x) = 1

and

g(x) = ex ,

where we have suppressed the arbitrary constant from the integration. Applying the

rule then gives,

x ex dx = (x)(ex )

(1)(ex ) dx = x ex

ex dx,

and, clearly, the new integral is easier to find. Thus, finding this integral, we get

x ex dx = x ex

ex dx = x ex ex +c = (x 1) ex +c,

as the answer.

159

5. Integration

Warning! Observe that if we had chosen f (x) and g (x) differently, we would have

got

f (x) = ex

and

g (x) = x,

so that differentiating f (x) and integrating g (x) we would have got

f (x) = ex

and

g(x) =

x2

,

2

where we have suppressed the arbitrary constant from the integration. Applying the

rule then gives,

x ex dx = (ex )

x2

2

x2

2

(ex )

dx =

x2 ex 1

2

2

x2 ex dx,

Example 5.19

Find

x ln x dx.

f (x) = ln x

and

g (x) = x,

f (x) =

1

x

and

g(x) =

x2

,

2

where we have suppressed the arbitrary constant from the integration. Applying the

rule then gives,

x ln x dx = (ln x)

x2

2

1

x

x2

2

dx =

x2

ln x

2

x

dx,

2

and, clearly, the new integral is easier to find. Thus, finding this integral, we get

x ln x dx =

x2

ln x

2

x

x2

x2

dx =

ln x

+ c,

2

2

4

as the answer.

Warning! Observe that if we had chosen f (x) and g (x) differently, we would have

got

f (x) = x

and

g (x) = ln x.

This would have been bad because we cant integrate g (x) = ln x to get g(x) at the

moment.

However, having said that, now that we can integrate by parts, we can finally see how

to integrate ln x.

160

Example 5.20

Find

ln(x) dx.

have a product, i.e. we want to find,

1 ln(x) dx.

ln(x) dx =

To apply integration by parts, we choose

f (x) = ln(x)

and

g (x) = 1,

f (x) =

1

x

and

g(x) = x,

where we have suppressed the arbitrary constant from the integration. Applying the

rule then gives,

1 ln(x) dx = (x)(ln(x))

(x)

1

x

dx = x ln(x)

1 dx,

and, clearly, the new integral is easier to find. Thus, finding this integral, we get

ln(x) dx = x ln(x)

1 dx = x ln(x) x + c,

Activity 5.12 Use the result in Example 5.20 and the change of base formula for

logarithms to find

loga x dx,

which was also promised in Section 5.2.1.

Activity 5.13 Use the result in the previous example to find the integral in

Example 5.19 the other way, i.e. by choosing

f (x) = x

and

g (x) = ln x,

We observe that integration by parts is not useful for all products since, as we saw

above, integrals like

(x2 + 1)2 x dx,

in Example 5.8 contain a product and yet they are best dealt with by substitution as

the extra x in the product is a constant multiple of the derivative of g = x2 + 1.

161

5. Integration

(x2 + 1)2 x2 dx,

would require integration by parts since, now, the extra x2 in the product is not a

constant multiple of the derivative of g = x2 + 1. Indeed, the main skill involved in

finding integrals using these rules is choosing the appropriate method.3 To illustrate

this, lets see how we would find this last integral.

Example 5.21

Find

f (x) = (x2 + 1)2

and

g (x) = x2 ,

f (x) = 2(x2 + 1)(2x)

and

g(x) =

x3

,

3

where we have used the chain rule to perform the differentiation and suppressed the

arbitrary constant from the integration. Applying the rule then gives,

x3

3

2(x2 + 1)(2x)

x3

3

dx,

and, clearly, the new integral is easier to find because we can easily multiply out the

brackets and integrate term-by-term. Thus, finding this integral, we get

(x2 + 1)2 x2 dx =

4

x3 2

(x + 1)2

3

3

x6 + x4 dx =

x3 2

4

(x + 1)2

3

3

x7 x5

+

7

5

+ c,

as the answer.

Activity 5.14 Verify that this answer is correct by multiplying out the brackets in

the integrand and integrating term-by-term.

The last two ways of making progress with an integral that we will consider are not rules

of integration, but handy techniques that allow us to rewrite integrands so that we can

see how to integrate them. The first of these uses a particular kind of algebraic identity

known as partial fractions and the second involves the use of trigonometric identities.

5.2.5

Suppose that we have an integrand which is a rational function of two polynomials, say

R(x) =

3

P (x)

.

Q(x)

This is unlike the situation with differentiation where it is always pretty obvious which rule we should

be applying!

162

In order to apply the method of partial fractions, it must be the case that the degree of

the numerator, i.e. P (x), is less than the degree of the denominator, i.e. Q(x). If this is

the case, we start by looking at how the denominator factorises and then proceed

according to which of the following cases we are in.

Case 1: The denominator has distinct [real] linear factors

If the denominator, Q(x), is of degree n and has n real and distinct roots a1 , a2 , . . . , an

then we can write

Q(x) = (x a1 )(x a2 ) (x an ),

i.e. Q(x) has distinct [real] linear factors. In this case, the method of partial fractions

dictates that we can write

A1

A2

An

P (x)

=

+

+ +

,

R(x) =

(x a1 )(x a2 ) (x an )

x a1 x a2

x an

right-hand-side, comparing the numerators and letting x = a1 , x = a2 , . . . , x = an

respectively. Lets look at a simple example.

Example 5.22

x

dx.

x2 x 2

Find

Here the integrand is a rational function of two polynomials and the degree of the

numerator is less than the degree of the denominator. As such, we can use the

method of partial fractions and, looking at the denominator, we see that

x2 x 2 = (x 2)(x + 1),

so we are in the case where we have distinct linear factors. This means that we can

write

x

A1

A2

x

=

=

+

,

2

x x2

(x 2)(x + 1)

x2 x+1

for some constants A1 and A2 . To find these constants, we cross-multiply on the

right-hand-side to see that

A1 (x + 1) + A2 (x 2)

x

=

,

(x 2)(x + 1)

(x 2)(x + 1)

x = A1 (x + 1) + A2 (x 2).

Indeed, setting x = 2 on both sides, we see that 2 = 3A1 whereas setting x = 1 on

both sides, we see that 1 = 3A2 . Thus, we have

x2

x

x

2/3

1/3

=

=

+

,

x2

(x 2)(x + 1)

x2 x+1

using the values of A1 and A2 that we have found. Consequently, we find that

x2

x

dx =

x2

2/3

1/3

+

x2 x+1

dx =

2

1

ln |x 2| + ln |x + 1| + c,

3

3

163

5. Integration

We observe, in particular, that the degree of the denominator determines how many

constants we have to find.

Case 2: The denominator has a repeated [real] linear factor

If we find that one of the roots, say ak , of the denominator, Q(x), is real and repeated

m times then we replace the term

Ak

,

x ak

in the expansion from Case 1 with the terms

B1

B2

Bm

+

+ +

.

2

x ak (x ak )

(x ak )m

We then have to find the numbers B1 , B2 , . . . , Bm as well as any other numbers that

remain from Case 1. Lets look at a simple example.

Example 5.23

Find

x+3

dx.

(x + 2)(x 1)2

Here the integrand is a rational function of two polynomials and the degree of the

numerator is less than the degree of the denominator. As such, we can use the

method of partial fractions and, looking at the denominator, we have

(x + 2)(x 1)2 ,

and so we are in the case where we have a repeated linear factor. This means that

we can write

A1

B1

B2

x+3

=

+

+

,

2

(x + 2)(x 1)

x + 2 x 1 (x 1)2

for some constants A1 , B1 and B2 . To find these constants, we cross-multiply on the

right-hand-side to see that

A1 (x 1)2 + B1 (x 1)(x + 2) + B2 (x + 2)

x+3

=

,

(x + 2)(x 1)2

(x + 2)(x 1)2

Indeed, setting x = 2 on both sides, we see that 1 = 9A1 and setting x = 1 on both

sides, we see that 4 = 3B2 . However, to find B1 , we now note that comparing (say)

the coefficient of the x2 term on both sides of this expression we get 0 = A1 + B1

and so B1 = A1 = 1/9. Thus, we have

x+3

1/9

1/9

4/3

=

+

+

,

2

(x + 2)(x 1)

x + 2 x 1 (x 1)2

using the values of A1 , B1 and B2 that we have found. Consequently, we find that

1/9

1/9

4/3

+

+

dx

x + 2 x 1 (x 1)2

1

1

4

= ln |x + 2| ln |x 1|

+ c,

9

9

3(x 1)

x+3

dx =

(x + 2)(x 1)2

where c is an arbitrary constant.

164

We observe, again, that the degree of the denominator determines how many constants

we have to find.

Case 3: The denominator has an irreducible [real] factor

If we find that the denominator, Q(x), has an irreducible [real] factor like ax2 + bx + c,4

then we replace the corresponding term in the expansion from Case 1 with the term

C1 x + C2

.

ax2 + bx + c

We then have to find the numbers C1 and C2 as well as any other numbers that remain

from Case 1. Lets look at a simple example.

Example 5.24

Find

x

dx.

2

(x 1)(x + 2x + 2)

Here the integrand is a rational function of two polynomials and the degree of the

numerator is less than the degree of the denominator. As such, we can use the

method of partial fractions and, looking at the denominator, we have

(x 1)(x2 + 2x + 2),

and so we are in the case where we have an irreducible factor as x2 + 2x + 2 has no

real roots as, for instance, b2 4ac gives us 22 4(1)(2) = 4 8 = 4 < 0. This

means that we can write

A1

C1 x + C2

x

=

+

,

(x 1)(x2 + 2x + 2)

x 1 x2 + 2x + 2

for some constants A1 , C1 and C2 . To find these constants, we cross-multiply on the

right-hand-side to see that

x

A1 (x2 + 2x + 2) + (C1 x + C2 )(x 1)

=

,

(x 1)(x2 + 2x + 2)

(x 1)(x2 + 2x + 2)

and so, comparing the numerators, we need

x = A1 (x2 + 2x + 2) + (C1 x + C2 )(x 1).

Indeed, setting x = 1 on both sides, we see that 1 = 5A1 and, to find C1 , we now note

that comparing the coefficient of the x2 term on both sides of this expression we get

0 = A1 + C1 and so C1 = A1 = 1/5 and comparing the coefficient of the constant

term on both sides we get 0 = 2A1 C2 and so C2 = 2A1 = 2/5. Thus, we have

(x

x

1/5

(1/5)(x + 2)

=

+ 2

,

+ 2x + 2)

x1

x + 2x + 2

1)(x2

That is, we have a quadratic like ax2 + bx + c with b2 4ac < 0 so we cannot find real roots. This

means that we cannot factorise it using real factors and so we cannot use Case 1 or Case 2 on it.

165

5. Integration

using the values of A1 , C1 and C2 that we have found. Consequently, we find that

(x

x

dx =

+ 2x + 2)

1)(x2

1

5

1/5

(1/5)(x 2)

+ 2

x1

x + 2x + 2

dx

1

x2

2

x 1 x + 2x + 2

dx.

Now, the integral of the first term is easy but, to deal with the integral of the second

term, we note that the derivative of x2 + 2x + 2 is 2x + 2 (i.e. we are thinking about

the substitution g = x2 + 2x + 2 which we saw in Example 5.13). This means that,

writing

x2

1 2x 4

1 2x + 2 6

1 2x + 2

3

x2

=

=

=

2

,

2

2

2

+ 2x + 2

2 x + 2x + 2

2 x + 2x + 2

2 x + 2x + 2 x + 2x + 2

we can see that, completing the square in the denominator of the last term, we have

(x

x

1

dx =

+ 2x + 2)

5

1)(x2

1

5

1

1 2x + 2

3

+

2

x 1 2 x + 2x + 2 (x + 1)2 + 1

ln |x 1|

dx

1

ln |x2 + 2x + 2| + 3 tan1 (x + 1) + c,

2

g = x2 + 2x + 2 in the middle term (as we saw in Example 5.13) and we saw how to

integrate the last term in Activity 5.11.

Of course, the key here is that, in this new term the linear expression in the numerator

that we have to find has a degree which is one less than the degree of the irreducible

quadratic expression in the denominator.5 This means that, if we had a repeated

irreducible factor in the denominator, we would have to compensate in a way which is

reminiscent of Case 2 as the next example shows.

Example 5.25

Find

x4 + x3 + 2x2

dx.

(x 1)(1 + x2 )2

Here the integrand is a rational function of two polynomials and the degree of the

numerator is less than the degree of the denominator. As such, we can use the

method of partial fractions and, looking at the denominator, we have

(x 1)(1 + x2 )2 ,

and so we are in the case where we have a repeated irreducible factor as x2 + 1 has

no real roots as, for instance, b2 4ac gives us 02 4(1)(1) = 4 < 0. This means

that we can write

x4 + x3 + 2x2

A1

C1 x + C2 D1 x + D2

=

+

+

,

2

2

(x 1)(1 + x )

x1

1 + x2

(1 + x2 )2

5

That is, the number of constants we have to find is equal to the degree of the denominator in the

term we are dealing with.

166

cross-multiply on the right-hand-side to see that

x4 + x3 + 2x2

A1 (1 + x2 )2 + (C1 x + C2 )(x 1)(1 + x2 ) + (D1 x + D2 )(x 1)

=

,

(x 1)(1 + x2 )2

(x 1)(1 + x2 )2

and so, comparing the numerators, we need

x4 + x3 + 2x2 = A1 (1 + x2 )2 + (C1 x + C2 )(x 1)(1 + x2 ) + (D1 x + D2 )(x 1).

Indeed, setting x = 1 on both sides, we see that 4 = 4A1 and, to find C1 , we now

note that comparing the coefficient of the x4 term on both sides of this expression

we get 1 = A1 + C1 and so C1 = 1 A1 = 0 and comparing the coefficient of the x3

term on both sides of this expression we get 1 = C1 + C2 and so C2 = 1 + C1 = 1.

To find D1 and D2 we note that, using what we have found so far, we have

x4 + x3 + 2x2 = (1 + x2 )2 + (x 1)(1 + x2 ) + (D1 x + D2 )(x 1),

which means that, comparing the coefficient of x2 on both sides of this expression we

get 2 = 2 1 + D1 and so D1 = 1 whereas comparing the coefficient of x on both

sides we get 0 = 0 + 1 D1 + D2 and so D2 = 0. Thus, we have

x4 + x3 + 2x2

1

1

x

=

+

+

,

2

2

2

(x 1)(1 + x )

x1 1+x

(1 + x2 )2

using the values of the constants A1 , C1 , C2 , D1 and D2 that we have found.

Consequently, we find that

x4 + x3 + 2x2

dx =

(x 1)(1 + x2 )2

1

1

x

+

+

dx

2

x1 1+x

(1 + x2 )2

1

+c

= ln |x 1| + tan1 x

2(1 + x2 )

u = 1 + x2 to work out the integral of the last term.

So, once again, we observe that the degree of the denominator determines how many

constants we have to find in all of these examples. Generally speaking, as we are using

partial fractions to help us find integrals, we shouldnt expect to see anything more

complicated than this.

5.2.6

sin2 x cos x dx,

by using the substitution g = sin x, but what if you were asked to find

sin2 x dx?

167

5. Integration

In this case, the substitution would not work since we do not have the extra factor of

cos x in the integrand. However, as we shall see in the next example, we can easily find

this new integral if we use one of the trigonometric identities that we saw in

Section 2.1.4.

sin2 x dx.

Find

Example 5.26

cos(2x) = 1 2 sin2 x,

which allows us to write the problematic integrand sin2 x in terms of the function

cos(2x) which is far easier to integrate. That is, rearranging this trigonometric

identity, we have

1

sin2 x =

1 cos(2x) ,

2

and so we find that

sin2 x dx =

1

2

1 cos(2x) dx =

1

1

x sin(2x) + c,

2

2

Activity 5.15

Find

cos2 x dx.

trigonometric identity tan2 + 1 = sec2 to obtain a useful result. Heres another one

that is very similar.

Example 5.27

dx

b2

(x + a)2

Here, for reasons that will soon become apparent, we make the suggested

substitution. As such, differentiating both sides of x + a = b sin with respect to ,

we have

dx

= b cos

=

dx = b cos d.

d

This means that our integral becomes

dx

b2

(x +

a)2

b cos

2

b2 b2 sin

dx =

cos

d =

cos

d,

if we use the trigonometric identity 1 sin2 = cos2 from (2.2). This then gives us

d = + c = sin1

168

x+a

b

+ c,

since x + a = b sin and where c is an arbitrary constant. Thus, we have found that

dx

= sin1

b2 (x + a)2

x+a

b

+ c,

As a last example, lets see another way in which trigonometric identities can be used to

find an integral.

Example 5.28

d

.

1 + cos(2)

The substitution t = tan is very useful and so we start by seeing how it can be

applied. Firstly, we note that, differentiating both sides with respect to , we get

dt

= sec2 ,

d

and so, using the trigonometric identity sec2 = 1 + tan2 from (2.4), this gives us

d =

dt

.

1 + t2

1 + cos(2) = 1 + cos2 sin2 ,

using a double-angle formula from (2.6) and so we will need to be able to write sin

and cos in terms of t. An easy way to do this is to consider the right-angled

triangle in Figure 5.1 as this immediately tells us that

t

sin =

,

1 + t2

and

cos =

1

,

1 + t2

1

t2

2

1 + cos(2) = 1 + cos sin = 1 +

=

,

2

2

1+t

1+t

1 + t2

2

d

=

1 + cos(2)

1 + t2 dt

=

2 1 + t2

1

t

1

dt = + c = tan + c,

2

2

2

Generally, as in the last two examples, when an unusual substitution is required in this

unit, it will be given in the question. Indeed, well see a little bit more of this kind of

thing in Examples 5.36 and 5.37.

169

5. Integration

1+

t2

1

Figure 5.1: A right-angled triangle with t = tan can have t on the opposite side and 1

on the

adjacent side which means that, using Pythagoras theorem, the hypotenuse must

be 1 + t2 . With this triangle, we can then quickly deduce the expressions for sin and

cos in terms of t which are needed for Example 5.28.

5.3

So far, we have been looking at indefinite integrals and we have been finding them by

using the idea of an antiderivative to deduce standard integrals and rules of integration.

We now turn to the geometric interpretation of an integral and this involves introducing

the idea of a definite integral and seeing what it represents.

5.3.1

In Section 3.3.1 we saw that the derivative of a function, f (x), gave us the gradient of

the curve y = f (x). We now consider what the integral of a function, f (x), tells us about

the curve y = f (x) and see how this comes about through the idea of a definite integral.

What is a definite integral?

Recall that an indefinite integral is so-called since, given a function, f (x), and one of its

antiderivatives, F (x), i.e. two functions related by the fact that

dF

= f (x),

dx

we have

f (x) dx = F (x) + c,

where c is an arbitrary constant. And, indeed, it is this arbitrary constant that makes

this integral indefinite as we do not know what c is. In a similar vein, instead of writing,

b

f (x) dx,

a

In order to work out integrals that look like this we need to know what to do with these

limits and the procedure is:

Firstly: deal with the integral. Integrating f (x), we take one of its antiderivatives,

F (x), and then write

b

f (x) dx = F (x) .

a

170

integration.

Secondly: deal with the limits. By definition, we let

b

F (x)

a

= F (b) F (a),

Notice that this means that, if F (x) is an antiderivative of f (x), we have

b

a

i.e. the value of the integral depends only on the value of the antiderivative at the

points x = a and x = b. Thus, this is now a definite integral as it no longer involves an

arbitrary constant, c.

Activity 5.16

b

f (x) dx = F (x) + c

a

= F (b) F (a),

if c is a constant. Hence explain why we can omit the constant of integration when

evaluating definite integrals.

Another consequence of this discussion is that it allows us to see how to use our basic

rules of integration to evaluate definite integrals. For instance, if k and l are constants

and f (x) and g(x) are functions, then we can see that the linear combination rule gives

us

b

g(x) dx,

f (x) dx + l

Activity 5.17 Following what we saw in Section 5.2.2, write down the constant

multiple rule, the sum rule and the difference rule for definite integrals.

Activity 5.18 Using what we have seen so far, derive the linear combination rule

for definite integrals.

Now that we have the basic idea, lets see how we can work out a definite integral.

3

Example 5.29

Evaluate

(x + 4) dx.

1

If we follow the two step procedure above, i.e. integrating to find an antiderivative

and then dealing with the limits, we get

3

1

x2

(x+4) dx =

+ 4x

2

=

1

32

12

+ 4(3)

+ 4(1)

2

2

9

1

+ 12

+4

2

2

= 12,

171

5. Integration

Alternatively, we could use the linear combination rule to get

3

x dx +

(x + 4) dx =

9 1

2 2

x2

4 dx =

2

+ 12 4

+ 4x

1

=

1

32 12

2

2

+ 4(3) 4(1)

= 12,

What definite integrals with non-negative integrands represent

Definite integrals are useful because they tell us about the area under a curve.

Specifically, if we have the definite integral

b

f (x) dx,

(5.1)

where f (x) 0 for all x such that a x b,6 we say that we have a non-negative

integrand and find that the value of the integral is the area of the region between the

curve y = f (x), the x-axis and the vertical lines x = a and x = b as illustrated in

Figure 5.2.

y

y = f (x)

x

a

Figure 5.2: The hatched region is between the curve y = f (x), the x-axis and the vertical

lines x = a and x = b. In cases like this we have a non-negative integrand, i.e. f (x) 0

for a x b, and so the definite integral in (5.1) gives us the area of this hatched region.

Example 5.30 Find the area of the region between the line y = 4 2x, the x-axis

and the vertical lines x = 0 and x = 2 which is illustrated in Figure 5.3(a).

There are two ways to find this area:

As this is just a right-angled triangle, the area is just half times base times

height, i.e.

1

area of triangle = 2 4 = 4.

2

Thus, the area of the region is four.

6

At the moment we will just accept this caveat. The reason why we need f (x) to be non-negative for

values of x between the limits of integration will become clear very soon.

172

As we have y = f (x) with f (x) = 4 2x, we can see from Figure 5.3(a) that

f (x) 0 between x = 0 and x = 2. So, as noted above, the area should be given

by

2

2

0

(4 2x) dx = 4x x2

= (4 2 22 ) (4 0 02 ) = (8 4) 0 = 4,

Consequently, this confirms that the definite integral does give us the area of the

region between the line y = 4 2x, the x-axis and the vertical lines x = 0 and x = 2,

at least, when f (x) 0 between the vertical lines.

y

11111

00000

00000

11111

00000

11111

00000

11111

00000

11111

00000

11111

00000

11111

00000

11111

4

3

y = 4 2x

y = 4 x2

(a)

(b)

Figure 5.3: Non-negative integrands. (a) For Example 5.30, the region between the line

y = 4 2x, the x-axis and the vertical lines x = 0 and x = 2. (b) For Example 5.31, the

region between the parabola y = 4 x2 , the x-axis and the vertical lines x = 1 and

x = 1.

However, generally, we wont have a simple geometric way of finding the area under a

curve and so we will have to use integration.

Example 5.31 Find the area of the region between the parabola y = 4 x2 , the

x-axis and the vertical lines x = 1 and x = 1 which is illustrated in Figure 5.3(b).

As we have y = f (x) with f (x) = 4 x2 , we can see from Figure 5.3(b) that

f (x) 0 between x = 1 and x = 1. So, as noted above, the area should be given by

1

1

(4 x2 ) dx = 4x

=

11

3

x3

3

1

1

4(1)

11

3

(1)3

3

4(1)

(1)3

3

22

,

3

173

5. Integration

Activity 5.19 Observe that the region in the previous example is symmetric about

the y-axis. Use this observation to explain why the area of this region is two times

the area represented by the definite integral,

1

0

(4 x2 ) dx,

and verify that this does indeed give the correct area.

What definite integrals with non-positive integrands represent

We now start to consider what happens to the definite integral in (5.1) when we cant

guarantee that the integrand is non-negative, i.e. what happens if we do not have

f (x) 0 for all x such that a x b? To simplify matters, we will start by asking:

What happens when this condition always fails? That is, what happens when the

integrand is non-positive as f (x) 0 for all x such that a x b.

So what does the definite integral in (5.1) tell us about the area of the region bounded

by the curve y = f (x), the x-axis and the vertical lines x = a and x = b when we have a

non-positive integrand, i.e. when f (x) 0 for a x b, as illustrated in Figure 5.4?

One way of looking at this is to note that,

If f (x) 0 for all a x b, then f (x) 0 for all a x b.

But, this means that f (x) gives us a non-negative integrand and the area, A, of the

region in question is given by

b

A=

a

f (x) dx =

f (x) dx

f (x) dx = A,

i.e. for non-positive integrands, the definite integral gives us minus the area. Thus, in

the case of non-positive integrands, the area is given by the magnitude of the definite

integral. Lets have a look at an example.

y

O

x

y = f (x)

Figure 5.4: The hatched region is between the curve y = f (x), the x-axis and the vertical

lines x = a and x = b. In cases like this we have a non-positive integrand, i.e. f (x) 0

for a x b, and so the definite integral in (5.1) gives us minus the area of this hatched

region.

174

Example 5.32 Find the area of the region between the line y = 4 2x, the x-axis

and the vertical lines x = 2 and x = 4 which is illustrated in Figure 5.5(a).

There are two ways to find this area:

As this is just a right-angled triangle, the area is just half times base times

height, i.e.

1

area of triangle = 2 4 = 4.

2

Thus, the area of the region is four.

As we have y = f (x) with f (x) = 4 2x, we can see from Figure 5.5(a) that

f (x) 0 between x = 2 and x = 4. So, looking at the definite integral we get,

4

(42x) dx = 4xx

which is minus the answer we would expect. As such, we take the magnitude of

this answer and so the area is, again, four.

Consequently, if f (x) 0 between the vertical lines, the definite integral gives us

minus the area and so we take the magnitude of the definite integral to find the area.

y = 4 2x

1

O

x

1

(a)

y = 4 2x

x

1

(b)

Figure 5.5: Negative integrands and their relation to area. The region between the line

y = 4 2x, the x-axis and the vertical lines (a) x = 2 and x = 4 for Example 5.32, and

(b) x = 0 and x = 4 for Example 5.33.

175

5. Integration

We now consider what happens to the definite integral in (5.1) when we cant guarantee

that the integrand is non-positive or non-negative, i.e. what happens if f (x) 0 for

some x such that a x b but not others? Lets start by considering the simple case

where we have an integrand which is neither non-positive nor non-negative because

there is some number c such that a c b where

f (x) 0 for all x such that a x c, and

f (x) 0 for all x such that c x b,

as illustrated in Figure 5.6. One way of looking at this is to note that the definite

integral

f (x) dx gives us the hatched area, A1 , between the vertical lines x = a and

x = c,

a

f (x) dx gives us minus the hatched area, A2 , between the vertical lines x = c

c

and x = b.

As such, the hatched area, A, between the lines x = a and x = b is given by

c

A = A1 + A 2 =

f (x) dx +

a

f (x) dx ,

c

y

y = f (x)

b

O

x

a

Figure 5.6: The hatched region is between the curve y = f (x), the x-axis and the vertical

and a non-positive integrand for a x c, i.e. the definite integral in (5.1) can not be

used to find the area of the region.

Thus, for general integrands, the procedure for finding the area of the region bounded

by the curve y = f (x), the x-axis and the vertical lines x = a and x = b is as follows:

Firstly, determine all the points where the curve crosses the x-axis with

x-coordinates between x = a and x = b.

Secondly, use these points to determine (possibly via a sketch) where the curve is

positive and where the curve is negative.

176

Thirdly, use this information to determine the areas by finding the appropriate

definite integrals (bearing in mind that the integrands will now be either

non-negative or non-positive).

Fourthly, add up all the areas to find the total area.

To see how this works lets consider a couple of examples.

Example 5.33 Find the area of the region between the line y = 4 2x, the x-axis

and the vertical lines x = 0 and x = 4 which is illustrated in Figure 5.5(b).

As indicated in Figure 5.5(b), the line y = 4 2x crosses the x-axis when x = 2 and

this lies between x = 0 and x = 4. We can also see that the function is non-negative

for 0 x 2 and non-positive for 2 x 4. As such, using our earlier workings in

Examples 5.30 and 5.32, we split the total region into two sub-regions to see that:

2

non-negative integrand.

Between x = 2 and x = 4 we evaluate the definite integral,

4

2

as we saw in Example 5.32. Thus, the area is four here as we have a non-positive

integrand.

Consequently, the total area is eight.

We also note, in passing, that the definite integral

4

4

0

(4 2x) dx = 4x x

2

0

= (4 4 42 ) (4 0 02 ) = (16 16) 0 = 0,

and, as this is zero, it most definitely is not giving us the area we seek!

Activity 5.20 Verify that the answer to the previous example is correct by finding

the areas of the triangles involved.

Example 5.34 Find the area of the region between the parabola y = 1 x2 , the

x-axis and the vertical lines x = 2 and x = 2 which is illustrated in Figure 5.7.

As indicated in Figure 5.7, the parabola y = 1 x2 crosses the x-axis when x = 1

and these points lie between x = 2 and x = 2. We can also see that the function is

non-negative for 1 x 1 and non-positive for 2 x 1 and 1 x 2. As

such, we split the total region into three sub-regions to see that:

177

5. Integration

1

2

x3

(1)3

(2)3

(1 x ) dx = x

= 1

2

3 2

3

3

1

8

4

= 1 +

2 +

= .

3

3

3

2

4

3

1

x3

13

(1)3

(1 x ) dx = x

= 1

1

3 1

3

3

1

1

4

1

1 +

= .

= 1

3

3

3

2

4

3

2

1

(1 x2 ) dx = x

4

3

x3

3

1

1

= 2

23

13

8

1

4

1

= 2

1

= .

3

3

3

3

3

4

3

+ 43 +

4

3

which is four.

2

2

(1x2 ) dx = x

x3

3

2

2

= 2

23

(2)3

8

8

4

(2)

= 2 2 +

= ,

3

3

3

3

3

5.3.2

We have seen how to use the basic rules of integration when dealing with definite

integrals and so we now look at how we can use the other two rules of integration,

namely integration by substitution and integration by parts, in this context.

Integration by substitution

When evaluating a definite integral using integration by substitution we follow the same

procedure as before but now, we also change the limits of integration so that they are

values of g rather than values of x. That is, if we are making the substitution g = g(x)

and we have a definite integral with limits x = a and x = b, after the substitution, the

limits will be g = g(a) and g = g(b) respectively. This is best illustrated by an example.

178

y

y = 1 x2

2

1

1

2

3

Figure 5.7: Negative integrands and their relation to area (continued). For Example 5.34,

the region between the parabola y = 1 x2 , the x-axis and the vertical lines x = 2 and

x = 2.

Example 5.35

x ex

Find

2 +1

dx.

dg

= 2x

dx

dg = 2x dx

x dx =

1

dg.

2

In this case, as we have a definite integral, we also change the limits of integration,

i.e.

lower limit: x = 0 gives g = g(0) = 02 + 1 = 1, and

upper limit: x = 1 gives g = g(1) = 12 + 1 = 2.

Hence, the substitution gives

1

xe

x2 +1

2

g

dx =

e

1

1

=

2

1

dg

2

1 g

e dg =

e

2

g

=

1

1

2

e2 e1

e

= (e 1),

2

as the answer.

Alternatively, using our indefinite integral from Example 5.10, we saw that

integration by substitution gave us

x ex

2 +1

dx =

1 x2 +1

e

+c,

2

1

x ex

0

2 +1

dx =

1 x2 +1

e

2

=

0

1

2

2 +1

e1

2 +1

e0

1

2

e2 e1

e

= (e 1),

2

as before.

179

5. Integration

For a harder example, lets see what happens when we have to make a substitution that

works because of our double-angle formulae from Section 2.1.4.

1

Example 5.36

0

x 1 x dx.

dx

= 2 sin cos

d

dx = 2 sin cos d,

lower limit: x = 0 gives sin2 = 0 and so = 0, and

upper limit: x = 1 gives sin2 = 1 and so = /2.

1

0

x 1 x dx =

/2

/2

0

where we have

used the trigonometric identity cos2 = 1 sin2 from (2.2)to get

cos from the 1 x in the integrand. Then, using the double-angle formula

sin(2) = 2 sin cos from (2.6), we see that this gives us

1

0

1

x 1 x dx =

2

/2

sin2 (2) d,

0

which we solve using a variation on the method given in Example 5.26, i.e. we note

that cos(4) = 1 2 sin2 (2) from Activity 2.18, so that

1

0

1

x 1 x dx =

4

/2

0

1 cos(4) d =

1

1

sin(4)

4

4

/2

=

0

Lastly, lets see another application of the t = tan substitution that we saw in

Example 5.28.

/2

Example 5.37

0

d

.

4 2 cos2

d =

dt

1 + t2

and

cos2 =

1

,

1 + t2

4 2 cos2 = 4

180

2

2 + 4t2

=

.

1 + t2

1 + t2

,

8

lower limit: = 0 gives t = tan 0 = 0, and

upper limit: = /2 gives t = tan(/2) = ,

/2

0

1 + t2 dt

1

dt

=

2 + 4t2 1 + t2

4 0

0

2

1

1

,

=

2 tan ( 2t)

=

4

8

0

d

=

4 2 cos2

1

2

1

dt

+ t2

tan1 () = /2, what we really mean is tan as /2 and

tan1 t /2 as t . This shorthand will be fine for this course, but in 176

Further Calculus, we will see how to do things like this properly.

Integration by parts

When evaluating a definite integral using integration by parts we use

b

a

f (x)g(x) dx,

a

i.e. we have to evaluate the f (x)g(x) term using the limits of integration as well as

evaluating the new [easier] definite integral.

1

Example 5.38

x ex dx.

Find

0

choose

f (x) = x

and

g (x) = ex ,

so that differentiating f (x) and integrating g (x) we get

f (x) = 1

g(x) = ex ,

and

where we have suppressed the arbitrary constant from the integration. Applying the

rule in the case of a definite integral then gives,

1

x ex dx = (x)(ex )

0

(1)(ex ) dx = x ex

0

ex dx,

0

which leads to

1

1

x

x e dx =

0

(1)(e ) (0)(e ) e

=

0

e1 0

e1 e0

= 1,

as the answer.

181

5. Integration

Alternatively, using our indefinite integral from Example 5.18, we saw that

integration by parts gave us

x ex dx = (x 1) ex +c,

and so this means that, if we suppress the constant of integration, we get

1

1

0

x ex dx = (x 1) ex

= (1 1) e1 (0 1) e0 = 0 ( e0 ) = 1,

as before.

5.4

Applications of integrals

Integrals can be used in economics and we now introduce two ways in which they can

arise in that subject. The first is what happens when we want to find a cost function

but we only know the marginal cost; and the second introduces the idea of consumer

and producer surpluses.

5.4.1

Suppose that the cost of producing a quantity, q, of goods is given by the cost function,

C(q). In Section 3.3.3, we met the idea of the marginal cost, MC(q), of producing q

units which was given by

dC

MC(q) =

,

dq

and this was useful since the approximation

C

MC(q)q,

i.e. where the quantity produced is increased from q to q + q. We now consider the

problem of finding the cost function, C(q), when we are given the marginal cost

function, MC(q). Indeed, as the marginal cost function is the derivative of the cost

function, we can see that

C(q) is an antiderivative of MC(q),

and so,

C(q) =

MC(q) dq.

(5.2)

However, this presents us with a problem as finding the indefinite integral on the

right-hand-side of (5.2) will yield all the antiderivatives of MC(q) i.e. a function C(q)

that contains an arbitrary constant whereas we want to find the particular

antiderivative that is actually the cost function i.e. we want to find a particular value

of this constant. So, the question is: Which value of the arbitrary constant will give us

the cost function? In order to answer this question, we need to be given more

information, say the fixed costs associated with this production, so that we can find the

right value for this constant. Lets consider an example.

182

Example 5.39

MC(q) = 2q + 100 eq ,

and its fixed costs are 10, 000. What is the cost function, C(q), for this company?

Using (5.2) above, we see that the cost function is given by the integral of the

marginal cost, i.e.

C(q) =

where c is an arbitrary constant. This tells us, depending on the value of c, all of the

possible cost functions for this company. But, which one should we take? Obviously,

perhaps, we want the one which also gives us fixed costs of 10, 000, i.e. we want

C(0) = 10, 000 = 10, 000 = 02 + 100 e0 +c = 10, 000 = 100 + c = c = 9, 900,

as the fixed costs are the cost of producing nothing. Thus, the cost function for this

company is given by

C(q) = q 2 + 100 eq +9, 900,

as this function agrees with the question on both the marginal and the fixed costs of

production.

5.4.2

Suppose that a market has linear supply and demand functions as illustrated in

Figure 5.8. As we know from Section 2.1.5, the equilibrium price, p , and the

equilibrium quantity, q , occur at the point where the graphs of these functions

intersect. Indeed, at equilibrium, as the consumers buy q units of the good at a price of

p per unit, they pay an amount p q to the suppliers and we can think of this as the

area of the hatched region in Figure 5.9(b).

However, if the consumers are willing to buy q units of the good, it can be argued7

that the consumers would be willing to pay an amount given by

q

pD (q) dq,

0

which is the area of the hatched region in Figure 5.9(a). The difference between the area

that represents what they would pay and the area that represents what they actually

pay, i.e. the area of the hatched region in Figure 5.9(d), is called the consumer surplus.

Indeed, this consumer surplus, CS, can be found using the formula

q

CS =

0

pD (q) dq p q ,

and this is the amount that the consumers save by paying what they actually paid

instead of what they would have paid.

7

183

5. Integration

Figure 5.8: Linear supply and demand functions for a market. Note that the equilibrium

price, p , and the equilibrium quantity, q , occur at the point where the graphs of these

functions intersect.

Similarly, if the suppliers are willing to supply q units of the good, it can be argued

that they need to be paid an amount given by

q

pS (q) dq,

0

which is the area of the hatched region in Figure 5.9(c). The difference between the area

that represents what they are actually paid and the area that represents what they need

to be paid, i.e. the area of the hatched region in Figure 5.9(e), is called the producer

surplus. Indeed, this producer surplus, PS, can be found using the formula

q

PS = p q

pD (q) dq,

0

and this is the amount that the suppliers gain by being paid what they actually receive

instead of what they need to receive. Lets look at a simple example.

Example 5.40

1

pD (q) = 70 q,

3

and an inverse supply function given by

1

pS (q) = 20 + q.

2

Find the equilibrium price and quantity. What are the consumer and producer

surpluses for this market?

The equilibrium quantity, q , makes the prices obtained from the inverse demand

and supply functions equal, i.e.

1

1

5

70 q = 20 + q

=

50 = q

=

q = 60,

3

2

6

184

p

111111

000000

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

p

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

O

111111

000000

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

p

(a)

1111111111

0000000000

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

(b)

(c)

p

111111

000000

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

p

000000

111111

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

O

0q

1

1010

111111

000000

000000

111111

10

000000

111111

1010

000000

111111

000000

111111

1010

p

Consumer surplus:

area (a) area (b)

Producer surplus:

area (b) area (c)

(d)

(e)

Figure 5.9: What people pay or need to be paid. (a) What the consumers would pay for a

quantity q . (b) What the consumers pay for a quantity q if the market is at equilibrium.

(c) What the suppliers need to be paid for a quantity q . (d) What the consumers save if

they pay for a quantity q in a market that is at equilibrium, this is the consumer surplus.

(e) What the producers gain if they sell a quantity q in a market that is at equilibrium,

this is the producer surplus.

and this means that the equilibrium price, p , is given by

1

p = 70 (60) = 70 20 = 50,

3

if we use the inverse demand function.

Hence, to find the consumer surplus, CS, we have

q

CS =

0

pD (q) dq p q ,

60

0

1

70 q

3

q2

dq = 70q

6

60

0

1

= 70(60) (60)2 0 = 4, 200 600 = 3, 600,

6

CS = 3, 600 (50)(60) = 3, 600 3, 000 = 600,

185

5. Integration

is the consumer surplus. And, to find the producer surplus, PS, we have

q

PS = p q

pS (q) dq,

0

60

0

1

20 + q

2

q2

dq = 20q +

4

60

0

1

= 20(60) + (60)2 0 = 1, 200 + 900 = 2, 100,

4

PS = (50)(60) 2, 100 = 3, 000 2, 100 = 900,

is the producer surplus.

5

Although, as both the demand and supply functions are linear in this example, there is

an easier way to find the consumer and producer surpluses as the next Activity shows.

Activity 5.21 Sketch the inverse demand and supply functions in the previous

example and shade in the regions which represent the consumer and producer

surplus. What are the areas of these regions?

Of course, the demand and supply functions that we are given may not be linear and, in

such cases, we would have to use integration to find the consumer and producer

surpluses.

Activity 5.22

p(q + 1) = 231.

If the equilibrium quantity is 10, find the equilibrium price and hence determine the

consumer surplus.

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

find integrals using standard integrals and the rules of integration;

find integrals by simplifying the integrand using partial fractions and trigonometric

identities;

use integrals to find areas;

solve problems from economics-based subjects that involve integrals.

186

Solutions to activities

Solution to activity 5.1

Given the linear combination rule, i.e.

[kf (x) + lg(x)] dx = k

f (x) dx + l

g(x) dx,

kf (x) dx =

f (x) dx + 0

g(x) dx = k

f (x) dx,

[f (x) + g(x)] dx =

=

f (x) dx +

f (x) dx + 1

g(x) dx

g(x) dx,

[f (x) g(x)] dx =

=

f (x) dx

f (x) dx + (1)

g(x) dx

g(x) dx.

Suppose that F (x) and G(x) are antiderivatives of f (x) and g(x) respectively, i.e.

dF

= f (x)

dx

and

dG

= g(x).

dx

k

f (x) dx + l

where c is an arbitrary constant. But, by the linear combination rule for differentiation,

we also have

d

dF

dG

kF (x) + lG(x) + c = k

+l

+ 0 = kf (x) + lg(x),

dx

dx

dx

which means that kF (x) + lG(x) + c is also an antiderivative of kf (x) + lg(x), i.e.

[kf (x) + lg(x)] dx = kF (x) + lG(x) + c.

Consequently, we have

[kf (x) + lg(x)] dx = k

f (x) dx + l

g(x) dx,

187

5. Integration

For (a), use the constant multiple rule to see that

3 cos x dx = 3

cos x dx = 3 sin x + c,

where c is an arbitrary constant. For (b), we use the sum rule to see that

(ex + cos x) dx =

ex dx +

cos x dx = ex + sin x + c,

where c is an arbitrary constant. For (c), we use the linear combination rule to see that

3 sin x

3

x

dx = 3

1

dx = 3( cos x)3 ln |x|+c = 3 cos x3 ln |x|+c,

x

sin x dx3

Solution to activity 5.4

For both of these integrals we use the substitution g = 4x + 7 so that we have

dg

=4

dx

dg = 4dx

1

dx = dg.

4

1

dx =

4x + 7

1

g

1

dg

4

1

4

1

1

1

dg = ln |g| + c = ln |4x + 7| + c,

g

4

4

e4x+7 dx =

1

dg

4

eg

1

4

eg dg =

1 g

1

e +c = e4x+7 +c,

4

4

Solution to activity 5.5

Using the standard integrals as a source of antiderivatives, we see that, if n = 1,

(ax + b)n dx =

1 xn+1

+ c,

a n+1

whereas, if n = 1, we have

(ax + b)1 dx =

1

1

dx = ln |ax + b| + c,

ax + b

a

eax+b dx =

1 ax+b

e

+c,

a

1

sin(ax + b) dx = cos(ax + b) + c, and

a

188

1

sin(ax + b) + c,

a

where c is an arbitrary constant.

cos(ax + b) dx =

just integrating a constant, i.e. we have

bn dx = xbn + c,

for any n, as well as

eb dx = x eb +c,

cos b dx = x cos b + c,

Using what we saw in Activity 5.5 we see that the integrals from Activity 5.4 are,

simply,

1

1

dx = ln |4x + 7| + c

4x + 7

4

and

e4x+7 dx =

1 4x+7

e

+c,

4

Activity 5.4.

Solution to activity 5.7

Taking g = 3x2 + 7 we have g (x) = 6x and so dg = 6x dx, i.e. x dx = 16 dg. Hence, in

the first integral, this substitution gives

x

dx =

+7

3x2

1

g

1

dg

6

1

6

1

1

1

dg = ln |g| + c = ln |3x2 + 7| + c,

g

6

6

where c is an arbitrary constant whereas, in the second integral, this substitution gives

2 +7

x e3x

dx =

eg

1

dg

6

1

6

eg dg =

1 g

1 2

e +c = e3x +7 +c,

6

6

where c is an arbitrary constant. In both cases, note that the extra x in the integrand

was actually needed for the substitution g = 3x2 + 7 to work.

Solution to activity 5.8

Here the composition is sin(x2 ) and so we take g = x2 . As such, we have

dg

1

= 2x

=

x dx = dg,

dx

2

which is a constant multiple of the other part of the product in the integrand, i.e. this

substitution will work. Thus, the substitution gives

x sin(x2 ) dx =

sin(g)

1

dg

2

1

2

1

1

sin(g) dg = cos(g) + c = cos(x2 ) + c,

2

2

where c is an arbitrary constant. Here, of course, the extra x in the integrand was

needed for the substitution g = x2 to work.

189

5. Integration

Here the composition is cos2 x and so we take g = cos x. As such, we have

dg

= sin x,

dx

which, up to a minus, is the other part of the product in the integrand, i.e. this

substitution will work. Thus, we see that

dg = sin x dx,

and so the substitution gives

sin2 x cos x dx =

g 2 ( dg) =

g 2 dg =

1

g3

+ c = cos3 x + c.

3

3

Here, of course, the extra sin x in the integrand was needed for the substitution

g = cos x to work.

Solution to activity 5.10

In Activity 2.4, we saw that

cos x

,

sin x

which means that the composition is (sin x)1 and so we take g = sin x. As such, we

have

dg

= cos x,

dx

which is the other part of the product in the integrand, i.e. this substitution will work.

Thus, we see that

dg = cos x dx,

cot x =

cot x dx =

cos x

dx =

sin x

dg

= ln |g| + c = ln | sin x| + c.

g

Here, of course, the extra cos x in the integrand was needed for the substitution

g = sin x to work.

Solution to activity 5.11

We note that the quadratic expression in the denominator can be written as

x2 + 2x + 2 = (x + 1)2 + 1,

if we complete the square. As such, we have

x2

dx

=

+ 2x + 2

dx

= tan1 (x + 1) + c,

2

(x + 1) + 1

using the result we derived in Example 5.17. (A useful exercise at this point is to try

and get this answer by actually making the substitution x + 1 = tan as we did in that

example.)

190

Using the change of base formula for logarithms from Section 2.1.4, i.e.

ln x

,

ln a

loga (x) =

we have

loga x dx =

1

ln a

ln x dx =

x

1

x ln(x) x + c = x loga (x)

+ c,

ln a

ln a

Solution to activity 5.13

To find

f (x) = x

and

g (x) = ln x,

we differentiate f (x) and integrate g (x) using the result in Example 5.20 to get

f (x) = 1

g (x) = x ln x x,

and

where we have suppressed the arbitrary constant from the integration. Applying the

rule then gives

x ln x dx = x(x ln x x)

= x2 ln x x2

= x2 ln x

x2

(1)(x ln x x) dx

x ln x dx

x2

2

+c

x ln x dx + c,

so that, taking the integral on the right-hand-side over to the left-hand-side, we have

2

x2

x ln x dx = x ln x

+c

2

2

x2

x2

x ln x dx =

ln x

+ c,

2

4

where c is an arbitrary constant. Notice that this is the same as the answer we found in

Example 5.19 but it is slightly trickier to get and we need to know the answer to

Example 5.20.

Solution to activity 5.14

Unlike what we saw in Example 5.21, it would actually make more sense to find

(x2 + 1)2 x2 dx,

by multiplying out the brackets and integrating term-by-term rather than integrating it

by parts. Doing this, we get

(x2 + 1)2 x2 dx =

(x6 + 2x4 + x2 ) dx =

x7 2 5 x3

+ x +

+ c,

7

5

3

191

5. Integration

where c is an arbitrary constant. Indeed, to verify that this is the same answer as the

one we saw in the example, it is easiest to take the earlier answer and note that

4

x3 2

(x + 1)2

3

3

x 7 x5

+

7

5

x3 4

4 x 7 x5

(x + 2x2 + 1)

+

+c

3

3 7

5

x7 2 5 x3

4

4

=

+ x +

x 7 x5 + c

3

3

3

21

15

7

3

x

2

x

=

+ x5 +

+ c,

7

5

3

+c=

Solution to activity 5.15

To find this integral we also use the other double-angle formula from Activity 2.18,

namely

1

cos(2x) = 2 cos2 x 1

=

cos2 x =

1 + cos(2x) ,

2

as this allows us to write the problematic integrand cos2 x in terms of the function

cos(2x) which is far easier to integrate. This means that we have

cos2 x dx =

1

2

1 + cos(2x) dx =

1

1

x + sin(2x) + c,

2

2

Solution to activity 5.16

Using the first step, we can see that

b

f (x) dx = F (x) + c ,

a

step we get8

b

F (x) + c

=

a

F (b) + c F (a) + c

= F (b) F (a),

which is exactly what we wanted. That is, including a constant of integration does not

affect the value of a definite integral and so we can omit it.

Solution to activity 5.17

For definite integrals, it should be easy to see that we have the

constant multiple rule: If k is a constant and f (x) is a function, then

b

kf (x) dx = k

a

8

f (x) dx.

a

In what follows, bear in mind that a constant such as c, when evaluated at either x = a or x = b, is

just c.

192

b

[f (x) + g(x)] dx =

f (x) dx +

g(x) dx.

a

[f (x) g(x)] dx =

f (x) dx

g(x) dx.

a

Suppose that F (x) and G(x) are antiderivatives of f (x) and g(x) respectively, i.e.

dF

= f (x)

dx

dG

= g(x).

dx

and

b

f (x) dx+l

a

g(x) dx = k F (x)

a

+l G(x)

d

dF

dG

+l

= kf (x) + lg(x),

kF (x) + lG(x) = k

dx

dx

dx

which means that kF (x) + lG(x) is also an antiderivative of kf (x) + lg(x), i.e.

b

b

a

a

Consequently, we have

b

a

f (x) dx + l

g(x) dx,

Solution to activity 5.19

As the region in Example 5.31 is symmetric about the y-axis it should be clear that we

have an area given by

1

1

(4 x2 ) dx =

(4 x2 ) dx +

(4 x2 ) dx,

where the values of the two integrals on the right-hand-side, i.e. the areas they

represent, are equal. As such, we can write

1

1

2

(4 x ) dx = 2

(4 x2 ) dx,

193

5. Integration

if we decide to find the second of these integrals. Then, looking at the integral on the

right-hand-side, we get

1

x3

(4 x ) dx = 4x

3

4(1)

=

0

(1)3

3

4(0)

(0)3

3

11

,

3

Solution to activity 5.20

Looking at the triangles in Figure 5.5(b), we use half times base times height to see

that the area of the triangle on the left is

1

2 4 = 4,

2

1

2 4 = 4.

2

As such, the total area is eight as we found in Example 5.33.

Solution to activity 5.21

A sketch of the inverse supply and demand functions from Example 5.40 is given in

Figure 5.10 and the shaded regions are the consumer and producer surpluses as

indicated. Notice that we have also labelled the equilibrium price and quantity, which

we found in the example, on the sketch. Indeed, from this sketch it should be clear that:

The consumer surplus, CS, is the area of a triangle of base 60 and height 20, i.e. we

can use half times base times height to see that

CS =

1

60 20 = 600,

2

The producer surplus, PS, is the area of a triangle of base 60 and height 30, i.e. we

can use half times base times height to see that

PS =

1

60 30 = 900,

2

Solution to activity 5.22

As the demand equation is p(q + 1) = 231, we see that the inverse demand function is

pD (q) =

231

,

q+1

p = pD (q ) = 21. This means that, using

q

CS =

0

194

pD (q) dq p q ,

5.4. Exercises

p

111111

000000

70

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

000000

111111

CS

000000

111111

000000

111111

50

000000

111111

1111111111

0000000000

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

0000000000

1111111111

PS

20

210

60

Figure 5.10: A sketch of the consumer and producer surpluses for Activity 5.21.

we need to find

10

0

231

dq = 231 ln |q + 1|

q+1

10

0

CS = 231 ln 11 (21)(10) = 231 ln 11 210,

for the consumer surplus.

Exercises

Exercise 5.1

Find the following indefinite integrals.

(a)

(b)

sin3 x dx,

(c)

(x + 2) ln x dx.

Exercise 5.2

Find

1 + ex

dx.

ex

Exercise 5.3

x2

Find

dx.

x2 1

Exercise 5.4

Use the substitution t = tan

x

to evaluate

2

/2

0

dx

.

2 + cos x

Exercise 5.5

Find the area of the region between the curve y = x3 , the x-axis and the vertical lines

x = 1 and x = 2.

195

5. Integration

Solutions to exercises

Solution to exercise 5.1

For (a), we have to find

sin3 x cos x dx,

and we notice that the integrand involves the composition sin3 x. This suggests that we

should make the substitution g = sin x and, as this gives us

dg

= cos x

dx

dg = cos x dx,

which is the other part of the product in the integrand, we can be sure that this will

work. So, using this substitution we get

sin3 x cos x dx =

g 3 dg =

g4

1

+ c = sin4 x + c,

4

4

For (b), we have to find

sin3 x dx,

and we note that, using the trigonometric identity sin2 x = 1 cos2 x from (2.2), this

can be written as

sin3 x dx =

(1 cos2 x) sin x dx =

sin x dx

Of course, the first of these integrals on the right-hand-side is trivial and the other was

found in Activity 5.9. So, using this, we find that

sin3 x dx =

sin x dx

= cos x +

1

cos2 x sin x dx = cos x cos3 x + c

3

1

cos3 x + c,

3

For (c), we have to find

(x + 2) ln x dx,

and we note that the integrand is a product. This suggests that we should use

integration by parts with

f (x) = ln x

and

g (x) = x + 2,

like we did in Example 5.19. So, differentiating f (x) and integrating g (x) we get

f (x) =

196

1

x

and

g(x) =

x2

+ 2x,

2

where we have suppressed the arbitrary constant from the integration. Applying the

rule then gives,

x2

+ 2x

2

(x+2) ln x dx = (ln x)

x2

+ 2x

2

1

x

dx =

x

(x+4) ln x

2

x

+ 2 dx,

2

and, clearly, the new integral is easier to find. Thus, finding this integral, we get

(x + 2) ln x dx =

x

(x + 4) ln x

2

x

x

+ 2 dx = (x + 4) ln x

2

2

x2

+ 2x + c,

4

Solution to exercise 5.2

It makes sense to start by rewriting the integral so that we have

1 + ex

dx =

ex

since, in this form, we can see that we have the composition 1 + ex/2 in the integrand.

This suggests that we should make the substitution g = 1 + ex/2 and, as this gives us

1

dg

= ex/2

dx

2

2 dg = ex/2 dx,

which is the other part of the product in the integrand, we can be sure that this will

work. So, using this substitution we get

g 3/2

4

1 + ex

3/2

1/2

g

(2

dg)

=

2

g

dg

=

2

+c = 1 + ex/2

dx

=

+c,

x

3/2

3

e

where c is an arbitrary constant.

Solution to exercise 5.3

The integral

x2

dx,

x2 1

has an integrand which is the quotient of two polynomials. But, as these have the same

degree, we can not use the method of partial fractions on it as it stands. Instead, we

start by rewriting the integrand as

x2

x2 1 + 1

1

=

=1+ 2

,

2

2

x 1

x 1

x 1

x2

1

,

1

as the degree of its numerator is less than the degree of its denominator. That is, since

x2 1 = (x 1)(x + 1), we have distinct linear factors and so we can write

x2

1

1

A1

A2

=

=

+

,

1

(x 1)(x + 1)

x1 x+1

197

5. Integration

right-hand-side to see that

1

A1 (x + 1) + A2 (x 1)

=

,

(x 1)(x + 1)

(x 1)(x + 1)

and so, comparing the numerators, we need

1 = A1 (x + 1) + A2 (x 1).

Indeed, setting x = 1 on both sides, we see that 1 = 2A1 whereas setting x = 1 on

both sides, we see that 1 = 2A2 . Thus, we have

x2

1

1/2

1/2

=

+

,

1

x1 x+1

using the values of A1 and A2 that we have found. Consequently, putting this all

together, we find that

x2

dx =

x2 1

1+

1/2

1/2

+

x1 x+1

dx = x +

1

1

ln |x 1| ln |x + 1| + c,

2

2

Solution to exercise 5.4

We are asked to evaluate the definite integral

/2

0

dx

,

2 + cos x

using the substitution t = tan(x/2). This substitution, like the substitution t = tan

that we saw in Example 5.28, is very useful and so we start by seeing how it can be

applied. Firstly, we note that we can easily write sin(x/2) and cos(x/2) in terms of t by

using a right-angled triangle like the one in Figure 5.1 as this immediately tells us that

sin

t

x

=

2

1 + t2

and

cos

x

1

=

.

2

1 + t2

So, using the double-angle formula cos(2x) = cos2 x sin2 x from (2.6), we see that the

denominator of our integrand can be written as

2 + cos x = 2 + cos2 x sin2 x = 2 +

t2

3 + t2

1

=

.

1 + t2 1 + t2

1 + t2

dt

1

x

= sec2 ,

dx

2

2

and so, since sec(x/2) is the reciprocal of cos(x/2) as we saw in Section 2.1.2, we have

dx =

2 dt

,

1 + t2

in terms of t. Thirdly, as this is a definite integral, we also have to change the limits of

integration, i.e.

198

upper limit: x = /2 gives t = tan(/4) = 1.

Thus, returning to the integral, we have

/2

0

dx

=

2 + cos x

1 + t2

3 + t2

2 dt

1 + t2

=2

0

dt

,

3 + t2

/2

1

dx

x

= 2 tan1

2 + cos x

3

3

0

3 3

1

0

2

=

3

1

tan1 tan1 0

3

2

0 ,

=

3 6

To find the area of the region between the curve y = x3 , the x-axis and the vertical lines

x = 1 and x = 2, we note that the curve will be similar to what we saw in

Figure 2.2(c) and so the region we are looking at is the one illustrated in Figure 5.11.

y

y = x3

1

0

0

1

0

1

01

1

0

1

0

0

1

0

1

0

1

0

1

0

1

0

1

1

0

0

1

0

0

1

01

011

00

1 1

1

Figure 5.11: The hatching indicates the region of interest in Exercise 5.5.

In particular, we see that the curve crosses the x-axis when x = 0 and that the function

is non-positive when 1 x 0 and non-negative when 0 x 2. As such, we split

the total region into two sub-regions to see that:

Between x = 1 and x = 0 we evaluate the definite integral,

0

x4

x dx =

4

1

1

4

=

1

04 (1)4

1

= .

4

4

4

2

x4

x dx =

4

=

0

24 04

16

=

= 4.

4

4

4

1

17

Consequently, the total area of this region is + 4 = .

4

4

199

5. Integration

200

Chapter 6

Functions of several variables

Essential reading

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 3.13.9.

Anthony and Biggs (1996) Chapters 11 and 12.

Further reading

Simon and Blume (1994) parts of 13.113.2, parts of 14.114.6 and 14.8, parts of

15.115.2.

Adams and Essex (2010) parts of Chapter 12.

Aims and objectives

The objectives of this chapter are as follows.

To understand that functions of two variables represent surfaces and see how to

visualise these surfaces using sections and contours.

To introduce partial derivatives and use them in various contexts.

To introduce tangent planes, gradient vectors, directional derivatives and Taylor

series for functions of two variables.

Specific learning outcomes can be found near the end of this chapter.

6.1

Introduction

In Section 2.1, we saw that a function f : R R was a rule which takes an input,

x R, and gives us a unique output, f (x) R. We now turn our attention to functions

of two variables, i.e. functions where the input consists of a pair of numbers, (x, y) R2 ,

and whose output is a unique number f (x, y) R.1 In particular, we will mainly be

concerned with functions of two variables where the variables are independent, i.e. the

1

The theory we consider extends to the general case where the input consists of n numbers

(x1 , x2 , . . . , xn ). This extension to functions of n variables (with n 3) should be obvious and so we

do not spend much time on it here. However, although we will mainly be dealing with the two-variable

case, we will occasionally consider functions of more than two variables.

201

value of x can be chosen independently of the value of y and vice versa. As we shall see,

functions of two variables often occur in economics and other fields where we might

wish to apply mathematical techniques. Two important examples of such functions from

economics are:

The production function of a firm, q(k, l), gives the amount it produces when using

k units of capital and l units of labour.

The utility function of a consumer, u(x1 , x2 ), describes how much utility a

consumer derives from a bundle (x1 , x2 ) of two goods. As such it enables us to

compare the preferences of the consumer when he is confronted with different

combinations of these two goods.

These applications will be discussed later because, before we consider what we may

want to use them for, we want to know how we can visualise what is going on when we

have a function of two variables.

6

6.2

Surfaces

any input (a, b) as a point in the (x, y)-plane and the output will be the corresponding

value of f , i.e. f (a, b), which we can take to be the number c. That is, generally

speaking, each point (x, y) in the (x, y)-plane will have an output given by the

corresponding value of f , i.e. f (x, y), which we can take to be the value of another

variable z. As such, to visualise a function of two variables we need three axes, two to

represent the inputs, i.e. x and y, and one to represent the output, i.e. z. Drawing these

as in Figure 6.1, we take the (x, y)-plane of the inputs to correspond to points where

z = 0, i.e. the input (a, b) is represented on our axes by the point (a, b, 0), and then the

output of z = f (x, y) is represented on our axes by the point (a, b, c) which is a vertical

distance c above the point (a, b, 0) in the (x, y)-plane.

If we do this for all possible inputs (x, y) R2 we obtain a surface in three-dimensional

space whose equation is given by z = f (x, y). For instance, the surfaces obtained from

three different functions of two variables, namely

f (x, y) = x2 + y 2 ,

g(x, y) = x2 y 2

and

h(x, y) = x2 y 2 ,

Of course, it would be difficult for us to sketch such surfaces by hand and, indeed, it is

hard enough to even contemplate how and why they look like they do without a

computer. But, as we shall soon see, it is possible to get some feel for what these

surfaces look like by thinking about how we can represent them in a two-dimensional

way. However, before we do that, lets take a moment to look at some far simpler

surfaces than the ones in Figure 6.2, namely those that can arise from linear functions

of two variables, as these turn out to be planes.

202

6.2. Surfaces

z

c

(a, b, c)

O

b

(a, b, 0)

y

Figure 6.1: Representing the point (a, b, c) using the x, y and z-axes in R3 .

6.2.1

Planes

The simplest kind of two-variable function is one which is linear in x and y, i.e. where

z = f (x, y) = ax + by,

for some constants a and b. Such functions represent planes and, generally speaking,

any surface which has an equation of the form

ax + by + cz = d,

where at least one of the constants a, b and c is non-zero will represent a plane. For

what follows, the important kinds of plane are, basically, those that fall into the

following categories:

The (x, y), (y, z) and (x, z)-planes which have equations z = 0, x = 0 and y = 0

respectively. (These are the planes in the middle of the three planes illustrated in

Figures 6.3(a), (b) and (c) respectively.)

Planes parallel to the (x, y), (y, z) and (x, z)-planes which, for some constant c, will

have equations z = c, x = c and y = c respectively. (These are the other planes

illustrated in Figures 6.3(a), (b) and (c) respectively.)

Planes which dont fall into either of the above categories, i.e. those with equations of

the form

ax + by + cz = d,

for some constants a, b, c and d (where at least two of the constants a, b and c are

non-zero) will not overly concern us here even though you will come across them in

Section 2.11 of 173 Algebra.

203

(a)

(b)

(c)

reflection of (a) in the (x, y)-plane as h(x, y) = f (x, y).

z

z

y

y

x

(a)

(b)

(c)

Figure 6.3: Planes parallel to the (x, y), (y, z) and (x, z)-planes: (a) From bottom, z =

10, 0, 10; (b) From left x = 10, 0, 10 and (c) From right y = 10, 0, 10. (Note, in

particular, how the axes are labelled in these pictures.)

6.2.2

Although curve sketching (which is sketching the graph of a function of one variable) is

important in this course, you will not be asked to sketch surfaces (such as the ones

illustrated above in Figure 6.2) for functions of two variables. However, there are useful

ways of visualising such surfaces which do not involve sketching it in three dimensions.

One of these is to use planes, such as the ones we saw in Figure 6.3, to carve up a

three-dimensional illustration of a surface into two-dimensional representations in terms

of contours and sections. In particular, these ideas may be familiar to you from your

experiences with maps (for contours) and other technical diagrams (for sections).

Horizontal planes and the contours of a surface

One way of visualising a surface is to look at its contours, which are the curves of

intersection that arise when we look at the points of intersection of a surface with

planes that are parallel to the (x, y)-plane. To find the contours, we take a plane

204

6.2. Surfaces

parallel to the (x, y)-plane, say the plane z = c, and find the curve of intersection

between it and the surface z = f (x, y), i.e. the curve with equation c = f (x, y). This

curve is the z = c contour, i.e. the set of points (x, y) which give z = c when we put

them into the equation z = f (x, y).

Example 6.1 Find the z = 2 contour of the surface z = x y + 4. Repeat for z = 4

and z = 6.

To find the z = 2 contour of the surface z = x y + 4 we need to find the curve of

intersection, which in this case, is given by

2 = x y + 4.

Rearranging this gives the equation y = x + 2 which is the equation of a straight line.

Similarly, we find that:

For z = 4, the curve of intersection is given by 4 = x y + 4 which gives us

y = x.

y = x 2.

Thus, we see from these equations that these two contours are straight lines as well.

The surface and its contours are illustrated in Figure 6.4.

5

1

z

0

5

2

2

2

0

0

5.0

5.0

2.5

0.0

0.0

2.5

2.5

5.05.0

2.5

5.0

2.5

0.0

2.5

x

y

5.05.0

2.5

0.0

2.5

5.0

y

3

y

4

(a)

(b)

(c)

Figure 6.4: For Example 6.1. (a) The surface z = x y + 4 and, from the bottom, the

planes z = 2, 4, 6. (b) The curves of intersection of the surface and the planes in (a)

with their corresponding values of z. (c) The contours: Each line represents a contour

(i.e. the points with coordinates (x, y) that map to a particular value of z) in this

case, the further to the right the line is, the larger the corresponding value of z is, as we

have z = 2, 4, 6 as we move from left to right. Notice that, here, the contours are parallel

straight lines (i.e. they have the same gradient but different y-intercepts).

Activity 6.1 Find the equations of the z = 10, z = 0 and z = 10 contours of the

surface z = 4x + 2y 2 and sketch these in the (x, y)-plane clearly labelling the

value of z which is associated with each contour.

205

Example 6.2 Find the z = 16 contour of the surface z = x2 + y 2 . What are the

z = c contours of the surface z = x2 + y 2 when (i) c > 0, (ii) c = 0 and (iii) c < 0?

To find the z = 16 contour of the surface z = x2 + y 2 we need to find the curve of

intersection which, in this case, is simply

x2 + y 2 = 16.

This is the equation of a circle, centred on the origin, with a radius of four.

To find the z = c contours in the three cases indicated we just need to find out what

the curve

x2 + y 2 = c,

looks like in the three cases. So, we have:

c.

If c = 0, the contour is the point (0, 0) as this is the only solution to the

equation x2 + y 2 = 0.

and y.

In particular, notice that z = 0 is the smallest value of z that arises from a point on

this surface. The surface and three of its contours for c > 0 are illustrated in

Figure 6.5.

4

70

3

60

2

50

70

60

40

50

30

40

20

30

20

5.0

0

5.0

10

0.0

5.0

2.5

5.0

2.5

2.5

10

2.5

0.0 y

0

1

5.0

2.5

2.5

0.0

5.0

2.5

5.0

2.5

0.0 y

5.0

(a)

(b)

(c)

Figure 6.5: For Example 6.2. (a) The surface z = x2 + y 2 , which we saw in Figure 6.2(a),

and the planes z = 4, 16, 25. (b) The curves of intersection of the surface and the planes

in (a) with their corresponding values of z. (c) The contours: Each circle represents a

contour (i.e. the points with coordinates (x, y) that map to a particular value of z) in

this case, the larger the radius of the contour, the larger the corresponding value of z as

we have z = 4, 16, 25. Notice that, here, the contours are concentric circles (i.e. they have

the same centre but different radii).

206

6.2. Surfaces

Activity 6.2 Find the z = 25 contour of the surface z = x2 y 2 . What are the

z = c contours of this surface when (i) c > 0, (ii) c = 0 and (iii) c < 0?

Vertical planes and the sections of a surface

Another way of visualising a surface is to look at its sections, which are the curves of

intersection that arise when we look at the points of intersection of a surface with

planes that are perpendicular to the (x, y)-plane. To find the sections, we take a plane

perpendicular to the (x, y)-plane and find the curve of intersection between it and the

surface z = f (x, y). In particular, in this course, we shall only need to consider sections

that arise from planes that are parallel to the (x, z)-plane (i.e. y = c for some constant

c) or parallel to the (y, z)-plane (i.e. x = c for some constant c).

As such, the easiest sections to sketch are the ones we get when we consider the (x, z)

and (y, z)-planes which are both perpendicular to the (x, y)-plane. In particular, we find

that the section which we get from the:

(x, z)-plane, which has the equation y = 0, is the curve of intersection between it

and the surface z = f (x, y), i.e. the curve with equation z = f (x, 0).

(y, z)-plane, which has the equation x = 0, is the curve of intersection between it

and the surface z = f (x, y), i.e. the curve with equation z = f (0, y).

Lets look at what these sections look like in the case of the two surfaces we considered

above when we were looking for contours.

Example 6.3

intersection, which in this case, are given by:

For the (x, z)-section, we have y = 0 and so the curve of intersection is given by

z = x + 4 and this is a straight line in the (x, z)-plane.

For the (y, z)-section, we have x = 0 and so the curve of intersection is given by

z = y + 4 and this is a straight line in the (y, z)-plane.

Activity 6.3 Find the (x, z) and (y, z)-sections of the surface z = 4x + 2y 2 and

sketch these in the appropriate planes.

Example 6.4

intersection, which in this case, are given by:

For the (x, z)-section, we have y = 0 and so the curve of intersection is given by

z = x2 and this is a parabola in the (x, z)-plane.

207

For the (y, z)-section, we have x = 0 and so the curve of intersection is given by

z = y 2 and this is a parabola in the (y, z)-plane.

The surface and these sections are illustrated in Figure 6.7.

8.0

8.0

7.2

7.2

6.4

6.4

5.6

5.6

4.8

4.8

z 4.0

z 4.0

3.2

3.2

2.4

2.4

1.6

1.6

2.5

0.8

0.8

0.0

6

z

4

5.0

2

0.0

5.0

2.5

2.5

0.0

x

2.5

0.0

0

5.0

5.0

(a)

(b)

(c)

Figure 6.6: For Example 6.3. (a) The surface z = x y + 4 and the planes x = 0 (which

goes diagonally from bottom left to top right) and y = 0 (which goes diagonally from top

left to bottom right). (b) The (x, z)-section is the line z = x + 4. (c) The (y, z)-section is

the line z = y + 4.

8.0

8.0

7.2

7.2

6.4

6.4

5.6

5.6

4.8

4.8

z 4.0

z 4.0

3.2

3.2

2.4

2.4

1.6

1.6

0.8

0.8

0.0

6

z

4

4

0

4

2

0

x

0.0

0

(a)

(b)

(c)

Figure 6.7: For Example 6.4. (a) The surface z = x2 + y 2 and the planes x = 0 (which

goes diagonally from bottom left to top right) and y = 0 (which goes diagonally from top

left to bottom right). (b) The (x, z)-section is the parabola z = x2 . (c) The (y, z)-section

is the parabola z = y 2 .

Activity 6.4 Find the (x, z) and (y, z)-sections of the surface z = x2 y 2 and

sketch these in the appropriate planes.

More generally, we may want to look at the sections we get when we consider planes

that are parallel to the (x, z) and (y, z)-planes which we considered above. In

particular, we find that the sections we get from the planes that are parallel to the:

208

6.2. Surfaces

(x, z)-plane, which have equations of the form y = c where c is a constant, are the

curves of intersection between it and the surface z = f (x, y), i.e. the curve with

equation z = f (x, c).

(y, z)-plane, which have equations of the form x = c where c is a constant, are the

curves of intersection between it and the surface z = f (x, y), i.e. the curve with

equation z = f (c, y).

Lets see what these sections look like in the case of the two surfaces we considered

above.

Example 6.5

intersection, which in this case, are given by:

For the y = 0 section, we have y = 0 and so the curve of intersection is given by

z = x + 4 and this is a straight line in the (x, z)-plane. Of course, this is just the

(x, z)-section we found in Example 6.3!

For the y = 2 section, we have y = 2 and so the curve of intersection is given by

z = x 2 + 4 = x + 2 and this is a straight line.

For the y = 4 section, we have y = 4 and so the curve of intersection is given by

z = x 4 + 4 = x and this is a straight line.

Observe that only the first of these sections lives in the (x, z)-plane, but we can

sketch the other two in this plane to get a feel for how the surface is changing when

we look at the sections y = c for different values of c. The surface and these sections,

when drawn in the (x, z)-plane, are illustrated in Figure 6.8.

6

8

4

z

4

2

4

2

2

0

0

x

0

2

5

4

1

(a)

(b)

Figure 6.8: For Example 6.5. (a) The surface z = x y + 4 and the planes y = 0, y = 2

and y = 4 as we move from right to left. (b) The y = 0, y = 2 and y = 4 sections (as

we move from top to bottom) all drawn in the (x, z)-plane. Note that, the y = 0 section

is the (x, z)-section and, of the three sections illustrated, this is the only one that really

lives in the (x, z)-plane. Also notice that, as the value of c increases when we look at

the plane y = c, the value of the z-intercept decreases when we look at the section.

209

them in the (y, z)-plane. Of these three sections, which one have we found before

and what did we call it? Of these three sections, which is the only one that really

lives in the (y, z)-plane?

Activity 6.6

Find the y = 2, 0, 2 sections of this surface and sketch them in the (x, z)-plane.

Find the x = 2, 0, 2 sections of this surface and sketch them in the (y, z)-plane.

Example 6.6

intersection, which in this case, are given by:

For the x = 0 section, we have x = 0 and so the curve of intersection is given by

z = y 2 and this is a parabola in the (y, z)-plane. Of course, this is just the

(y, z)-section we found in Example 6.4!

z = 1 + y 2 and this is a parabola.

For the x = 2 section, we have x = 2 and so the curve of intersection is given by

z = 4 + y 2 and this is a parabola.

Observe that only the first of these sections lives in the (y, z)-plane, but we can

sketch the other two in this plane to get a feel for how the surface is changing when

we look at the sections x = c for different values of c. The surface and these sections,

when drawn in the (x, z)-plane, are illustrated in Figure 6.9.

Activity 6.7 Find the y = 0, 1, 2 sections of the surface z = x2 + y 2 and sketch

them in the (x, z)-plane. Of these three sections, which one have we found before

and what did we call it? Of these three sections, which is the only one that really

lives in the (x, z)-plane?

Activity 6.8

Find the y = 0, 1, 2 sections of this surface and sketch them in the (x, z)-plane.

Find the x = 0, 1, 2 sections of this surface and sketch them in the (y, z)-plane.

6.3

Partial differentiation

perhaps, we can also differentiate functions of two variables using partial differentiation

210

8

6

6

z

4

z

4

2

2

4

2

0 y

0

4

2

3

y

1

4

1

(a)

(b)

Figure 6.9: For Example 6.6. (a) The surface z = x2 + y 2 and the planes x = 0, x = 1 and

x = 2 as we move from left to right. (b) The x = 0, x = 1 and x = 2 sections all drawn

in the (y, z)-plane. Note that, the x = 0 section is the (y, z)-section and, of the three

sections illustrated, this is the only one that really lives in the (y, z)-plane. Notice that,

as the value of c increases when we look at the plane x = c, the value of the z-intercept

increases when we look at the section.

to yield partial derivatives.2 In some ways, this will be similar to what we saw when we

differentiated functions of one variable to get their derivatives, but as we now have two

variables to deal with, things get a little trickier.

6.3.1

Consider f (x, y), a function of two independent variables. For a fixed value of y, say

y = y0 , we can look at the function g(x) = f (x, y0 ) which is now a function of x only.

Clearly, the rate of change of g(x) with respect to x is just the derivative of this

function with respect to x. But, what happens when we want to calculate the rate of

change of f (x, y) with respect to x for any fixed value of y? To do this we avoid

specifying a particular value of y by just assuming that y is a constant and

differentiating with respect to x. So, given a function f (x, y) we denote the operation of

differentiating f with respect to x whilst holding y constant by

f

or, more compactly, fx (x, y),

x

(6.1)

and call this the partial derivative of f (x, y) with respect to x.3 In a similar manner, we

can define the partial derivative of f (x, y) with respect to y, denoted by

f

or, more compactly, fy (x, y),

y

(6.2)

Most of the material in these notes can be generalised to functions with more than two variables.

But, in this course, almost without exception, we will be considering functions of two variables.

3

Note that we use the curly-d, i.e. , for partial derivatives rather than the normal straight-d, i.e.

d, which one encounters in the notation dg/dx for the derivative of a function g(x) of one variable. We

shall see why it is important to keep these two notions of differentiation separate later.

Similarly, we use fx (x, y) as shorthand for the partial derivative of f (x, y) with respect to x rather

than the g (x) which one encounters as the shorthand for the derivative of a function g(x) of one variable.

211

which is what we obtain from differentiating f (x, y) with respect to y whilst holding x

constant.

Clearly, the partial derivative of f (x, y) with respect to x, i.e. the result of

differentiating f (x, y) with respect to x whilst holding y constant, is going to be another

function of x and y. This function of x and y is what is denoted by the symbols in (6.1).

But, what does this partial derivative mean? In effect, what we have done when we

consider the function f (x, y) for some fixed value of y, say y0 , is to look at the section of

the curve z = f (x, y) we get when y = y0 , i.e. the section given by the equation

z = f (x, y0 ) which lies in a plane that has y = y0 and is parallel to the (x, z)-plane.

Then, when we differentiate f (x, y0 ) with respect to x, we are finding the gradient of

this section, i.e. it tells us how z = f (x, y0 ) is varying with x. Consequently, this partial

derivative is telling us something about the gradient of the surface when we are at the

point (x, y0 ) and we are looking in the x-direction. This will become clear when we

look at tangent planes in Section 6.4.1.

Activity 6.9 Describe what the partial derivative of f (x, y) with respect to y

evaluated at the point (x0 , y) tells us about the gradient of the surface at the point

(x0 , y).

6.3.2

Calculating the partial derivatives of f (x, y) is only slightly more difficult than finding

the derivative of a function of one variable. Recalling that the partial derivative of a

function f (x, y) with respect to x, i.e. fx (x, y), is just the derivative of f (x, y) with

respect to x whilst holding y constant, to calculate fx (x, y) we just treat any occurrence

of y in f (x, y) as if it were a constant and differentiate f (x, y) with respect to x. And, in

a similar way, we can find the partial derivative of a function f (x, y) with respect to y,

i.e. fy (x, y). Lets look at an example.

Example 6.7

Lets do this slowly so that we get the idea. To find fx (x, y), we treat y as if it were

a constant and lets say that this constant is c. So, we have a function of one

variable given by

g(x) = f (x, c) = cx2 + 5c3 x + c2 ,

and differentiating this with respect to x gives

dg

= 2cx + 5c3 .

dx

But, c is the constant were using to represent y and so replacing all the cs with ys

we have

f

= 2xy + 5y 3 ,

x

which is the partial derivative of f (x, y) with respect to x.

Similarly, to find fy (x, y), we treat x as if it were a constant and (again) lets say

that this constant is c. So, we have a function of one variable given by

g(y) = f (c, y) = c2 y + 5cy 3 + y 2 ,

212

dg

= c2 + 15cy 2 + 2y.

dy

But, c is the constant were using to represent x and so replacing all the cs with

xs we have

f

= x2 + 15xy 2 + 2y,

y

which is the partial derivative of f (x, y) with respect to y.

Obviously, there is no need to go through all this detail whenever we calculate a partial

derivative all you have to do is remember what you are keeping constant and then

differentiate whatever is left. Lets look at another example.

Example 6.8

Given that f (x, y) = 3x3 + 7xy 1 + 2y 9 , find fx (x, y) and fy (x, y).

Lets do this quickly. To find fx (x, y), we treat y as a constant and differentiate

with respect to x to get

f

= 9x2 + 7y 1 .

x

Similarly, to find fy (x, y), we treat x as a constant and differentiate with respect to y

to get

f

= 7xy 2 + 18y 8 .

y

And, were done!

Activity 6.10

Given that

f (x, y) = 2x + x3 y

x y3

+ ,

y

2

So far, we have calculated the partial derivatives of very simple functions of x and y.

But, sometimes, we will need to use the chain, product and quotient rules when

calculating partial derivatives. Lets look at an example to see how this is done.

Example 6.9

Given that

f (x, y) = x ex+y ,

find fx (x, y) and fy (x, y).

We first note that we can write this function as

2

f (x, y) = (x ex ) ey ,

2

and so, to find fx (x, y), we treat ey as a constant and we differentiate the function

x ex using the product rule to get x ex +1 ex . This gives us

f

2

2

= ey (x ex + ex ) = (x + 1) ex+y .

x

213

2

using the chain rule to get 2y ey . This gives us

f

2

2

= x ex (2y ey ) = 2xy ex+y .

y

Activity 6.11

6.3.3

variables. For instance, suppose that the production level, q, of a firm depends on the

amounts k of capital and l of labour used through the function q(k, l). Suppose also

that both k and l change over time in some known way so that we have formulas for

k(t) and l(t) where t is a parameter measuring time.4 How, we might ask, can we find

the rate of change of production with time?

Example 6.10 Given that we have the production function q(k, l) = kl where k

and l are functions of time, t, given by

k(t) = 3 + 2t

and

l(t) = 10 3t,

In this case, we can calculate the production as a function of time by explicitly

finding Q(t) = q(k(t), l(t)) which, in this case is

Q(t) = k(t)l(t) = (3 + 2t)(10 3t) = 30 + 11t 6t2 .

And, in particular, we can now differentiate this to find the rate of change of

production with time, i.e. we have

dQ

= 11 12t,

dt

in this case.

More generally, suppose we are given a function f of two variables x and y, both of

which are themselves functions of t. We can think of this as defining a composite

function F (t) = f (x(t), y(t)). In the case of a single variable we have a rule, i.e. the

chain rule, which enables us to work out the derivative of a composite function.

Amazingly, perhaps, there is a similar rule for composite functions of two variables such

as the one we have here which is also known as the chain rule. It states that

dF

f dx f dy

=

+

.

dt

x dt

y dt

4

(6.3)

Notice that, since k and l both depend on t, we can only pick certain pairs of values, (k, l). That is,

in this case, the variables k and l are not independent.

214

Sometimes, in this context, we call F (t) the total derivative of F (t) with respect to t

(in order to distinguish it from the partial derivatives of f with respect to x and y).

To see why the chain rule works, consider that if we change t by a small amount, t,

the corresponding change in F (t) is given by

dF

t,

dt

but here, there are two ways in which F (t) = f (x(t), y(t)) can change with t.

F

Firstly, F can change with t because f changes with x and x changes with t, lets

denote this change in F by x F . In this case, we have

f

x,

x

as we are holding y constant to see how F changes with x and this means that

x F

f dx

t,

x dt

as the change in x, x, is related to a change in t by x

x F

x (t)t.

Secondly, F can change with t because f changes with y and y changes with t, lets

denote this change in F by y F . In this case, we have

y F

f

y,

y

as we are holding x constant to see how F changes with y and this means that

y F

f dy

t,

y dt

y (t)t.

F = x F + y F

f dx

f dy

t +

t,

x dt

y dt

we can now equate our two expressions for F and divide through by t to get the

chain rule which we saw above in (6.3). Lets see how we could have used it to answer

the question we saw in Example 6.10.

Example 6.11 Consider the functions in Example 6.10. Use the chain rule to find

the rate of change of production with time.

Here q(k, l) = kl, k(t) = 3 + 2t and l(t) = 10 3t. In this case, if we again let

Q(t) = q(k(t), l(t)), the chain rule states that

dQ

q dk q dl

=

+

.

dt

k dt

l dt

As such, using this, we can see that

dQ

= (l)(2) + (k)(3) = 2(10 3t) 3(3 + 2t) = 11 12t,

dt

which agrees with our earlier answer.

215

Activity 6.12 Suppose that f (x, y) = x2 y and that x(t) = 2 + 3t and y(t) = t2 + 1.

If F (t) = f (x(t), y(t)), use the chain rule to find the total derivative of F with

respect to t and check your answer by explicitly finding F (t) and differentiating it

with respect to t.

We now consider one of the many useful applications of the chain rule.

The derivative of an implicit function

An equation g(x, y) = c where c is a constant can, in some cases, be rearranged (or

solved) to give y as an explicit function of x. Once we have done this, we can then

differentiate our expression for y with respect to x to find its derivative, y (x).

Example 6.12 Suppose that y is a function of x defined by the equation

x2 y = 7. Find y as an explicit function of x and hence find y (x).

y(x) = x2 7,

if we want y as an explicit function of x. In particular, this means that

dy

= 2x,

dx

in this case.

In general, we say that an equation g(x, y) = c defines y implicitly as a function of x if

there is a function y(x) which satisfies the equation for a range of values of x. But, in

general, it may be difficult or impossible to solve the equation g(x, y) = c to find an

explicit formula for y(x) as we did in Example 6.12. However, we can [often] still find

the derivative y (x), even if we dont have an explicit expression for y in terms of x.

To see how we can do this, consider that if we knew the function, y(x), that satisfied

the equation g(x, y) = c, we could find a new function, G(x), of x only which would be

given by G(x) = g(x, y(x)). Then, using the chain rule, we would have

g dx g dy

dG

=

+

.

dx

x dx y dx

But, G(x) = c where c is a constant and so we also have

dG

=0

dx

as well as

0=

g g dy

+

.

x y dx

dy

g/x

=

,

dx

g/y

216

dx

= 1,

dx

as long as gy (x, y) = 0, That is, y (x) can easily be found by using the partial

derivatives of g. (But, dont forget the minus sign!)

Example 6.13 In Example 6.12, y was a function of x defined implicitly by the

equation x2 y = 7. Find y (x) using the result above.

As we have the equation x2 y = 7 we can write this as g(x, y) = c with

g(x, y) = x2 y and c = 7. Using the above result we can then see that

g

= 2x

x

which means that

dy

g/x

2x

=

=

= 2x,

dx

g/y

1

as before.

Example 6.14

g

= 1,

y

and

2 3

3 2

x y 6x y + 2xy = 1.

Verify that the point (x, y) = (1/2, 2) satisfies this equation and find the value of the

derivative, y (x), at this point.

The point (x, y) = (1/2, 2) satisfies the equation since, putting x = 1/2 and y = 2

into the left-hand side, we get

1

2

(2)3 6

1

2

(2)2 + 2

1

2

(2) = 2 3 + 2 = 1,

which is what we have on the right-hand side of the equation. We then see that the

equation defining y implicitly as a function of x is of the form g(x, y) = 1 where

g(x, y) = x2 y 3 6x3 y 2 + 2xy. So, according to the formula given above, we have

dy

g/x

=

,

dx

g/y

and so, since

g

= 2xy 3 18x2 y 2 + 2y

x

we have

and

g

= 3x2 y 2 12x3 y + 2x,

y

dy

2xy 3 18x2 y 2 + 2y

= 2 2

,

dx

3x y 12x3 y + 2x

as long as 3x2 y 2 12x3 y + 2x = 0. Thus, given the point (1/2, 2), we can substitute

these values into our expression for y (x) to see that the value of the derivative at

this point is 6.

217

Activity 6.13

x2 + 2xy = 6 3y 3 .

Verify that the point (x, y) = (1, 1) satisfies this equation and find the value of the

derivative, y (x), at this point.

Extensions of the chain rule

What we seen above can be extended. Suppose, for instance, that g is is a function of

two variables x and y, both of which are themselves functions of two variables k and l.

We can think of this as defining a composite function G(k, l) = g(x(k, l), y(k, l)) and an

extension of the chain rule then assures us that

G

g x g y

=

+

k

x k y k

and

G

g x g y

=

+

.

l

x l

y l

To see why the first of these formulae works, consider that if we change k by a small

amount, k, whilst holding l constant, the corresponding change in G(k, l) is given by

G

G

k,

k

but here, there are two ways in which G(k, l) = g(x(k, l), y(k, l)) can change with k.

Firstly, G can change with k because g changes with x and x changes with k, lets

denote this change in G by x G. In this case, we have

x G

g

x,

x

as we are holding y constant to see how F changes with x and this means that

x G

g x

k,

x k

x xk (k, l)k.

Secondly, G can change with k because g changes with y and y changes with k,

lets denote this change in G by y G. In this case, we have

y G

g

y,

y

as we are holding x constant to see how F changes with y and this means that

y G

g y

k,

y k

y yk (k, l)k.

218

G = x G + y G

g x

g y

k +

k,

x k

y k

we can now equate our two expressions for G and divide through by k to get the

chain rule for Gk (k, l) which we saw above.

Activity 6.14 Use a similar argument to the one above to explain why the chain

rule formula for Gl (k, l) works.

And, in a similar manner, if we suppose that g(x, y, z) = c defines z implicitly as a

function of x and y, we can use this form of the chain rule to derive the formulae

z

g/x

=

x

g/z

and

z

g/y

=

,

y

g/z

which will allow us to calculate the partial derivatives of z with respect to x and y.

Indeed, to see why the first of these formulae works, we consider that if we knew the

function, z(x, y), that satisfied the equation g(x, y, z) = c, we could find a new function,

G(x, y), of x and y only which is given by G(x, y) = g(x, y, z(x, y)). Then using the

chain rule, we have

g dx g z

G

=

+

.

x

x dx z x

But, G(x, y) = c where c is a constant and so we also have

G

=0

x

as well as

dx

= 1,

dx

0=

g g z

+

.

x z x

z

g/x

=

,

x

g/z

as long as gz (x, y, z) = 0. That is, zx (x, y) can easily be found by using the partial

derivatives of g. (But, dont forget the minus sign!)

Activity 6.15 Use a similar argument to the one above to explain why the formula

for zy (x, y) works.

Activity 6.16

equation

q 3 k + k 3 l + qk 2 l = 3.

Find the partial derivatives qk (k, l) and ql (k, l). What are the values of these partial

derivatives at the point where k = 1 and l = 1?

[Hint: The identity q 3 + q 2 = (q 1)(q 2 + q + 2) will be useful.]

219

6.3.4

Homogeneous functions are important in economics since they allow us to capture the

idea of returns to scale. In this section we will see what it means for a function to be

homogeneous and consider an important theorem about homogeneous functions. The

former will enable us to give an economic interpretation of homogeneous production

functions in terms of returns to scale and the latter will enable us to consider the

economic significance of the marginal products that can be derived from such

production functions.

Homogeneity and returns to scale

We say that a function, f (x, y), is homogeneous of degree r if

f (x, y) = r f (x, y),

for any R. Lets start by looking at some examples of homogeneous functions.

Example 6.15

one.

f (x, y) = (x)1/2 (y)1/2 = (1/2 x1/2 )(1/2 y 1/2 ) = 1 x1/2 y 1/2 = 1 f (x, y).

Comparing this with the definition of homogeneity, i.e.

f (x, y) = r f (x, y),

we see that r = 1 and so this function is homogeneous of degree one.

Example 6.16 Show that the function f (x, y) =

is its degree of homogeneity?

x+

y is homogeneous. What

f (x, y) = r f (x, y),

we see that r = 1/2 and so this function is homogeneous of degree one half.

Example 6.17

f (x, y) = (x) + (y)2 = x + 2 y 2 .

220

f (x, y) = r f (x, y),

we see that there is no way of writing x + 2 y 2 in the form r (x + y 2 ) for any r and

so this function is not homogeneous.

In particular, this means that not all functions are homogeneous.

Economically, we can think of homogeneous functions as telling us about how outputs

change if we scale up our inputs. To see why, consider what happens if we scale up our

inputs by a factor of , i.e. if we increase our bundle of inputs, (x, y), by a factor of

> 1 we get the new bundle of inputs (x, y). Now if our outputs are determined by a

homogeneous function, f (x, y), of degree r we can see that the output from our new

bundle, (x, y), is given by

f (x, y) = r f (x, y),

i.e. we will get r times as much as we did from our old bundle, (x, y). That is, scaling

inputs by leads to a scaling of output by r if our output is determined by a function

which is homogeneous of degree r.

In particular, given a function which is homogeneous of degree one, we can see that

scaling our inputs by > 1 i.e. going from the bundle of inputs (x, y) to the bundle

of inputs (x, y) will scale our output by i.e. going from an output of f (x, y) to

an output of f (x, y). That is, we get constant returns to scale, a proportional increase

in inputs leads to the same proportional increase in output. Clearly, given functions of

degree r > 0, this idea can be extended to cover functions with degrees r = 1 as follows:

If r > 1, we get increasing returns to scale as > 1 implies that r > .5

If r = 1, we get constant returns to scale as we saw above.

If r < 1, we get decreasing returns to scale as > 1 implies that r < .6

To see how this works, consider the following example.

Example 6.18 A firm invests an amount of capital, k, and labour, l, in its

production process and this yields a production level of q(k, l). What will be the

effect on the level of production of quadrupling the amount of capital and labour

invested if the production function is homogeneous of degree (a) 1/2, (b) 1 and (c)

3/2?

Quadrupling the amount of capital and labour invested means increasing the

investment bundle from (k, l) to (4k, 4l). So, if the production function is

homogeneous of degree r, the production level will go from q(k, l) to

q(4k, 4l) = 4r q(k, l), i.e. the production level will change by a factor of 4r . In

particular, this means that if the production function is homogeneous of degree

(a) 1/2, the change will be by a factor of 41/2 = 2 (i.e. quadrupling inputs doubles

production),

5

6

That is, a proportional increase in inputs leads to a larger proportional increase in output.

That is, a proportional increase in inputs leads to a smaller proportional increase in output.

221

production),

(c) 3/2 the change will be by a factor of 43/2 = 8, (i.e. quadrupling inputs octuples

production),

yielding decreasing, constant and increasing returns to scale respectively.

We now turn to a useful result about homogeneous functions.

Eulers theorem and marginal products

Eulers theorem states that if f (x, y) is an homogeneous function of degree r, then

x

f

f

+y

= rf (x, y).

x

y

This follows from a simple application of the chain rule since, using the definition of a

function that is homogeneous of degree r, we have

for any R. As such, differentiating both sides with respect to and using the chain

rule from (6.3) on the left-hand side, we have

f du f dv

+

= rr1 f (x, y),

u d v d

if we think of f (x, y) as f (u, v) with u = x and v = y. This then gives us

f

f

+y

= rr1 f (x, y).

u

v

and, if we now set = 1, we get the desired result as we have u = x, v = y and r1 = 1.

x

In this course, a question may involve verifying that Eulers theorem holds for some

given homogeneous function. As an example, lets verify that it is true for the two

homogeneous functions we considered in Examples 6.15 and 6.16.

Example 6.19 In Example 6.15, we saw that the function f (x, y) = x1/2 y 1/2 is

homogeneous of degree one. Verify that Eulers theorem holds for this function.

In this case we can see that

f

1

= x1/2 y 1/2

x

2

and

f

1

= x1/2 y 1/2 .

y

2

x

f

f

+y

=x

x

y

1 1/2 1/2

x

y

+y

2

1 1/2 1/2

x y

2

1

1

= x1/2 y 1/2 + x1/2 y 1/2

2

2

and since the degree of homogeneity of this function is one, we have f (x, y) on the

right-hand-side of Eulers theorem. Thus, as these two expressions are the same,

Eulers theorem holds.

222

homogeneous of degree 1/2. Verify that Eulers theorem holds for this function.

In this case we can see that f (x, y) can be written as f (x, y) = x1/2 + y 1/2 and so,

f

1

= x1/2

x

2

and

f

1

= y 1/2 .

y

2

x

f

f

+y

=x

x

y

1 1/2

+y

x

2

1 1/2

y

2

1

1

1 1/2

1

= x1/2 + y 1/2 =

x + y 1/2 = f (x, y),

2

2

2

2

and since the degree of homogeneity of this function is a half, we have 12 f (x, y) on

the right-hand-side of Eulers theorem. Thus, as these two expressions are the same,

Eulers theorem holds.

We now turn to the economic significance of Eulers theorem. Consider a firm that

invests an amount of capital, k, and labour, l, in its production process and this yields a

production level of q(k, l). Further, assume that this production function is

homogeneous of degree one, i.e. that we have constant returns to scale. Eulers theorem

then asserts that

q

q

+l

= q.

k

k

l

Now, ql gives us the marginal product of labour, i.e. it measures the change in

production if we change the amount of labour. In particular, if we invest one more unit

of labour, say by employing one more worker, ql tells us the resulting change in

production.7 As such, it makes sense to say that this extra worker is responsible for this

change in production and so, if we assume that we reward workers by giving them

goods equal to the quantity they produce, it makes sense to reward this worker with a

quantity of goods given by ql . Thus, if all workers produce the same amount, i.e. ql , and

there are l (i.e. the amount of labour invested) workers, it makes sense that they should

all be rewarded with a quantity of goods equal to ql . As such, the quantity lql represents

the total quantity of goods that should be given as rewards to the workers (i.e. the

labour). A similar argument applies to the quantity kqk , i.e. this should be the total

quantity of goods that should be given as rewards to the providers of the capital.

Consequently, Eulers theorem tells us that these rewards should add up to the total

quantity of goods produced, i.e. all the goods being produced should be distributed

amongst the suppliers of capital and the providers of labour. In summary, this says:

But, strictly, this is only approximate since if q is the change in production and l is the change

in labour, the relationship

q

q

q

or

q

l,

l

l

l

is only an approximation. As such, taking on one more worker (i.e. changing the amount of labour by

one) gives l = 1 and hence the change in production, q, is given [approximately] by q = ql . However,

the argument given in these notes can be made precise if we consider the change in production due to

an arbitrarily small change in the amount of labour instead of, say, the intuitively more obvious change

of one worker.

223

(e.g. capital and labour) at a level equal to its marginal product, then the total

reward to the factors of production will be the amount produced.

6.3.5

If we have a function f (x, y), we can use partial differentiation to find the new functions

fx (x, y) and fy (x, y). These new functions are called the first-order partial derivatives of

f . However, it is also possible to partially differentiate these new functions with respect

to x and y to get the second-order partial derivatives of f . Obviously, for a function of

two variables, there are four second-order partial derivatives, i.e. those that are

unmixed:

2f

f

f

2f

and

,

=

=

x2

x x

y 2

y y

and those that are mixed:

2f

=

yx

y

f

x

2f

=

xy

x

and

f

.

y

fxx = (fx )x ,

fyy = (fy )y ,

fxy = (fx )y

respectively. In this course, we will find that the order of partial differentiation in the

mixed second-order partial derivatives is unimportant since we will always have

fxy = fyx . In particular, this fact can serve as a useful check when we are working out

second-order partial derivatives.

Example 6.21

f (x, y) = x2 y + 5xy 3 + y 2 ,

were given by

fx (x, y) = 2xy + 5y 3

and

Partially differentiating fx (x, y) = 2xy + 5y 3 with respect to x and y respectively, we

can see that

fxx (x, y) = 2y

and

fxy (x, y) = 2x + 15y 2 ,

whereas, partially differentiating fy (x, y) = x2 + 15xy 2 + 2y with respect to x and y

respectively, we can see that

fyx (x, y) = 2x + 15y 2

and

224

Example 6.22

f (x, y) = 3x3 + 7xy 1 + 2y 9 ,

were given by

fx (x, y) = 9x2 + 7y 1

and

Partially differentiating fx (x, y) = 9x2 + 7y 1 with respect to x and y respectively,

we can see that

fxx (x, y) = 18x

fxy (x, y) = 7y 2 ,

and

respectively, we can see that

fyx (x, y) = 7y 2

and

Find the second-order partial derivatives of the function in

Activity 6.17

Activity 6.10.

f (x, y) = x3/4 y 1/4 .

And, of course, when finding second-order partial derivatives we may need to use the

chain, product and quotient rules.

Example 6.23

2

f (x, y) = x ex+y ,

were given by

fx (x, y) = (x + 1) ex+y

and

To find the second-order derivatives that arise from fx (x, y), we first note that we

can write it as

f

2

= [(x + 1) ex ] ey .

x

2

So, to find fxx (x, y), we treat ey as a constant and we differentiate the function

(x + 1) ex using the product rule to get (x + 1) ex +1 ex . This gives us

2f

2

2

= ey [(x + 1) ex + ex ] = (x + 2) ex+y .

2

x

225

To find fxy (x, y), we treat (x + 1) ex as a constant and we differentiate the function

2

2

ey using the chain rule to get 2y ey . This gives us

2f

2

2

= (x + 1) ex (2y ey ) = 2(x + 1)y ex+y .

yx

To find the second-order derivatives that arise from fy (x, y), we first note that we

can write it as

f

2

= 2(x ex )(y ey ).

y

2

So, to find fyx (x, y), we treat 2y ey as a constant and we differentiate the function

x ex using the product rule to get x ex +1 ex . This gives us

2f

2

2

= 2y ey (x ex + ex ) = 2(x + 1)y ex+y .

xy

To find fyy (x, y), we treat 2x ex as a constant and we differentiate the function y ey

2

2

using the chain and product rules to get y(2y ey ) + ey . This gives us

2f

2

2

2

= 2x ex (2y 2 ey + ey ) = 2x(2y 2 + 1) ex+y .

2

y

Notice that fxy = fyx as we should expect in this course.

Activity 6.19

Activity 6.11.

they will not be used in this course.

6.4

We now look at some of the useful things that partial derivatives tell us about functions

of two variables. Before you start this section, you should note that this material makes

use of some ideas from Chapter 2 of 173 Algebra, namely

the dot product of two vectors (see Section 2.8),

displacement and direction vectors (see Section 2.9),

the equation of a plane (see Section 2.11), and

the equation of a hyperplane (see Section 2.12).

Make sure that you understand these before you proceed.

6.4.1

Tangent planes

Suppose that we have a surface whose equation is given by z = f (x, y). If c = f (a, b),

then the point (a, b, c) is on this surface and, if we look at the sections given by x = a

226

and y = b, which are parallel to the (y, z)-plane and (x, z)-plane respectively, we can

find tangent lines in these planes by using the partial derivatives as these tell us how z

is changing with y and x respectively at this point. In particular, if x = a, the section is

given by z = f (a, y) and the tangent line is given by

z = c + fy (a, b)(y b),

and this lives in the plane x = a which is parallel to the (y, z)-plane whereas if y = b,

the section is given by z = f (x, b) and the tangent line is given by

z = c + fx (a, b)(x a),

and this lives in the plane y = b which is parallel to the (x, z)-plane.

Example 6.24 Show that the point (1, 1, 2) lies on the surface whose equation is

z = x2 + y 2 . What are the equations of the tangent lines to the x = 1 and y = 1

sections at this point?

The point (1, 1, 2) lies on the surface z = x2 + y 2 as 2 = 12 + 12 . Here we have

z = f (x, y) with f (x, y) = x2 + y 2 and so, looking at the:

x = 1 section, we have

fy (x, y) = 2y

fy (1, 1) = 2,

and so the tangent line, which lives in the plane x = 1, has an equation given by

z = 2 + 2(y 1) = 2y,

as we should expect since this section has an equation given by z = 1 + y 2 . This

section and the tangent line are illustrated in Figure 6.10(a).

y = 1 section, we have

fx (x, y) = 2x

fx (1, 1) = 2,

and so the tangent line, which lives in the plane y = 1, has an equation given by

z = 2 + 2(x 1) = 2x,

as we should expect since this section has an equation given by z = 1 + x2 . This

section and the tangent line are illustrated in Figure 6.10(b).

In particular, note that these tangent lines live in the planes that define the

relevant sections.

Indeed, as we can find two tangent lines that tell us about how the surface z = f (x, y)

is changing in the x and y-directions at the point (a, b, c) by considering the y = b and

x = a sections respectively, we can use these two lines to define the tangent plane to the

surface at this point. The question is: How do we find the equation of this tangent

plane?

227

x=1

y=1

z = 2y

z = 2x

(a)

(b)

Figure 6.10: Tangent lines to the (a) x = 1 and (b) y = 1 sections of the surface z = x2 +y 2

Lets assume that both of the partial derivatives, fx (x, y) and fy (x, y), are defined at

the point (a, b, c). We know, from Section 2.11 of 173 Algebra, that the vector equation

of a plane through this point is given by

u

xa

v y b = 0,

w

zc

where the vector (u, v, w) is the normal vector to the plane. Indeed, working out this

dot product, we find that

u(x a) + v(y b) + w(z c) = 0,

is the Cartesian equation of the plane. But, what are u, v and w? Well, if we assume

that we have w = 0, i.e. the plane we are considering is not vertical, then we can write

this as

v

u

z = c (x a) (y b),

w

w

and, to be a tangent plane, we require that the two tangent lines we found above lie in

the plane. In particular, we find that when x = a, we must have

z =c

v

(y b) giving us z = c + fy (a, b)(y b),

w

z =c

u

(x a) giving us z = c + fx (a, b)(x a),

w

v

= fy (a, b) and

w

u

= fx (a, b).

w

This means that the Cartesian equation of the tangent plane is given by

z c = fx (a, b)(x a) + fy (a, b)(y b),

and writing this as

fx (a, b)(x a) + fy (a, b)(y b) (z c) = 0,

228

(6.4)

fx (a, b)

xa

fy (a, b) y b = 0.

1

zc

(6.5)

u

fx (a, b)

v = fy (a, b) ,

w

1

is a normal vector to this tangent plane.

Example 6.25 Following on from Example 6.24, find the Cartesian and vector

equations of the tangent plane to the surface z = x2 + y 2 at the point (1, 1, 2). Verify

that the tangent lines to the x = 1 and y = 1 sections at this point (found in

Example 6.24) lie in this tangent plane.

Using what we found in Example 6.24 and (6.4), it should be clear that the

Cartesian equation of the tangent plane to the surface z = x2 + y 2 at the point

(1, 1, 2) is given by

z 2 = 2(x 1) + 2(y 1)

z = 2x + 2y 2,

2

x1

2 y 1 = 0.

1

z2

Of course, if you work out the dot product in the latter, you should get the former!

If we now find the x = 1 section of this tangent plane we get

z = 2(1) + 2y 2 = 2y,

which is the tangent line to the x = 1 section of the surface and so this must lie in

the tangent plane and, similarly, if we find the y = 1 section of this tangent plane we

get

z = 2x + 2(1) 2 = 2x,

which is the tangent line to the y = 1 section of the surface and so this must lie in

the tangent plane too. This is illustrated in Figure 6.11.

We note in passing that, if f is differentiable,8 then the tangent plane to f (x, y) at the

point (a, b) gives us a linear approximation to f (x, y) at nearby points, i.e.

f (x, y)

f (x, y)

8

xa

.

yb

229

Figure 6.11: The tangent plane to the surface z = x2 +y 2 at the point (1, 1, 2) as discussed

in Example 6.25. The lines in this tangent plane, which lie in the x = 1 and y = 1 planes,

are the tangent lines to the x = 1 and y = 1 sections of the surface respectively.

This prompts us to define the derivative of f (x, y) with respect to the vector x = (x, y)

to be the vector

df

= fx (x, y), fy (x, y) ,

dx

so that we can write

df

xa

f (x, y) f (a, b) +

.

dx (a,b) y b

This then gives us something which looks like a Taylor series and we will see more of

this in Section 6.4.5. But, before we do this, lets consider another important use of

what we have just seen.

6.4.2

Gradient vectors

The tangent to the surface z = f (x, y) at the point (a, b, c), where c = f (a, b), has a

Cartesian equation given by

z c = fx (a, b)(x a) + fy (a, b)(y b).

Now, if we look at the intersection of the surface and its tangent plane with the

horizontal plane z = c, we find that the surface gives us the contour c = f (x, y) and the

tangent plane gives us the line

fx (a, b)(x a) + fy (a, b)(y b) = 0.

Now, this line passes through the point (a, b) and, given that this line is in the tangent

plane of the surface at the point (a, b, c), it should be clear that it is the tangent line of

this contour at (a, b). In particular, as we can write the equation of this line as

fx (a, b)

xa

fy (a, b)

yb

= 0,

f (a, b) =

230

fx (a, b)

,

fy (a, b)

(6.6)

Example 6.26 Given that z = f (x, y) where f (x, y) = x2 + y 2 , find f (1, 1). Show

that this vector is perpendicular to the tangent line to the z = 2 contour of this

surface at the point (1, 1) and hence deduce that it is perpendicular to this contour

at this point.

Here f (x, y) = x2 + y 2 and so we have

f (x, y) =

fx (x, y)

fy (x, y)

2x

,

2y

f (1, 1) =

2

.

2

Then, using (6.6), we see that the Cartesian equation of the tangent line to the

z = 2 contour at this point9 is given by

2

x1

2

y1

=0

2(x 1) + 2(y 1) = 0

y = 2 x.

x

y

x

2x

0

1

+x

,

2

1

f (1, 1)

1

1

2

1

2

1

. But, of course,

= 2 + (2) = 0,

which means that f (1, 1) is indeed perpendicular to this tangent line and, in

particular, it will be perpendicular to the contour at this point too. This is

illustrated in Figure 6.12.

In general, given a function f (x, y), we call the vector

f (x, y) =

fx (x, y)

,

fy (x, y)

(6.7)

the gradient of f . Indeed, we have seen that fx (a, b) and fy (a, b) allow us to see how

rapidly f is changing if we move away from the point (a, b) in the x or y-direction

respectively. Now, we will look at how f (a, b) allows us to see how rapidly f is

changing if we move away from the point (a, b) in any direction.

9

Note that (x, y) = (1, 1) gives z = f (1, 1) = 2 and so this point is on the z = 2 contour of this

surface.

231

y =2x y

z=2

f (1, 1)

1

O

Figure 6.12: The z = 2 contour of the surface z = x2 + y 2 and its tangent line at the

point (1, 1) as discussed in Example 6.26. Observe how the tangent line to the contour at

this point is perpendicular to the vector f (1, 1). (The x and y-intercepts of the contour

have been omitted for clarity.)

6.4.3

Directional derivatives

Given the function f (x, y), we want to find its derivative, fu (a, b), in the direction of the

= (u1 , u2 )T .10 Of course, if u

is a unit vector in the x-direction, i.e.

unit vector u

=

u

1

0

whereas if u

=

u

0

1

but the question is: What if we are not using either of these two directions?

Consider the point on the surface z = f (x, y) at the point (a, b, c) where c = f (a, b). At

i.e. the curve of intersection of the

this point, we can find the section in the direction u,

Then,

surface and a plane that contains the point (a, b, c) and the vector u.

geometrically, we would want to interpret fu (a, b) as the gradient of the tangent line to

is a unit vector, this means that we have a vector v given by

this section. Now, as u

u1

v = u2 ,

fu (a, b)

which lies in the plane and points in the direction of the tangent line. As such, this

vector is perpendicular to the normal vector to the surface at this point and so we have

u1

fx (a, b)

u2 fy (a, b) = 0.

fu (a, b)

1

That is, working out this dot product, we have

10

=

That is, we have a direction u and we work with a unit vector in that direction, i.e. we use u

= u21 + u22 = 1.

(u1 , u2 )T where |u|

232

or, rearranging,

fu (a, b) = u1 fx (a, b) + u2 fy (a, b) =

u1

f (a, b)

x

,

u2

fy (a, b)

if we rewrite this in terms of inner products. Thus, we can see that the derivative of f

is given by

at the point (a, b) in the direction of the unit vector u

f (a, b),

fu (a, b) = u

in terms of the gradient of f .

Example 6.27 Given that z = f (x, y) with f (x, y) = x2 + y 2 , find the derivative of

T

f (x, y) in the direction 1, 2 at the point (1, 1). What is the derivative of f in the

direction f (1, 1)?

We saw in Example 6.26 that the gradient of f at the point (1, 1) is given by

f (1, 1) =

2

.

2

1

2

u=

1

=

we get the unit vector u

5

1

,

2

as |u|2 = 12 + 22 = 5 and this means that the gradient of f in the direction of this

unit vector is given by

1

f (1, 1) =

fu (1, 1) = u

5

1

2

2

2

6

1

= (2 + 4) = .

5

5

v=

2

2

1

so we get the unit vector v =

8

2

,

2

unit vector is given by

1

fv (1, 1) = v f (1, 1) =

8

2

2

2

2

1

8

= (4 + 4) = .

8

8

In particular, observe that the latter is approximately 2.83 (to 2dp) which is larger

than the former which is approximately 2.68 (to 2dp).

Indeed, this leads on to a useful observation about the rate at which f is changing in

different directions. We know, from Section 2.9 of 173 Algebra, that if is the angle

and f (a, b), we have

between the vectors u

233

f (a, b) = |u||f

u

(a, b)| cos = |f (a, b)| cos ,

= 1 since u

is a unit vector. In particular, we can use the fact that

as |u|

1 cos 1 to see that

|f (a, b)| fu (a, b) |f (a, b)|.

That is, if |f (a, b)| = 0, we can deduce that:

The maximum rate of change of f at the point (a, b, c) is |f (a, b)| and this occurs

when = 0, i.e. when the direction is u = f (a, b). This is the direction and rate

at which f increases most rapidly.

The minimum rate of change of f at the point (a, b, c) is |f (a, b)| and this

occurs when = , i.e. when the direction is u = f (a, b). This is the direction

and rate at which f decreases most rapidly.

Indeed, this allows us to see that, at the point (a, b), f is steepest in the direction

f (a, b).11

Example 6.28 Illustrate that the maximum rate of change of f occurs in the

direction f using what we found in Example 6.27.

In Example 6.27, we saw that the rate of change in the direction v = f (1, 1) was

T

greater than the rate of change in the direction u = 1, 2 as

fv (1, 1) > fu (1, 1),

and we can illustrate this using Figure 6.13. In particular, observe that if we want to

move to the z = 4 contour from the point (1, 1) on the z = 2 contour, it is quickest

to go in the direction given by f (1, 1) as, if we were to go in the direction

T

u = 1, 2 , we would have to travel further. Consequently, the rate of change of

z = f (x, y) is maximised when we go in the direction given by f (1, 1) and if we go

T

in another direction, say u = 1, 2 , it will be smaller.

6.4.4

Suppose that we have a surface whose equation is given by z = f (x, y). We could, of

course, write this equation as f (x, y) z = 0 and, in this form, the equation is now

g(x, y, z) = 0 if we take g to be the function of three variables given by

g(x, y, z) = f (x, y) z.

Indeed, more generally, we can see that a surface can be given by an equation of the

form g(x, y, z) = c where g, a function of three variables, is constrained to take the

11

Of course, if |f (a, b)| = 0 we find that fu (a,b) = 0 in all directions, u!

Section 7.2.1.

234

y

z=4

z=2

(1, 2)T

f (1, 1)

1

O

Figure 6.13: The z = 2 and z = 4 contours of the surface z = x2 + y 2 and the directions

f (1, 1) and (1, 2)T at the point (1, 1) as discussed in Example 6.27. Observe how the

quickest way to get to z = 4 contour from the point (1, 1) on the z = 2 contour is to go in

the direction f (1, 1). (The x and y-intercepts of the z = 2 contour have been omitted

for clarity.)

constant value, c. Sometimes, in such cases, we will be able to rearrange what we are

given to explicitly find the equation of the surface in the form z = f (x, y). But, what if

we cant? That is, what if we can only implicitly define the function f (x, y) through the

equation g(x, y, z) = c? As we shall see, with minor modifications, we will be able to

discuss certain aspects of such a surface using g even if we cant find f .

Tangent planes

Technically, a function g : R3 R defines a hypersurface in R4 whose equation is given

by u = g(x, y, z). And, although we cant visualise such hypersurfaces because they

live in a four-dimensional space, we can easily extend the theory of this chapter to say

things about them. For instance, if we have the point (a, b, c, d) where d = g(a, b, c), it

should be clear that the Cartesian equation of the tangent hyperplane to the surface at

this point is given by

u d = gx (a, b, c)(x a) + gy (a, b, c)(y b) + gz (a, b, c)(z c),

which is the analogue of what we saw in (6.4).12 Indeed, rewriting this as

gx (a, b, c)(x a) + gy (a, b, c)(y b) + gz (a, b, c)(z c) (u d) = 0,

we can see that the vector equation of this tangent hyperplane is

gx (a, b, c)

xa

gy (a, b, c) y b

gz (a, b, c) z c = 0,

1

ud

which is the analogue of (6.5) and the vector

gx (a, b, c)

gy (a, b, c)

gz (a, b, c) ,

1

12

We could, of course, re-run the argument given in Section 6.4.1 in this new context but we refrain

from doing that here.

235

is therefore one of its normal vectors as we might expect given what we saw before.

Here, however, we are interested in a surface in R3 whose equation, for some constant d,

is given by g(x, y, z) = d and this is the u = d contour of the corresponding

hypersurface in R4 .13 In particular, we want to be able to find the tangent plane to this

surface at a point (a, b, c) where g(a, b, c) = d. So, setting u = d in the Cartesian

equation of the tangent hyperplane above, we get

gx (a, b, c)(x a) + gy (a, b, c)(y b) + gz (a, b, c)(z c) = 0,

(6.8)

and this is the Cartesian equation of the tangent plane we seek. Lets see how this

works in practice.

Example 6.29 Following on from Example 6.25, find the Cartesian equation of the

tangent plane to the surface z = x2 + y 2 at the point (1, 1, 2) by using the function

g(x, y, z) = x2 + y 2 z.

The surface whose equation is z = x2 + y 2 can be represented by the equation

g(x, y, z) = 0 with g(x, y, z) = x2 + y 2 z and, as such, we have

gx (x, y, z) = 2x,

gy (x, y, z) = 2y,

gz (x, y, z) = 1.

and

Thus, using the Cartesian equation for the tangent plane at the point (a, b, c) on the

surface g(x, y, z) = d in (6.8), i.e.

gx (a, b, c)(x a) + gy (a, b, c)(y b) + gz (a, b, c)(z c) = 0,

we verify that the point (1, 1, 2) is on the surface as g(1, 1, 2) = 12 + 12 2 = 0 and

see that

2(1)(x 1) + 2(1)(y 1) + (1)(z 2) = 0

2x + 2y z = 2,

is the Cartesian equation of the tangent plane to the surface at this point in

agreement with what we saw in Example 6.25.

But, of course, our real objective here is to see how to find a tangent plane when the

function of two variables which gives the surface is only implicitly defined through an

equation that involves a function of three variables as in the next example.

Example 6.30 Verify that the point (1, 0, ) is on the surface whose equation is

x3 + zy 3 + sin z = 1 and find the tangent plane to the surface at that point.

The point (1, 0, ) is on the surface as 13 + ()(03 ) + sin = 1 + 0 + 0 = 1 and we

can write the equation of the surface as g(x, y, z) = 1 with

g(x, y, z) = x3 + zy 3 + sin z.

As such, we have

gx (x, y, z) = 3x2 ,

13

gy (x, y, z) = 3zy 2 ,

and

gz (x, y, z) = y 3 + cos z,

236

3(12 )(x 1) + 3()(02 )(y 0) + (03 + cos )(z ) = 0

3x z = 3 ,

as the Cartesian equation of the tangent plane to the surface at this point.

Gradient vectors

If we now write (6.8) in vector form, we get

gx (a, b, c)

xa

gy (a, b, c) y b = 0,

gz (a, b, c)

zc

and so we can see that the vector

(6.9)

gx (a, b, c)

g(a, b, c) = gy (a, b, c) ,

gz (a, b, c)

Example 6.31 Following on from Example 6.29, find the vector g(1, 1, 2) where

g(x, y, z) = x2 + y 2 z. Show that this vector is perpendicular to the tangent plane

to the surface g(x, y, z) = 0 at the point (1, 1, 2) and hence deduce that it is

perpendicular to the surface at this point.

Here g(x, y, z) = x2 + y 2 z and so we have

gx (a, b, c)

2x

g(x, y, z) = gy (a, b, c) = 2y ,

gz (a, b, c)

1

and, evaluating this at the point (1, 1, 2), we get

2

2 .

g(1, 1, 2) =

1

Then, using (6.9), we see that the Cartesian equation of the tangent plane to the

surface g(x, y, z) = 0 at this point14 is given by

2

x1

2 y 1 = 0 = 2(x 1) + 2(y 1) (z 2) = 0 = 2x + 2y z = 2.

1

z2

Now, for x, y R, we have points (x, y, z) on this tangent plane given by

x

x

0

1

0

y =

= 0 + x 0 + y 1 ,

y

z

2 + 2x + 2y

2

2

2

237

and so this plane lies in the directions given by the vectors (1, 0, 2)T and (0, 1, 2)T .

But, of course,

1

2

1

2 0 = 2 + 0 + (2) = 0,

g(1, 1, 2) 0 =

2

1

2

and

0

2

0

g(1, 1, 2) 1 = 2 1 = 0 + 2 + (2) = 0,

2

1

2

which means that g(1, 1, 2) is indeed perpendicular to this tangent plane and, in

particular, it will be perpendicular to the surface at this point too.

gx (x, y, z)

g(x, y, z) = gy (x, y, z) ,

gz (x, y, z)

the gradient of g and, for a function of three variables, this is the analogue of what we

saw in (6.7). Of course, we could then extend what we saw in Section 6.4.3, and use this

to find the directional derivatives of a function of three variables. This, in turn, would

allow us to see how rapidly this function is changing if we move away from a point in a

certain direction and, in particular, it would allow us to find the maximum (or

minimum) rate of change of such a function and the direction in which it occurs.

6.4.5

Taylor series

We saw in Section 3.4 that a function, F (t), of one variable has a second-order Taylor

series given by

F (t) = F (a) + (t a)F (a) +

(t a)2

F (a) + ,

2!

around t = a. Now, we want to derive the corresponding result for a function, f (x, y), of

two variables around the point (a, b) and, from what we saw when we considered

tangent planes in Section 6.4.1, we should anticipate that the first two terms of this

Taylor series will be given by

f (a, b) +

df

dx

(a,b)

xa

,

yb

df

= fx (x, y), fy (x, y) ,

dx

is the derivative of f (x, y) with respect to x = (x, y). So, our main concern here is what

the next term will look like.

14

Note that (x, y, z) = (1, 1, 2) gives g(1, 1, 2) = 12 + 12 2 = 0 and so this point is on this surface.

238

If we want to find the Taylor series for a function, f (x, y), around the point (a, b) we

need to see what is happening at some nearby point (x, y). Lets say that, in terms of a

new variable t, these points are related by the equations

x = a + ht

and

y = b + kt,

for some appropriately small values of the numbers ht and kt since these points are

supposed to be close to one another. Indeed, this means that we can define a new

function, F (t), of the single variable, t, given by

F (t) = f (x(t), y(t)) where x(t) = a + ht and y(t) = b + kt,

where the idea is that F (t) and its derivatives will allow us to use the Maclaurin series

for F (t), i.e.

t2

F (t) = F (0) + tF (0) + F (0) + ,

2!

to deduce the corresponding Taylor series for f (x, y). In particular, we can see

straightaway that

F (0) = f (x(0), y(0)) = f (a, b),

which is the first of our anticipated terms. Now we need to find the derivatives F (t)

and F (t) to see what the other two terms are.

To find F (t), we use the chain rule from Section 6.3.3 to see that

F (t) =

f dx f dy

+

= hfx (x(t), y(t)) + kfy (x(t), y(t)).

x dt

y dt

F (0) = hfx (x(0), y(0)) + kfy (x(0), y(0)) = hfx (a, b) + kfy (a, b),

so we can see that the next term in our Taylor series will be

tF (0) = htfx (a, b) + ktfy (a, b) = (x a)fx (a, b) + (y b)fy (a, b) =

df

dx

(a,b)

xa

,

yb

To find the remaining term, we need to find F (t) by differentiating our expression for

F (t) with respect to t using the chain rule. This gives us

F (t) = h

fx dx fx dy

fy dx fy dy

+

+k

+

x dt

y dt

x dt

y dt

= h hfxx (x(t), y(t)) + kfxy (x(t), y(t)) + k hfyx (x(t), y(t)) + kfyy (x(t), y(t))

F (t) = h2 fxx (x(t), y(t)) + hkfxy (x(t), y(t)) + khfyx (x(t), y(t)) + k 2 fyy (x(t), y(t))

and, in particular, this means that

F (0) = h2 fxx (a, b) + hkfxy (a, b) + khfyx (a, b) + k 2 fyy (a, b),

239

so we can see that the next term in our Taylor series will be

1

t2

F (0) =

(x a)2 fxx (a, b) + (x a)(y b)fxy (a, b) +

2!

2!

(y b)(x a)fyx (a, b) + (y b)2 fyy (a, b) .

Indeed, if we now define the second derivative of f (x, y) with respect to x = (x, y) to be

the matrix

d2 f

f (x, y) fxy (x, y)

= xx

,

2

fyx (x, y) fyy (x, y)

dx

it is easily verified that we have

t2

1

d2 f

x a, y b

F (0) =

2!

2!

dx 2

xa

,

yb

(a,b)

Consequently, putting this all together we see that the second-order Taylor series for a

function, f (x, y), of two variables around the point (a, b) is given by

f (x, y) = f (a, b) +

df

dx

d2 f

1

xa

x a, y b

+

yb

2!

dx 2

(a,b)

(a,b)

xa

+ ,

yb

and these terms will be sufficient for our purposes in this course. We will see how this

can be used in the next chapter, but for now, we will just use it to find an

approximation to a function of two variables around a certain point.

Example 6.32 Find the second-order Taylor series of the function

f (x, y) = ex cos y around the point (1, 0).

The first term of our second-order Taylor series is simply f (0, 1) = e1 cos 0 = e. We

also see that

df

= fx (x, y), fy (x, y) = ex cos y, ex sin y ,

dx

which means that

df

= e1 cos 0, e1 sin 0 = e, 0 ,

dx (1,0)

and so the second term of our second-order Taylor series is

df

dx

x1

y0

(1,0)

x1

y

= e, 0

= e(x 1).

d2 f

=

dx 2

fyx (x, y) fyy (x, y)

ex cos y ex sin y

,

ex sin y ex cos y

d2 f

dx 2

240

=

(1,0)

e1 cos 0 e1 sin 0

e1 sin 0 e1 cos 0

e 0

,

0 e

1

d2 f

x 1, y 0

2!

dx 2

(1,0)

x1

y0

1

x 1, y

2!

e 0

0 e

x1

y

1

e(x 1)

x 1, y

ey

2!

1

=

e(x 1)2 e y 2 .

2!

f (x, y)

e + e(x 1) +

1

e(x 1)2 e y 2 ,

2!

is the second-order Taylor series of f (x, y) = ex cos y around the point (1, 0).

Activity 6.20 Find an approximation to e1.1 cos 0.2 by using the second-order

Taylor series that we found in Example 6.32.

Activity 6.21 Find the second-order Taylor series in the previous example by using

the Taylor series for ex about x = 1 (see Example 3.31) and the Maclaurin series for

cos y (see Section 3.4.1).

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

visualise a surface by using sections and contours;

find partial derivatives;

use the chain rule to find derivatives of various kinds;

show that a function is homogeneous and verify Eulers theorem;

solve problems from economics-based subjects that involve partial derivatives;

find tangent planes and gradient vectors;

find directional derivatives and interpret what you have found;

find Taylor series and use these to approximate functions of two variables.

241

Solutions to activities

Solution to activity 6.1

To find the contours of the surface z = 4x + 2y 2 when we have the given values of z,

we note that:

For z = 10, the curve of intersection is given by 10 = 4x + 2y 2 which gives us

y = 2x 4.

For z = 0, the curve of intersection is given by 0 = 4x + 2y 2 which gives us

y = 2x + 1.

For z = 10, the curve of intersection is given by 10 = 4x + 2y 2 which gives us

y = 2x + 6.

Thus, we see from these equations that all three of the contours are straight lines. The

sketch of these contours in the (x, y)-plane is illustrated in Figure 6.14.

z

=

10

z

=

0

z

=

10

1

O

1

2

3 x

4

Figure 6.14: A sketch of the z = 10, z = 0 and z = 10 contours of the surface z =

4x + 2y 2 in the (x, y)-plane for Activity 6.1.

To find the z = 25 contour of the surface z = x2 y 2 we need to find the curve of

intersection which, in this case, is simply

x2 y 2 = 25

x2 + y 2 = 25.

This is the equation of a circle, centred on the origin, with a radius of five.

To find the z = c contours in the three cases indicated we just need to find out what the

curve

x2 y 2 = c

=

x2 + y 2 = c,

looks like in the three cases. So, we have:

If c > 0, there are no contours as we have c < 0 and we know that x2 + y 2 0 for

all values of x and y.

242

If c = 0, the contour is the point (0, 0) as this is the only solution to the equation

x2 + y 2 = 0.

have c > 0.

In particular, notice that z = 0 is the smallest value of z that arises from a point on this

surface.

Solution to activity 6.3

To find these sections of the surface z = 4x + 2y 2 we need to find the curves of

intersection, which in this case, are given by:

For the (x, z)-section, we have y = 0 and so the curve of intersection is given by

z = 4x 2 and this is a straight line in the (x, z)-plane.

For the (y, z)-section, we have x = 0 and so the curve of intersection is given by

z = 2y 2 and this is a straight line in the (y, z)-plane.

6

z

z = 4x 2

O

1

2

z = 2y 2

x

(a)

(b)

Figure 6.15: A sketch of the (a) (x, z)-section and (b) the (y, z)-section of the surface

Solution to activity 6.4

To find these sections of the surface z = x2 y 2 we need to find the curves of

intersection, which in this case, are given by:

For the (x, z)-section, we have y = 0 and so the curve of intersection is given by

z = x2 and this is a parabola in the (x, z)-plane.

For the (y, z)-section, we have x = 0 and so the curve of intersection is given by

z = y 2 and this is a parabola in the (y, z)-plane.

243

x

z=

x2

y

z=

(a)

y 2

(b)

Figure 6.16: A sketch of (a) the (x, z)-section and (b) the (y, z)-section of the surface

Solution to activity 6.5

To find these sections of the surface z = x y + 4 we need to find the curves of

intersection, which in this case, are given by:

z = y + 4 and this is a straight line in the (y, z)-plane. Of course, this is just the

(y, z)-section we found in Example 6.3!

For the x = 2 section, we have x = 2 and so the curve of intersection is given by

z = 2 y + 4 = y + 6 and this is a straight line.

For the x = 4 section, we have x = 4 and so the curve of intersection is given by

z = 4 y + 4 = y + 8 and this is a straight line.

Observe that only the first of these sections lives in the (y, z)-plane but, as illustrated

in Figure 6.17, we can also sketch the other two in this plane to get a feel for how the

surface is changing when we look at the sections x = c for different values of c.

z

8

6

4

4

= 2

x = 0

x =

x

O

8 y

Activity 6.5.

244

To find the y = 2, 0, 2 sections of the surface z = 4x + 2y 2 we need to find the

curves of intersection, which in this case, are given by:

For the y = 2 section, we have y = 2 and so the curve of intersection is given by

z = 4x 4 2 = 4x 6 and this is a straight line.

For the y = 0 section, we have y = 0 and so the curve of intersection is given by

z = 4x 2 and this is a straight line in the (y, z)-plane. Of course, this is just the

(x, z)-section we found in Activity 6.3 and it is the only one that lives in the

(x, z)-plane!

For the y = 2 section, we have y = 2 and so the curve of intersection is given by

z = 4x + 4 2 = 4x + 2 and this is a straight line.

the curves of intersection, which in this case, are given by:

For the x = 2 section, we have x = 2 and so the curve of intersection is given by

z = 8 + 2y 2 = 2y 10 and this is a straight line.

For the x = 0 section, we have x = 0 and so the curve of intersection is given by

z = 2y 2 and this is a straight line in the (y, z)-plane. Of course, this is just the

(y, z)-section we found in Activity 6.3 and it is the only one that lives in the

(y, z)-plane!

For the x = 2 section, we have x = 2 and so the curve of intersection is given by

z = 8 + 2y 2 = 2y + 6 and this is a straight line.

=

x

y=2

y=0

y=

2

O

12 12

2

3

2

O 1

2

10

(a)

(b)

Figure 6.18: A sketch of (a) the y = 2, 0, 2 sections and (b) the x = 2, 0, 2 sections of

245

To find these sections of the surface z = x2 + y 2 we need to find the curves of

intersection, which in this case, are given by:

For the y = 0 section, we have y = 0 and so the curve of intersection is given by

z = x2 and this is a parabola in the (x, z)-plane. Of course, this is just the

(x, z)-section we found in Example 6.4!

For the y = 1 section, we have y = 1 and so the curve of intersection is given by

z = x2 + 1 and this is a parabola.

For the y = 2 section, we have y = 2 and so the curve of intersection is given by

z = x2 + 4 and this is a parabola.

Observe that only the first of these sections lives in the (x, z)-plane but, as illustrated

in Figure 6.9, we can also sketch the other two in this plane to get a feel for how the

surface is changing when we look at the sections y = c for different values of c.

y=2

4

y=1

1

O

y=0

x

Figure 6.19: The y = 0, y = 1 and y = 2 sections of the surface z = x2 +y 2 for Activity 6.7.

To find the y = 0, 1, 2 sections of the surface z = x2 y 2 we need to find the curves of

intersection, which in this case, are given by:

For the y = 0 section, we have y = 0 and so the curve of intersection is given by

z = x2 and this is a parabola in the (y, z)-plane. Of course, this is just the

(x, z)-section we found in Activity 6.4 and it is the only one that lives in the

(x, z)-plane!

For the y = 1 section, we have y = 1 and so the curve of intersection is given by

z = x2 1 and this is a parabola.

For the y = 2 section, we have y = 2 and so the curve of intersection is given by

z = x2 4 and this is a parabola.

curves of intersection, which in this case, are given by:

246

z = y 2 and this is a parabola in the (y, z)-plane. Of course, this is just the

(y, z)-section we found in Activity 6.4 and it is the only one that lives in the

(y, z)-plane!

For the x = 1 section, we have x = 1 and so the curve of intersection is given by

z = 1 y 2 and this is a parabola.

For the x = 2 section, we have x = 2 and so the curve of intersection is given by

z = 4 y 2 and this is a parabola.

z

O

1

y=0

O

1

y=1

4

x=0

x=1

4

y=2

(a)

x=2

(b)

Figure 6.20: A sketch of (a) the y = 0, 1, 2 sections and (b) the x = 0, 1, 2 sections of the

Solution to activity 6.9

The partial derivative of f (x, y) with respect to y, i.e. the result of differentiating

f (x, y) with respect to y whilst holding x constant, is going to be another function of x

and y. This function of x and y is what is denoted by the symbols in (6.2). What does

this partial derivative mean? In effect, what we have done when we consider the

function f (x, y) for some fixed value of x, say x0 , is to look at the section of the curve

z = f (x, y) we get when x = x0 , i.e. the section given by the equation z = f (x0 , y)

which lies in a plane that has x = x0 and is parallel to the (y, z)-plane. Then, when we

differentiate f (x0 , y) with respect to y, we are finding the gradient of this section, i.e. it

tells us how z = f (x0 , y) is varying with y. Consequently, this partial derivative is

telling us something about the gradient of the surface when we are at the point (x0 , y)

and we are looking in the y-direction.

Solution to activity 6.10

Given the function

f (x, y) = 2x + x3 y

y3

x y3

+

= 2x + x3 y xy 1 + ,

y

2

2

f

1

= 2 + 3x2 y y 1 = 2 + 3x2 y ,

x

y

247

f

3

x

3

= x3 + xy 2 + y 2 = x3 + 2 + y 2 .

y

2

y

2

These are the sought after partial derivatives fx (x, y) and fy (x, y) respectively.

Solution to activity 6.11

Given the function

f (x, y) =

x2 + y 2 = (x2 + y 2 )1/2 ,

we hold y constant and differentiate with respect to x using the chain rule to get

f

1

= (x2 + y 2 )1/2 (2x) =

x

2

x

x2 + y 2

and we hold x constant and differentiate with respect to y using the chain rule to get

f

1

= (x2 + y 2 )1/2 (2y) =

y

2

y

x2 + y 2

These are the sought after partial derivatives fx (x, y) and fy (x, y) respectively.

Solution to activity 6.12

Here f (x, y) = x2 y, x(t) = 2 + 3t and y(t) = t2 + 1. In this case, if we again let

F (t) = f (x(t), y(t)), the chain rule states that

f dx f dy

dF

=

+

.

dt

x dt

y dt

As such, using this, we can see that

dF

= (2xy)(3) + (x2 )(2t) = 2x(3y + xt),

dt

and so, substituting our expressions for x(t) and y(t), we get

dF

= 2(2 + 3t)[3(t2 + 1) + (2 + 3t)t] = 2(2 + 3t)(6t2 + 2t + 3).

dt

To check this, we note that

F (t) = f (x(t), y(t)) = (2 + 3t)2 (t2 + 1),

which, using the product and chain rules, gives us

dF

= [2(2 + 3t)(3)](t2 + 1) + (2 + 3t)2 (2t) = 2(2 + 3t)[3(t2 + 1) + t(2 + 3t)],

dt

and this agrees with our earlier answer.

248

We have a function, y(x), which is defined implicitly by the equation

x2 + 2xy + 3y 3 = 6,

and we notice that, at the point (x, y) = (1, 1) we have

(1)2 + 2(1)(1) + 3(1)3 = 6,

and so this point does indeed satisfy the equation. To find its derivative at this point we

note that we have g(x, y) = c where

g(x, y) = x2 + 2xy + 3y 3

and we use the fact that

dy

g

=

dx

x

and

c = 6,

g

,

y

to get

dy

2x + 2y

2(x + y)

=

=

,

2

dx

2x + 9y

2x + 9y 2

dy

dx

(1,1)

4

2(1 + 1)

= ,

2+9

11

Solution to activity 6.14

We have G(k, l) = g(x(k, l), y(k, l)) and we want to explain why the chain rule formula

for Gl (k, l) works. To do this, consider that if we change l by a small amount, l, whilst

holding k constant, the corresponding change in G(k, l) is given by

G

G

l,

l

but here, there are two ways in which G(k, l) = g(x(k, l), y(k, l)) can change with l.

Firstly, G can change with l because g changes with x and x changes with l, lets

denote this change in G by x G. In this case, we have

x G

g

x,

x

as we are holding y constant to see how F changes with x and this means that

x G

g x

l,

x l

x xl (k, l)l.

249

Secondly, G can change with l because g changes with y and y changes with l, lets

denote this change in G by y G. In this case, we have

y G

g

y,

y

as we are holding x constant to see how F changes with y and this means that

y G

g y

l,

y l

y yl (k, l)l.

Thus, as the total change in F due to these two changes is given by

G = x G + y G

g x

g y

l +

l,

x l

y l

we can now equate our two expressions for G and divide through by l to get the

chain rule for Gl (k, l) which we wanted.

To see why the formula for zy (x, y) works, we consider that if we knew the function,

z(x, y), that satisfied the equation g(x, y, z) = c, we could find a new function, G(x, y),

of x and y only which is given by G(x, y) = g(x, y, z(x, y)). Then using the chain rule,

we have

g dy g z

G

=

+

.

y

y dy z y

But, G(x, y) = c where c is a constant and so we also have

G

=0

x

as well as

dy

= 1,

dy

0=

g g z

+

.

y z y

z

g/y

=

,

y

g/z

as long as gz (x, y, z) = 0.

Solution to activity 6.16

We have a function, q(k, l), which is defined implicitly by the equation

q 3 k + k 3 l + qk 2 l = 3,

and we want to find its partial derivatives with respect to k and l. To do this, we

rewrite the equation as g(q, k, l) = c so that we have, say,

g(q, k, l) = q 3 k + k 3 l + qk 2 l

250

and

c = 3,

q

g

=

k

k

g

q

q

g

=

l

l

and

g

,

q

q

q 3 + 3k 2 l + 2qkl

=

k

3q 2 k + k 2 l

q

k 3 + qk 2

= 2

,

l

3q k + k 2 l

and

Now, to evaluate these partial derivatives at the point where (k, l) = (1, 1), we need to

find the corresponding value of q. This can be done by noting that, when we have k = 1

and l = 1, the equation becomes

q 3 + q 2 = 0,

and, using the hint, we see that this equation can be written as

(q 1)(q 2 + q + 2) = 0.

Indeed, since

1

7

q +q+2= q+

+ > 0,

2

4

for all q R, we see that q = 1 is the only solution to this equation. Thus, the point we

are interested in has coordinates (k, l, q) = (1, 1, 1) and, at this point, we have

2

1+3+2

6

3

q

=

= =

k

3+1

4

2

and

q

1+1

2

1

=

= = ,

l

3+1

4

2

Solution to activity 6.17

In Activity 6.10, we saw that the function

f (x, y) = 2x + x3 y

y3

x y3

+

= 2x + x3 y xy 1 + ,

y

2

2

f

= 2 + 3x2 y y 1

x

and

f

3

= x3 + xy 2 + y 2 .

y

2

fxx (x, y) = 6xy

1

,

y2

fyx (x, y) = 3x2 + y 2 = 3x2 +

1

y2

x

+ 3y.

y3

15

Notice that, in particular, we can never have k = 0 here as this does not satisfy the equation

q k + k 3 l + qk 2 l = 3.

3

251

Given the function f (x, y) = x3/4 y 1/4 , we partially differentiate with respect to x and y

respectively to get

3

fx (x, y) = x1/4 y 1/4

4

1

and fy (x, y) = x3/4 y 3/4 ,

4

as the first-order partial derivatives. Then, for the second-order partial derivatives, we

note that partially differentiating fx (x, y) with respect to x and y respectively, we get

fxx (x, y) =

3 5/4 1/4

x

y

16

3 1/4 3/4

x

y

,

16

fyx (x, y) =

3 1/4 3/4

x

y

16

3 3/4 7/4

x y

.

16

In Activity 6.11, we saw that the function

f (x, y) =

x2 + y 2 = (x2 + y 2 )1/2 ,

f

= x(x2 + y 2 )1/2

x

and

f

= y(x2 + y 2 )1/2 .

y

So, partially differentiating fx (x, y) with respect to x using the product and chain rules

we get

1

(x2 + y 2 ) x2

y2

fxx (x, y) = (1)(x2 +y 2 )1/2 +(x) (x2 + y 2 )3/2 (2x) =

=

,

2

(x2 + y 2 )3/2

(x2 + y 2 )3/2

and partially differentiating fx (x, y) with respect to y using the chain rule we get

xy

1

fxy (x, y) = x (x2 + y 2 )3/2 (2y) = 2

.

2

(x + y 2 )3/2

Similarly, partially differentiating fy (x, y) with respect to x using the chain rule we get

1

xy

fyx (x, y) = y (x2 + y 2 )3/2 (2x) = 2

.

2

(x + y 2 )3/2

and partially differentiating fy (x, y) with respect to y using the product and chain rules

we get

1

(x2 + y 2 ) y 2

x2

fyy (x, y) = (1)(x2 + y 2 )1/2 + (y) (x2 + y 2 )3/2 (2y) =

=

.

2

(x2 + y 2 )3/2

(x2 + y 2 )3/2

Notice that fxy = fyx as we should expect in this course.

252

6.4. Exercises

To find an approximation to e1.1 cos 0.2 using the second-order Taylor series in

Example 6.32, we have

e1.1 cos 0.2

e + e(1.1 1) +

1

e(1.1 1)2 e(0.2)2 = 1.085 e,

2!

and, using the value of e, we find that e1.1 cos 0.2 2.949 to 3dp. Indeed, as the point

(1.1, 0.2) is close to the point (1, 0) we expect this to be a good approximation. Of

course, the exact value of e1.1 cos 0.2 is 2.944 to 3dp and so we can see that our

approximation agrees with this to 1dp.

Solution to activity 6.21

As we saw in Example 3.31, the second-order Taylor series for ex around x = 1 is

ex

e +(x 1) e +

(x 1)2

e,

2!

and as we saw in Section 3.4.1, the second-order Maclaurin series (i.e. the Taylor series

around y = 0) of cos y is

y2

cos y 1 .

2!

This means that, around the point (1, 0), we would have

ex cos y

e +(x 1) e +

(x 1)2

e

2!

y2

2!

and, multiplying out the brackets and discarding terms which are more than

second-order in (x 1) and y since these are small around the point (1, 0), we get

ex cos y

e +(x 1) e +

y2

(x 1)2

ee ,

2!

2!

Exercises

Exercise 6.1

Find the first and second-order partial derivatives of the function

f (x, y) = 2xy + x2a y a ,

where a is a constant.

If this function satisfies the equation

x2

2

2f

2 f

2y

18f (x, y) + 36xy = 0,

x2

y 2

253

Exercise 6.2

For some numbers , and , a function, f , takes the form

x2 + y

f (x, y) = 2

.

x + y

If f is homogeneous of degree four, find the values of , and . Having found these

values, verify that the function satisfies Eulers theorem.

Exercise 6.3

Suppose that R(p, q) = eq+p and that p is a positive function of q defined implicitly by

the equation

q 2 p + p2 q + qp = 3.

Given that r(q) = R(q, p(q)), use the chain rule to find its derivative, r (q), when q = 1.

Exercise 6.4

A function f : R2 R is defined by

f (x, y) = x2 2y 2 ,

and the point P has coordinates (1, 1).

(a) Find the direction and rate at which f increases most rapidly at P .

(b) Find the rate of change of f at P in the direction (1, 1)T .

(c) Verify that the point P is on the curve

x2 2y 2 = 1,

and find the Cartesian equation of the tangent line to this curve at this point.

Exercise 6.5

A function f : R3 R is defined by

f (x, y, z) = ln(xy + z).

(a) Find the gradient of f at the point (a, b, c).

(b) Verify that the point (1, 1, 0) is on the surface

ln(xy + z) = 0,

and find the normal vector and the tangent plane to the surface at this point.

(c) Consider the points, (x, y, z), at which the rate of increase of f in the direction

(x/2.y/2, z)T is equal to two. Show that all of these points lie on the surface with

equation

x2 + y 2 + 4z 2 = 1.

254

Solutions to exercises

Solution to exercise 6.1

Given that f (x, y) = 2xy + x2a y a where a is a constant, its first and second-order partial

derivatives are given by

f

= 2y +2ax2a1 y a

x

2f

= 2a(2a1)x2a2 y a

x2

2f

= a(a 1)x2a y a2

y 2

and

2f

= 2+2a2 x2a1 y a1 ,

yx

and

f

= 2x + ax2a y a1

y

and

2f

= 2 + 2a2 x2a1 y a1 .

xy

Observe, in particular, that fxy (x, y) = fyx (x, y) as we should expect in this course.

If this function satisfies the equation

x2

2

2f

2 f

2y

18f (x, y) + 36xy = 0,

x2

y 2

x2 2a(2a 1)x2a2 y a 2y 2 a(a 1)x2a y a2 18 2xy + x2a y a + 36xy = 0,

which can be tidied up to give us

2a(2a 1)x2a y a 2a(a 1)x2a y a 36xy 18x2a y a + 36xy = 0,

and, after further simplification, we get

2(a2 9)x2a y a = 0.

Consequently, as x, y R, we must have a2 = 9 which means that a = 3 are the

possible values of a if f has to satisfy the given equation.

Solution to exercise 6.2

For the function

x2 + y

,

x2 + y

to be homogeneous of degree four for some numbers , and , we require that

f (x, y) =

f (x, y) =

(x)2 + (y)

,

(x)2 + (y)

is equal to 4 f (x, y). But, in order for this to happen, we must find that the

numerator is homogeneous, i.e. we have 2 = so that

(x)2 + (y) = (x) + (y) = (x + y ),

giving us a numerator whose degree of homogeneity is = 2.

255

(x)2 + (y) = (x)2 + (y)2 = 2 (x2 + y 2 ),

giving us a denominator whose degree of homogeneity is = 2.

overall degree of homogeneity is four, i.e. we must find that

(x)2 + (y)

(x + y )

2 x + y

=

= 2 2

= 2 f (x, y),

2

2

2

2

(x) + (y)

(x + y )

x +y

after homogeneous function is

x6 + y 6

f (x, y) = 2

.

x + y2

To verify that Eulers theorem holds for this function, we need to show that

x

f

f

+y

= 4f (x, y).

x

y

(6x5 )(x2 + y 2 ) (x6 + y 6 )(2x)

f

=

x

(x2 + y 2 )2

and

f

(6y 5 )(x2 + y 2 ) (x6 + y 6 )(2y)

=

,

y

(x2 + y 2 )2

x

f

f

+y

=x

x

y

(x2 + y 2 )2

+y

(x2 + y 2 )2

(x2 + y 2 )2

(x2 + y 2 )2

4(x6 + y 6 )

x2 + y 2

= 4f (x, y),

=

as required.

Solution to exercise 6.3

Given that, r(q) = R(q, p(q)), the chain rule tells us that

dr

R dq R dp

R R dp

=

+

=

+

,

dq

q dq

p dq

q

p dq

and so, as R(q, p) = eq+p , we have

dr

dp

dp

= eq+p + eq+p

= eq+p 1 +

dq

dq

dq

256

Now we need to calculate p (q) given that p = p(q) is defined through the equation

q 2 p + p2 q + qp = 3.

To do this, we let G(q, p) be the function defined by

G(q, p) = q 2 p + p2 q + qp,

so that the given equation is now G(q, p) = 3. With this, we then have

dp

G

=

dq

q

where

G

= 2qp + p2 + p and

q

G

,

p

G

= q 2 + 2pq + q,

p

which gives us

dp

2qp + p2 + p

= 2

,

dq

q + 2pq + q

To take stock, so far, we have found that

dr

dp

= eq+p 1 +

dq

dq

and

2qp + p2 + p

dp

= 2

,

dq

q + 2pq + q

and we need to evaluate this at the point where q = 1. In particular, we now need to

find the value of p that corresponds to q = 1 if p = p(q) is the positive function of q

defined implicitly by the equation

q 2 p + p2 q + qp = 3.

That is, if we set q = 1 in this equation we get

p + p2 + p = 3

p2 + 2p 3 = 0

(p + 3)(p 1) = 0,

i.e. the possible values of p are 3 and 1. But, we are told that p is a positive function

of q and so we reject p = 3 and take the point where q = 1 and p = 1 to be the one we

are interested in. Then, at this point, we find that

2+1+1

dp

=

= 1

dq

1+2+1

dr

= e1+1 (1 + [1]) = 0,

dq

Solution to exercise 6.4

For (a), the function f (x, y) = x2 2y 2 has a gradient vector given by

f =

2x

,

4y

257

2

,

4

f (1, 1) =

and this is the direction in which f is increasing most rapidly at P . We then find that

is the rate of change of f in this direction and so this is the rate at which f increases

most rapidly.

For (b), a unit vector in the direction v = (1, 1)T is v = ( 12 , 12 )T and so

fv (1, 1) = v f (1, 1) =

1

2

4

2

1

1

1

6

= (2 + 4) = = 3 2,

2

2

For (c), the point P is on the curve as 12 2(1)2 = 1 2 = 1. To find the equation

of the tangent line to the curve at this point, we use (6.6), to see that

f (1, 1)

x1

y+1

=0

2

x1

4

y+1

=0

2(x 1) + 4(y + 1) = 0,

Solution to exercise 6.5

For (a), given the function f (x, y, z) = ln(xy + z) we have

y

fx

y/(xy + z)

1

x

f

x/(xy

+

z)

f (x, y, z) =

=

=

y

xy + z

1

fz

1/(xy + z)

and so the gradient vector is

b

1

a ,

f (a, b, c) =

ab + c

1

For (b), we see that the point (1, 1, 0) is on the surface as ln([1][1] + 0) = ln 1 = 0 and

the normal vector to the surface at this point is

1

1

1

1

f (1, 1, 0) =

= 1 .

(1)(1) + 0

1

1

x1

1

x1

f (1, 1, 0) y 1 = 0 = 1 y 1 = 0 = 1(x1)+1(y1)+1(z0) = 0,

z0

1

z0

258

i.e. x + y + z = 2 is the Cartesian equation of the tangent plane to surface at the point

(1, 1, 0).

For (c), we note that at all points, (x, y, z), we have

y

1

x ,

f (x, y, z) =

xy + z

1

x

1

y ,

v=

2

2z

x

1

y .

v =

2

2

2

x + y + 4z

2z

The rate of increase of f in the direction of the unit vector v at a point (x, y, z) is then

given by fv (x, y, z), i.e. we have

v f (x, y, z) =

xy + xy + 2z

(xy + z)

x2 + y 2 + 4z 2

2(xy + z)

(xy + z) x2 + y 2 + 4z 2

2

x2 + y 2 + 4z 2

where we have just found the dot product of the two vectors v and f (x, y, z).

Consequently, when fv (x, y, z) = 2, we have points (x, y, z) that satisfy the equation,

2=

2

x2

y2

4z 2

x2 + y 2 + 4z 2 = 1,

as required.

259

260

Chapter 7

Two-variable optimisation

Essential reading

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 4.6, 4.7, 6.36.8.

Anthony and Biggs (1996) Chapter 13, parts of Chapters 14 and 21.

Further reading

Simon and Blume (1994) parts of Chapter 17, 18 and 19.

Adams and Essex (2010) parts of Sections 13.113.3.

The objectives of this chapter are as follows.

To use partial derivatives to solve problems where a function needs to be optimised.

To solve problems where a function needs to be optimised subject to a constraint.

Specific learning outcomes can be found near the end of this chapter.

7.1

Introduction

Having seen how to find partial derivatives and gained some insight into what they tell

us about a function of two variables in the last chapter, we now see how they can be

used to optimise such a function. In particular, we will see how the first-order partial

derivatives allow us to find the stationary points of a function and its second-order

partial derivatives allow us to see whether such a point is a maximum or a minimum. We

will also see how to optimise a function of two variables in cases where the variables are

constrained, i.e. they are required to satisfy some extra condition known as a constraint.

7.2

Unconstrained optimisation

We start by considering unconstrained optimisation, i.e. we are looking for the places

where a function of two variables, f (x, y), attains its maximum or minimum values

when x and y are independent and free to take any values in R2 .

261

7. Two-variable optimisation

7.2.1

Stationary points

Suppose we have a surface z = f (x, y) whose tangent plane at the point (a, b, c) where

c = f (a, b) is given by (6.4), i.e.

z c = fx (a, b)(x a) + fy (a, b)(y b).

We define a stationary point of this function to be any point where the tangent plane to

the function is horizontal and so, in this case, the tangent plane would have to be z = c.

But, if this is the case, it means that we must have

fx (a, b)(x a) + fy (a, b)(y b) = 0,

for all x, y R which, in turn, means that we must have

fx (a, b) = 0

and

fy (a, b) = 0.

Thus, we find that the point (x, y) = (a, b) is a stationary point of the function f (x, y) if

both first-order partial derivatives of the function are zero at that point. Consequently,

in order to find the stationary points of a function, f (x, y), we must find all points (x, y)

that satisfy the equations

fx (x, y) = 0

and

fy (x, y) = 0,

simultaneously.

Example 7.1 Find the stationary points of the function

f (x, y) = x4 + 2x2 y + 2y 2 + y.

The first-order partial derivatives of this function are

fx (x, y) = 4x3 + 4xy

and

fy (x, y) = 2x2 + 4y + 1.

At a stationary point, both of the first-order partial derivatives are zero, i.e. we must

have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to

solve the simultaneous equations

4x3 + 4xy = 0

and

2x2 + 4y + 1 = 0.

4x3 + 4xy = 0

4x(x2 + y) = 0

x = 0 or y = x2 .

x = 0 we must have

2(0)2 + 4y + 1 = 0

i.e. (0, 1/4) is a stationary point.

262

1

y= ,

4

y = x2 we must have

2x2 + 4(x2 ) + 1 = 0

2x2 = 1

x2 =

1

2

1

x = ,

2

1

1

= ,

y =

2

2

0,

1

4

1

1

,

2 2

1

1

,

2 2

and

Example 7.2 Find the stationary points of the function

f (x, y) = 4x3 60xy + 5y 2 + 400y 35.

The first-order partial derivatives of this function are

fx (x, y) = 12x2 60y

and

At a stationary point, both of the first-order partial derivatives are zero, i.e. we must

have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to

solve the simultaneous equations

12x2 60y = 0

and

x2 5y = 0

6x + y + 40 = 0,

and

and then notice that the first equation gives us y = x2 /5. Substituting this into the

second equation then allows us to see that

6x+

x2

+40 = 0 = x2 30x+200 = 0 = (x20)(x10) = 0 = x = 10 or x = 20,

5

y=

102

= 20

5

or

y=

202

= 80,

5

respectively. Thus, this function has two stationary points, namely the points

(10, 20) and (20, 80).

Activity 7.1 Find the stationary points of the function

f (x, y) = x2 4x + y 2 + 4y + 8.

263

7. Two-variable optimisation

Activity 7.2

f (x, y) = 3x3 + 9x2 72x + 2y 3 12y 2 126y + 19.

In particular, notice that at a stationary point, i.e. at a point, (a, b), where

fx (a, b) = 0

and

fy (a, b) = 0,

f (a, b) =

fx (a, b)

fy (a, b)

0

0

= 0.

That is, if we are at a stationary point, we can see that the rate of change of f in any

is zero as

direction given by the unit vector u

f (a, b) = u

0 = 0,

fu (a, b) = u

which means that at a stationary point, the rate of change of f is zero in all directions.

We have now seen how to find the stationary points of a function, f (x, y), but what do

they look like? Generally speaking, we will find that there are three kinds of stationary

point namely local minima, saddle points and local maxima and these are

illustrated in Figure 7.1(a), (b) and (c) respectively. We now consider what criteria we

can use to determine exactly what kind of stationary point we have found.

x

Figure 7.1: Each of these surfaces has the indicated kind of stationary point at (0, 0, 0).

7.2.2

Lets say that we have found that (a, b) is a stationary point of the function, f (x, y).

This means that

fx (a, b) = 0

and

fy (a, b) = 0,

and so, in particular, the derivative of f at this point is given by

df

dx

264

(a,b)

However, we saw in Section 6.4.5, that the second-order Taylor series of the function

f (x, y) around the point (a, b) is given by

f (x, y) = f (a, b) +

df

dx

(a,b)

1

d2 f

xa

x a, y b

+

yb

2!

dx 2

(a,b)

xa

+ ,

yb

f (x, y) f (a, b) =

1

d2 f

x a, y b

2!

dx 2

(a,b)

xa

+ ,

yb

provided that the point (x, y) is sufficiently close to the point (a, b). Consequently, if we

let K(x, y) be the quantity

x a, y b

d2 f

dx 2

(a,b)

xa

,

yb

we can see that: If, for all (x, y) close to [but not equal to] (a, b), we have:

K(x, y) > 0, then f (x, y) > f (a, b) for such points and so the function always lies

above the horizontal tangent plane at (a, b). This means that the stationary point

is a local minimum as in Figure 7.1(a).

K(x, y) < 0, then f (x, y) < f (a, b) for such points and so the function always lies

below the horizontal tangent plane at (a, b). This means that the stationary point

is a local maximum as in Figure 7.1(c).

However, if we find that there are some points (x, y) close to [but not equal to] (a, b)

that make K(x, y) > 0 and others that make K(x, y) < 0, we see that at some points we

have f (x, y) > f (a, b) and so the function lies above the horizontal tangent plane and at

other points we have f (x, y) < f (a, b) and so the function lies below the horizontal

tangent plane. Indeed, as we saw in Figure 7.1(b), this is exactly what happens when

we have a saddle point.

Now, it turns out that,1 if we use the definition of the second derivative matrix, we have

K(x, y) = x a, y b

fyx (a, b) fyy (a, b)

xa

,

yb

K(x, y) = (x a)2 fxx (a, b) + 2(x a)(y b)fxy (a, b) + (y b)2 fyy (a, b),

1

This is most easily done if we show that the second derivative of f (x, y) at the point (a, b), i.e. the

matrix

d2 f

fxx (a, b) fxy (a, b)

=

,

2

fyx (a, b) fyy (a, b)

dx (a,b)

is positive definite or negative definite as in Binmore and Davies (2002) Section 6.3. But you wont

encounter these concepts until you study 175 Further Linear Algebra and so we merely motivate the

result that follows here.

265

7. Two-variable optimisation

if we assume, as usual, that fxy (a, b) = fyx (a, b). Then, taking out a factor of fxx (a, b)

and completing the square, we get2

K(x, y) = fxx (a, b)

fxy (a, b)

(x a) +

fxx (a, b)

(y b)2 .

[fxx (a, b)]2

H(x, y) = fxx (x, y)fyy (x, y) [fxy (x, y)]2 ,

so that, finally, we have

K(x, y) = fxx (a, b)

fxy (a, b)

(x a) +

fxx (a, b)

H(a, b)

(y b)2 .

[fxx (a, b)]2

If H(a, b) > 0 and fxx (a, b) > 0, then K(x, y) > 0 for all (x, y) close to [but not

equal to] (a, b) and so, as we saw above, this means that the stationary point (a, b)

is a local minimum.

If H(a, b) > 0 and fxx (a, b) < 0, then K(x, y) < 0 for all (x, y) close to [but not

equal to] (a, b) and so, as we saw above, this means that the stationary point (a, b)

is a local maximum.

Indeed, if we find that H(a, b) < 0, we can see that there will be some points (x, y) close

to [but not equal to] (a, b) that make K(x, y) > 0 and others that make K(x, y) < 0. In

this case, as we saw above, this means that the stationary point (a, b) is a saddle point.

In summary, we have now motivated the following method for classifying our stationary

points:

If (a, b) is a stationary point of the function, f (x, y), and the Hessian is defined

to be the function

H(x, y) = fxx (x, y)fyy (x, y) [fxy (x, y)]2 ,

then

If H(a, b) > 0 and fxx (a, b) > 0, then this stationary point is a local

minimum.

If H(a, b) > 0 and fxx (a, b) < 0, then this stationary point is a local

maximum.

If H(a, b) < 0, then this stationary point is a saddle point.

In particular, if H(a, b) = 0, we can draw no conclusions about the nature of

the stationary point by using this method.

Lets look at some examples of how this works in practice.

2

Technically, we have assumed that fxx (a, b) = 0 here, but if this was not the case we could present

a slightly different argument to deal with this problem. However, as we are just trying to motivate what

follows instead of providing a rigorous argument for it, we will skip these technicalities here.

266

Example 7.3

Using the first-order partial derivatives we found in Example 7.1, we find that the

second-order partial derivatives are

fxx (x, y) = 12x2 + 4y,

and

fyy (x, y) = 4,

H(x, y) = (12x2 + 4y)(4) (4x)2 = 48x2 + 16y 16x2 = 16(2x2 + y).

Evaluating this at each of the stationary points we then find that:

At (0, 1/4), the Hessian is

H(0, 1/4) = 16(1/4) < 0,

and so this is a saddle point.

so this is a local minimum.

so this is a local minimum.

Thus, the stationary points we found in Example 7.1, i.e.

0,

1

4

1

1

,

2 2

and

1

1

,

2 2

Example 7.4

Using the first-order partial derivatives we found in Example 7.2, we find that the

second-order partial derivatives are

fxx (x, y) = 24x,

and

H(x, y) = (24x)(10) (60)2 = 240x 3600 = 240(x 15).

Evaluating this at each of the stationary points we then find that:

At (10, 20), the Hessian is

H(10, 20) = 240(5) < 0,

267

7. Two-variable optimisation

At (20, 80), the Hessian is

H(20, 80) = 240(5) > 0

and

Thus, the stationary points (10, 20) and (20, 80) are a saddle point and a local

minimum respectively.

Activity 7.3

Activity 7.4

Lastly, we have remarked above that in cases where the Hessian is zero at a stationary

point, the method that we have used so far fails. Indeed, in such cases, the stationary

point could be a local minimum, a local maximum or a saddle point and, to determine

which, we would have to think more carefully about what is happening. Lets consider

an example of a function where this kind of problem occurs.

Example 7.5 Find the stationary point of the function f (x, y) = x3 y 3 and show

that we cant determine its nature using the method above. What kind of stationary

point do we have here?

The first-order partial derivatives of this function are

fx (x, y) = 3x2

and

fy (x, y) = 3y 2 .

So, clearly, the only stationary point is at (0, 0). The second-order partial derivatives

of this function are given by

fxx (x, y) = 6x,

and

H(x, y) = (6x)(6y) 02 = 36xy.

Indeed, evaluating this at the stationary point gives H(0, 0) = 0 and so the method

we used above fails.

However, if we consider the surface z = f (x, y), notice that the y = 0 section of our

function gives z = f (x, 0) = x3 . As such, if we look at this section around the

stationary point (0, 0) where z = f (0, 0) = 0, we can see that

if x > 0, we have f (x, 0) > f (0, 0) and so this stationary point cant be a local

maximum, whereas

if x < 0, we have f (x, 0) < f (0, 0) and so this stationary point cant be a local

minimum.

268

leads us to a similar conclusion. In fact, looking at the sections, we can see that this

is a kind of saddle point, albeit one which looks different to the one that we saw

before in Figure 7.1(b), and it is illustrated in Figure 7.2.

100

100

50

50

200

100

0

-4

-2

0

0

-4

-2

-50

-50

-100

-100

-100

4

-200

2

0

-2

4

-4

-2

-4

(a)

(b)

(c)

Figure 7.2: Some useful pictures for Example 7.5. (a) The y = 0 section, z = f (x, 0) = x3 .

a different kind of saddle point at (0, 0, 0).

and show that we cant determine its nature using the method above. What kind of

stationary point do we have here?

7.2.3

concave. In particular, we will see that, if a function, f (x, y), is convex (or concave) for

all (x, y) R2 , then a local minimum (or local maximum) is actually a global minimum

(or global maximum), i.e. we can find the smallest (or largest) value that the function

can attain.

To see how this works, consider that, in the case of a function of one variable, f (x), we

saw in Section 4.3.2 that

f (x) is convex on R if it lies above all of its tangent lines, and

f (x) is concave on R if it lies below all of its tangent lines.

So, analogously, we say that a function of two variables, f (x, y), is

convex on R2 if it lies above all of its tangent planes, and

concave on R2 if it lies below all of its tangent planes.

As an example of what this means, it should be clear from what we can see of the

surfaces illustrated in Figure 7.1, that in:

269

7. Two-variable optimisation

(a) where we have a local minimum, the function is convex because it lies above all

of its tangent planes

(b) where we have a saddle point, the function is neither convex nor concave as,

considering the horizontal tangent plane at (0, 0, 0), some of the function lies

above this tangent plane and the rest of it lies below this tangent plane.

(c) where we have a local maximum, the function is concave because it lies below all

of its tangent planes.

We now want to develop a way of determining whether a function is convex or concave

on R2 .

Suppose that we have a function f (x, y) that is convex. As we saw in Section 6.4.1, at

any point (a, b), the tangent plane to this function has a Cartesian equation given by

z = f (a, b) +

df

dx

xa

,

yb

(a,b)

and, as this function is convex, it must be the case that for all (x, y) R2 , the function

lies above this tangent plane, i.e. we must have

f (x, y) f (a, b) +

df

dx

(a,b)

xa

.

yb

However, using the second-order Taylor series for f (x, y) around the point (a, b), this

means that we have

f (a, b)+

df

dx

(a,b)

1

d2 f

xa

x a, y b

+

yb

2!

dx 2

(a,b)

xa

yb

f (a, b)+

df

dx

(a,b)

xa

,

yb

x a, y b

d2 f

dx 2

xa

yb

(a,b)

0,

and this just asserts that K(x, y) 0 using our notation from Section 7.2.2. However,

using what we saw before, this means that we require

H(x, y) 0

and

fxx (x, y) 0,

Activity 7.6 Using an argument similar to the one above, explain why a concave

function requires that H(x, y) 0 and fxx 0.

The upshot of this is that we can now see that a function, f (x, y), is

convex on R2 if, for all (x, y) R2 , H(x, y) 0 and fxx (x, y) 0, and

concave on R2 if, for all (x, y) R2 , H(x, y) 0 and fxx (x, y) 0.

3

Again, we have glossed over any complications in our derivation that would occur if fxx (x, y) = 0

for some point, (x, y).

270

Note, in particular, that when testing for convexity or concavity, we can have

H(x, y) = 0 even though we must have H(x, y) = 0 when we are classifying stationary

points using the method of the previous section. But, it should be clear that if a

function, f (x, y), has a stationary point and it is

convex, then that stationary point is a global minimum.

concave, then that stationary point is a global maximum.

That is, we now have a way of determining whether a local minimum (or a local

maximum) is a global minimum (or a global maximum).

Example 7.6 Show that the function f (x, y) = x2 + y 2 has a global minimum at

the point (0, 0, 0).

The first-order partial derivatives of this function are

fx (x, y) = 2x

and

fy (x, y) = 2y.

At a stationary point, we must have fx (x, y) = 0 and fy (x, y) = 0, i.e. we must have

x = 0 and y = 0. Indeed, as z = f (0, 0) = 0, this means that we have a stationary

point at (0, 0, 0).

The second-order partial derivatives of this function are

fxx (x, y) = 2,

and

7

fyy (x, y) = 2,

H(x, y) = (2)(2) 02 = 4.

So, at the stationary point, we have H(0, 0) = 4 > 0 and fxx (0, 0) = 2 > 0 which

means that this is a local minimum. But, in fact, we have H(x, y) = 4 0 and

fxx (x, y) = 2 0 for all (x, y) R2 here and so this function is actually convex on

R2 , i.e. the local minimum we have found here is actually a global minimum.

In particular, notice that this should have been obvious since we have

z = f (0, 0) = 0 at the stationary point and for all other x, y R, we have

z = f (x, y) = x2 + y 2 > 0,

i.e. f (x, y) f (0, 0) for all x, y R. Consequently, it should be clear that this

function has a global minimum at (0, 0) and this minimum value is zero.

Lastly, we note that these conditions can also be used to determine the regions in the

(x, y)-plane where a function is convex, concave or neither as the next example shows.

271

7. Two-variable optimisation

Example 7.7 Determine the regions in the (x, y)-plane where the function,

f (x, y) = x2 y 3 is convex, concave or neither.

The first-order partial derivatives of this function are

fx (x, y) = 2x

and

fy (x, y) = 3y 2 ,

fxx (x, y) = 2,

and

H(x, y) = (2)(6y) 02 = 12y.

As such, we see that:

When y > 0, H(x, y) < 0 and so the function is neither convex nor concave.

When y 0, H(x, y) 0 and fxx (x, y) = 2 0 and so the function is convex.

particular, observe that this function has a stationary point at (0, 0, 0) and that,

even though our method for classifying this point fails here (as H(0, 0) = 0), it is

clearly a saddle point.

Figure 7.3: The surface z = f (x, y) where f (x, y) = x2 y 3 from Example 7.7. Observe

that this function is convex when y 0 but that it is neither convex nor concave when

y > 0.

Lets now look at some applications of this material.

7.2.4

Applications

Optimisation problems are very common in economics and we now introduce two ways

in which they can arise in that subject. The first is their use in cost minimisation and

the second will be another instance of profit maximisation.

272

Cost minimisation

Suppose a firm is using quantities x and y of two commodities and this incurs a cost

given by the cost function, C(x, y). One might reasonably ask: What quantities should

they be using if they want to minimise their costs?

Example 7.8 A data processing company employs both senior and junior

programmers. A particularly large project will cost

C(x, y) = 2000 + 2x3 12xy + y 2 ,

pounds, where x and y represent the number of junior and senior programmers used

respectively. How many employees of each kind should be assigned to the project in

order to minimise its cost? What is this minimum cost?

To minimise the cost, we need to find the stationary points of C(x, y) and determine

which of them gives us a minimum. So, as before, we start by finding the first-order

partial derivatives of C(x, y), i.e.

Cx (x, y) = 6x2 12y

and

At a stationary point, both of these first-order partial derivatives are zero, i.e. we

must have Cx (x, y) = 0 and Cy (x, y) = 0. Thus, to find the stationary points, we

have to solve the simultaneous equations

6x2 12y = 0

12x + 2y = 0.

and

x2 2y = 0

and

6x + y = 0,

and then notice that the second equation gives us y = 6x. Substituting this into the

first equation then allows us to see that

x2 2(6x) = 0

x2 12x = 0

x(x 12) = 0

x = 0 or x = 12,

y = 6(0) = 0

or

y = 6(12) = 72,

respectively. Thus, the cost function, C(x, y), has two stationary points, namely the

points (0, 0) and (12, 72).

To classify these stationary points, we look at the second-order partial derivatives of

C(x, y), which are

Cxx (x, y) = 12x,

and

Cyy (x, y) = 2,

H(x, y) = (12x)(2) (12)2 = 24x 144 = 24(x 6).

Evaluating this at each of the stationary points we then find that:

273

7. Two-variable optimisation

H(0, 0) = 24(6) < 0,

and so this is a saddle point.

At (12, 72), the Hessian is

H(12, 72) = 24(+6) > 0

and

Consequently, to minimise the cost we want to use 12 junior and 72 senior

programmers. If we do this we find that the minimum cost is given by

C(12, 72) = 2000 + 3456 10368 + 5184 = 272,

i.e. the minimum cost is 272.4

Profit maximisation

We now describe the problem of maximising the profit of a firm which makes two

products, X and Y. Generally, if pX and pY are the selling prices of one unit of X and

one unit of Y respectively, then the total revenue, TR(x, y), obtained from producing

amounts x of product X and y of product Y is

TR(x, y) = xpX + ypY .

Of course, there are a number of ways in which the prices pX and pY may be related to

the quantities x and y. For instance:

If the goods were related, pX and pY could both depend on x and y (e.g. if we were

considering a music company producing an album on both CD and cassette).

If the goods were unrelated, pX and pY could depend only on x and y respectively

(e.g. a pharmaceuticals company producing paracetamol and insulin).

The firm will also have a joint total cost function, TC(x, y), which tells us how much it

costs to produce x units of X and y units of Y. Clearly, given TR(x, y) and TC(x, y), we

can consider the profit function of the firm, (x, y), which is given by

(x, y) = TR(x, y) TC(x, y) = xpX + ypY TC(x, y),

and we can maximise this function of x and y using the techniques described above.

Lets look at an example.

Example 7.9 Suppose that a firm is the sole supplier of X and Y (in other words,

it has a monopoly on these goods) and that the demands for X and Y, in tonnes, are

given by

x = 2 2pX + pY

and

y = 13 + pX 2pY ,

4

Which, thinking about it, is far less than the value of C(x, y) at the other stationary point since

C(0, 0) = 2000.

274

respectively. If the joint total cost function of the firm is TC(x, y) = 5 + x2 xy + y 2 ,

find the quantities of X and Y the firm should produce in order to maximise its

profit. What are the corresponding prices? What is the maximum profit?

We start by rearranging the equations to find expressions for pX and pY .5 The first

equation tells us that pY = x 2 + 2pX and so substituting this into the second

equation yields

y = 13 + pX 2(x 2 + 2pX ) = y = 13 + pX 2x + 4 4pX = 3pX = 17 2x y.

As such, we have

17 2x y

,

3

and so substituting this into pY = x 2 + 2pX , we find that

pX =

pY = x2+2

17 2x y

3x 6 + 34 4x 2y

28 x 2y

= pY =

= pY =

.

3

3

3

17 2x y

28 x 2y

=x

+y

(5 + x2 xy + y 2 )

3

3

1

= (17x 2x2 xy) + (28y xy 2y 2 ) (15 + 3x2 3xy + 3y 2 )

3

1

(x, y) =

15 + 17x + 28y 5x2 5y 2 + xy ,

3

and we can now maximise this profit function using the method above.

Activity 7.7 Finish the problem started in Example 7.9. That is, find the values of

x and y that maximise the profit function (x, y) found in the example, the

corresponding prices pX and pY , and the maximum profit.

7.3

Constrained optimisation

We now turn our attention to the problem of constrained optimisation, i.e. the problem

of optimising a function, f (x, y), in the case where the values of x and y we are

considering are constrained by the requirement that they must lie in some region, R, of

R2 . In particular, we will see that the optimal point we seek will

Note that if the price of X was fixed and the price of Y was increased, then the demand for X would

rise and the demand for Y would fall. This is the behaviour one might expect if X and Y were two related

commodities, e.g. if they were two different types of chocolate bar.

275

7. Two-variable optimisation

either be a point inside the region, in which case it will be a stationary point of

f (x, y) that happens to be in the region,

or it will be a point on the boundary of the region, in which case it need not be a

stationary point of f (x, y) even though it optimises this function over points in the

region.

Of course, in the former case, we can find and classify the stationary point in the region

using the method in the previous section and then, checking that this point is more

optimal than any point on the boundary of the region, we will have our answer. Lets

look at a quick example.

Example 7.10 Minimise the function f (x, y) = (x 1)2 + (y 1)2 given that (x, y)

must lie in the region defined by the inequalities x 0, y 0 and x + y 3.

The first-order partial derivatives of this function are

fx (x, y) = 2(x 1)

and

and so, setting these equal to zero, we see that (1, 1) is the only stationary point of

this function. The second-order partial derivatives of this function are

fxx (x, y) = 2,

and

fyy (x, y) = 2,

H(x, y) = (2)(2) 02 = 4,

and so we see that H(1, 1) = 4 > 0 and fxx (1, 1) = 2 > 0 which means that this

point is a local minimum. Indeed, as this point satisfies the inequalities given above,6

this point is in the specified region and so f (1, 1) = 0 is a candidate for the

minimum value of f (x, y) for (x, y) that lie in the region. However, we must check

that nothing odd is happening due to the points on the boundary of the region and

to do this we note that:

If we are on the x = 0 boundary of the region (so, technically, 0 y 3) we

have f (0, y) = 1 + (y 1)2 1 > 0.

If we are on the y = 0 boundary of the region (so, technically, 0 x 3) we

have f (x, 0) = (x 1)2 + 1 1 > 0.

If we are on the x + y = 3 boundary of the region we have x = 3 y (and,

technically, 0 y 3) which means that

f (3 y, y) = (2 y)2 + (y 1)2 = 2y 2 6y + 5 = 2 y

if we complete the square, but this means that f (3 y, y)

1

2

3

2

1

+ ,

2

> 0.

Thus, we cant find values of f (x, y) as small as f (1, 1) = 0 on any of the boundaries

of the region and so the minimum value of f (x, y) for points in this region is zero

and this occurs at the point (1, 1).

276

Activity 7.8

problems where the optimal point occurs on the boundary of the region since the

methods we have developed so far will not help us in that case.

7.3.1

Generally speaking, when the optimal point occurs on the boundary of a region, we will

be able to find it by considering the contours of the function we are optimising in

relation to the region we are optimising the function over. Indeed, when doing this, we

will find that we are in one of the two cases below.

The optimal point is at a corner of the boundary

The following example should clarify what we should do in this case.

Example 7.11 Maximise the function f (x, y) = x2 + y 2 given that (x, y) must lie

in the region defined by the inequalities x 0, y 0 and x + 2y 4.

We start by sketching the region which is the shaded triangle in Figure 7.4(a) and

some typical contours of the surface z = f (x, y). Indeed, notice that here, the

contour z = c has equation

x2 + y 2 = c,

and so it will be a circle of radius c centred on the origin. In the figure, we have

sketched the z = 4 and z = 16 contours and, in particular, we notice that as the

contours move away from the origin, the value of z increases as indicated by the

arrow.

Now, to find the maximum value of f (x, y) in this region we need a point which both

lies in the region, and

gives us the largest value of z.

That is, in this case, we want the point (4, 0) which is a corner of the boundary. In

particular, notice that with this point on the z = 16 contour:

we get a higher value of z than we do from any point on a contour with z < 16

(like, say, the z = 2 contour), and

we cant have any point on a contour with z > 16 as none of these contours will

give us a point in the region.

That is, the point (4, 0) which gives us z = 16 must indeed maximise the function

f (x, y) given that (x, y) must lie in the specified region.

6

That is, the point (1, 1) clearly satisfies the inequalities x 0 and y 0 as well as the inequality

x + y 3 since 1 + 1 = 2 < 3.

277

7. Two-variable optimisation

2

O

d

i n i rec

cr ti

ea on

si n o

g f

z

d

i n i rec

cr ti

ea on

si n o

g f

z

z = 16

z=4

(a)

(x , y )

z=2

z=1

x

(b)

Figure 7.4: (a) The region for Example 7.11 is the shaded triangle and the z = 4 and

z = 16 contours are indicated. (b) The region for Example 7.12 is the same shaded triangle

and the z = 1 and z = 2 contours are indicated. Note, in both cases, the direction in

which z increases.

The optimal point is on the boundary but it isnt a corner

This is the case that is going to concern us the most and so, for the moment, we just

look at an example to see what is happening before we come to the recommended

method for solving such problems.

Example 7.12 Maximise the function f (x, y) = xy given that (x, y) must lie in the

region defined by the inequalities x 0, y 0 and x + 2y 4.

We start by sketching the region which is the shaded triangle in Figure 7.4(b) and

some typical contours of the surface z = f (x, y). Indeed, notice that here, the

contour z = c has equation

xy = c,

and so it will be a rectangular hyperbola with the x and y-axes as its asymptotes. In

the figure, we have sketched the z = 1 and z = 2 contours and, in particular, we

notice that as the contours move away from the origin, the value of z increases as

indicated by the arrow.

Now, to find the maximum value of f (x, y) in this region we need a point which both

lies in the region, and

gives us the largest value of z.

That is, in this case, we want the point (x , y ) which is not a corner of the

boundary. In particular, notice that with this point on the z = 2 contour:

we get a higher value of z than we do from any point on a contour with z < 2

(like, say, the z = 1 contour), and

we cant have any point on a contour with z > 2 as none of these contours will

give us a point in the region.

That is, the point (x , y ) which gives us z = 2 must indeed maximise the function

f (x, y) given that (x, y) must lie in the specified region. But, how do we find this

point?

278

One way to find this point is to see that it is a point where, for some constant c, we

have a contour f (x, y) = c which is both

tangential to the line x + 2y = 4, and

touching the line x + 2y = 4.

Indeed, as the gradient of f (x, y) = c is given by

dy

f /x

y

=

= ,

dx

f /y

x

as we saw in Section 6.3.3 and the gradient of the line x + 2y = 4 is given by

y =2

x

2

dy

1

= ,

dx

2

the first condition means that we must have a point which satisfies the equation

y

1

=

x

2

y=

x

,

2

whereas the second condition means that we must have a point which satisfies the

equation x + 2y = 4. Solving these equations simultaneously, we find that this gives

us the point (x , y ) = (2, 1).7

Now, in such cases, we could always proceed in this way but, as we shall see in a

moment, there is a way of turning this idea into a much more general method. And, it is

this new method that we will generally use in such cases.

7.3.2

Suppose that we have been asked to optimise the function, f (x, y), given that (x, y)

must lie in some region and, by looking at the contours as above, we have determined

that the optimal point occurs on the boundary given by some equation g(x, y) = 0. In

particular, we are concerned with the case where the optimal point is not a corner of

the boundary, i.e. we want a point where, for some constant c, the contour f (x, y) = c is

both

tangential to the boundary given by g(x, y) = 0, and

touching the boundary given by g(x, y) = 0.

Now, for tangency, we require that the gradient of the contour f (x, y) = c, i.e.

dy

fx (x, y)

=

,

dx

fy (x, y)

is equal to the gradient of the boundary given by g(x, y) = 0, i.e.

dy

gx (x, y)

=

,

dx

gy (x, y)

7

And, at this point, z = f (2, 1) = 2 as expected from above. But, in general, we would not know the

optimal value of z = f (x, y) beforehand. We have just used it here to help illustrate what is going on.

279

7. Two-variable optimisation

where we have used what we saw in Section 6.3.3 twice. But, if these are equal, we have

gx (x, y)

fx (x, y)

=

fy (x, y)

gy (x, y)

fx (x, y)

fy (x, y)

=

,

gx (x, y)

gy (x, y)

=

fy (x, y)

fx (x, y)

=

.

gx (x, y)

gy (x, y)

fx (x, y) gx (x, y) = 0 and fy (x, y) gy (x, y) = 0,

or, more simply,

x

f (x, y) g(x, y) = 0.

y

So, any point which satisfies these two equations is a point where the contour

f (x, y) = c is tangential to the boundary g(x, y) = 0. We also note that the equation

f (x, y) g(x, y) = 0

g(x, y) = 0,

and so, any point which satisfies this equation lies on the boundary. Consequently, we

define the Lagrangean to be the function

L(x, y, ) = f (x, y) g(x, y),

and we call the Lagrange multiplier. In particular, the point we seek will be amongst

the stationary points of the Lagrangean since it must satisfy the equations

L

= 0,

x

L

= 0 and

y

L

= 0,

which we have derived above. In such cases, we call the function we are optimising,

f (x, y), the objective function and we call the equation of the boundary, which must be

written in the form g(x, y) = 0, the constraint. Lets see how we can use this method to

solve the constrained optimisation problem we saw in Example 7.12.

Example 7.13 Solve the constrained optimisation problem in Example 7.12 using

the method of Lagrange multipliers.

We have already seen that the optimal point we seek occurs when the function

f (x, y) = xy is tangential to the boundary given by the line x + 2y = 4. Writing the

equation of the line in the form g(x, y) = x + 2y 4 = 0 we see that the Lagrangean

is

L(x, y, ) = xy (x + 2y 4),

where is the Lagrange multiplier. We now find the stationary points of the

Lagrangean by finding its first-order partial derivatives, i.e.

Lx (x, y, ) = y ,

280

y = 0,

x 2 = 0 and x + 2y 4 = 0.

=y=

x

2

y=

x

,

2

and this, as you should expect is our tangency condition from Example 7.12. On the

other hand, the third equation is just

x + 2y = 4,

which, as you should expect, is our constraint. Solving these two equations

simultaneously, we then get the point (2, 1) as the only solution and so this must be

the optimal point we seek in agreement with what we found in Example 7.12.

Obviously, at this point, we find that f (1, 2) = 2 is the maximum value of f subject

to the constraint.

Sometimes we will see questions where we are just asked to use this method to solve a

constrained optimisation problem. In such cases, we will be given the objective function,

f (x, y), and the constraint, g(x, y) = 0, which we should be using. In particular, unless

we are explicitly asked to look at contours, we will just apply the method and assume

that the answer we find is the appropriate kind of optimal point.8 Lets look at an

example of such a problem.

Example 7.14

f (x, y) = 160x 3x2 2xy 2y 2 + 120y 18,

We write the constraint x + y = 34 as x + y 34 = 0 so that it is in the form

g(x, y) = 0 with g(x, y) = x + y 34. This allows us to write the Lagrangean as

L(x, y, ) = 160x 3x2 2xy 2y 2 + 120y 18 (x + y 34),

where is the Lagrange multiplier. To find the stationary points of the Lagrangean

we find its first-order partial derivatives, i.e.

Lx (x, y, ) = 160 6x 2y ,

Ly (x, y, ) = 2x 4y + 120 and

L (x, y, ) = (x + y 34),

and set them equal to zero to get the equations

160 6x 2y = 0,

2x 4y + 120 = 0

and

x + y 34 = 0.

Although, sometimes, the Lagrangean may have several stationary points and, if that happens, it

should be fairly straightforward to see which of these is the one we want.

281

7. Two-variable optimisation

= 160 6x 2y

and

= 2x 4y + 120,

160 6x 2y = 2x 4y + 120

2y = 4x 40

y = 2x 20,

whereas the third equation gives us x + y = 34 which is, of course, just our

constraint. So, as this gives y = 34 x, we can use it and the y = 2x 20 that we

have just found to eliminate y and get

34 x = 2x 20

3x = 54

x = 18.

point (18, 16) is the only stationary point of the Lagrangean and so it must be the

optimal point we seek. Thus, the maximum of f (x, y) subject to the constraint

g(x, y) = 0 is f (18, 16) = 2, 722.

Note that, although we have only used this method to find maxima in the examples

above, it will find minima as well and we will see an example of this when we consider

cost minimisation problems in Section 7.3.4.

7.3.3

method of Lagrange multipliers has another use which will be important when we come

to consider its applications in Section 7.3.4. To see this, consider that, when we are

asked to optimise f (x, y) subject to the constraint g(x, y) = c where c is a constant we

would proceed as follows.

Writing the constraint in the form g(x, y) c = 0, we have the Lagrangean

L(x, y, ) = f (x, y) (g(x, y) c),

where is the Lagrange multiplier. Its first-order partial derivatives are given by

Lx (x, y, ) = fx (x, y) gx (x, y),

Ly (x, y, ) = fy (x, y) gy (x, y) and

L (x, y, ) = (g(x, y) c)

and we find that the stationary points occur when we set these equal to zero to get the

equations

fx (x, y) gx (x, y) = 0,

fx (x, y) = gx (x, y) and fy (x, y) = gy (x, y),

and, clearly, neither of these depend on c. However, when we solve these equations in

the standard way and use the constraint, g(x, y) = c, we find the point (x , y ) which

282

optimises f (x, y) subject to the constraint. Of course, since we have used the constraint

to find the point (x , y ), the values of x and y we found will depend on c, i.e. we have

the functions x = x(c) and y = y(c) of c. In particular, this means that the optimal

value of f (x, y) subject to the constraint that we have found also depends on c, lets call

this F (c), i.e. we have

F (c) = f (x , y ) = f (x(c), y(c)).

Now, if we differentiate this with respect to c using the chain rule (see Section 6.3.3), we

have

f dx f dy

dF

=

+

,

dc

x dc

y dc

so that, using our expressions for fx (x, y) and fy (x, y) above, we get

dF

g dx

g dy

=

+

=

dc

x dc

y dc

g dx g dy

+

x dc y dc

However, given the constraint g(x, y) = c, we see that differentiating both sides with

respect to c we get

g dx g dy

+

= 1,

x dc y dc

where we have used the chain rule again on the left-hand-side. Putting these last two

equations together, we find that

dF

= ,

dc

i.e. the Lagrange multiplier is the rate of change of the optimal value of f (x, y) subject

to the constraint g(x, y) = c with respect to c. In particular, if we allowed our constraint

to change from g(x, y) = c to g(x, y) = c + c we would find that the change in the

optimal value of f (x, y) subject to this constraint, i.e. F (c), is given by

F

c

c,

provided that c is suitably small. Lets see how this works in the context of

Example 7.14.

Example 7.15 Using what we found in Example 7.14, find and hence find the

approximate change in the maximum value of f (x, y) subject to the constraint

x + y = 34 if the constraint is changed to x + y = 35.

We have found that the maximum value of f (x, y) subject to the constraint

x + y = 34 is f (18, 16) = 2, 722. As this occurs at the point (18, 16) we can use either

of the first two equations we found in Example 7.14 to find so, using the first, we

have

160 6x 2y = 0 = = 160 6(18) 2(16) = 20.

Consequently, using the theory above, we have a change in the constraint from

x + y = 34 to x + y = 35 which gives c = 1 and so the change in the maximum

value of f (x, y) subject to this constraint is approximately 20.

We now turn to some applications of constrained optimisation in economics.

283

7. Two-variable optimisation

7.3.4

Applications

introduce two ways in which they can arise in that subject. The first is their use when a

consumer wants to maximise their utility subject to a constraint imposed by their

budget and the second is when a firm wants to minimise its costs subject to a constraint

on its level of production.

Utility maximisation subject to a budget constraint

Suppose that a consumer is interested in buying some combination of two goods. Lets

say the price of the first good is p1 per unit, the price of the second good is p2 per unit

and the consumer has an amount M to spend on them. Indeed, if he wants to purchase

the bundle, (x1 , x2 ), which contains quantities x1 and x2 of the first and second good

respectively, it will cost him

p 1 x1 + p 2 x2 ,

and he can afford this bundle if he satisfies the budget constraint given by

p1 x1 + p2 x2 M,

M/p2

M/p2

ut

ili

n

g

io

ct

cr

ea

sin

re

x1

p1

di

ty

x2

of

x2

x2

p2

in

where x1 , x2 0 as they represent quantities. This gives us a budget set, i.e. the set of

all bundles that the consumer can afford given the prices of the goods and his budget.

Indeed, geometrically, the bundles he can afford are contained in the triangular region

illustrated in Figure 7.5(a).

=

M

M/p1

(a)

x1

M/p1

x1

(b)

Figure 7.5: (a) The budget set for our consumer. (b) Adding three contours, u(x1 , x2 ) =

interested in the point which is indicated in the figure.

Now, if his utility function is u(x1 , x2 ), the consumer wants to maximise this subject to

the constraint that he must be able to afford the bundle. That is, he must maximise

u(x1 , x2 ) subject to the constraint that the bundle he chooses is in the budget set. Lets

assume that, in this case, the utility function has contours u(x1 , x2 ) = c, where c is a

constant,9 that look like the ones illustrated in Figure 7.5(b) and that the direction of

9

These contours are called indifference curves as each point on such a contour gives our consumer the

same utility, i.e. he will be indifferent between the bundles represented by points on the same contour.

284

increasing utility is as indicated. Indeed, we observe in this case that the maximum

value of u(x1 , x2 ) subject to the constraint imposed by the budget set occurs at the

point indicated, i.e. a point where we have a contour of u(x1 , x2 ) which is both

tangential to the line p1 x1 + p2 x2 = M , and

touching the line p1 x1 + p2 x2 = M .

As such, we could use the method of Lagrange multipliers to solve this problem, i.e. we

would write the constraint as p1 x1 + p2 x2 M = 0 and use the Lagrangean

L(k, l, ) = u(x1 , x2 ) (p1 x1 + p2 x2 M ),

to find the point (x1 , x2 ) which maximises the consumers utility subject to the

constraint. Indeed, having done this, we can define the function

U (M ) = u(x1 , x2 ),

which tells us the maximum utility of the consumer given his budget, M . In particular,

using the theory in Section 7.3.3, we see that the value of the Lagrange multiplier we

get from solving the equations will satisfy

dU

= ,

dM

i.e. it gives us the consumers marginal utility of [budgetary] money if he is purchasing

in a way that maximises his utility subject to his budget set. Lets look at an example.

Example 7.16 Suppose cats cost 2 each and dogs cost 1 each. If a consumer

has a utility function given by

u(x1 , x2 ) = x21 x22 ,

when he buys x1 cats and x2 dogs, how many cats and dogs should he buy if he

wants to maximise his utility given that he has M to spend? Find, U (M ), the

maximum utility he can attain if he has a budget of M and verify that U (M ) =

where is the Lagrange multiplier.

In this case, the budget set will be the region defined by the inequalities

2x1 + x2 M,

and x1 , x2 0 which looks like the one in Figure 7.5(a) whereas the contours

u(x1 , x2 ) = c where u(x1 , x2 ) = x21 x22 look like the ones sketched in Figure 7.5(b). As

such, we are in the situation described above and so we need to maximise u(x1 , x2 )

subject to the constraint that

2x1 + x2 = M

2x1 + x2 M = 0,

if we want the constraint in the right form. Thus, we have the Lagrangean

L(x1 , x2 , ) = x21 x22 (2x1 + x2 M ),

285

7. Two-variable optimisation

and we seek the points which simultaneously satisfy the equations Lx1 (x1 , x2 , ) = 0,

Lx2 (x1 , x2 , ) = 0 and L (x1 , x2 , ) = 0. The first-order partial derivatives of

L(x1 , x2 , ) are

Lx1 (x1 , x2 , ) = 2x1 x22 2,

Lx2 (x1 , x2 , ) = 2x21 x2 and

L (x1 , x2 , ) = (2x1 + x2 M ) ,

and we set these equal to zero to yield the equations

2x1 x22 2 = 0,

2x21 x2 = 0

2x1 + x2 M = 0.

and

We now solve these by eliminating from the first two equations, i.e. we get

= x1 x22 = 2x21 x2

x1 x2 (x2 2x1 ) = 0

x2 = 2x1 ,

where we reject the solutions where x1 = 0 and x2 = 0 as these give a utility of zero

which, clearly, wont give us the maximum we seek. We then use this new

relationship between x1 and x2 in the third equation, which is just the constraint

2x1 + x2 = M , to get

2x1 + 2x1 = M

4x1 = M

x1 =

M

,

4

and then, using this in the equation x2 = 2x1 , we get x2 = M/2. Thus, these values

of x1 and x2 maximise our consumers utility if he has a budget of M and his

maximum utility is then given by

U (M ) = u

M M

,

4 2

M

4

M

2

M4

,

64

4M 3

M3

=

.

64

16

Of course, we can also find the value of using, say, the equation

U (M ) =

= x1 x22

M

4

M

2

M3

,

16

Activity 7.9 Another consumer has a budget of 4 to buy cats and dogs at the

prices in Example 7.16 and her utility function is u(x1 , x2 ) = 3x1 + x2 when she buys

x1 cats and x2 dogs. Sketch the budget set and some contours u(x1 , x2 ) = c where c

is a constant for this consumer. How many cats and dogs should she buy if she wants

to maximise her utility given her budget?

286

Suppose that capital costs v per unit and labour costs w per unit. This means that a

firm which uses an amount k of capital and l of labour will incur costs given by the cost

function

C(k, l) = vl + wk.

Also suppose that these inputs allow the firm to produce an amount given by the

production function, q(k, l). We want to ask: How much capital and labour should the

firm use if it needs to produce an amount Q of its product? That is, we want to solve

the constrained optimisation problem

minimise C(k, l) subject to the constraint q(k, l) = Q,

where k, l 0 as they are quantities. Lets assume that, in this case, the constraint

q(k, l) = Q looks like the curve in Figure 7.6(a) for k, l 0. If we also sketch some

contours of the cost function,10 we can identify the direction in which costs are

co

g

sin

ea

de

cr

di

re

ct

io

st

of

7

O

(a)

k

(b)

Figure 7.6: (a) The constraint q(k, l) = Q. (b) Adding three contours, C(k, l) = c, where

the point which is indicated in the figure.

decreasing as indicated in Figure 7.6(b). Indeed, we observe in this case that the

minimum value of C(k, l) subject to the constraint q(k, l) = Q occurs at the point

indicated, i.e. a point where we have a contour of C(k, l) which is both

tangential to the constraint q(k, l) = Q, and

touching the constraint q(k, l) = Q.

As such, we could use the method of Lagrange multipliers to solve this problem, i.e. we

would write the constraint as q(k, l) Q = 0 and use the Lagrangean

L(k, l, ) = C(k, l) (q(k, l) Q),

to find the point (k , l ) which minimises the costs subject to the constraint. Indeed,

having done this, we can define the function

C(Q)

= C(k , l ),

10

These contours are called isocosts as each point on such a contour costs the firm the same amount

of money.

287

7. Two-variable optimisation

which tells us the minimum cost of producing an amount, Q. In particular, using the

theory in Section 7.3.3, we see that the value of the Lagrange multiplier we get from

solving the equations will satisfy

dC

= ,

dQ

i.e. it gives us the marginal cost of the firm if it is producing in a way that minimises its

costs subject to the constraint that it is producing an amount, Q. Lets look at an

example.

Example 7.17 Suppose capital, k, costs 16 per unit and labour, l, costs 1 per

unit. If a firm can produce an amount given by the production function

q(k, l) = 10k 1/4 l1/4 ,

what values of k and l will minimise the cost of producing Q units? Find, C(Q),

the

minimum cost of producing Q and verify that C (Q) = where is the Lagrange

multiplier.

In this case, the constraint q(k, l) = Q will look like the curve in Figure 7.6(a) for

k, l 0 and so we are in the situation described above. Indeed, here the cost

function is

C(k, l) = 16k + l,

and, writing the constraint in the form q(k, l) Q = 0, we get the Lagrangean

L(k, l, ) = 16k + l (q(k, l) Q).

We seek the points which simultaneously satisfy the equations Lk (k, l, ) = 0,

Ll (k, l, ) = 0 and L (k, l, ) = 0 so we find the first-order partial derivatives of

L(k, l, ), i.e.

10 3 1

k 4l4 ,

4

10 1 3

k 4 l 4 and

4

Lk (k, l, ) = 16

Ll (k, l, ) = 1

L (k, l, ) = 10k 4 l 4 Q ,

and set these equal to zero to yield the equations

3 1

5

16 k 4 l 4 = 0,

2

5 1 3

1 k 4 l 4 = 0

2

10k 4 l 4 Q = 0.

and

We now solve these by eliminating from the first two equations, i.e. we get

1

5 l4

16 3 = 0

2 k4

2

5

= 16

k4

1

l4

1

5 k4

1 3 =0

2 l4

288

2

5

l4

1

k4

from the second equation. As such, we can equate these expressions for to get

16

2

5

k4

1

l4

2

5

l4

k4

16k = l.

We then use this new relationship between k and l in the third equation, which is

just the constraint 10k 1/4 l1/4 = Q, to get

1

Q = 10k 4 (16k) 4

Q = 20k 2

k2 =

Q

20

k=

Q2

,

400

l = 16

Q2

400

Q2

.

25

Thus, these values of k and l minimise the cost of producing Q units. The minimum

cost is then given by

Q2 Q2

,

400 25

C(Q)

=C

= 16

Q2

400

Q2

25

2Q2

,

25

4Q

.

C (Q) =

25

Of course, we can also find the value of using, say, the equation

=

2

5

l4

k

1

4

2

5

(Q2 /25) 4

(Q2 /400)

1

4

4Q

,

25

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

find and classify the stationary points of a function of two variables;

solve problems from economics-based subjects that involve unconstrained

optimisation;

optimise a function in the presence of constraints;

solve problems from economics-based subjects that involve constrained

optimisation.

289

7. Two-variable optimisation

Solutions to activities

Solution to activity 7.1

The first-order partial derivatives of the function are

fx (x, y) = 2x 4

and

fy (x, y) = 2y + 4.

At a stationary point, both of the first-order partial derivatives are zero, i.e. we must

have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to solve

the simultaneous equations

2x 4 = 0

and

2y + 4 = 0.

But, clearly, the first of these equations gives x = 2 and the second gives y = 2. Thus,

(2, 2) is the only stationary point of f (x, y).

Solution to activity 7.2

The first-order partial derivatives of the function are

fx (x, y) = 9x2 + 18x 72

and

At a stationary point, both of the first-order partial derivatives are zero, i.e. we must

have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to solve

the simultaneous equations

9x2 + 18x 72 = 0

and

6y 2 24y 126 = 0.

Now, notice that the first equation contains no ys and the second equation contains no

xs. As such, the first equation tells us everything there is to know about x, i.e.

9x2 + 18x 72 = 0 = x2 + 2x 8 = 0 = (x + 4)(x 2) = 0 = x = 4 or x = 2,

whereas the second equation tells us everything we need to know about y, i.e.

6y 2 24y 126 = 0 = y 2 4y 21 = 0 = (y + 3)(y 7) = 0 = y = 3 or y = 7.

As such, since we can take any of the x values with any of the y values we can see that

this function has four stationary points, namely (4, 3), (4, 7), (2, 3) and (2, 7).

Solution to activity 7.3

Using the first-order partial derivatives we found in Activity 7.1, we find that the

second-order partial derivatives are

fxx (x, y) = 2,

and

fyy (x, y) = 2.

As these are constants, they take these values at the stationary point (and, indeed, at

all other points). Thus, we can see that the Hessian at the stationary point is given by

H(2, 2) = (2)(2) (0)2 = 4 > 0

so this is a local minimum.

290

and

Using the first-order partial derivatives we found in Activity 7.2, we find that the

second-order partial derivatives are

fxx (x, y) = 18x + 18,

and

H(x, y) = (18x + 18)(12y 24) 02 = 216(x + 1)(y 2).

Evaluating this at each of the stationary points we find that:

At (4, 3), the Hessian is

H(4, 3) = 216(3)(5) > 0

and

At (4, 7), the Hessian is

H(4, 7) = 216(3)(+5) < 0,

and so this is a saddle point.

H(2, 3) = 216(+3)(5) < 0,

and so this is a saddle point.

At (2, 7), the Hessian is

H(2, 7) = 216(+3)(+5) > 0

and

Thus, the stationary point (4, 3) is a local maximum, (4, 7) and (2, 3) are saddle

points and (2, 7) is a local minimum.

Solution to activity 7.5

The first-order partial derivatives of this function are

fx (x, y) = 4(x 1)3

and

So, clearly, the only stationary point is at (1, 1) as this is the only point that makes

fx (x, y) = 0 and fy (x, y) = 0. The second-order partial derivatives of this function are

given by

fxx (x, y) = 12(x 1)2 ,

and

H(x, y) = [12(x 1)2 ][12(y 1)2 ] 02 = 144(x 1)2 (y 1)2 .

291

7. Two-variable optimisation

Indeed, evaluating this as the stationary point gives H(1, 1) = 0 and so the method we

used above fails.

However, if we consider the surface z = f (x, y), notice that we have z = f (1, 1) = 0 at

the stationary point and for all other x, y R, we have

z = f (x, y) = (x 1)4 + (y 1)4 > 0,

i.e. f (x, y) f (1, 1) for all x, y R. Consequently, it should be clear that this function

has a local minimum at (1, 1) and this minimum value is zero.11

Solution to activity 7.6

Suppose that we have a function f (x, y) that is concave. As we saw in Section 6.4.1, at

any point (a, b), the tangent plane to this function has a Cartesian equation given by

z = f (a, b) +

df

dx

xa

,

yb

(a,b)

and, as this function is concave, it must be the case that for all (x, y) R2 , the function

lies below this tangent plane, i.e. we must have

f (x, y) f (a, b) +

df

dx

(a,b)

xa

.

yb

However, using the second-order Taylor series for f (x, y) around the point (a, b), this

means that we have

f (a, b)+

df

dx

(a,b)

1

d2 f

xa

x a, y b

+

yb

2!

dx 2

(a,b)

xa

yb

f (a, b)+

df

dx

(a,b)

xa

,

yb

d2 f

x a, y b

dx 2

xa

yb

(a,b)

0,

and this just asserts that K(x, y) 0 using our notation from Section 7.2.2. However,

using what we saw before, this means that we require

H(x, y) 0

and

fxx (x, y) 0,

Solution to activity 7.7

We have found that the profit function is given by

(x, y) =

11

1

15 + 17x + 28y 5x2 5y 2 + xy ,

3

Actually, this is not only a local minimum, it is a global minimum as this is truly the smallest value

the function can take for x, y R.

12

Again, we have glossed over any complications in our derivation that would occur if fxx (x, y) = 0

for some point, (x, y).

292

and, to maximise this, we need to find its stationary points and determine which of

them gives us a maximum. So, we start by finding the first-order partial derivatives of

(x, y), i.e.

x (x, y) =

1

17 10x + y

3

and

y (x, y) =

1

28 10y + x .

3

At a stationary point, both of these first-order partial derivatives are zero, i.e. we must

have x (x, y) = 0 and y (x, y) = 0. Thus, to find the stationary points, we have to solve

the simultaneous equations

10x y = 17

x 10y = 28.

and

We start by noticing that the first equation gives us y = 10x 17 and so, substituting

this into the second equation, we get

x 10 10x 17

= 28

99x = 198

x = 2,

and then, using y = 10x 17 again, we get y = 3. Thus, the profit function, (x, y), has

(2, 3) as its only stationary point.

To classify this stationary point, we look at the second-order partial derivatives of

(x, y), which are

xx (x, y) =

10

,

3

xy (x, y) =

1

= yx (x, y)

3

and

yy (x, y) =

10

,

3

H(x, y) =

10

10

1

3

100 1

= 11.

9

9

Clearly, at (2, 3), we have H(2, 3) > 0 and fxx (2, 3) < 0, which means that the

stationary point we have found is indeed a local maximum. Consequently, to maximise

its profit, the firm should produce 2 tonnes of X and 3 tonnes of Y so that it can sell

them at prices, in pounds, of

pX =

17 2(2) 3

10

=

3

3

3.33

and

pY =

28 2 2(3)

20

=

3

3

6.67,

respectively and, in doing so, the firm will make a maximum profit of

(2, 3) =

1

44

15 + 17(2) + 28(3) 5(2)2 5(3)2 + (2)(3) =

3

3

14.67,

pounds.

Solution to activity 7.8

Of course, this should have been obvious

either by noting that

f (x, y) = (x 1)2 + (y 1)2 0,

for all points (x, y) R2 with a minimum of zero at (1, 1);

293

7. Two-variable optimisation

or by observing that as H(x, y) = 4 > 0 and fxx (x, y) = 2 > 0 for all points

(x, y) R2 , we see that this function is convex and so the stationary point (1, 1) we

found above is a global minimum.

Then, using either of these facts, we see that we have found the minimum of f (x, y) for

all (x, y) R2 and so it must be the minimum in the given region too since it is in that

region.

Solution to activity 7.9

Given the prices in Example 7.16 and the consumers budget of 4, we see that the

budget set is given by

2x1 + x2 4,

where x1 , x2 0 as they are quantities. This is sketched in Figure 7.7(a).

We are now asked to sketch some contours u(x1 , x2 ) = c where c is a constant and

u(x1 , x2 ) = 3x1 + x2 ,

for this consumer. Indeed, looking at the budget set, it makes sense to choose the

contours where c = 4 and c = 6 and these are illustrated in Figure 7.7(b). This allows us

to see the direction of increasing utility, which is indicated in the figure, and allows us

to see that the point (2, 0) is the one where we get the highest utility if we are

constrained to stay within the budget set. Consequently, this consumer should buy two

cats and no dogs if she wants to maximise her utility subject to her budget constraint.

x2

x2

6

4

4

1

c=

2x

+

4

=

2

c=

x2

O

of

on tility

i

t

u

ec

d i r si n g

a

re

in c

x1

4

3

(a)

x1

(b)

Figure 7.7: The sketches for Activity 7.9. (a) The budget set for our consumer. (b) Adding

increasing is as indicated and we are interested in the point which is indicated in the

figure.

Exercises

Exercise 7.1

The function

f (x, y) = x2 ln y y ln y,

is defined for y > 0 and all x R. Find its stationary points and classify them.

294

7.3. Exercises

Exercise 7.2

Consider the function

f (x, y) = x+1 y 1 ,

for x, y > 0 and some constants and . For what values of and is this function

convex? Sketch the region(s) in the (, )-plane that correspond to these values of

and .

Exercise 7.3

Suppose that a firm can sell its product in a domestic and a foreign market and that

the inverse demand functions for these two markets are

p1 = 30 4q1

and

p2 = 50 5q2 ,

where p1 and p2 are the prices (in pounds) if they sell quantities q1 and q2 (in tonnes) in

the domestic and foreign markets respectively. Given that the total cost function of the

firm (in pounds) is

TC(q) = 10 + 10q,

where q is the quantity produced (in tonnes) and that the firm has a monopoly in both

markets, find the quantities it should sell in these markets if they want to maximise

their profit. What are the corresponding prices? What is the maximum profit?

Exercise 7.4

Use the method of Lagrange multipliers to optimise the function

f (x, y) = x3/8 y 2/3 ,

subject to the constraint x2 + y 2 = 25 where x, y > 0.

By sketching the constraint and some contours of f , justify your use of the method of

Lagrange multipliers and determine whether the point you have found maximises or

minimises f subject to the constraint.

Exercise 7.5

Given an amount of capital, k, and labour, l, a firm produces a quantity of goods,

q(k, l), where

q(k, l) = ln k + ln l,

for k, l > 0. Suppose that each unit of capital costs 2 and each unit of labour costs 3.

Use the method of Lagrange multipliers to find the values of k and l that maximise the

firms production given that their total budget for capital and labour is M .

Hence show that the maximum production the firm can achieve given a budget of M

is given by

M

Q(M ) = 2 ln ,

2 6

and verify that Q (M ) = where is the Lagrange multiplier.

295

7. Two-variable optimisation

Solutions to exercises

Solution to exercise 7.1

Given that

f (x, y) = x2 ln y y ln y,

for y > 0 and all x R, we see that the first-order partial derivatives of this function are

fx (x, y) = 2x ln y

and

fy (x, y) =

x2

(ln y + 1),

y

where we have used the product rule when finding fy (x, y). At a stationary point, both

of the first-order partial derivatives are zero, i.e. we must have fx (x, y) = 0 and

fy (x, y) = 0. Thus, to find the stationary points we have to solve the simultaneous

equations

x2

2x ln y = 0

and

ln y 1 = 0.

y

If we start by looking at the first equation, this gives us

x ln y = 0

x = 0 or ln y = 0

x = 0 or y = 1.

x = 0 we must have

0 ln y 1 = 0

ln y = 1

y = e1 ,

y = 1 we must have

x2

ln 1 1 = 0

1

x2 = 1

x = 1,

Consequently, the points (0, e1 ), (1, 1) and (1, 1) are stationary points of this

function.

To classify these stationary points, we note that the second-order partial derivatives are

fxx (x, y) = 2 ln y,

fxy (x, y) =

2x

x2 1

= fyx (x, y) and fyy (x, y) = 2 ,

y

y

y

H(x, y) = (2 ln y)

x2 1

y2 y

2x

y

2(x2 + y) ln y + 4x2

.

y2

296

H(0, e1 ) =

2 e1 ln(e1 )

= 2 e > 0 and fxx (0, e1 ) = 2 ln(e1 ) = 2 < 0,

2

e

At (1, 1), the Hessian is

4

H(1, 1) = < 0,

1

as ln 1 = 0 and so this is a saddle point.

4

H(1, 1) = < 0,

1

as ln 1 = 0 and so this is a saddle point.

Thus, the stationary points (0, e1 ), (1, 1) and (1, 1) are a local maximum and two

saddle points respectively.

Solution to exercise 7.2

We have, for x, y > 0, the function f (x, y) = x+1 y 1 whose first-order partial

derivatives are

fx (x, y) = ( + 1)x y 1

fy (x, y) = ( 1)x+1 y 2 ,

and

fxx (x, y) = ( + 1)x1 y 1 ,

fxy (x, y) = ( + 1)( 1)x y 2 = fyx (x, y),

and

H(x, y) = ( + 1)( 1)( 2) ( + 1)2 ( 1)2 x2 y 2(2) ,

and, for f (x, y) to be convex, we need H(x, y) 0 and fxx (x, y) 0, i.e. we need

( + 1)x1 y 1 0

( + 1) 0,

( )

as x, y > 0, and

( + 1)( 1)( 2) ( + 1)2 ( 1)2 x2 y 2(2) 0,

which gives

(+1)( 1) [( 2) ( + 1)( 1)] 0

(+1)( 1)(1) 0,

()

either 1: so that 0 from ( ) which means that we have 0 and, from

(),

297

7. Two-variable optimisation

(),

1 and + 1 (see region B in Figure 7.8), or

1 and + 1 (see region C in Figure 7.8).

=

1

2

1

Here the firm is a monopoly and so, as it is the sole supplier of its product in both

markets, when it supplies quantities q1 and q2 to the domestic and foreign markets

respectively, the prices will be given by the inverse demand functions

p1 = 30 4q1

and

p2 = 50 5q2 ,

TR(q1 , q2 ) = p1 q1 + p2 q2 = (30 4q1 )q1 + (50 5q2 )q2 ,

and their total costs are given by

TC(q) = 10 + 10q

TC(q1 , q2 ) = 10 + 10(q1 + q2 ),

(q1 , q2 ) = TR(q1 , q2 ) TC(q1 , q2 ) = 20q1 + 40q2 4q12 5q22 10,

13

Note that the situation described here, where a producer charges different prices in different markets,

is sometimes known as price discrimination.

298

To do this, we see that the first-order partial derivatives of (q1 , q2 ) are

q1 (q1 , q2 ) = 20 8q1

and

q2 (q1 , q2 ) = 40 10q2 ,

and so, as a stationary point occurs when q1 (q1 , q2 ) = 0 and q2 (q1 , q2 ) = 0, we need to

solve the simultaneous equations

20 8q1 = 0

and

40 10q2 = 0.

But, of course, the first equation gives q1 = 5/2 and the second equation gives q2 = 4

which means that (5/2, 4) is the only stationary point of (q1 , q2 ).

To check that this is a maximum, we look at the second-order partial derivatives of

(q1 , q2 ), which are

q1 q1 (q1 , q2 ) = 8,

q1 q2 (q1 , q2 ) = 0 = q2 q1 (q1 , q2 )

and

q2 q2 (q1 , q2 ) = 10,

H(x, y) = (8)(10) 02 = 80.

Clearly, at (5/2, 4), we have H(5/2, 4) > 0 and q1 q1 (5/2, 4) < 0 which means that the

stationary point we have found is indeed a local maximum. Consequently, to maximise

its profit, the firm should supply 5/2 tonnes of its product to the domestic market and 4

tonnes of its product to the foreign market so that it can sell them at prices, in pounds,

of

5

= 20

and

p2 = 50 5(4) = 30,

p1 = 30 4

2

respectively and, in doing so, the firm will make a maximum profit of

(5/2, 4) = 20

5

2

+ 40(4) 4

5

2

5(4)2 10 = 95,

pounds.

Solution to exercise 7.4

Writing the constraint in the form x2 + y 2 25 = 0, we get the Lagrangean

L(x, y, ) = x3/8 y 2/3 (x2 + y 2 25),

and we seek the points which simultaneously satisfy the equations Lx (x, y, ) = 0,

Ly (x, y, ) = 0 and L (x, y, ) = 0. So we find the first-order partial derivatives of

L(x, y, ), i.e.

3

Lx (x, y, ) = x5/8 y 2/3 2x,

8

2

Ly (x, y, ) = x3/8 y 1/3 2y and

3

L (x, y, ) = (x2 + y 2 25),

299

7. Two-variable optimisation

2 3/8 1/3

x y

2y = 0

3

3 5/8 2/3

x

y 2x = 0,

8

x2 + y 2 25 = 0.

and

We now solve these by eliminating from the first two equations, i.e. we get

3 5/8 2/3

x

y 2x = 0

8

3

16

y 2/3

x13/8

1

3

x3/8

y 4/3

2 3/8 1/3

x y

2y = 0

3

from the second equation. As such, we can equate these expressions for to get

3

16

y 2/3

x13/8

1

3

x3/8

y 4/3

y2 =

16 2

x.

9

We then use this new relationship between x and y in the third equation, which is just

the constraint x2 + y 2 = 25, to get

x2 +

16 2

x = 25

9

25 2

x = 25

9

x2 = 9

x = 3,

y2 =

16 2

(3 ) = 16

9

y = 4,

The constraint is x2 + y 2 = 15 and this is a circle of radius five centred on the origin

which, for x, y > 0, is illustrated in Figure 7.9(a). The objective function,

f (x, y) = x3/8 y 2/3 has contours f (x, y) = c, where c is a constant, that look a bit like

rectangular hyperbolae as illustrated in Figure 7.9(b). The direction in which f (x, y) is

increasing is indicated in this figure along with the point we found above using the

Lagrange multiplier method i.e. a point where we have a contour of f (x, y) which is

both tangential to the constraint and touching the constraint. Having seen this, it

should be clear that this point will maximise f subject to the constraint.

Solution to exercise 7.5

The firm has M to spend on capital and labour where each unit of capital costs 2

and each unit of labour costs 3. As such, the cost of using k units of capital and l

units of labour is 2k + 3l and this gives us the constraint 2k + 3l = M .14 So, to

maximise the quantity

q(k, l) = ln k + ln l,

14

Strictly, the constraint is 2k + 3l M where k, l > 0, but we can see that if we chose a point where

2k + 3l < M , we could not maximise the quantity produced since, spending more on capital and labour

to get a point where 2k + 3l = M , we would get a larger quantity. This should make sense if you consider

the discussion of budget constraints in Section 7.3.4.

300

5

4

i n d i re

cr

e a c ti o

si n n

g of

f(

x,

y)

(a)

(b)

Figure 7.9: The sketches for Exercise 7.4. (a) The constraint x2 + y 2 = 25 for x, y > 0. (b)

Adding three contours, f (x, y) = c, where the direction in which f (x, y) is increasing is

as indicated. Clearly, we are interested in the point (3, 4) which is indicated in the figure.

that the firm can produce subject to the constraint 2k + 3l = M where k, l > 0 we use

the Lagrangean

L(k, l, ) = ln(k) + ln(l) (2k + 3l M ).

We seek the points which simultaneously satisfy the equations Lk (k, l, ) = 0,

Ll (k, l, ) = 0 and L (k, l, ) = 0. The first-order derivatives of L(k, l, ) are

Lk (k, l, ) =

1

2,

k

Ll (k, l, ) =

1

3

l

L (k, l, ) = (2k + 3l M ),

and

1

2 = 0,

k

1

3 = 0

l

2k + 3l M = 0.

and

We now solve these by eliminating from the first two equations, i.e. we get

=

1

1

=

2k

3l

3l = 2k

3

k = l.

2

We then use this new relationship between k and l in the third equation, which is just

the constraint 2k + 3l = M , to get

2

3

l + 3l = M

2

6l = M

l=

M

,

6

k=

3 M

M

=

.

2

6

4

Thus the values of k and l that maximise q(k, l) subject to the constraint are k = M/4

and l = M/6.

In this case, the maximum production achievable, given a budget of M , is

Q(M ) = q

M M

,

4 6

= ln

M

4

+ ln

M

6

= ln

M2

24

= 2 ln

2 6

301

7. Two-variable optimisation

as required. Further, we can find the value of using, say, the equation

=

1

2k

1

2

4

M

2 6

Q(M ) = 2 ln

2

,

M

can be written as

Q(M ) = 2 ln M 2 ln 2 6

Q (M ) =

2

,

M

Note: Although this question is similar to what we saw in Example 7.17, notice that

here we are maximising production subject to a budget constraint whereas in

Example 7.17 we were minimising costs subject to a production constraint. In

particular, this means that you should always read the question carefully to ensure that

you are using the correct objective function and constraint! Further, we were not asked

to justify the assertion that the optimal point we found was a maximum here and so we

havent, but sometimes, as in Exercise 7.4, we will be asked to provide such a

justification.

302

Chapter 8

Differential equations

Essential reading

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 12.112.4 and 12.712.8.

Anthony and Biggs (1996) Chapters 27 and 28.

Further reading

Simon and Blume (1994) Sections 24.124.3 and Section 25.3.

Adams and Essex (2010) Sections 3.7 and 7.9, parts of Sections 17.117.2,

17.417.6.

Aims and objectives

To see different types of differential equation and solve them using the given

methods.

To use differential equations to solve problems from economics-based subjects.

Specific learning outcomes can be found near the end of this chapter.

8.1

unknown function. In this course, we will be concerned with ordinary differential

equations (or ODEs), i.e. those which involve functions of only one independent

variable.1 It is often convenient to classify ODEs according to how the highest order

derivative it contains appears in it. That is, we say that the

order of an ODE is given by the order of the highest-order derivative it contains.

1

If the differential equation involves a function with more than one independent variable, then it

would contain at least one partial derivative of the function and we would have a partial differential

equation.

303

8. Differential equations

it contains.

On the whole, we will be concerned with ODEs which are first or second-order and of

the first degree.

Activity 8.1 Determine the order and degree of the following ODEs involving the

unknown function, y(x).

2

(a)

dy

dx

(b)

dy

=x

dx

(c)

d3 y

=x

dx3

=x

d2 y

.

dx2

d2 y

dx2

d2 y

dx2

.

2

Given an ODE, we usually want to solve it. That is, we want to find the unknown

function in a form which does not involve any derivatives, and when we have found the

function in this form we call it a solution to the ODE. In general, we will find that any

given ODE has many solutions and so we get a general solution, i.e. we find the

unknown function up to some arbitrary constants that are not determined by the ODE

itself. Lets look at a very simple example of an ODE (i.e. one that can be solved by

direct integration) to see how things work.

8

Example 8.1

dy

= 2x + 1.

dx

This is a first-order ODE of degree one and it is very easy to solve because we can

just integrate both sides to see that

dy

dx =

dx

(2x + 1) dx

y = x2 + x + c,

variable is y, the unknown function here is y(x), this gives us

y(x) = x2 + x + c.

We call this the general solution to the ODE as any solution to the ODE will have

this form and each of these solutions arises from a different value of the arbitrary

constant, c.

In addition to an ODE, we may also be given conditions which give us extra

information about the function we are interested in. Given this information, we can find

a particular solution, i.e. a solution to the ODE that also satisfies the given conditions.

304

Example 8.2 Find the solution to the ODE in Example 8.1 that also satisfies the

condition y(0) = 1.

We know that all solutions to the ODE in the previous example have the form

y(x) = x2 + x + c.

If, in addition, we want a solution that satisfies the condition y(0) = 1, we can set

x = 0 in both sides of this expression and use the condition to get

y(0) = 02 + 0 + c

1 = c.

That is, if we want to satisfy the condition y(0) = 1 as well, we must take c = 1 in

the general solution. Consequently,

y(x) = x2 + x + 1,

is the particular solution to the ODE given that y(0) = 1.

Of course, it should be clear from this example that, when we apply different conditions

to the general solution, we can get different values of c and hence different particular

solutions.

Activity 8.2 Find the particular solutions to the ODE in Example 8.1 that also

satisfy the conditions (a) y(0) = 0, (b) y(0) = 1 and (c) y(2) = 7.

Indeed, we solved simple ODEs that looked like this when we considered marginal

functions in Section 5.4.1. Further, as the following example shows, we can also solve

simple higher-order ODEs by direct integration.

Example 8.3

d2 y

= 6x + 2.

dx2

This is a second-order ODE of degree one and, once again, we can begin to solve it

by integrating both sides to see that

d2 y

dx =

dx2

(6x + 2) dx

dy

= 3x2 + 2x + c,

dx

but this does not give us a solution as we still have a derivative in our expression.

However, if we integrate both sides again, we get

dy

dx =

dx

(3x2 + 2x + c) dx

y = x3 + x2 + cx + d,

dependent variable is y, the unknown function here is y(x), this gives us

y(x) = x3 + x2 + cx + d.

This is the general solution to the ODE as any solution to the ODE will have this

form and each of these solutions arises from different values of the arbitrary

constants, c and d.

305

8. Differential equations

Of course, if we find that there are several arbitrary constants in the general solution of

an ODE, such as c and d in the general solution to the second-order ODE in

Example 8.3, we will need more conditions in order to determine these constants and

hence find a particular solution.

Example 8.4 Find the solution to the ODE in Example 8.3 that also satisfies the

conditions y(0) = 1 and y (0) = 2.

We know that all solutions to the ODE in the previous example have the form

y(x) = x3 + x2 + cx + d.

If, in addition, we want a solution that satisfies the condition y(0) = 1, we can set

x = 0 in both sides of this expression and use the condition to get

y(0) = 03 + 02 + c(0) + d

1 = d.

y (x) = 3x2 + 2x + c,

and so, if we want a solution that satisfies the condition y (0) = 2, we can set x = 0

in both sides of this expression and use the condition to get

y (0) = 3(02 ) + 2(0) + c

2 = c.

y(x) = x3 + x2 + 2x + 1,

is the particular solution to the ODE given that y(0) = 1 and y (0) = 2.

More generally, we wont be able to solve ODEs by direct integration and so the

procedure for solving an ODE will usually involve identifying its type and applying the

relevant method. In what follows, we shall see how the form of an ODE allows us to

choose the method that will enable us to solve it in cases where direct integration cant

be used.

8.2

First-order ODEs

In this section we will consider some methods that will allow us to solve certain

first-order ODEs of degree one. That is, certain ODEs that have the form

dy

= f (x, y),

dx

where f (x, y) is some given function of the independent variable, x, and the dependent

variable, y.

306

8.2.1

M (x) = N (y)

dy

,

dx

is called a separable ODE. This is because, in such cases, we have been able to

separate the variables so that all occurrences of x occur on the left-hand-side and all

occurrences of y occur on the right-hand-side. ODEs of this type can be solved by

integrating both sides to get

M (x) dx =

N (y)

dy

dx

dx

M (x) dx =

N (y) dy,

using the integration by substitution formula from Section 5.2.3. If we now determine

these integrals, we will find the general solution to the ODE.

Example 8.5

dy

= 2x(y 1).

dx

2x =

1 dy

,

y 1 dx

with M (x) = 2x and N (y) = (y 1)1 . Using the method described above, we write

this as

2x dx =

dy

y1

|y 1| = ex

2 +c

= ec ex .

Now, both sides of this expression are non-negative because of the modulus on the

left-hand-side and the exponentials on the right-hand-side. This means that, if we

want to remove the modulus, we must allow the possibility that the right-hand-side

can give us a negative quantity, i.e. we have

y 1 = ec ex

y = 1 ec ex .

Then, as the independent variable is x and the dependent variable is y, the unknown

function here is y(x), so this gives us the general solution

2

y(x) = 1 + A ex ,

where A R is an arbitrary constant.2

Of course, having found the general solution to the ODE in this example, we can also

find particular solutions if we are given some conditions.

2

Here we have replaced ec with a new constant A R which can take any value.

307

8. Differential equations

Activity 8.3 Find the particular solutions to the ODE in Example 8.5 given the

conditions (a) y(0) = 2 and (b) y(0) = 0.

What value of y(1) will give the same particular solution as the one you found in

(a)?

8.2.2

dy

+ P (x)y = Q(x),

dx

is called a linear ODE. The procedure for solving such an ODE involves finding an

integrating factor, (x), given by

(x) = e

P (x) dx

where, here, P (x) dx is just any antiderivative of P (x). Once we have this, we

multiply both sides of the ODE by the integrating factor to get

(x)

dy

+ (x)P (x)y = (x)Q(x).

dx

(8.1)

d

d

=

dx

dx

P (x) dx

P (x) dx

if we use the chain rule3 and so, using the product rule, we have

d

(x)y(x)

dx

= (x)

dy d

dy

+

y(x) = (x)

+ (x)P (x)y(x),

dx dx

dx

d

(x)y(x)

dx

= (x)Q(x)

(x)y(x) =

(x)Q(x) dx,

and if we determine the integral on the right-hand-side, we can then find the general

solution to the ODE.

3

d

dx

P (x) dx = P (x).

To see why, note that if c is an arbitrary constant and F (x) is an antiderivative of P (x), i.e. F (x) = P (x),

we have

P (x) dx = F (x) + c

as expected.

308

d

dx

P (x) dx =

d

F (x) + c

dx

= F (x) = P (x),

Example 8.6

dy

2y = 6.

dx

dy

2

6

y= ,

dx x

x

with P (x) = 2/x and Q(x) = 6/x. Using the method above, we start by finding the

integrating factor, (x), by determining the integral

2

dx = 2 ln |x| + c,

x

P (x) dx =

integrating factor is

2

(x) = e2 ln x = eln x = x2 ,

and so we have

(x)y(x) =

(x)Q(x) dx = x2 y(x) =

y(x) = 3 + cx2 ,

is the general solution to our linear ODE.

Activity 8.4

2

1 dy

=

.

x

y + 3 dx

Verify that the answer we found in that example is correct by solving this separable

ODE using the method in Section 8.2.1.

Lets now consider another example where the ODE is linear, but not separable.

Example 8.7

dy

= y + ex .

dx

dy

y = ex ,

dx

which is linear with P (x) = 1 and Q(x) = ex . Using the method above, we start by

finding the integrating factor, (x), by determining the integral

P (x) dx =

1 dx = x + c,

309

8. Differential equations

factor is

(x) = ex ,

and so we have

(x)y(x) =

(x)Q(x) dx

ex y(x) =

dx

ex y(x) = x + c,

y(x) = (x + c) ex ,

is the general solution to our linear ODE.

Activity 8.5

8.2.3

f (x, y) = r f (x, y).

Using this, we say that a first-order ODE of the form

M (x, y) + N (x, y)

dy

= 0,

dx

n. The procedure for solving such an ODE involves making the substitution y = xv(x)

to separate the variables v and x so that we can solve it using the method in

Section 8.2.1.

Example 8.8

xy + y 2 xy

dy

= 0.

dx

M (x, y) = xy + y 2

and, clearly, they are both homogeneous of degree 2. As such, we introduce a new

function, v(x), such that

y(x) = xv(x)

dy

dv

= v(x) + x ,

dx

dx

x2 v + v 2 x2 v v + x

310

dv

dx

= 0.

Cancelling common factors and simplifying this then becomes the separable ODE

dv

1

= ,

dx

x

which we solve using the method in Section 8.2.1, i.e.

dx

x

dv =

v(x) = ln |x| + c,

y(x) = x(ln |x| + c),

is the general solution to our homogeneous ODE.

Activity 8.6

dy

y

= 1.

dx x

Verify that the answer we found in that example is correct by solving this linear

first-order ODE using the method in Section 8.2.2.

Lets now consider another example where the ODE is homogeneous, but not linear.

Example 8.9

x4 + 5y 4 4xy 3

dy

= 0.

dx

M (x, y) = x4 + 5y 4

and, clearly, they are both homogeneous of degree 4. As such, we introduce a new

function, v(x), such that

y(x) = xv(x)

dy

dv

= v(x) + x ,

dx

dx

x4 + 5x4 v 4 4x4 v 3 v + x

dv

dx

= 0.

Cancelling common factors and simplifying this then becomes the separable ODE

dv

1 + v4

=

,

dx

4xv 3

which we solve using the method in Section 8.2.1, i.e.

4v 3

dv =

1 + v4

dx

x

ln |1 + v 4 | = ln |x| + c,

311

8. Differential equations

where c is an arbitrary constant. So, taking exponentials of both sides, this gives us

|1 + v 4 | = eln |x|+c = ec eln |x| = ec |x|,

so that removing the modulus signs and replacing the arbitrary constant ec > 0 with

A R, we get

v 4 + 1 = Ax = v = (Ax 1)1/4 ,

for some arbitrary constant, A. Consequently, using y(x) = xv(x), we find that

y(x) = x(Ax 1)1/4 ,

is the general solution to our homogeneous ODE.

Activity 8.7

Homogeneous ODEs are not the only examples of ODEs that can be solved using the

methods above after some judicious substitution. In this course, if a novel substitution

is needed to make a given ODE solvable, it will usually be given. See, for example,

Exercise 8.2.

8.3

Second-order ODEs

In this section we will consider some methods that will allow us to solve certain

second-order ODEs where all occurrences of y and its derivatives are of degree one. In

particular, we will be concerned with such ODEs that have the form

a

dy

d2 y

+

b

+ cy = f (x),

dx2

dx

where a, b and c are constants and f (x) is some given function of the independent

variable, x. ODEs of this form are often said to have constant coefficients referring to

the constants multiplying y and its derivatives on the left-hand-side. The method for

solving such second-order ODEs is as follows.

8.3.1

If the function, f (x), on the right-hand-side of our second-order ODE with constant

coefficients is zero, i.e. if our ODE has the form

a

dy

d2 y

+b

+ cy = 0,

2

dx

dx

we say that it is homogeneous.4 To solve such an ODE, lets suppose that any solution

must have the form

y(x) = A ekx ,

(8.2)

4

Note that this is a different use of the word homogeneous to the one in Sections 6.3.4 and 8.2.3.

That is, this is an homogeneous equation whereas in Section 6.3.4 we had homogeneous functions and

in Section 8.2.3 we had an ODE which was made up from two such functions in a certain way.

312

this twice, we find that

dy

= Ak ekx

dx

and

d2 y

= Ak 2 ekx ,

dx2

a(Ak 2 ekx ) + b(Ak ekx ) + c(A ekx ) = 0.

Now, we can cancel the A as it is arbitrary and the ekx as it is always non-zero, which

leaves us with the auxiliary equation

ak 2 + bk + c = 0.

If we solve the auxiliary equation, we can determine the values of k in (8.2) that yield

solutions. Of course, when solving a quadratic equation such as this, there are three

different cases that can arise, i.e. we can get:

Two real solutions: If the solutions are k = and k = , then we get solutions of

the form

y(x) = A ex

and

y(x) = B ex ,

where A and B are arbitrary constants. As such, we find that

y(x) = A ex +B ex ,

is the general solution of the second-order ODE.

One real solution: If the solution is k = (twice), then we get solutions of the form

y(x) = A ex

and

y(x) = Bx ex ,

y(x) = (A + Bx) ex ,

is the general solution of the second-order ODE.

beyond the scope of this course,5 we find that

y(x) = ex A cos(x) + B sin(x) ,

is the general solution of the second-order ODE.

Lets illustrate these three cases by looking at some examples.

5

If you are interested, this case involves complex numbers which are discussed in Chapter 13 of

Binmore and Davies (2002). If you read this, you will then be able to understand the discussion of this

type of solution in Section 14.5 of Binmore and Davies (2002). However, as we are not dealing with such

things here, you are advised to wait until you tackle complex numbers properly in 175 Further Linear

Algebra.

313

8. Differential equations

Example 8.10

is homogeneous. Its auxiliary equation is given by

k2 k 2 = 0

(k 2)(k + 1) = 0,

and so we have two real solutions given by k = 2 and k = 1. As such, the theory

above dictates that

y(x) = A e2x +B ex ,

where A and B are arbitrary constants, is the general solution to this homogeneous

second-order ODE.

Example 8.11

is homogeneous. Its auxiliary equation is given by

k 2 + 4k + 4 = 0

(k + 2)2 = 0,

and so we have one real solution given by k = 2. As such, the theory above

dictates that

y(x) = (A + Bx) e2x ,

where A and B are arbitrary constants, is the general solution to this homogeneous

second-order ODE.

Example 8.12

is homogeneous. Its auxiliary equation is given by

k 2 2k+2 = 0 = (k1)2 +1 = 0 = k1 = 1 = k = 1 1.

and so we get no real solutions for k. As such, the theory above dictates that we take

= 1 and d = 1, so that

y(x) = ex A cos(x) + B sin(x) ,

where A and B are arbitrary constants, is the general solution to this homogeneous

second-order ODE.

8.3.2

If the function, f (x), on the right-hand-side of our second-order ODE with constant

coefficients is non-zero, i.e. it has the form

a

314

dy

d2 y

+

b

+ cy = f (x),

dx2

dx

with f (x) = 0, then we say that it is non-homogeneous. To solve such an ODE, we use

the following method.

We solve the corresponding homogeneous ODE, to find the function, yc (x), which is

often called the complementary function. That is, we solve

a

d2 yc

dyc

+ cyc = 0,

+b

2

dx

dx

We then seek a function, yp (x), which is often called the particular integral, that

satisfies the non-homogeneous ODE. That is, we want to find a function, yp (x),

that satisfies

d2 y p

dyp

+ cyp = f (x),

a 2 +b

dx

dx

and we will see how to do this in a moment.

Then, having found the complementary function and a particular integral, the

general solution to our non-homogeneous ODE is given by

y(x) = yc (x) + yp (x).

That is, the general solution we seek, y(x), is the sum of the two functions we have

found.

In particular, observe that the complementary function will contain the two arbitrary

constants that make y(x) a general solution whereas the particular integral guarantees

that y(x) will give us the correct right-hand-side, i.e. f (x), when we substitute it into

the ODE.

Finding particular integrals

To find the particular integral for a given second-order ODE, we look at f (x) and start

by taking yp (x) to be a general function of the same form. For instance, if we find that

f (x) = a for some constant a we take yp (x) = .

f (x) = a + bx for some constants a and b we take yp (x) = + x.

f (x) = a + bx + cx2 for some a, b and c we take yp (x) = + x + x2 .

et cetera.

f (x) = a erx for some constant a we take yp (x) = erx .

f (x) = (a + bx) erx for some constants a, b and r we take yp (x) = ( + x) erx .

et cetera.

f (x) = a sin(rx) for some constants a and r we take yp (x) = sin(rx) + cos(rx).

f (x) = a cos(rx) for some constants a and r we take yp (x) = sin(rx) + cos(rx).

et cetera.

315

8. Differential equations

second-order ODE, we can find the values of the relevant Greek letters and this will

then give us the specific function, yp (x), that will play the role of the particular integral

in our solution.

Applying the method

Lets consider an example to see how we would go about determining the particular

integral in some of the cases listed above and how we would use this to find the general

solution of a non-homogeneous second-order ODE.

Example 8.13 In Example 8.10 we saw that the general solution to the

homogeneous second-order ODE y y 2y = 0 was given by

y(x) = A e2x +B ex ,

where A and B are arbitrary constants. Find the general solution to the

non-homogeneous second-order ODE

y y 2y = f (x),

when (i) f (x) = 8, (ii) f (x) = 6x and (iii) f (x) = 20 e3x .

We know that the complementary function, yc (x), for this non-homogeneous

second-order ODE is given by the general solution to the homogeneous second-order

ODE. As such, we know that

yc (x) = A e2x +B ex ,

where A and B are arbitrary constants. Our first task is to find the particular

integral, yp (x), for each choice of f (x). Once we have this, we can then find the

general solution, y(x), of the relevant non-homogeneous second-order ODE by

simply taking y(x) = yc (x) + yp (x).

For (i), we have f (x) = 8 and so we take yp (x) = where is a constant that has to

be determined. To find , we note that yp (x) and yp (x) are both zero which means

that substituting them into the non-homogeneous second-order ODE, we get

0 0 2 = 8

= 4.

Thus, yp (x) = 4 is the sought after particular integral and the general solution to

our non-homogeneous second-order ODE is

y(x) = A e2x +B ex 4,

using y(x) = yc (x) + yp (x).

For (ii), we have f (x) = 6x and so we take yp (x) = + x where and are

constants that have to be determined. To find and , we note that yp (x) = and

yp (x) = 0 which means that substituting them into the non-homogeneous

second-order ODE yields

0 2( + x) = 6x

316

2x (2 + ) = 6x.

Now these two expressions must be the same and so, looking at the coefficient of x

on both sides, we see that must be 3. Similarly, looking at the constant term on

both sides we see that 2 must be zero, so as = 3, this means that must

be 3/2. Thus, yp (x) = 32 3x is the sought after particular integral and the general

solution to our non-homogeneous second-order ODE is

3

y(x) = A e2x +B ex + 3x,

2

using y(x) = yc (x) + yp (x).

For (iii), we have f (x) = 20 e3x and so we take yp (x) = e3x where is a constant

that has to be determined. To find , we note that yp (x) = 3 e3x and yp (t) = 9 e3x

which means that substituting them into the non-homogeneous second-order ODE

yields

9 e3x 3 e3x 2( e3x ) = 20 e3x

4 e3x = 20 e3x

= 5.

Thus, yp (x) = 5 e3x is the sought after particular integral and the general solution to

our non-homogeneous second-order ODE is

y(x) = A e2x +B ex +5 e3x ,

using y(x) = yc (x) + yp (x).

A complication

Although we wont spend much time on such things, observe that if the function, f (x),

in our non-homogeneous second-order ODE prompts us to try a particular integral,

yp (x), that is part of the complementary function i.e. we can find values of the

arbitrary constants in yc (x) that make yc (x) = yp (x) we have to be more subtle when

we choose our particular integral. However, this subtlety usually involves doing nothing

more than multiplying what wed normally choose to be our particular integral by x.

Lets return to our previous example to see how this works.

Example 8.14 Following on from Example 8.13, find the general solution to the

non-homogeneous second-order ODE

y y 2y = f (x),

when f (x) = 18 e2x .

We know that the complementary function, yc (x), for this non-homogeneous

second-order ODE is given by

yc (x) = A e2x +B ex ,

where A and B are arbitrary constants. Our task is to find the particular integral,

yp (x), in the case where f (x) = 18 e2x so that we can deduce the relevant general

solution.

317

8. Differential equations

Note: Here we would normally try yp (x) = e2x but this is part of the

complementary function since, taking A = and B = 0, we have yp (x) = yc (x)!

Our first reaction in this case would be to take yp (x) = e2x where is a constant

that has to be determined. To find , we note that yp (x) = 2 e2x and yp (x) = 4 e2x

which means that substituting them into the non-homogeneous second-order ODE,

we get

4 e2x 2 e2x 2( e2x ) = 18 e2x .

But now, the left-hand-side turns out to be zero,6 meaning that this equation for

has no solutions! That is, we cant determine if we use this general form for yp (x)!

Thus, the particular integral in this case cant have the general form yp (x) = e2x as

we cant find an that will make it work.

So, following the advice above, we try the next best thing which is our original

choice multiplied by x. That is, we try yp (x) = x e2x where is a constant that has

to be determined. To find , we note that writing yp (x) as (x)(e2x ) we can use the

product rule to get

yp (x) = ()(e2x ) + (x)(2 e2x ) = ( + 2x)(e2x ),

and

yp (x) = (2)(e2x ) + ( + 2x)(2 e2x ) = (4 + 4x)(e2x ).

So, substituting these into the non-homogeneous second-order ODE, we get

(4 + 4x)(e2x ) ( + 2x)(e2x ) 2(x)(e2x ) = 18 e2x

3 e2x = 18 e2x ,

which means that can now be determined and is actually equal to 6. Thus,

yp (x) = 6x e2x is the sought after particular integral and so the general solution to

our non-homogeneous second-order ODE is

y(x) = A e2x +B ex +6x e2x ,

using y(x) = yc (x) + yp (x).

Another example of this complication arises in Question 3(b) of the sample examination

paper in Appendix A.

8.4

We now turn our attention to systems of first-order ODEs. For instance, we may be

asked to find the functions y1 (x) and y2 (x) that simultaneously satisfy the ODEs

dy1

= f1 (y1 , y2 , x) and

dx

6

dy2

= f2 (y1 , y2 , x),

dx

Actually, this shouldnt be a surprise since, taking A = and B = 0 in our complementary function,

we still have a solution to the homogeneous second-order ODE and so putting this into the left-hand-side

must yield zero!

318

where we are given the functions f1 and f2 . Generally, y1 and y2 will appear on the

right-hand-sides of both these first-order ODEs and, in such cases, we say that they are

coupled as we cant solve one of them without using information contained in the other.

The procedure that we shall use to solve these involves rewriting the system of

first-order ODEs as a second-order ODE which can then be solved using the method

outlined in the previous section.

8.4.1

A simple system of coupled first-order ODEs will only involve linear combinations of

y1 (x) and y2 (x) on the right-hand-side, i.e. it will have the form

dy1

= ay1 (x) + by2 (x) and

dx

dy2

= cy1 (x) + dy2 (x),

dx

for some constants a, b, c and d. The procedure for solving this involves differentiating

the first equation (say) with respect to x so that we get

dy1

dy2

d2 y1

=a

+b

,

2

dx

dx

dx

and then, using the second equation, we find that

d2 y1

dy1

=

a

+ b (cy1 (x) + dy2 (x)) ,

dx2

dx

which means that we have

d2 y1

dy1

a

bcy1 (x) bdy2 (x) = 0.

2

dx

dx

by2 (x) =

dy1

ay1 (x),

dx

d2 y 1

dy1

(a + d)

(bc ad)y1 (x) = 0,

2

dx

dx

which is an homogeneous second-order ODE with constant coefficients which we can

solve using the method in Section 8.3.1 to find y1 (x). Of course, having done this, we

can then use the first of the original equations (say) to find y2 (x). Lets look at an

example to see how this works.

Example 8.15 Find the functions y1 (x) and y2 (x) that satisfy the system of

first-order ODEs given by

dy1

= 2y1 + 4y2

dx

and

dy2

= 3y1 + 3y2 ,

dx

319

8. Differential equations

this we note that, rearranging the first ODE gives us

y2 =

1

4

dy1

2y1 ,

dx

(8.3)

dy2

1

=

dx

4

d2 y1

dy1

2

2

dx

dx

Consequently, if we substitute these two expressions into the second ODE, we get

1

4

d2 y 1

dy1

2

2

dx

dx

= 3y1 +

3

4

dy1

2y1 ,

dx

d2 y1

dy1

6y1 = 0,

5

dx2

dx

which is our sought after second-order ODE in y1 (x). As it is an homogeneous

second-order ODE with constant coefficients, this can be solved using the method in

Section 8.3.1. The auxiliary equation is

k 2 5k 6 = 0

(k + 1)(k 6) = 0,

which has two real solutions given by k = 1 and k = 6 which means that the

general solution for y1 (x) is

y1 (x) = A ex +B e6x ,

for arbitrary constants A and B. To find the general solution for y2 (x), we note that

using (8.3) and the fact that

we get

y2 (x) =

1

[A ex +6B e6x ] 2[A ex +B e6x ]

4

1

4

3A ex +4B e6x ,

in terms of the same arbitrary constants A and B as before. Thus, the general

solution to this system of first-order ODEs is

3

y1 (x) = A ex +B e6x and y2 (x) = A ex +B e6x ,

4

for arbitrary constants A and B.

However, we are also given the conditions y1 (0) = 5 and y2 (0) = 2 which imply that

3

5 = A + B and 2 = A + B.

4

Solving these two equations simultaneously, say by subtracting one from the other,

we see that 7 = 7A/4 which gives A = 4 and then, the first equation gives B = 1.

Consequently, we find that

y1 (x) = 4 ex + e6x

is the particular solution of this system of first-order ODEs given the conditions

y1 (0) = 5 and y2 (0) = 2.

320

It is worth noting that systems of equations of the form encountered here can also be

solved using diagonalisation in much the same way as systems of difference equations

are solved in Section 11.2 of 173 Algebra.

8.4.2

Systems of first-order ODEs become more complicated when they involve more

complicated functions on the right-hand-side. The method for solving them remains the

same, but a little more care must be taken as the following example illustrates.

Example 8.16 Find the functions y1 (x) and y2 (x) that satisfy the system of

first-order ODEs given by

dy1

dy2

= 4y1 + 2y2 and

= 2y1 + 4x2 + 4,

dx

dx

with the conditions y1 (0) = 1 and y2 (0) = 7/2.

We will solve this by rewriting this system as a second-order ODE in y1 (x). To do

this we note that, rearranging the first ODE gives us

y2 =

1

2

dy1

+ 4y1 ,

dx

(8.4)

1

dy2

=

dx

2

d2 y1

dy1

+4

2

dx

dx

1

2

dy1

d2 y 1

+4

2

dx

dx

= 2y1 + 4x + 4,

d2 y 1

dy1

+4

+ 4y1 = 8x2 + 8,

(8.5)

2

dx

dx

which is our sought after second-order ODE in y1 (x). As it is a non-homogeneous

second-order ODE with constant coefficients, this can be easily solved using the

method of Section 8.3.2. In particular:

The homogeneous second-order ODE that corresponds to (8.5) is

d2 y1

dy1

+4

+ 4y1 = 0,

2

dx

dx

and so the auxiliary equation is

k 2 + 4k + 4 = 0

(k + 2)2 = 0,

complementary function for y1 (x) is

y1 (x) = (A + Bx) e2x ,

where A and B are arbitrary constants.

321

8. Differential equations

particular integral of the form

y1 (x) = x2 + x + .

We differentiate this twice to get

y1 (x) = 2x +

and

y1 (x) = 2,

2 + 4(2x + ) + 4(x2 + x + ) = 8x2 + 8.

Then, equating the coefficients of the terms on both sides we see that, from the

x2 term, we get

4 = 8

=

= 2,

which means that, from the x term, we get

8 + 4 = 0

= 4,

2 + 4 + 4 = 8

= 5.

y1 (x) = 2x2 4x + 5,

The general solution to (8.5) is then given by the sum of its complementary

function and its particular integral, i.e. we have

y1 (x) = (A + Bx) e2x +2x2 4x + 5,

where A and B are arbitrary constants.

We can now use this to find the general solution for y2 (x) since, using (8.5) and the

fact that

dy1

= B e2x 2(A + Bx) e2x +4x 4 = (B 2A 2Bx) e2x +4x 4,

dx

we get

y2 (x) =

1

[(B 2A 2Bx) e2x +4x 4] + 4[(A + Bx) e2x +2x2 4x + 5] .

2

y2 (x) = ( 12 B + A + Bx) e2x +4x2 6x + 8,

is the corresponding general solution for y2 (x) in terms of the same arbitrary

constants A and B as before.

322

Once we have these general solutions, we can use the initial conditions y1 (0) = 1 and

y2 (0) = 7/2 to get the equations

1=A+5

7

2

and

= 12 B + A + 8,

that

y1 (x) = (4 + t) e2x +2x2 4x + 5 and y2 (x) = ( 29 + x) e2x +4x2 6x + 8,

are the sought after particular solutions.

8.5

Applications of ODEs

Differential equations are used widely in economics-based subjects and, in Section 5.4.1,

we saw a very simple application when we considered marginal functions. Here, we will

consider a few more examples that are a bit more sophisticated.

8.5.1

(p) =

p dq

,

q dp

where q = q D (p) is the demand function. If we know the elasticity of demand, we can

use this and our knowledge of ODEs to determine the demand function.

Example 8.17 Suppose that the elasticity of demand is a constant, i.e. (p) = r for

all p and r is a positive constant. Find the demand function if q D (1) = 2.

Using the definition of the elasticity of demand, this gives us

p dq

=r

q dp

1 dq

r

= ,

q dp

p

and so this is a separable first-order ODE. Solving this using the method in

Section 8.2.1, we write this as

1

dq =

q

r

p

ln |q| = r ln |p| + c,

ln |q| = ln |p|r + c,

we can take exponentials of both sides, to get

q = eln |p|

r +c

= ec pr ,

323

8. Differential equations

where we can remove the modulus signs since, economically, q and p are both

positive. Then, using the fact that q D (1) = 2, we see that ec = 2 and so

q = q D (p) =

2

,

pr

Activity 8.8 How does the demand function found in Example 8.17 behave as

p 0+ and as p ?

8.5.2

Suppose that the price of some commodity varies continuously with time and that its

initial price is not equal to its equilibrium price. We might expect that, as time

progresses, the price of the commodity will tend to its equilibrium price but to be sure,

we need to have a model of how the price of the commodity is varying with time. One

such model involves looking at how the rate of change of the price of the commodity is

related to the excess of demand over supply.

Suppose that the price of the commodity as a function of time is p(t) and that the

market for this commodity is governed by the demand function, q D (p), and the supply

function, q S (p). This means that, at any time, t, as the price is p(t), the quantity being

demanded is given by q D (p(t)) and the quantity being supplied is given by q S (p(t)). As

such, we can define the excess of demand over supply to be the function of p(t) given by

(p(t)) = q D (p(t)) q S (p(t)),

i.e. the difference between these two quantities. Clearly, this means that if p(t) is such

that:

(p(t)) > 0, demand outstrips supply and so the price should rise, i.e. p (t) > 0.

(p(t)) = 0, demand equals supply and we should have equilibrium, i.e. p (t) = 0.

(p(t)) < 0, supply outstrips demand and so the price should fall, i.e. p (t) < 0.

This suggests that the rate of change of the market price with time, i.e. p (t), should be

given by some function f of the excess of demand over supply, (p(t)), i.e. we have a

model where

dp

= f ((p(t)))

dt

with

Then, by solving this first-order ODE, we can find out how the market price varies with

time and hence assess the stability of the market by considering what it does as t .

To see how this works, lets consider an example.

324

Example 8.18

q D (p) = 5 2p

and

q S (p) = 3p 1,

respectively. If the rate of change of the market price is given by three times the

excess of demand over supply, find the ODE that describes how p(t) changes with

time.

We start by calculating the excess of demand over supply which is given by

(p(t)) = q D (p(t)) q S (p(t)) = [5 2p(t)] [3p(t) 1] = 6 5p(t).

We then know that the rate of change of demand over supply is given by three times

the excess, i.e.

6

dp

= 3(p(t)) = 3[6 5p(t)] = 15 p(t)

dt

5

This is a separable first-order ODE and we can easily solve it using the method in

Section 8.2.1.

Activity 8.9 Solve the separable first-order ODE found in Example 8.18 and use it

to determine how the market price changes over time if the initial price is p(0). How

does the market price behave in the long-term?

8.5.3

In Section 6.1.5 of 173 Algebra, you saw how to find the balance, B(t), of a bank

account that utilises continuously compounded interest at an annual equivalent rate of

100r%. Another way of thinking about this is to say that, at any time, t, the rate of

increase of the balance, B (t), is given by rB(t). This means that we have

dB

= rB(t),

dt

and this is a simple separable first-order ODE that can be solved, using the method in

Section 8.2.1, to get

B(t) = P ert ,

where B(0) = P is the initial balance. As such, we can see that this way of thinking

about continuous compounding gives us an alternative way of deriving the formula you

saw in Section 6.1.5 of 173 Algebra.

Activity 8.10 Verify that solving this separable first-order ODE will give the

solution above.

However, we can actually use ODEs to find the balance of a bank account which uses

continuously compounded interest in the presence of more complicated investment

schemes. For instance, if we take the bank account above and suppose that money is

325

8. Differential equations

added to the account at a rate given by f (t),7 we see that the balance, B(t), is now

given by

dB

dB

= rB(t) + f (t) =

rB(t) = f (t),

dt

dt

which is a linear first-order ODE. And, of course, we could also have the situation where

money is deducted from the account at a rate given by f (t),8 and then we see that the

balance, B(t), would be given by

dB

= rB(t) f (t)

dt

dB

rB(t) = f (t),

dt

Example 8.19 Suppose that we have two bank accounts, X and Y, that pay

continuously compounded interest at annual equivalent rates of 100rX % and 100rY %

respectively. We initially invest an amount PX in account X and, at each instant,

pay the interest accrued into account Y whose initial balance is PY . Find the ODE

that determines the balance in account Y at any time t 0.

Let BX (t) and BY (t) denote the balance in accounts X and Y respectively at time t.

The first thing to notice is that the rate of change of BX (t) is given by

dBX

= rX BX (t) rX BX (t) = 0,

dt

as, at every instant, any interest accrued is immediately deducted from account X so

that it can be paid into account Y. This means that BX (t) must be a constant and,

in particular, this constant must be the initial balance PX . Thus, we find that

BX (t) = PX for all t 0 and the interest accrued at each time t (which we

immediately pay into account Y) is given by rX PX .

The rate of change of BY (t) is then given by the sum of rY BY (t) which is the

continuously compounded interest accrued on the balance in account Y and rX PX

which, as we have just seen, is the continuously compounded interest accrued in

account X. That is, for t 0, we have

dBY

= rY BY (t) + rX PX

dt

dBY

rY BY (t) = rX PX ,

dt

which is a linear first-order ODE and we can easily solve this, subject to the

condition that BY (0) = PY , using the method in Section 8.2.2.

Activity 8.11 Solve the linear first-order ODE found in Example 8.19 and use it to

determine the balance in account Y at any time t 0.

7

8

That is, at each time, t, the balance decreases by f (t).

326

8.5.4

Market trends

In some markets, the equilibrium price will change with time and so it is useful for

consumers to try and anticipate trends. That is, the consumer will keep an eye on the

current equilibrium price, but they will also look at the rate at which the price is rising

or falling and whether this rate of change is speeding up or slowing down. We can

represent these three considerations mathematically by using p(t), p (t) and p (t)

respectively and, by considering how these affect the quantity being supplied or

demanded, we can model how the price itself is varying with time by using an ODE.

Lets look at an example.

Example 8.20

q D (p) = 9 2p + 6

d2 p

dp

2 2,

dt

dt

q S (p) = 3 + 4p

dp d2 p

2.

dt

dt

Find the ODE that determines the equilibrium price at any time t 0.

Here we have linear supply and demand functions which have been modified to take

a trend into account. To find the equilibrium price at any time t 0, we need to

determine the function, p(t), that makes the amount supplied equal to the amount

demanded, i.e.

3 + 4p(t)

dp

d2 p

dp d2 p

2 = 9 2p(t) + 6 2 2 .

dt

dt

dt

dt

But, rearranging this, we get the non-homogeneous second-order ODE with constant

coefficients given by

d2 p

dp

7 + 6p(t) = 12,

2

dt

dt

which we can solve using the method in Section 8.3.2.

Activity 8.12 Solve the second-order ODE found in Example 8.20 and use it to

determine how the equilibrium price changes if p(0) = 7 and p (0) = 15. How does

this equilibrium price behave in the long-term?

Learning outcomes

At the end of this chapter and having completed the relevant reading and activities, you

should be able to:

identify and solve separable, linear and homogeneous first-order ODEs and other

first-order ODEs that can be solved by a given substitution;

327

8. Differential equations

constant coefficients;

solve coupled systems of first-order ODEs by rewriting them as a second-order

ODE with constant coefficients;

solve problems from economics-based subjects that involve applications of ODEs.

Solutions to activities

Solution to activity 8.1

Looking at the given ODEs, we see that:

(a) is second-order of first degree,

(b) is second-order of second degree, and

(c) is third-order of first degree.

Here we find the highest order derivative to determine the order and then the algebraic

degree (or power) of this derivative determines the degree.

Solution to activity 8.2

We have the general solution

y(x) = x2 + x + c,

y(0) = 0. So, setting x = 0 in both sides of this expression and using the condition,

we get

y(0) = 02 + 0 + c = 0 = c,

which means that we must take c = 0 in the general solution to see that

y(x) = x2 + x,

is the particular solution to the ODE given that y(0) = 0.

y(0) = 1. So, setting x = 0 in both sides of this expression and using the

condition, we get

y(0) = 02 + 0 + c = 1 = c,

which means that we must take c = 1 in the general solution to see that

y(x) = x2 + x 1,

is the particular solution to the ODE given that y(0) = 1.

y(2) = 7. So, setting x = 2 in both sides of this expression and using the condition,

we get

y(2) = 22 + 2 + c = 7 = 6 + c,

328

which means that we must take c = 1 in the general solution to see that

y(x) = x2 + x + 1,

is the particular solution to the ODE given that y(2) = 7. Observe that this is the

same particular solution as the one we found with y(0) = 1 in Example 8.2 but that

it arises from a condition that specifies information about y(x) at a different value

of x.

Solution to activity 8.3

We have the general solution

2

y(x) = 1 + A ex ,

and we want to find the particular solutions corresponding to:

y(0) = 2. So, setting x = 0 in both sides of this expression and using the condition,

we get

y(0) = 1 + A e0 = 2 = 1 + A,

which means that we must take A = 1 in the general solution to see that

2

y(x) = 1 + ex ,

is the particular solution to the ODE given that y(0) = 2.

y(0) = 0. So, setting x = 0 in both sides of this expression and using the condition,

we get

y(0) = 1 + A e0 = 0 = 1 + A,

which means that we must take A = 1 in the general solution to see that

2

y(x) = 1 ex ,

is the particular solution to the ODE given that y(0) = 0.

If we want a value of y(1) that will give us the same particular solution as the one found

in (a), i.e.

2

y(x) = 1 + ex ,

we put x = 1 into both sides of this expression to get

y(1) = 1 + e1 = 1 + e .

That is, the condition y(1) = 1 + e gives us the same particular solution as the one we

found in (a).

Solution to activity 8.4

Here we have to solve the separable first-order ODE

2

1 dy

=

,

x

y + 3 dx

329

8. Differential equations

with M (x) = 2/x and N (y) = (y + 3)1 . Using the method in Section 8.2.1, we write

this as

2

dx =

x

dy

y+3

2

Now, |x|2 = x2 and removing the modulus on the left-hand-side, we get

y + 3 = ec x2

y = 3 ec x2 ,

y(x) = 3 + Ax2 ,

where A R is an arbitrary constant.

Solution to activity 8.5

The given ODE can be left as it is or rearranged to give

1 dy

= 1,

y + ex dx

but, either way, it is not separable because we cant separate the variables.

Here we have to solve the linear first-order ODE

y

dy

= 1,

dx x

with P (x) = 1/x and Q(x) = 1. Using the method in Section 8.2.2, we start by finding

the integrating factor, (x), by determining the integral

P (x) dx =

1

dx = ln |x| + c,

x

and so we see that ln x is an antiderivative of 1/x. This means that the integrating

factor is

1

(x) = e ln x = eln x = x1 ,

and so we have

(x)y(x) =

(x)Q(x) dx

x1 y(x) =

x1 dx

y(x) = x(ln |x| + c),

which is the same general solution as before.

330

x1 y(x) = ln |x| + c,

If we try and write the ODE

x4 + 5y 4 4xy 3

dy

+ P (x)y = Q(x),

dx

dy

= 0 in the form

dx

5

x3

dy

y = 3,

dx 4x

4y

and this is not linear due to the presence of the 1/y 3 on the right-hand-side.

Solution to activity 8.8

In Example 8.17, we found that q D (p) = 2/pr where r is a positive constant. As such,

we can see that q D (p) as p 0+ and q D (p) 0 as p .

Solution to activity 8.9

Using the method in Section 8.2.1, we write the ODE as

dp

=

p 65

ln p

6

= 15t + c,

5

p

6

= e15t+c = ec e15t .

5

Now, we remove the modulus bars and compensate for this loss by replacing ec (which

must be positive) with the constant A (which can be negative), to get

p(t) =

6

+ A e15t ,

5

which is the general solution. Then, given that the initial price is p(0), we see that

p(0) =

6

+ A e0

5

6

A = p(0) ,

5

p(t) =

6

6

+ p(0)

5

5

e15t ,

which tells us how the market price changes over time if the initial price is p(0). In

particular, if we have a p(0) such that:

p(0) > 6/5, since e15t 0 as t , p(t) will decrease to 6/5.

p(0) = 6/5, we find that p(t) = 6/5 for all t 0.

p(0) < 6/5, since e15t 0 as t , p(t) will increase to 6/5.

Indeed, as you should be able to verify, 6/5 is the equilibrium price for this market and

so, in this case, regardless of the choice of p(0), the market is either in equilibrium or

tends to equilibrium in the long-term.

331

8. Differential equations

To solve the separable first-order ODE

dB

=

B

dB

= rB(t) we write it as

dt

r dt,

ln |B| = rt + c

B = ert+c = ec ert ,

where we can remove the modulus sign since, economically, B is positive. Then, using

the fact that B(0) = P , we see that ec = P and so

B(t) = P ert ,

as we would expect.

Solution to activity 8.11

We have to solve the linear first-order ODE

dBY

rY BY (t) = rX PX ,

dt

subject to the condition that BY (0) = PY . The integrating factor is given by

e

(rY ) dt

= erY t ,

erY t BY =

rX PX erY t dt

erY t BY = PX

rX rY t

e

+c,

rY

BY (t) = PX

rX

+ c erY t .

rY

PY = PX

rX

+c

rY

c = PY + P X

rX

,

rY

BY (t) = PX

rX

rX

+ PY + PX

rY

rY

rX

rY

erY t 1 ,

Solution to activity 8.12

To solve the non-homogeneous second-order ODE with constant coefficients given by

d2 p

dp

7 + 6p(t) = 12,

2

dt

dt

we note that:

332

d2 p

dp

+ 6p(t) = 0,

7

dt2

dt

and so the auxiliary equation is

k 2 7k + 6 = 0

(k 1)(k 6) = 0,

complementary function for p(t) is

p(t) = A et +B e6t ,

where A and B are arbitrary constants.

The right-hand-side is a constant and this suggests we try a particular integral of

the form p(t) = . We differentiate this twice to get p (t) = 0 and p (t) = 0 so that,

on substituting these into our equation, we get

6 = 12

= 2.

The general solution is then given by the sum of its complementary function and

its particular integral, i.e. we have

where A and B are arbitrary constants.

Then given the initial condition p(0) = 7 we have

7=A+B+2

A + B = 5,

and since

p (t) = A et +6B e6t ,

the other initial condition, p (0) = 15, gives us

15 = A + 6B.

Solving these equations, say by subtracting one from the other, we get 5B = 10 which

gives us B = 2 and so, from the first equation, A = 3. Consequently, the particular

solution we seek is

p(t) = 3 et +2 e6t +2,

and this describes how the equilibrium price changes with time. Indeed, in the

long-term, as both 3 et and 2 e6t tend to infinity as t , we see that p(t) too.

333

8. Differential equations

Exercises

Exercise 8.1

Find the general solution of the ODE

dy

xy

+

1 + x2 .

=

dx 1 + x2

What is the particular solution if y(0) = 1?

Exercise 8.2

Use the substitution w(t) = y (t) to show that the ODE

d2 y 3 dy

= 3.

dt2

t dt

can be written as a linear ODE in terms of w(t). Solve this linear ODE for w(t) and

hence find the general solution of the original ODE.

Exercise 8.3

Find the particular solution of the ODE

y (t) 5y (t) + 6y(t) = 10 sin t,

given that y(0) = 0 and y (0) = 1.

Exercise 8.4

The functions f (t) and g(t) are related by the first-order ODEs

f (t) = 3f (t) g(t)

and

Exercise 8.5

The elasticity of demand for a good is given by

(p) =

2p2

,

p2 + 1

Solutions to exercises

Solution to exercise 8.1

We solve this linear first-order ODE using the method in Section 8.2.2. Here

P (x) = x/(1 + x2 ) and we start by seeing that the integral

P (x) dx =

334

x

dx = 12 ln |1 + x2 | + c,

1 + x2

antiderivative of x/(1 + x2 ), the integrating factor is

1

Then, as Q(x) =

1+x2

1

2

ln(1 + x2 ) is an

1 + x2 .

1 + x2 , we have

(x)y(x) =

(x)Q(x) dx

= y(x) 1 + x2 =

(1 + x2 ) dx

x3

+ c,

= y(x) 1 + x2 = x +

3

where c is an arbitrary constant. As such, we find that

y(x) =

x

x3

c

+

+

,

1 + x2 3 1 + x2

1 + x2

If y(0) = 1, this gives us c = 1, and so

y(x) =

3x + x3 + 3

,

3 1 + x2

Given that w(t) = y (t), we have w (t) = y (t), and so the given ODE, i.e.

d2 y 3 dy

= 3 becomes

dt2

t dt

dw 3

w(t) = 3,

dt

t

We solve this ODE using the method in Section 8.2.2. Here P (t) = 3/t and we start

by seeing that the integral

p(t) dt =

3

dt = 3 ln |t| + c,

t

and so 3 ln t is an antiderivative of 3/t which means that the the integrating factor,

(t), is given by

3

(t) = e3 ln t = eln(t ) = t3 .

Then, as Q(t) = 3, we have

(t)w(t) =

(t)Q(t) dt

t3 w(t) =

3t3 dt

3

t3 w(t) = t2 + c,

2

w(t) =

3t

+ ct3 ,

2

335

8. Differential equations

Then, as w(t) = y (t), we see that

y(t) =

3t

+ ct3

2

w(t) dt =

3

c

dt = t2 + t4 + d,

4

4

where d is another arbitrary constant. This is the general solution of the original ODE.

Solution to exercise 8.3

The given ODE is a non-homogeneous second-order ODE with constant coefficients and

we solve it using the method of Section 8.3.2. In particular:

The corresponding homogeneous second-order ODE is

y (t) 5y (t) + 6y(t) = 0,

and so the auxiliary equation is

k 2 5k + 6 = 0

(k 2)(k 3) = 0,

complementary function, yc (t), is

yc (t) = A e2t +B e3t ,

for arbitrary constants A and B.

The right-hand-side of the given ODE is 10 sin t and this suggests that we try a

particular integral of the form

yp (t) = sin t + cos t.

We differentiate this twice to get

yp (t) = cos t sin t

and

( sin t cos t) 5( cos t sin t) + 6( sin t + cos t) = 10 sin t.

Then, equating the coefficients of the terms on both sides we see that, from the

sin t term, we get

+ 5 + 6 = 10

+ = 2,

5 + 6 = 0

= ,

and so, solving these two equations simultaneously, we find that = 1 and = 1.

Consequently, we see that

yp (t) = sin t + cos t,

is the particular integral.

336

The general solution is then given by the sum of its complementary function and

its particular integral, i.e. we have

y(t) = A e2t +B e3t + sin t + cos t,

where A and B are arbitrary constants.

We can now use the initial condition y(0) = 0 to see that

0=A+B+0+1

A + B = 1,

and, as

y (t) = 2A e2t +3B e3t + cos t sin t,

the initial condition y (0) = 1 gives us

1 = 2A + 3B + 1 0

2A + 3B = 0.

that

y(t) = 3 e2t +2 e3t + sin t + cos t,

is the sought after particular solution.

Solution to exercise 8.4

We will solve the given system of first-order ODEs by rewriting it as a second-order

ODE in f (t). To do this we note that, rearranging the first ODE gives us

g = 3f

df

dt

(8.6)

df

d2 f

dg

=3

2.

dt

dt

dt

Consequently, if we substitute these two expressions into the second ODE, we get

3

df

d2 f

df

2 = 3 3f

dt

dt

dt

f,

d2 f

df

6

+ 8f = 0,

2

dt

dt

which is our sought after second-order ODE in f (t). As it is an homogeneous

second-order ODE with constant coefficients, this can be solved using the method of

Section 8.3.1. The auxiliary equation is

k 2 6k + 8 = 0

(k 2)(k 4) = 0,

which has two real solutions given by k = 2 and k = 4 which means that the general

solution for f (t) is

f (t) = A e2t +B e4t ,

337

8. Differential equations

for arbitrary constants A and B. To find the general solution for g(t), we note that

using (8.6) and the fact that

f (t) = 2A e2t +4B e4t ,

we get

g(t) = 3[A e2t +B e4t ] [2A e2t +4B e4t ] = A e2t B e4t ,

in terms of the same arbitrary constants A and B as before. Thus, the general solution

to this system of first-order ODEs is

f (t) = A e2t +B e4t

However, we are also given the conditions f (0) = 2 and g(0) = 0 which imply that

2=A+B

and 0 = A B.

Solving these two equations simultaneously then gives us A = 1 and B = 1 which means

that

f (t) = e2t + e4t and g(t) = e2t e4t ,

are the sought after functions.

Solution to exercise 8.5

Using the definition of elasticity with q = q D (p) and the given expression we have

p dq

2p2

= 2

,

q dp

p +1

2p

1 dq

= 2

,

q dp

p +1

which is a separable ODE. So, using the method of Section 8.2.1, we write this as

dq

=

q

2p

dp and determine the integrals to get

+1

p2

ln |q| = ln |p2 + 1| + c,

ln(p2 +1)+c

q=e

c ln(p2 +1)1

=e e

= e (p + 1)

ec

= 2

,

p +1

where we can remove the modulus signs since, economically, q is positive and p2 + 1 is

always positive too. Then, using the fact that q = 4 when p = 1, we see that ec = 8 and

so

8

q = q D (p) = 2

,

p +1

is the sought after demand function.

9

Here we have implicitly used the substitution u = 1 + p2 to determine the integral on the

right-hand-side.

338

Appendix A

Sample examination paper

Important note: This sample examination paper reflects the intended examination

and assessment arrangements for this course in the academic year 20112012. The

format and structure of the examination may have changed since the publication of this

subject guide. You can find the most recent examination papers on the VLE where all

changes to the format of the examination are posted.

Calculus

Time allowed: THREE hours.

Candidates should answer all FIVE questions. All questions carry equal marks (20

marks each).

Calculators may not be used for this paper.

1. (a) (i) Find

t cos t dt.

dy

x3

+ xy 2 x2 y

= 0,

cos(y/x)

dx

is homogeneous and find its degree of homogeneity.

(iii) Hence find the general solution of the differential equation in (ii) leaving

your answer in terms of y/x.

(b) A plane, P , in R3 contains the point (3, 4, 1) and has normal (4, 8, 4)T .

Find the Cartesian equation of this plane.

It is known that the surface, S, with equation

x2 + y 2 + z 2 = c,

for some c R has P as a tangent plane. Find the value of c that makes this

the case and find the point on this surface which has P as its tangent plane.

Another surface with equation

x2 + y 2 + z 2 = ,

for some , R intersects S orthogonally at the point (4, 3, 5). Find the

values of and that make this the case.

339

(a) If this markets elasticity of demand is given by

p

(p) =

,

26 p

find its demand function.

(b) The markets inverse supply function has the form

pS (q) = aq + b,

for some numbers a and b. Given that the producer surplus is 36, find the

values of a and b. Hence deduce the supply function, q S (p), for this market.

(c) An excise (or per-unit) tax of T is imposed on the market. Find the new

equilibrium price and quantity of the market.

Hence find the value of T that maximises the tax revenue.

3. (a) A function f : R2 R is defined by

f (x, y) = x2 2x y 3 + y 2 + y.

Find the regions, if any, in the (x, y)-plane where f is convex, concave or

neither.

Does f have a global minimum or a global maximum? Justify your answer.

(b) Find the general solution of the differential equation

y (t) 2y (t) + y(t) = et .

4. If a firm uses amounts k and l of capital and labour respectively, then it can

produce an amount q(k, l) = k l where 0 < < 1/2. Supposing that the firm is

producing an amount Q, use the method of Lagrange multipliers to show that the

minimum amount it can spend on capital and labour is given by

1

2 vw Q 2 ,

where each unit of capital costs v and each unit of labour costs w. By sketching the

constraint and some appropriate contours, you should justify your use of the

method of Lagrange multipliers and explain why your answer is a minimum.

The product manufactured by the firm sells at a fixed price, p, and the raw

materials required to produce each unit cost an amount, c, where c < p. If the firm

acts in a way which minimises its capital and labour costs, use the result just

obtained to determine the production level, Q, that will maximise its profit.

5. (a) Find the fifth-order Maclaurin series for esin x .

cos x

dx.

(b) Determine the integral

(1 sin x)(2 + sin x)

(c) Find and classify the stationary points of the function

x

3 x.

f (x) =

12

340

Appendix B

Solutions to the sample examination

paper

Question 1.

(a) For (i), we use integration by parts to see that, differentiating the t and integrating

the cos t, we get

t cos t dt = t sin t

For (ii), we compare the first-order differential equation with the standard form

M (x, y) + N (x, y)

to see that

M (x, y) =

x3

+ xy 2

cos(y/x)

dy

= 0,

dx

and N (x, y) = x2 y.

M (x, y) =

(x)3

+ (x)(y)2 = 3 M (x, y),

cos(y/x)

and

N (x, y) = (x)2 (y) = 3 N (x, y),

i.e. both M (x, y) and N (x, y) are homogeneous of degree 3. Consequently, this

differential equation is homogeneous of degree 3.

For (iii), as the differential equation in (ii) is homogeneous, we make the substitution

y(x) = xv(x) so that, using the product rule, we have

dy

dv

= v(x) + x ,

dx

dx

and the differential equation becomes

x3

dv

+ x3 v 2 x3 v v + x

cos v

dx

which, when simplified, yields

v cos v

= 0,

dv

1

= ,

dx

x

341

which is a separable differential equation. Rewriting this in the usual way then gives

dx

x

v cos v dv =

where c is an arbitrary constant and we have used (i) to find the integral on the

right-hand-side. So, using y(x) = xv(x) again, we see that

y

y

y

sin

+ cos

= ln |x| + c,

x

x

x

is the general solution in terms of y/x. (Obviously, this expression cannot be usefully

simplified any further.)

(b) As the plane, P , contains the point (3, 4, 1) and has normal (4, 8, 4)T , we have

4

x3

8 y 4 = 0 = 4(x3)+8(y4)4(z+1) = 0 = x2y+z = 6,

4

z+1

as its Cartesian equation.

f (x, y, z) = x2 + y 2 + z 2 ,

for constant c. At any point, (x, y, z), on the surface, its normal vector is given by

fx

2x

f = fy = 2y ,

fz

2z

and in order for this to be in the same direction as the normal to P , there must be some

that makes

2

2x

4

x

2

f = 4 = 2y = 8 = y = 4 .

2

2z

4

z

2

Of course, we also need the point, (x, y, z), to lie on P and so we also have

x 2y + z = 6

1

= ,

2

i.e. this is the value of that we need. Thus, the point on S that we seek is (1, 2, 1)

and, using the equation for S, we get

c = (1)2 + (2)2 + (1)2 = 6,

as the required value of c.

The new surface can be written as g(x, y, z) = with

g(x, y, z) = x2 + y 2 + z 2 ,

342

for constants and . At any point (x, y, z) on the surface, its normal vector is given by

gx

2x

8

g = gy = 2y

=

g = 6 ,

gz

2z

10

at the point (4, 3, 5). We also see that the normal to S at the point (4, 3, 5) is

8

f = 6 ,

10

and, in order for these two surfaces to intersect orthogonally at this point, we must have

8

8

6

g f = 0 =

6 = 0 = 64 + 36 + 100 = 0 = = 1,

10

10

as the value of that we seek. Then, as the point (4, 3, 5) must lie on the new surface,

we also have

x2 + y 2 z 2 =

42 + 32 (52 ) =

= 0,

Question 2.

(a) The elasticity of demand is given by the formula

p dq

(p) = ,

q dp

where q = q D (p) is the demand function. In this question, we are told that

=

p

,

26 p

p dq

p

=

q dp

26 p

1 dq

1

=

,

q dp

p 26

variables and integrating both sides to get

1

dq =

q

1

dp

p 26

ln |q| = ln |p 26| + c

q = A(p 26),

for some arbitrary constant, A. Then, using the fact that the equilibrium price is 14 and

the equilibrium quantity is 6, we can see that A must satisfy the equation

6 = A(14 26)

A=

6

1

= .

12

2

343

Putting this all together, we then see that we have q = q D (p) where

p

q D (p) = 13 ,

2

(b) The producer surplus is given by

q

PS = p q

pS (q) dq,

0

where p and q are the equilibrium price and quantity, and pS (q) is the inverse supply

function. So, using the information given in the question, we have

6

36 = (14)(6)

(aq + b) dq

q2

48 = a + bq

2

48 = 18a + 6b,

or, indeed, 8 = 3a + b as our first equation for a and b. Another equation that needs to

be satisfied is

14 = 6a + b,

as the equilibrium quantity must give the equilibrium price when we use the inverse

supply function. We can easily solve these equations for the constants a and b by

subtracting one from the other to get a = 2 and then, using the first equation again, we

get b = 2. Consequently, we have

pS (q) = 2q + 2 so that q S (p) =

p

1,

2

(c) In the presence of an excise tax of T , the supply function becomes

1

qTS (p) = q S (p T ) = (p T ) 1,

2

whereas the demand function is unchanged, i.e. qTD (p) = q D (p).

1

Of course, an alternative method here would be to observe that the supply function is a straight line

and so the producer surplus is the area of a triangular region whose height is p b and whose width is

q . This means that, if we find the area of this triangle, we have

36 = 21 (14 b)(6)

14 b = 12

b = 2.

Then, again using the fact that equilibrium quantity must give the equilibrium price when we use the

inverse supply function, we use b = 2 to see that

14 = 6a + b

a=

14 2

= 2,

6

pS (q) = 2q + 2

is the supply function for this market.

344

so that q S (p) =

p

1,

2

This means that, in the presence of the excise tax of T , the new equilibrium price is

given by

qTS (p) = qTD (p)

p

1

(p T ) 1 = 13

2

2

p = 14 +

T

,

2

and, using qTD (p) say, we see that the new equilibrium quantity is

q = 13

1

2

T

2

14 +

=6

T

.

4

We can now find the tax revenue, R(T ), which is the tax per unit, T , multiplied by q,

the amount being sold in the presence of the tax, i.e. we have

R(T ) = T q = T

T

4

= 6T

T2

.

4

To see where this is maximised, we start by noting that R(T ) has a stationary point

when R (T ) = 0, i.e. when

6

T

=0

2

T = 12,

and since R (T ) = 1/2 < 0 this turning point is indeed a maximum. Thus, the tax

revenue is maximised when T = 12.

Question 3.

(a) The first-order partial derivatives of f (x, y) are

fx (x, y) = 2x 2

and

fy (x, y) = 3y 2 + 2y + 1.

At a stationary point, both of these first-order partial derivatives are zero, i.e. we must

have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to solve

the simultaneous equations

2x 2 = 0

and

3y 2 + 2y + 1 = 0.

But, the first equation gives us x = 1 and the second equation gives us

3y 2 2y 1 = 0

(3y + 1)(y 1) = 0

y=

1

or 1.

3

Consequently, the points (1, 1/3) and (1, 1) are the stationary points of this function.

The second-order partial derivatives of this function are

fxx (x, y) = 2,

and

fyy (x, y) = 6y + 2,

H(x, y) = (2)(6y + 2) (0)2 = 4(1 3y).

Evaluating this at each of the stationary points we then find that:

345

H(1, 1/3) = 4(2) > 0

and

At (1, 1), the Hessian is

H(1, 1) = 4(2) < 0,

and so this is a saddle point.

Thus, the stationary points (1, 1/3) and (1, 1) are a local minimum and a saddle point

respectively.

To see where the function is convex, concave or neither we note that the Hessian is

given by

H(x, y) = 4(1 3y)

and

fxx (x, y) = 2,

and so we see that:

When y > 1/3, H(x, y) < 0 and so the function is neither convex nor concave.

When y 1/3, H(x, y) 0 and fxx (x, y) 0 and so the function is convex.

consider the behaviour of the function when x = 0 we have

f (0, y) = y 3 + y 2 + y,

and as such, we see that:

As y , f (0, y) and so f (x, y) cannot have a global minimum.

As y , f (0, y) and so f (x, y) cannot have a global maximum.

(b) To solve the given non-homogeneous second-order differential equation, we follow

the method in Section 8.3.2. In particular:

The corresponding homogeneous second-order ODE is

y1 2y1 + y1 = 0,

and so the auxiliary equation is

k 2 2k + 1 = 0

(k 1)2 = 0,

complementary function, yc (t), is

y1 (t) = (At + B) et ,

for arbitrary constants A and B.

346

The right-hand-side of the given ODE is et and our first reaction in this case would

be to take yp (t) = et where is a constant that has to be determined. But, this

wont work as, taking A = 0 and B = , we see that this is part of the

complementary function. As such, we multiply by t and try yp (t) = t et which

wont work either because, taking A = and B = 0, we see that this is part of the

complementary function as well. Consequently, we multiply by t again and try

yp (t) = t2 et which, thankfully, will work because it is not part of the

complementary function. So, differentiating this using the product rule, we have

yp (t) = (2t) et +(t2 ) et = (2t + t2 ) et ,

and

yp (t) = (2 + 2t) et +(2t + t2 ) et = (2 + 4t + t2 ) et ,

which means that, substituting these into our ODE, we get

(2 + 4t + t2 ) et 2(2t + t2 ) et +t2 et = et

Consequently, we see that

yp (t) =

2 et = et

1

= .

2

t2 t

e,

2

The general solution to our ODE is then given by the sum of its complementary

function and its particular integral, i.e. we have

y(t) = (At + B) et +

t2 t

e,

2

Then given the conditions y(0) = 1 and y(1) = 0, we have the equations

1 = B e0

and

0 = (A + B) e1 +

e1

,

2

respectively. The first of these gives B = 1 and then the second gives

0=A+B+

1

2

0=A+1+

1

2

3

A= .

2

y(t) =

3

t2

t2 3t + 2 t

t + 1 et + et =

e,

2

2

2

347

Question 4.

C(k, l) = vk + wl,

and we want to minimise this subject to the constraint q(k, l) = Q where k, l > 0. So,

writing the constraint in the form q(k, l) Q = 0, we get the Lagrangean

L(k, l, ) = vk + wl (q(k, l) Q) = vk + wl (k l Q).

and we seek the points which simultaneously satisfy the equations Lk (k, l, ) = 0,

Ll (k, l, ) = 0 and L (k, l, ) = 0. As such, we find the first-order partial derivatives of

L(k, l, ), i.e.

Lk (k, l, ) = v k 1 l , Ll (k, l, ) = w k l1 and L (k, l, ) = (k l Q) ,

and set these equal to zero to yield the equations

v k 1 l = 0,

w k l1 = 0

k l Q = 0.

and

We now solve these by eliminating from the first two equations, i.e. we get

v k 1 l = 0,

v

k 1 l

vk

,

k l

wl

,

k l

w k l1 = 0

w

k l1

from the second equation. As such, we can equate these expressions for to get

vk

wl

=

k l

k l

l=

v

k.

w

We then use this new relationship between k and l in the third equation, which is just

the constraint k l = Q, to get

Q = k

v

k

w

Q=

v

w

k 2

k 2 =

w

v

k=

w 1

Q 2 ,

v

l=

v

w

w 1

Q 2

v

v 1

Q 2 .

w

Thus, these values of k and l minimise the cost of producing Q units. The minimum

cost is then given by

C(Q)

=C

as required.

348

w 1

Q 2 ,

v

v 1

Q 2

w

=v

w 1

Q 2 + w

v

1

v 1

Q 2 = 2 vw Q 2 ,

w

de

cr

di

re

ct

io

ea

sin n of

g

co

st

To justify this, we note that the constraint k l = Q looks a bit like a rectangular

hyperbola and, for k, l > 0, this is illustrated in Figure B.1(a). The objective function,

C(k, l) = vk + wl has contours C(k, l) = c, where c is a constant, that are straight lines

as illustrated in Figure B.1(b). The direction in which C(k, l) is decreasing is indicated

in this figure along with the point we found above using the Lagrange multiplier

method i.e. a point where we have a contour of C(k, l) which is both tangential to

the constraint and touching the constraint. Having seen this, it should be clear that this

point will minimise C subject to the constraint.

(a)

(b)

Figure B.1: (a) The constraint q(k, l) = Q. (b) Adding three contours, C(k, l) = c, where

the point which is indicated in the figure.

Using the given information, we can see that if Q is produced then the revenue

generated will be R(Q) = pQ and the costs incurred will be

C(Q) = cQ + C(Q)

+ FC = cQ + 2 vw Q 2 + FC,

which is the cost of the raw materials plus the costs of capital and labour plus any fixed

costs the firm may have. As such, the profit function for the firm is

1

(Q) = R(Q) C(Q) = pQ cQ 2 vw Q 2 FC,

and we want to find the value of Q that maximises this. As such, we find that

1 1 1

vw 12

2

(Q) = p c 2 vw

Q

=pc

Q 2 ,

2

as the fixed costs, FC, are a constant and, setting this equal to zero, we find that

(Q) = 0

12

2

pc

=

vw

Q=

pc

vw

2

12

is the only stationary point. Indeed, notice that this value of Q is positive as p > c and

> 0. Furthermore, we have

2

vw

2

(Q) =

Q 12 1 ,

1 2

349

and as this is negative at the stationary point (since 0 < < 1/2 implies that > 0

and 1 2 > 0) we see that our stationary point is a local maximum. Thus,

pc

vw

Q=

2

12

Question 5.

(a) Using the facts that

ey = 1 + y +

y2 y3 y4

+

+

+ ,

2!

3!

4!

and

sin x = x

x3 x5

+

,

3!

5!

sin x

x3 x5

=1+ x

+

3!

5!

1

+

2!

+

x3

+

x

3!

1

+

3!

x3

+

x

3!

1

1

(x )4 + (x )5 + ,

4!

5!

if we keep the relevant terms of the sin x series when we put them into the series for ey .

Then, multiplying out the brackets and, again, keeping the relevant terms we get

esin x = 1 + x

x3 x5

+

3!

5!

1

x3

x2 + 2(x)

+

2!

3!

1

x3

+

x3 + 3(x)(x)

+

3!

3!

x4 x5

+

+

+ ,

4!

5!

+

esin x = 1 + x

x3

x5

+

6

120

1

x4

x2

+

2

3

1

x5

+

x3

+

6

2

x4

x5

+

+

+

24 120

+

esin x = 1 + x +

in terms up to x5 .

350

x2

x4 x5

+ 0x3

+ ,

2

8

15

dg

= cos x

dx

and so we have

cos x

dx =

(1 sin x)(2 + sin x)

1

dg.

(1 g)(2 + g)

1

A

B

=

+

(1 g)(2 + g)

1g 2+g

Consequently, we have

1

dg =

(1 g)(2 + g)

1/3

1/3

+

1g 2+g

1

3

1

2 + sin x

ln

+ c,

3

1 sin x

dg

ln |1 g| + ln |2 + g| + c

as the answer.

(c) To find the stationary points of the function f (x) we write it as

x

x1/3 ,

f (x) =

12

and so we have

1

1

x2/3 .

12 3

The stationary points occur when f (x) = 0 and so we need to solve the equation

f (x) =

1

1

2/3 = 0

12 3x

x2/3 4

= 0,

x2/3

x2/3 = 4

x2 = 64

x = 8.

f (x) =

1

3

2

3

x5/3 =

2

,

9x5/3

If x = 8, we have f (8) > 0 and so this is a local minimum.

If x = 8, we have f (8) < 0 and so this is a local maximum.

Thus, the stationary points when x = 8 and x = 8 are a local minimum and a local

maximum respectively.

351

352

- Mathematics for EconomistsDiunggah olehblakmetal
- AlgebraDiunggah olehJuan Luis Leiva Torres
- LSE Abstract MathematicsDiunggah olehgattling
- Study Guide (Math Econ)Diunggah olehNida Sohail Chaudhary
- Mathematics 1Diunggah olehBlepharitisMgdEyebag
- AC3091_vle Financial reportingDiunggah olehletuan2212
- EC2096_vle[1]Diunggah olehbillymambo
- EC2020 Study GuideDiunggah olehKalyan_ralla
- Subject GuideDiunggah olehJorge Mestre-Jorda
- Quantitative financeDiunggah olehmkjh
- ECONOMICS1002 Subject Guide (2016)Diunggah olehMuhammad Khan
- ST104a VleDiunggah olehletuan2212
- Financial EconomicsDiunggah olehmosib74
- EC3115 - Monetary EconomicsDiunggah olehEdwin Aw
- EC2065_2013Diunggah olehstaticbitez
- PS3086 Subject GuideDiunggah olehbillymambo
- MT105b_vle[1]Diunggah olehklchoudhary
- LSE_Distribution Theory.pdfDiunggah olehgattling
- xfdgsfzfgdrfesdcDiunggah olehbabylovelylovely
- ST104B Statistics 2.pdfDiunggah olehmagnifiqueheart
- Application of CalculusDiunggah oleheduarvesan1597
- PS2082_vle[1]Diunggah olehbillymambo
- EC2066 Microeconomics (EC1002).pdfDiunggah olehmagnifiqueheart
- ec2066_ch1-4.pdfDiunggah olehMandefro Seifu
- EC2066-vleDiunggah olehSehar Sajjad
- Advanced Mathematical MethodsDiunggah olehConcha Concha
- Mastering R for Quantitative Finance - Sample ChapterDiunggah olehPackt Publishing
- Understanding Multivariable CalculusDiunggah olehrashadwcartwright
- Introduction to Econometric Solutions to Exercises(Part 2)Diunggah olehJustin Liao
- mt1174_ch1-4Diunggah olehhing

- SchlumbergerDiunggah olehRuby Permata
- lecture-20_4279Diunggah olehMayurdhwajsinhjiChauhan
- gr12nat2014exemplarp1p2exams.pdfDiunggah olehanon_77778877
- Week 6_ Principal Stresses and StrainsDiunggah olehPierce Hardin
- Course Outline Dbm 1023 Dec14Diunggah olehilchwan
- 2014ACTPreparing.pdfDiunggah oleholiver gross
- bt-teDiunggah olehprakhargupta41
- GradientDiunggah olehMd. Nazmun Sadat Khan
- Signpost 10Diunggah olehCharith1998
- mathematics page 267Diunggah olehapi-334097938
- An Introduction to Crystallography and Mineral Crystal SystemsDiunggah olehLuis Paez
- 2016 P1 St JosephDiunggah olehZuraini Arshad
- LESSON 2 - ENGINEERING DRAWING AND ITS INSTRUMENTS.pdfDiunggah olehАисыафирул Касим
- Drawing the LineDiunggah olehLuis Paulo
- InequalitiesDiunggah olehEpic Win
- Problem by David MountDiunggah olehdineshdash1979
- CAD system algorithmDiunggah olehsachin b
- QB EnggMechanicsDiunggah olehManish Shashikant Dharek
- GeogebraDiunggah olehSaurabh Gupta
- Engr Math I Chap 13 Spring 106 Student VersionDiunggah olehjimmy
- Basics 978-1-58503-638-7-3Diunggah olehReshma Menon
- ACSA.Intl.2003.60Diunggah olehHerminio Pagnoncelli
- Schuller's Geometric Anatomy of Theoretical Physics, Lectures 1-25Diunggah olehSimon Rea
- 5 EX Solid ModellingDiunggah olehjishnusaji
- Maths Spm Melaka AnswerDiunggah olehmurulikrishan
- CurveDiunggah olehprmahajan18
- Vector Fields IntegrationDiunggah olehJayesh Narayan
- Leep 208Diunggah olehAarushi Singh
- vectors.pptDiunggah olehLouise Marie
- AMC8 1998Diunggah olehKevin Yang