Engineering Math Reference

Engineering Mathematics
Dr Colin Turner
October 15, 2009
i
Copyright Notice
The contents of this document are protected by a Creative Commons
License, that allows copying and modication with attribution, but not
commercial reuse. Please contact me for more details.
http://creativecommons.org/licenses/by-nc/2.0/uk/
Queries
Should you have any queries about these notes, you should approach your
lecturer or your tutor as soon as possible. Dont be afraid to ask questions, it
is possible you may have found an error, and if you have not, your questions
will help your lecturer / tutor understand the problems you are experiencing.
As mathematics is cumulative, it will be very hard to continue the module
with outstanding problems from the start, a bit of work at this point will
make the rest much easier going.
Practice
Mathematics requires practice. No matter how simple a procedure may
look when demonstrated in a lecture or a tutorial you can have no idea how
well you can perform it until you try. As there is very little opportunity
for practice in the university environment it is vitally important that you
attempt the questions provided in the tutorial, preferably before attending
the relevant tutorial class. Your time with your tutor will be best spent when
you arrive at the class with a list of problems you are unable to tackle, the
more specic the better. If you nd the questions too hard before the tutorial,
do not become discouraged, the mere act of thinking about the problem will
have a positive aect on your understanding of the problem once explained
to you in the tutorial.
Contact Details
My contact details are as follows
Name Dr Colin Turner
Room 5F10
Phone 68084 (+44-28-9036-8084 externally)
Email c.turner@ulster.ac.uk
WWW http://newton.engj.ulst.ac.uk/crt/
Contents
1 Preliminaries 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3.1 The law of signs . . . . . . . . . . . . . . . . . . . . . . 2
1.3.2 Order of precedence . . . . . . . . . . . . . . . . . . . . 3
1.4 Decimal Places & Signicant Figures . . . . . . . . . . . . . . 3
1.4.1 Decimal Places . . . . . . . . . . . . . . . . . . . . . . 3
1.4.2 Signicant Figures . . . . . . . . . . . . . . . . . . . . 4
1.5 Standard Form . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5.1 Standard prexes . . . . . . . . . . . . . . . . . . . . . 5
2 Number Systems 8
2.1 Natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.6 Irrational Numbers . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Basic Algebra 11
3.1 Rearranging Equations . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Order of Rearranging . . . . . . . . . . . . . . . . . . . 12
3.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Function Notation . . . . . . . . . . . . . . . . . . . . . . . . 15
CONTENTS iii
3.3 Expansion of Brackets . . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 Brackets upon Brackets . . . . . . . . . . . . . . . . . . 18
3.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Laws of Indices . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.1 Example proofs . . . . . . . . . . . . . . . . . . . . . 22
3.5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.6 Laws of Surds . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.6.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.7 Quadratic Equations . . . . . . . . . . . . . . . . . . . . . . . 24
3.7.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.7.2 Graphical interpretation . . . . . . . . . . . . . . . . . 25
3.7.3 Factorization . . . . . . . . . . . . . . . . . . . . . . . 26
3.7.4 Quadratic solution formula . . . . . . . . . . . . . . . . 27
3.7.5 The discriminant . . . . . . . . . . . . . . . . . . . . . 28
3.7.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.7.7 Special cases . . . . . . . . . . . . . . . . . . . . . . . . 29
3.8 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.8.1 Modulus or absolute value . . . . . . . . . . . . . . . . 30
3.8.2 Sigma notation . . . . . . . . . . . . . . . . . . . . . . 31
3.8.3 Factorials . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.9 Exponential and Logarithmic functions . . . . . . . . . . . . . 32
3.9.1 Exponential functions . . . . . . . . . . . . . . . . . . 32
3.9.2 Logarithmic functions . . . . . . . . . . . . . . . . . . 33
3.9.3 Logarithms to solve equations . . . . . . . . . . . . . . 34
3.9.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.9.5 Anti-logging . . . . . . . . . . . . . . . . . . . . . . . . 36
3.9.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.10 Binomial Expansion . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.10.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.10.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.10.4 High values of n . . . . . . . . . . . . . . . . . . . . . . 41
3.11 Arithmetic Progressions . . . . . . . . . . . . . . . . . . . . . 42
3.11.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.11.2 Sum of an arithmetic progression . . . . . . . . . . . . 42
3.11.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.11.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.12 Geometric Progressions . . . . . . . . . . . . . . . . . . . . . . 45
CONTENTS iv
3.12.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.12.2 Sum of a geometric progression . . . . . . . . . . . . . 46
3.12.3 Sum to innity . . . . . . . . . . . . . . . . . . . . . . 46
3.12.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.12.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Trigonometry 49
4.1 Right-angled triangles . . . . . . . . . . . . . . . . . . . . . . 49
4.1.1 Labelling . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1.2 Pythagoras Theorem . . . . . . . . . . . . . . . . . . . 50
4.1.3 Basic trigonometric functions . . . . . . . . . . . . . . 50
4.1.4 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 Table of values . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Graphs of Functions . . . . . . . . . . . . . . . . . . . . . . . 53
4.5 Multiple Solutions . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5.1 CAST diagram . . . . . . . . . . . . . . . . . . . . . . 55
4.5.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.5.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.6 Scalene triangles . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6.1 Labelling . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6.2 Scalene trigonmetry . . . . . . . . . . . . . . . . . . . . 58
4.6.3 Sine Rule . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.6.4 Cosine Rule . . . . . . . . . . . . . . . . . . . . . . . . 59
4.6.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.7 Radian Measure . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.1 Conversion . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.2 Length of Arc . . . . . . . . . . . . . . . . . . . . . . . 61
4.7.3 Area of Sector . . . . . . . . . . . . . . . . . . . . . . . 62
4.8 Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.8.1 Basic identities . . . . . . . . . . . . . . . . . . . . . . 64
4.8.2 Compound angle identities . . . . . . . . . . . . . . . . 64
4.8.3 Double angle identities . . . . . . . . . . . . . . . . . . 64
4.9 Trigonmetric equations . . . . . . . . . . . . . . . . . . . . . . 65
4.9.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.9.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.9.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 66
CONTENTS v
5 Complex Numbers 68
5.1 Basic Principle . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.1 Imaginary and Complex Numbers . . . . . . . . . . . . 69
5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
5.3 Argand Diagram Representation . . . . . . . . . . . . . . . . . 70
5.4 Algebra of Complex Numbers . . . . . . . . . . . . . . . . . . 70
5.4.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.2 Subtraction . . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.3 Multiplication . . . . . . . . . . . . . . . . . . . . . . . 72
5.4.4 Division . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.4.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.5 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.5.1 Modulus . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.5.2 Conjugate . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.5.3 Real part . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.5.4 Imaginary part . . . . . . . . . . . . . . . . . . . . . . 76
5.6 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.6.1 Cartesian form . . . . . . . . . . . . . . . . . . . . . . 76
5.6.2 Polar form . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.6.3 Exponential form . . . . . . . . . . . . . . . . . . . . . 78
5.6.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.7 De Moivres Theorem . . . . . . . . . . . . . . . . . . . . . . . 80
5.7.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.7.2 Roots of Unity . . . . . . . . . . . . . . . . . . . . . . 82
5.7.3 Roots of other numbers . . . . . . . . . . . . . . . . . . 83
5.8 Trigonometric functions . . . . . . . . . . . . . . . . . . . . . 84
6 Vectors & Matrices 85
6.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1.1 Modulus . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.1.2 Unit Vector . . . . . . . . . . . . . . . . . . . . . . . . 86
6.1.3 Cartesian unit vectors . . . . . . . . . . . . . . . . . . 86
6.1.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.1.5 Signs of vectors . . . . . . . . . . . . . . . . . . . . . . 87
6.1.6 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.1.7 Subtraction . . . . . . . . . . . . . . . . . . . . . . . . 88
6.1.8 Zero vector . . . . . . . . . . . . . . . . . . . . . . . . 88
6.1.9 Scalar Product . . . . . . . . . . . . . . . . . . . . . . 89
6.1.10 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 90
CONTENTS vi
6.1.11 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.1.12 Vector Product . . . . . . . . . . . . . . . . . . . . . . 91
6.1.13 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2.1 Square matrices . . . . . . . . . . . . . . . . . . . . . . 93
6.2.2 Row and Column vectors . . . . . . . . . . . . . . . . . 94
6.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.2.4 Zero and Identity . . . . . . . . . . . . . . . . . . . . . 94
6.3 Matrix Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3.3 Subtraction . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.3.5 Multiplication by a scalar . . . . . . . . . . . . . . . . 96
6.3.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.7 Domino Rule . . . . . . . . . . . . . . . . . . . . . . . 96
6.3.8 Multiplication . . . . . . . . . . . . . . . . . . . . . . . 97
6.3.9 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.3.10 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4 Determinant of a matrix . . . . . . . . . . . . . . . . . . . . . 99
6.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.4.2 Sign rule for matrices . . . . . . . . . . . . . . . . . . . 99
6.4.3 Order 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.4.5 Order 4 . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5 Inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.1 Order 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5.3 Other orders . . . . . . . . . . . . . . . . . . . . . . . . 103
6.5.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.6 Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.6.1 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.6.2 Multiplication . . . . . . . . . . . . . . . . . . . . . . . 104
6.6.3 Mixed . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.7 Solving equations . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.7.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.7.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.7.3 Row reduction . . . . . . . . . . . . . . . . . . . . . . . 108
6.8 Row Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.8.1 Determinants . . . . . . . . . . . . . . . . . . . . . . . 109
6.8.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 110
CONTENTS vii
6.9 Solving systems of equations . . . . . . . . . . . . . . . . . . . 110
6.9.1 Gaussian Elimination . . . . . . . . . . . . . . . . . . . 111
6.9.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.9.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.9.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 115
6.9.5 Number of equations vs. unknowns . . . . . . . . . . . 116
6.10 Inversion by Row operations . . . . . . . . . . . . . . . . . . . 117
6.11 Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.11.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6.11.2 Systems of equations . . . . . . . . . . . . . . . . . . . 119
6.11.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.11.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.11.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.11.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.11.7 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.12 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . 122
6.12.1 Finding Eigenvalues . . . . . . . . . . . . . . . . . . . 122
6.12.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.12.3 Finding eigenvectors . . . . . . . . . . . . . . . . . . . 123
6.12.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.12.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.12.6 Other orders . . . . . . . . . . . . . . . . . . . . . . . . 125
6.13 Diagonalisation . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.13.1 Powers of diagonal matrices . . . . . . . . . . . . . . . 126
6.13.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.13.3 Powers of other matrices . . . . . . . . . . . . . . . . . 127
6.13.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7 Graphs of Functions 128
7.1 Simple graph plotting . . . . . . . . . . . . . . . . . . . . . . . 128
7.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.2 Important functions . . . . . . . . . . . . . . . . . . . . . . . . 130
7.2.1 Direct Proportion . . . . . . . . . . . . . . . . . . . . . 130
7.2.2 Inverse Proportion . . . . . . . . . . . . . . . . . . . . 131
7.2.3 Inverse Square Proportion . . . . . . . . . . . . . . . . 132
7.2.4 Exponential Functions . . . . . . . . . . . . . . . . . . 134
7.2.5 Logarithmic Functions . . . . . . . . . . . . . . . . . . 134
7.3 Transformations on graphs . . . . . . . . . . . . . . . . . . . . 135
CONTENTS viii
7.3.1 Addition or Subtraction . . . . . . . . . . . . . . . . . 136
7.3.2 Multiplication or Division . . . . . . . . . . . . . . . . 136
7.3.3 Adding to or Subtracting from x . . . . . . . . . . . . 136
7.3.4 Multiplying or Dividing x . . . . . . . . . . . . . . . . 137
7.4 Even and Odd functions . . . . . . . . . . . . . . . . . . . . . 138
7.4.1 Even functions . . . . . . . . . . . . . . . . . . . . . . 138
7.4.2 Odd functions . . . . . . . . . . . . . . . . . . . . . . . 138
7.4.3 Combinations of functions . . . . . . . . . . . . . . . . 139
7.4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 140
8 Coordinate geometry 142
8.1 Elementary concepts . . . . . . . . . . . . . . . . . . . . . . . 142
8.1.1 Distance between two points . . . . . . . . . . . . . . . 143
8.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.1.4 Midpoint of two points . . . . . . . . . . . . . . . . . . 144
8.1.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.1.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.1.7 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.1.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.1.9 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.2 Equation of a straight line . . . . . . . . . . . . . . . . . . . . 145
8.2.1 Meaning of equation of line . . . . . . . . . . . . . . . 146
8.2.2 Finding the equation of a line . . . . . . . . . . . . . . 146
8.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 147
8.2.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9 Dierential Calculus 149
9.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
9.3 Rules & Techniques . . . . . . . . . . . . . . . . . . . . . . . . 150
9.3.1 Power Rule . . . . . . . . . . . . . . . . . . . . . . . . 150
9.3.2 Addition and Subtraction . . . . . . . . . . . . . . . . 151
9.3.3 Constants upon functions . . . . . . . . . . . . . . . . 151
9.3.4 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . 151
9.3.5 Product Rule . . . . . . . . . . . . . . . . . . . . . . . 151
9.3.6 Quotient Rule . . . . . . . . . . . . . . . . . . . . . . . 152
9.3.7 Trigonometric Rules . . . . . . . . . . . . . . . . . . . 152
9.3.8 Exponential Rules . . . . . . . . . . . . . . . . . . . . 152
9.3.9 Logarithmic Rules . . . . . . . . . . . . . . . . . . . . 152
CONTENTS ix
9.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
9.4.1 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . 153
9.5 Tangents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9.6 Turning Points . . . . . . . . . . . . . . . . . . . . . . . . . . 158
9.6.1 Types of turning point . . . . . . . . . . . . . . . . . . 158
9.6.2 Finding turning points . . . . . . . . . . . . . . . . . . 159
9.6.3 Classication of turning points . . . . . . . . . . . . . . 160
9.6.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 161
9.6.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 162
9.6.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.7 Newton Rhapson . . . . . . . . . . . . . . . . . . . . . . . . . 163
9.7.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 164
9.7.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9.8 Partial Dierentiation . . . . . . . . . . . . . . . . . . . . . . 166
9.8.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 167
9.9 Small Changes . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.9.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.9.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10 Integral Calculus 172
10.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
10.1.1 Constant of Integration . . . . . . . . . . . . . . . . . . 172
10.2 Rules & Techniques . . . . . . . . . . . . . . . . . . . . . . . . 173
10.2.1 Power Rule . . . . . . . . . . . . . . . . . . . . . . . . 173
10.2.2 Addition & Subtraction . . . . . . . . . . . . . . . . . 173
10.2.3 Multiplication by a constant . . . . . . . . . . . . . . . 174
10.2.4 Substitution . . . . . . . . . . . . . . . . . . . . . . . . 174
10.2.5 Limited Chain Rule . . . . . . . . . . . . . . . . . . . . 174
10.2.6 Logarithm rule . . . . . . . . . . . . . . . . . . . . . . 175
10.2.7 Partial Fractions . . . . . . . . . . . . . . . . . . . . . 176
10.2.8 Integration by Parts . . . . . . . . . . . . . . . . . . . 177
10.2.9 Other rules . . . . . . . . . . . . . . . . . . . . . . . . 178
10.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.3.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 180
10.3.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 182
10.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 183
10.3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 184
CONTENTS x
10.4 Denite Integration . . . . . . . . . . . . . . . . . . . . . . . . 185
10.4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 186
10.4.2 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . 186
10.4.3 Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
10.4.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 186
10.4.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 186
10.4.6 Volumes of Revolution . . . . . . . . . . . . . . . . . . 187
10.4.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 187
10.4.8 Mean Values . . . . . . . . . . . . . . . . . . . . . . . . 188
10.4.9 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 188
10.4.10Example . . . . . . . . . . . . . . . . . . . . . . . . . . 188
10.4.11RMS Values . . . . . . . . . . . . . . . . . . . . . . . . 189
10.4.12Example . . . . . . . . . . . . . . . . . . . . . . . . . . 189
10.4.13Example . . . . . . . . . . . . . . . . . . . . . . . . . . 190
10.5 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . 191
10.5.1 Simpsons rule . . . . . . . . . . . . . . . . . . . . . . . 192
10.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 192
11 Power Series 194
11.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
11.1.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . 194
11.2 Maclaurins Expansion . . . . . . . . . . . . . . . . . . . . . . 195
11.2.1 Odd and Even . . . . . . . . . . . . . . . . . . . . . . . 195
11.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.2.3 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.2.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.2.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 197
11.2.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 197
11.3 Taylors Expansion . . . . . . . . . . . . . . . . . . . . . . . . 197
11.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 198
11.3.2 Identication of Turning Points . . . . . . . . . . . . . 198
12 Dierential Equations 200
12.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
12.2 Exact D.E.s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
12.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 201
12.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 202
12.3 Variables separable D.E.s . . . . . . . . . . . . . . . . . . . . . 203
12.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 203
CONTENTS xi
12.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 204
12.4 First order linear D.E.s . . . . . . . . . . . . . . . . . . . . . . 206
12.4.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 207
12.4.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 208
12.5 Second order D.E.s . . . . . . . . . . . . . . . . . . . . . . . . 209
12.5.1 Homogenous D.E. with constant coecients . . . . . . 209
12.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 211
12.5.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 211
12.5.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 212
12.5.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 212
12.5.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 213
13 Dierentiation in several variables 214
13.1 Partial Dierentiation . . . . . . . . . . . . . . . . . . . . . . 214
13.1.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 214
13.1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 215
13.1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 216
13.1.4 Higher Derivatives . . . . . . . . . . . . . . . . . . . . 216
13.1.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 217
13.2 Taylors Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 218
13.3 Stationary Points . . . . . . . . . . . . . . . . . . . . . . . . . 218
13.3.1 Types of points . . . . . . . . . . . . . . . . . . . . . . 218
13.3.2 Finding points . . . . . . . . . . . . . . . . . . . . . . . 219
13.3.3 Classifying points . . . . . . . . . . . . . . . . . . . . . 219
13.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . 220
13.3.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 221
13.4 Implicit functions . . . . . . . . . . . . . . . . . . . . . . . . . 222
13.5 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . 223
13.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 224
13.6 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
13.6.1 Dierential . . . . . . . . . . . . . . . . . . . . . . . . 226
13.7 Parametric functions . . . . . . . . . . . . . . . . . . . . . . . 226
13.7.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 227
13.8 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
14 Integration in several variables 229
14.1 Double integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 229
14.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 229
14.2 Change of order . . . . . . . . . . . . . . . . . . . . . . . . . . 230
14.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
CONTENTS xii
14.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 232
14.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 232
14.4 Triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 233
14.4.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 233
14.5 Change of variable . . . . . . . . . . . . . . . . . . . . . . . . 234
14.5.1 Polar coordinates . . . . . . . . . . . . . . . . . . . . . 235
14.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 235
14.5.3 Cylindrical Polar Coordinates . . . . . . . . . . . . . . 236
14.5.4 Spherical Polar Coordinates . . . . . . . . . . . . . . . 236
15 Fourier Series 237
15.1 Periodic functions . . . . . . . . . . . . . . . . . . . . . . . . . 237
15.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 237
15.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 238
15.2 Sets of functions . . . . . . . . . . . . . . . . . . . . . . . . . . 239
15.2.1 Orthogonal functions . . . . . . . . . . . . . . . . . . . 239
15.2.2 Orthonormal functions . . . . . . . . . . . . . . . . . . 239
15.2.3 Norm of a function . . . . . . . . . . . . . . . . . . . . 239
15.3 Fourier concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 239
15.3.1 Fourier coecents . . . . . . . . . . . . . . . . . . . . . 239
15.3.2 Fourier series . . . . . . . . . . . . . . . . . . . . . . . 240
15.3.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . 240
15.4 Important functions . . . . . . . . . . . . . . . . . . . . . . . . 240
15.4.1 Trigonometric system . . . . . . . . . . . . . . . . . . . 241
15.4.2 Exponential system . . . . . . . . . . . . . . . . . . . . 242
15.5 Trigonometric expansions . . . . . . . . . . . . . . . . . . . . . 242
15.5.1 Even functions . . . . . . . . . . . . . . . . . . . . . . 242
15.5.2 Odd functions . . . . . . . . . . . . . . . . . . . . . . . 243
15.5.3 Other Ranges . . . . . . . . . . . . . . . . . . . . . . . 243
15.6 Harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
15.6.1 Odd and Even Harmonics . . . . . . . . . . . . . . . . 244
15.6.2 Trigonometric system . . . . . . . . . . . . . . . . . . . 244
15.6.3 Exponential system . . . . . . . . . . . . . . . . . . . . 245
15.6.4 Percentage harmonic . . . . . . . . . . . . . . . . . . . 245
15.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
15.7.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 245
15.8 Exponential Series . . . . . . . . . . . . . . . . . . . . . . . . 246
CONTENTS xiii
16 Laplace transforms 248
16.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
16.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 248
16.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 249
16.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 249
16.1.4 Inverse Transform . . . . . . . . . . . . . . . . . . . . . 250
16.1.5 Elementary properties . . . . . . . . . . . . . . . . . . 250
16.1.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 250
16.2 Important Transforms . . . . . . . . . . . . . . . . . . . . . . 251
16.2.1 First shifting property . . . . . . . . . . . . . . . . . . 251
16.2.2 Further Laplace transforms . . . . . . . . . . . . . . . 253
16.3 Transforming derivatives . . . . . . . . . . . . . . . . . . . . . 254
16.3.1 First derivative . . . . . . . . . . . . . . . . . . . . . . 254
16.3.2 Second derivative . . . . . . . . . . . . . . . . . . . . . 254
16.3.3 Higher derivatives . . . . . . . . . . . . . . . . . . . . . 254
16.4 Transforming integrals . . . . . . . . . . . . . . . . . . . . . . 254
16.5 Dierential Equations . . . . . . . . . . . . . . . . . . . . . . . 255
16.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 255
16.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 256
16.5.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 258
16.5.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 260
16.5.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 261
16.5.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 261
16.5.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 262
16.6 Other theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 263
16.6.1 Change of Scale . . . . . . . . . . . . . . . . . . . . . . 263
16.6.2 Derivative of the transform . . . . . . . . . . . . . . . . 263
16.6.3 Convolution Theorem . . . . . . . . . . . . . . . . . . . 264
16.6.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 264
16.6.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 264
16.6.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 265
16.7 Heaviside unit step function . . . . . . . . . . . . . . . . . . . 265
16.7.1 Laplace transform of u(t c) . . . . . . . . . . . . . . 268
16.7.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 268
16.7.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 270
16.7.4 Delayed functions . . . . . . . . . . . . . . . . . . . . . 270
16.7.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 271
16.8 The Dirac Delta . . . . . . . . . . . . . . . . . . . . . . . . . . 272
16.8.1 Delayed impulse . . . . . . . . . . . . . . . . . . . . . . 273
16.8.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 274
16.9 Transfer Functions . . . . . . . . . . . . . . . . . . . . . . . . 274
CONTENTS xiv
16.9.1 Impulse Response . . . . . . . . . . . . . . . . . . . . . 276
16.9.2 Initial value theorem . . . . . . . . . . . . . . . . . . . 277
16.9.3 Final value theorem . . . . . . . . . . . . . . . . . . . . 277
17 Z-transform 279
17.1 Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
17.2 Important Z-transforms . . . . . . . . . . . . . . . . . . . . . 281
17.2.1 Unit step function . . . . . . . . . . . . . . . . . . . . 281
17.2.2 Linear function . . . . . . . . . . . . . . . . . . . . . . 282
17.2.3 Exponential function . . . . . . . . . . . . . . . . . . . 283
17.2.4 Elementary properties . . . . . . . . . . . . . . . . . . 283
17.2.5 Real translation theorem . . . . . . . . . . . . . . . . . 283
18 Statistics 285
18.1 Sigma Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 285
18.1.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 285
18.2 Populations and Samples . . . . . . . . . . . . . . . . . . . . . 287
18.2.1 Sampling . . . . . . . . . . . . . . . . . . . . . . . . . 287
18.3 Parameters and Statistics . . . . . . . . . . . . . . . . . . . . 288
18.4 Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
18.5 Measures of Location . . . . . . . . . . . . . . . . . . . . . . . 288
18.5.1 Arithmetic Mean . . . . . . . . . . . . . . . . . . . . . 289
18.5.2 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
18.5.3 Median . . . . . . . . . . . . . . . . . . . . . . . . . . 289
18.5.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 289
18.5.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 290
18.6 Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . 290
18.6.1 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
18.6.2 Standard deviation . . . . . . . . . . . . . . . . . . . . 291
18.6.3 Inter-quartile range . . . . . . . . . . . . . . . . . . . . 292
18.7 Frequency Distributions . . . . . . . . . . . . . . . . . . . . . 292
18.7.1 Class intervals . . . . . . . . . . . . . . . . . . . . . . . 292
18.8 Cumulative frequency . . . . . . . . . . . . . . . . . . . . . . . 293
18.8.1 Calculating the median . . . . . . . . . . . . . . . . . . 293
18.8.2 Calculating quartiles . . . . . . . . . . . . . . . . . . . 294
18.8.3 Calculating other ranges . . . . . . . . . . . . . . . . . 294
18.9 Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
18.10Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
18.10.1Linear regression . . . . . . . . . . . . . . . . . . . . . 295
18.10.2Correlation coecient . . . . . . . . . . . . . . . . . . 295
CONTENTS xv
19 Probability 297
19.1 Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
19.1.1 Probability of an Event . . . . . . . . . . . . . . . . . . 297
19.1.2 Exhaustive lists . . . . . . . . . . . . . . . . . . . . . . 297
19.2 Multiple Events . . . . . . . . . . . . . . . . . . . . . . . . . . 298
19.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 298
19.2.2 Relations between events . . . . . . . . . . . . . . . . . 298
19.3 Probability Laws . . . . . . . . . . . . . . . . . . . . . . . . . 299
19.3.1 A or B (mutually exclusive events) . . . . . . . . . . . 299
19.3.2 not A . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
19.3.3 1 event of N . . . . . . . . . . . . . . . . . . . . . . . . 300
19.3.4 n events of N . . . . . . . . . . . . . . . . . . . . . . . 300
19.3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . 300
19.3.6 A and B (independent events) . . . . . . . . . . . . . . 300
19.3.7 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 301
19.3.8 A or B or C or ... . . . . . . . . . . . . . . . . . . . . . 301
19.3.9 A and B and C and ... . . . . . . . . . . . . . . . . . . 301
19.3.10Example . . . . . . . . . . . . . . . . . . . . . . . . . . 302
19.3.11A or B revisited . . . . . . . . . . . . . . . . . . . . . . 302
19.3.12Example . . . . . . . . . . . . . . . . . . . . . . . . . . 302
19.3.13A and B revisited . . . . . . . . . . . . . . . . . . . . . 303
19.3.14Conditional probability . . . . . . . . . . . . . . . . . . 303
19.3.15Example . . . . . . . . . . . . . . . . . . . . . . . . . . 304
19.3.16Bayes Theorem . . . . . . . . . . . . . . . . . . . . . . 304
19.4 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . 304
19.4.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 304
19.4.2 Expected Value . . . . . . . . . . . . . . . . . . . . . . 305
19.4.3 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . 306
19.4.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 307
19.5 Continuous Random Variables . . . . . . . . . . . . . . . . . . 308
19.5.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . 309
19.5.2 Probability Density Function . . . . . . . . . . . . . . 309
20 The Normal Distribution 310
20.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
20.2 Standard normal distribution . . . . . . . . . . . . . . . . . . 311
20.2.1 Transforming variables . . . . . . . . . . . . . . . . . . 311
20.2.2 Calculation of areas . . . . . . . . . . . . . . . . . . . . 312
20.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 312
20.2.4 Condence limits . . . . . . . . . . . . . . . . . . . . . 313
CONTENTS xvi
20.2.5 Sampling distribution . . . . . . . . . . . . . . . . . . . 313
20.3 The central limit theorem . . . . . . . . . . . . . . . . . . . . 314
20.4 Finding the Population mean . . . . . . . . . . . . . . . . . . 315
20.5 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . 315
20.5.1 Two tailed tests . . . . . . . . . . . . . . . . . . . . . . 316
20.6 Dierence of two normal distributions . . . . . . . . . . . . . . 317
A Statistical Tables 318
B Greek Alphabet 321
List of Tables
1.1 Basic notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The law of signs . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Order of precedence . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 SI prexes for large numbers . . . . . . . . . . . . . . . . . . . 6
1.5 SI prexes for small numbers . . . . . . . . . . . . . . . . . . . 7
3.1 The laws of indices . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 The laws of surds . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Examples of quadratic equations . . . . . . . . . . . . . . . . . 25
3.4 Laws of Logarithms . . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Pascals Triangle . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1 Table of trigonometric values . . . . . . . . . . . . . . . . . . . 54
4.2 Conversion between degrees and radians . . . . . . . . . . . . 62
5.1 Powers of j . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.1 Matrix algebra - Addition . . . . . . . . . . . . . . . . . . . . 104
6.2 Matrix algebra - Multiplication . . . . . . . . . . . . . . . . . 105
6.3 Matrix algebra - Mixed operations . . . . . . . . . . . . . . . . 105
7.1 Adding and Subtracting even and odd functions . . . . . . . . 139
7.2 Multiplying even and odd functions . . . . . . . . . . . . . . . 140
9.1 Second derivative test . . . . . . . . . . . . . . . . . . . . . . . 160
9.2 First derivative turning point classication . . . . . . . . . . . 161
15.1 Symmetry in Fourier Series . . . . . . . . . . . . . . . . . . . . 244
16.1 Common Laplace transforms . . . . . . . . . . . . . . . . . . . 252
16.2 Further Laplace transforms . . . . . . . . . . . . . . . . . . . . 253
16.3 Examples of transfer function denominators . . . . . . . . . . 276
LIST OF TABLES xviii
18.1 An example of class intervals . . . . . . . . . . . . . . . . . . . 293
19.1 Probabilities for total of two rolled dice . . . . . . . . . . . . . 305
19.2 Calculating E(X) and var (X) for two rolled dice. . . . . . . . 308
A.1 Table of (x) (Normal Distribution) . . . . . . . . . . . . . . 319
A.2 Table of
2
distribution (Part I) . . . . . . . . . . . . . . . . . 320
A.3 Table of
2
distribution (Part II) . . . . . . . . . . . . . . . . 320
List of Figures
2.1 The real number line . . . . . . . . . . . . . . . . . . . . . . . 9
3.1 The quadratic equation . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Labelling right-angled triangles . . . . . . . . . . . . . . . . . 50
4.2 Generating the trigonometric graphs . . . . . . . . . . . . . . 55
4.3 The graphs of sin and cos . . . . . . . . . . . . . . . . . . . 56
4.4 The CAST diagram . . . . . . . . . . . . . . . . . . . . . . 57
4.5 Labelling a scalene triangle . . . . . . . . . . . . . . . . . . . . 59
4.6 Length of Arc, Area of Sector . . . . . . . . . . . . . . . . . . 63
5.1 The Argand diagram . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Polar representation of a complex number . . . . . . . . . . . 77
6.1 Vector Addition . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.2 Vector Subtraction . . . . . . . . . . . . . . . . . . . . . . . . 89
7.1 The graph of x
2
+ 2x 3 . . . . . . . . . . . . . . . . . . . . . 129
7.2 The graph of 2x + 3 . . . . . . . . . . . . . . . . . . . . . . . 130
7.3 The graph of
1
x
. . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.4 The graph of
1
x
2
. . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.5 The graph of e
x
. . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.6 The graph of ln x . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.7 Closeup of graph of ln x . . . . . . . . . . . . . . . . . . . . . 136
7.8 Graph of sin x . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.9 Graph of sin x + 1 . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.10 Graph of 2 sin x . . . . . . . . . . . . . . . . . . . . . . . . . . 139
7.11 Graph of sin(x + 90) . . . . . . . . . . . . . . . . . . . . . . . 140
7.12 Closeup of graph of sin(2x) . . . . . . . . . . . . . . . . . . . . 141
8.1 Relationships between two points . . . . . . . . . . . . . . . . 142
LIST OF FIGURES xx
9.1 Types of turning point . . . . . . . . . . . . . . . . . . . . . . 159
9.2 The Newton-Rhapson method . . . . . . . . . . . . . . . . . . 171
13.1 The graph of f(x, y) = xy
2
+ 1 . . . . . . . . . . . . . . . . . . 215
13.2 The graph of f(x, y) = x
3
(sin xy + 3x + y + 3) . . . . . . . . . 216
13.3 A graph with four turning points . . . . . . . . . . . . . . . . 222
14.1 Double integration over the simple region R. . . . . . . . . . . 230
14.2 Double integration over x then y. . . . . . . . . . . . . . . . . 231
14.3 Double integration over y then x. . . . . . . . . . . . . . . . . 232
16.1 An L and R curcuit. . . . . . . . . . . . . . . . . . . . . . . . 261
16.2 An L, C, and R curcuit. . . . . . . . . . . . . . . . . . . . . . 262
16.3 The unit step function u(t). . . . . . . . . . . . . . . . . . . . 266
16.4 The displaced unit step function u(t c). . . . . . . . . . . . . 266
16.5 Building functions that are on and o when we please. . . . . 267
16.6 A positive waveform built from steps. . . . . . . . . . . . . . . 269
16.7 A waveform built from steps. . . . . . . . . . . . . . . . . . . 270
16.8 A waveform built from delayed linear functions. . . . . . . . . 272
16.9 An impulse train built from Dirac deltas. . . . . . . . . . . . . 274
17.1 A continuous (analog) function . . . . . . . . . . . . . . . . . 279
17.2 Sampling the function . . . . . . . . . . . . . . . . . . . . . . 280
17.3 The digital view . . . . . . . . . . . . . . . . . . . . . . . . . . 280
20.1 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . 310
20.2 Close up of the Normal Distribution . . . . . . . . . . . . . . . 311
Chapter 1
Preliminaries
1.1 Introduction
We shall start the course, by recapping many denitions and results that
may already be well known. As mathematics is a cumulative subject, it is
necessary however, to ensure that all the basics are in place before we can
go on.
We assume the reader is familiar with the elementary arithmetic of num-
bers positive, negative and zero. We also assume the reader is familiar with
the decimal representation of numbers, and that they can evaluate simple
expressions, including fractional arithmetic.
1.2 Notation
We now list some mathematical notation that we may use in the course, or
may be encountered elsewhere, this is shown in table 1.1.
Another important bit of notation is . . . , which is used as a sort of
mathematicians etcetera. For example
1, 2, 3, . . . , 10 is short hand for 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
1, 2, 3, . . . is short hand for 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, etc.
Its probably worth noting that in algebra when we use letters to represent
numbers, then
1.3 Arithmetic 2
= equal to ,= not equal to
< less than less than or equal to
> greater than greater than or equal to
equivalent to approximately equal to
implies innity
sum of what follows product of what follows
Table 1.1: Basic notation
3a is a shorthand for 3 a
a is a shorthand for 1 a
a is a shorthand for 1 a
1.3 Arithmetic
Two often forgotten pieces of arithmetic are:
1.3.1 The law of signs
When we combine signs, either by multiplying two numbers, or by subtracting
a negative number for example, we use table 1.2 to determine the sign of
the outcome. Put simply, a sign reverses our direction, and so two of
them take us back to the + direction and so on.
+ + +
+
+
+
Table 1.2: The law of signs
1.4 Decimal Places & Signicant Figures 3
1.3.2 Order of precedence
We are familiar with the fact that expressions inside brackets must be evalu-
ated rst, that is what the bracket signies. However, without brackets there
is still an inherent order in which operations must be done. Consider this
simple calculation
2 + 3 4
opinion is usually split as to whether the answer is 20, or 14. The rea-
son is that multiplication should be performed before addition, and so the
3 4 segment should be calculated rst. Be aware that not all calculators
understand this, test yours with this calculuation.
Calculations should be performed in the order shown in table 1.3.
B Brackets rst - they override all priority
O Order (Powers, roots)
D Division
M Multiplication
A Addition
S Subtraction
Table 1.3: Order of precedence
We note that the table provides us with a handy reminder, BODMAS.
1.4 Decimal Places & Signicant Figures
Often we are required to produce answers to a specic degree of accuracy.
The most well known way to do this is with the number of decimal places.
1.4.1 Decimal Places
The number of decimal places is a measure of how to truncate answers to a
given accuracy. If four decimal places are required then we look at the fth
decimal place and beyond, if it is 5 or greater, we round up the last decimal
places that is written, otherwise we simply leave it alone. Let us take an
1.4 Decimal Places & Signicant Figures 4
example. is a mathematical constant that continues innitely through all
its decimal places, never repeating its pattern.
= 3.141592653589793 . . .
rounded to ve decimal places we obtain
3.14159
since the next digit is 2, wheras rounding to four decimal places give
3.1416
since the next digit is 9 which is clearly bigger than 5.
It is good to use enough decimal places to obtain an accurate answer, but
one must always remember the context of the answer. There is little point
in calculating that the length of a piece of metal should be 2.328745 cm if all
we will have to measure it with is a ruler accurate to 1 mm.
1.4.2 Signicant Figures
Sometimes decimal places are not the most appropriate way to dene accu-
racy. There is no specic number of decimal places that suit all situations.
For example, if we quote the radius of the Earth in metres, then probably no
number of decimal places are appropriate for most purposes, as the answer
will not be that accurate, and there will be so many other gures before it,
they are unlikely to be signicant.
An alternative often used is to specify a number of signicant gures.
This is essential the number of non-zero numbers that should be displayed.
Suppose that we specify four signicant gures. Then the speed of light in
m/s is written as:
c = 2, 997, 992, 458 2, 998, 000, 000 m/s
which can be written more simply again in standard form (see below). The
issue here is that the other gures are less likely to have any real impact on
the answer of a problem. Similarly the standard atomic mass of Uranium is
238.02891 g/mol 238.0 g/mol
since we only have four signicant gures, we round after the zero. Note that
writing the zero helps indicate the precision of the answer.
1.5 Standard Form 5
1.5 Standard Form
In science, large and small numbers are often represented by standard form.
This takes the form
a.bcd 10
n
if we are using four signicant gures. For example, we saw above that to
four signicant gures
c = 2, 998, 000, 000 m/s = 2, 998 1, 000, 000 m/s
or, working a bit more, we move the decimal place each time to the left,
(which divides the left hand number by ten), and multiply by another ten
on the right to compensate.
= 2.998 1, 000, 000, 000 m/s
now all that remains to do, is to write the number on the right as a power
of ten. We count the zeros, there are nine, and so
c 2.998 10
9
m/s.
The same applies for small numbers. The light emitted by a Helium-Neon
Laser has a wavelength of
= 0.000, 000, 632, 8 m
but this is clearly rather unwieldy to write down. This time we move the
decimal place to the right until we get to after the rst non-zero digit. Each
time we do this we essentially multiply by 10, and so to compensate we
have to divide by ten. This can be represented by increasingly large negative
values of the power.
1
So here, we need to move the decimal place seven times to the right, and
so we will multiply by 10
7
.
= 6.328 10
7
m.
1.5.1 Standard prexes
There are a number of prexes applied to large and small numbers to allow
us to write them more meaningfully. You will have met many of them before.
The prexes for large numbers are shown in table 1.5.1.
1
We will have to wait a while, until 3.5 to see exactly why this is.
1.5 Standard Form 6
Name Prex In English Power of Ten
deca da tens 10
1
hecto h hundreds 10
2
kilo k thousands 10
3
Mega M millions 10
6
Giga G billions 10
9
Tera T trillions 10
12
Peta P quadrillions 10
15
Exa E quintillions 10
18
Table 1.4: SI prexes for large numbers
Note that in the past there was a dierence between billions as used in
British English and American English. The English billion was one million
million, wheras the American billion is one thousand million. The latter has
won out now, and most references to a billion are to the American one. Also,
there is a slight disparity between normal quantities, and the bytes used
in computing. Since computing is based on binary, and therefore powers of
2, a kilobyte (kB) is not 1000 bytes, but 1024 bytes.
2
So in computing, 1024
is used rather than thousands to build up such quantities.
The prexes for large numbers are shown in table 1.5.1.
Because of these prexes, it is normal within engineering to adapt stan-
dard scientic form to get the power of ten to be a multiple of three. Let us
revisit our wavelength example:
= 6.328 10
7
m.
So we would prefer to tweak the power of ten here. We could do this
= .6328 10
6
m = 0.6238 m,
but this is pretty ugly to have a fractional number. It would be more normal
to write
632.8 10
9
m = 623.8 nm,
2
1024 = 2
10
and so is the closest power of two
1.5 Standard Form 7
Name Prex In English Power of Ten
deci d tenths 10
1
centi c hundredths 10
2
milli m thousandths 10
3
micro millionths 10
6
nano n billionths 10
9
pico p trillionths 10
12
femto f quadrillionths 10
15
atto a quintillionths 10
18
Table 1.5: SI prexes for small numbers
Chapter 2
Number Systems
We remind ourselves of dierent sets of numbers that we will refer to later.
2.1 Natural numbers
The set of all numbers
1, 2, 3, . . .
is known as the set of positive integers, (or natural numbers, or whole
numbers, or counting numbers and is denoted by N.
2.2 Prime numbers
A prime number is a positive integer which has exactly two factors
1
, namely
itself and one.
Thus 2 is the rst prime number, and the only even prime number. So
the prime numbers are given by
2, 3, 5, 7, 11, 13, 17, 19, . . .
1
Recall that a factor of a number x is one that divides into x with no remainder.
2.3 Integers 9
2.3 Integers
The set of all numbers given by
. . . , 3, 2, 1, 0, 1, 2, 3, . . .
is known as the set of integers and is denoted by the symbol Z.
2.4 Real numbers
The collection of all numbers in ordinary arithmetic, i.e. including fractions,
integers, zero, positive and negative numbers etc. is called the set of real
numbers and is denoted R. The set of real numbers can be visualised as a
line, called the real number line or simply the real line. Each point on the
lines represents a unique real number, and every number, including exotic
examples such as is represented by a unique point on the line.
Figure 2.1: The real number line
2.5 Rational numbers
The set of all numbers that can be written
m
n
where m and n are integers,
is known as the set of rational numbers and is noted by Q. (Note that n
cannot by zero, as division by zero is not permitted).
For example,
2
3
,
165
2096
, 2 =
2
1
, 0 =
0
1
are all rational numbers.
(Note that although division by zero is not peemitted, dividing zero by
another number is, and as no other number can t into zero at all, the result
is zero).
It turns out that if you add, subtract, divide or multiply any two rational
numbers together, you still get a rational number.
2.6 Irrational Numbers 10
Theres an easy way of working out whether a given number is rational
or not. Simply expand it in its decimal form. Rational numbers always have
a decimal expansion that ends, or repeats itself every so many digits.
For example
2
3
= 0.6666666 . . .
3436.234523452345 . . .
34.68
are all rational.
2.6 Irrational Numbers
Of course, not all real numbers are rational, and in fact many numbers you
will already have met are not. These numbers are called irrational numbers.
Examples are
2,
3, and in fact

p where p is prime.
When written in their decimal forms, irrationals are never ending and
non-repeating. This means that irrational numbers can never be written
down exactly.
Although in secondary schools we often write =
22
7
, suggesting that
is rational, this is only a simple (and not very accurate) approximation. In
fact is also irrational, as is Eulers constant e.
Chapter 3
Basic Algebra
Algebra is perhaps the most important part of mathematics to master. If
you do not, you will nd problems in all the areas you study, all caused by
the underlying weakness in your algebra skills. In many ways, algebra is
representative of mathematics in that it deals with forming an easy problem
out of a dicult one.
3.1 Rearranging Equations
Imagine an equation as a pair of balanced weighing scales. What will happen
if we add 2kg on both sides? The scales will remain in balance. If we multiply
the weights by three on both sides? The scales will remain in balance. In
fact, even if we take the sine of both weights, the scales remain in balance.
The leads to the fundamental result you must remember.
You can do anything to both sides of an equation and you will
obtain an equivalent equation.
3.1.1 Example
We shall look at an extremely trivial example of this concept in use. In our
learning of algebraic manipulation we are often told that we can take things
across the equals sign and change the sign. Rarely are we told why this
works. Lets examine it.
3.1 Rearranging Equations 12
x + 4 = 9
Well, it is simple to see what value x has in this case. However, we wish
to show how rearranging works in these very simple cases. We wish to nd x,
and this is really saying we want to manipulate the equation into the form:
x =??
Where ?? represents the answer. Therefore, we wish to have an x on its
own, on one side of the equation, with everything else on the other side of
the equals sign. To that end, we start to look at what is attached to x, how
it is attached, and how we should remove it. In our example, 4 is attached
to the x by the process of addition. Now, how do you get rid of a 4 that has
been added? Of course, the answer is to subtract it, but we must not simply
do this on one side, rather in accordance with 3.1 we must do it on both
sides of the equation to maintain its validity.
So we obtain
x + 4 4 = 9 4
Now the +44 on the L.H.S. cancel, leaving zero, and this step wouldnt
be written normally. So we nally obtain
x = 9 4 = 5
If you observe that it appears that the +4 crossed the equals sign to
become a 4 on the R.H.S.. However, now we know what has actually
happened.
3.1.2 Order of Rearranging
Of course, in most examples, more than one thing is attached to the x,
and usually by a combination of operations. It may be equally correct to
rearrange by removing these in any order, but some ways will almost certainly
be easier than others.
We have already noted in 1.3.2 that some operations are naturally done
before others, and so when we see an expression such as:
3x
2
4 = 8
it really means
((3(x
2
)) 4) = 8
where the brackets serve simply to underline the order in which things are
done. The eect is somewhat similar to an onion with the x in the very centre.
We could peel the onion from the outside in for the most tidy approach. That
is, we remove things in the reverse order to the way they were attached in
the rst place.
So in our simple example, the 4 is subtracted last, so remove it rst,
(adding 4 on both sides).
3x
2
4 + 4 = 8 + 4 3x
2
= 12
Now the x still has two things attached, the three, which is multiplied
on, and the 2 which is a power. Powers are done before multiplication, so we
remove in the reverse order again. Therefore we divide by 3 on both sides.
3
3
x
2
=
12
3
x
2
= 4
Now we only have one thing stuck to the x, and that is the power of 2.
To remove this we simply take the square root on both sides:
x = 2.
Recall that -2 squared is also 4.
3.1.3 Example
Rearrange the following expression for x.
4x + 6 = 2x 3
Solution
We still wish to rearrange to get x =, but we must notice here that x occurs
in two places. We could remove the 3 and 6 on the LHS to obtain
x =
2x 9
4
(try it as an exercise), but this is not very helpful, as x is now dened in
terms of itself, so we still dont know its value.
Instead we rst gather all the x terms together, and we do this by per-
forming the same operation on both sides. For example, we dont want the
2x on the RHS, it is positive and so it present by addition. We subtract it
on both sides.
4x + 6 2x = 2x 3 2x 2x + 6 = 3
So we now have a simpler equation, with x only on one side. We can
proceed as before now to remove things from the x in the LHS. Subtract 6
on both sides.
2x + 6 6 = 3 6 2x = 9
Finally divide by 2 on both sides.
x =
9
2
3.1.4 Example
Rearrange the following expression for x:
3(2x + 3) 6 = 0
Solution
In this case we nd that x is encased in brackets. To get at x so we can
rearrange for it we could multiply out the bracket and rearrange from there.
This is left as an exercise for the reader.
Another way to deal with it is to think of the bracket as an onion within
an onion to continue the analogy we began above. Begin by taking things
o this bracket, rather than x directly. We add 6 both sides
3(2x + 3) 6 + 6 = 0 + 6 3(2x + 3) = 6.
Now divide by 3 on both sides to obtain
3(2x + 3)
3
=
6
3
2x + 3 = 2.
Note the brackets can fall o at this point naturally. Now we start
with out new onion, subtracting 3 both sides.
2x + 3 3 = 2 3 2x = 1
3.2 Function Notation 15
and nally divide by 2 both sides
2x
2
=
1
2
x =
1
2
3.1.5 Example
Rearrange the following expression for x:
3 =
10
x
5.
Solution
We have a more serious problem here, namely that x is on the bottom line.
We begin by removing the 5 to clarify the equation, by adding 5 on both
sides of course.
3 + 5 =
10
x
5 + 5 2 =
10
x
Now, theres not much attached to x, but the x is still on the bottom line.
That means the x has been divided into something (the 10 in this case). To
cancel the division by x, we multiply x on both sides.
2x =
10x
x
2x = 10
which simplies our equation quite a lot. We can now divide by two on
both sides to nish.
2x
2
=
10
2
x = 5
3.2 Function Notation
Very often when we wish to analyse the behaviour of an expression, we make
it a function. A function can be thought of as a box, into which goes a value
and out of which comes a, usually dierent, value.
You will have seen functions written as formulae before, for example
A = r
2
3.2 Function Notation 16
is a function for calculating the area of a circle. A value goes in (the
radius of the circle) and a value comes out (the area of the circle).
At times we may use notation such as f(x) to represent a function. For
example
f(x) = 2x 3
is a very simple function. For dierent values we put in (x), we will get
dierent values out f(x). The notation f(x) simply means the value of the
output of the function.
This notation is very useful when we want to consider specic values that
we insert. For example, we write f(2) to mean nd the output value of the
function f, when the input value for x is 2. You can see that we have simply
replaced the x by a 2.
In our example stated above we get
f(2) = 2(2) 3 = 1
f(3) = 2(3) 3 = 9
f(0) = 2(0) 3 = 3
f(w) = 2(w) 3
In the last example, we had to replace x by w, but we cant work out
anything further, so we stop there.
Very often we want to be able to undo the result of our function. For
example, we need to be able to reverse multiplication with division in order
to rearrange equations, or reverse a square with a square root.
To do this we use an inverse function. An inverse function can be thought
of a complementary box to our original one, so that when we plug the output
of the rst function into its input we get the original value. For example,
with our simple f(x) above, we inserted 2 and got 1. Our inverse will have
to take 1 and give us 2.
To nd the inverse function, it is usually easier to give f(x) a letter, like
y. In our example we obtain
y = 2x 3
Now we rearrange the equation for x, using the rules described above.
We obtain
x =
y + 3
2
3.3 Expansion of Brackets 17
this is left as an exercise for the reader.
Were pretty much done, but it is usual to label our inverse of f(x) with
the notation f
1
(x) and have our function in terms of x, not y. So, we swap
x and y to obtain
y =
x + 3
2
and now use our inverse function formula
f
1
(x) =
x + 3
2
Recall that with our simple example for f(x), that
f(2) = 2(2) 3 = 1.
If we now feed this output into the input of the inverse, we should get
back to our starting position (2).
f
1
(1) =
1 + 3
2
= 2.
Just as before we insert the value in the brackets into x throughout the
expression for the inverse function, and you can see that indeed the inverse
function here has taken us back to the start.
We will meet other examples of inverse functions throughout this module.
Its not always possible to do this, and not all functions have inverses
unfortunately.
3.3 Expansion of Brackets
When we have to multiply something by a bracketed expression, we use the
so called distributive law.
a(b + c) = ab + ac(a + b)c = ac + bc
We can easily show that this can be extended.
a(b + c + d + ) = ab + ac + ad +
There are some simple things worth remembering
The abscence of a number before an expression is the same as multi-
plying by 1.
The sign preceding a number belongs to that number and must be
included in the multiplication.
In particular, a minus sign before a bracket means 1 multiplied by
that bracket.
3.3.1 Examples
Here are some examples.
1.
2(3x 5y + z) = 6x + 10y 2z
2.
3(2x y) (x + 2y) = 6x 3y x 2y = 5x 5y
3.
4x(y z +2(xy)) = 4x(y z +2x2y) = 4x(2xy z) = 8x
2
4xy 4xz
4.
2y(3x 4z(x + z)) = 2y(3x 4xz 4z
2
) = 6xy 4xyz 8yz
2
3.3.2 Brackets upon Brackets
When we encounter a bracketed expression multiplied by another bracketed
expression we can apply the same technique, although it appears more com-
plicated.
Consider
(a + b)(c + d)
For the moment, we shall call z = (a + b). Then our expression appears
simpler.
z(c + d) = zc + zd
Now we reinsert the true value of z.
= (a + b)c + (a + b)d = ac + bc + ad + bd
So we reduce the whole problem to two separate expansions of the type
we have already met. There are a number of rules of thumb to make this
technique rather simpler, but many depend on multiplying only two brackets
together, each of which with exactly two terms. We shall examine a general
technique without this shortcoming.
Consider once more
(a + b)(c + d)
Pick any bracket, for the sake of demonstration, we shall pick the rst.
Now take the rst term in it (which is a). We now multiply this term on
each term of the other bracket in turn, adding all the results.
= ac + ad +
When we reach the end of the other bracket, we return to the rst bracket
and move onto the next term, which is now b and do the same again, adding
to our existing terms.
= ac + ad + bc + bd +
Now we return to the rst bracket, and move to the next term. We nd
we have actually exhausted our supply of terms, and so our expansion is
really complete.
(a + b)(c + d) = ac + ad + bc + bd
To multiply several brackets together at once we should multiply two only
at a time. For example
(a + b)(c + d)(e + f).
We begin my multiplying one pair together, let us say the rst two, to
obtain:
= (ac + ad + bc + bd)(e + f).
We may then complete the expansion, it is left to the reader as an exercise
to conrm that the full expansion will be:
= ace + ade + bce + bde + acf + adf + bcf + bdf.
3.3.3 Examples
Here are some examples.
1.
(x + 1)(x + 2) = x
2
+ 2x + x + 2 = x
2
+ 3x + 2
3.4 Factorization 20
2.
(x + y)
2
= (x + y)(x + y) = x
2
+ xy + xy + y
2
= x
2
+ 2xy + y
2
(See binomial expansions later).
3.
(x + y)(x y) = x
2
xy + xy y
2
= x
2
y
2
(This is called the dierence of two squares).
4.
(3x 2y)(x + 3) = 3x
2
+ 9x 2xy 6y
5.
(2x y)(x + 2y) = 2x
2
+ 4xy xy 2y
2
= 2x
2
+ 3xy 2y
2
3.4 Factorization
Factorization is the opposite of expansion, we often prefer to condense and
simplify expressions rather than expand them. Indeed, even in the expansion
examples above we simplied the expressions along the way to make life easier
for ourselves.
Factorization is recognizing that an expression like this
4x + 2y
could be written as
4x + 2y = 2(2x + y)
simply because we can clearly see that expanding the result gives us the
original. A factor common to all terms is observed - in this case the number
2 clearly divides into all the terms. The factor is divided into the expression
and written outside the result which is bracketed.
The factor may often be some algebra, and not just a number. In the
expression
x
2
3x
we see that x divides into both terms. We can thus write
x
2
3x = x(x 3).
The ability to spot factors does not come easily, but with a great deal of
practice.
3.5 Laws of Indices 21
3.4.1 Examples
Let us look at some examples. Remember that in each case, expanding the
end result should give us our original expression, and this allows you to check
and follow the logic.
1.
3x + 12y
2
6z = 3(x + 4y
2
2z)
2.
x
3
+ 3x
2
+ 4x = x(x
2
+ 3x + 4)
3.
x
3
+ 3x
2
= x
2
(x + 3)
4.
2x
2
+ 4xy + 8x
2
z = 2x(x + 2y + 4xz)
We can also factorise expressions into two or more brackets multiplied
together, but this is more dicult and we shall examine it later.
3.5 Laws of Indices
The term index is a formal term for a power, such as squaring, cubing etc,
and the plural of index is indices.
There are some simple laws of indices, which are shown in table 3.1.
1 x
a
x
b
= x
a+b
2 x
a
x
b
= x
ab
3 (x
a
)
b
= x
ab
4 x
0
= 1
5 x
b
=
1
x
b
6 x
1
b
=
b
x
7 x
a
b
=
b
x
a
Table 3.1: The laws of indices
3.5 Laws of Indices 22
3.5.1 Example proofs
We shall attempt to show how a selection of these results work, but such
demonstrations are for understanding and are not examinable.
Let us consider the rst law, with a concrete example:
x
3
x
2
We dont know what the number x is, but all that is important is that
the base values of each number are the same.
We recall that powers mean a string of the same thing multiplied together,
so that:
x
3
x
2
= x x x
. .
x
3
x x
. .
x
2
.
Clearly there is no dierence between the inside the braced section
and between them. In otherwords, this is just
x
3
x
2
= x x x
. .
x
3
x x
. .
x
2
= x x x x x
. .
x
5
= x
5
a string of ve xs multiplied together, exactly the denition of x
5
, and
the 3 and 2 add to make 5.
We shall show how one other result works, using the more general a and
b.
Consider
(x
a
)
b
.
By denition, this is just
x
a
x
a
x
a
. .
b times
with b of these x
a
terms. (Recall, x
6
just means 6 of the x terms multiplied
together).
Now each x
a
terms is itself a collection of a xs multiplied together. So we
can expand further
b groups
..
x x
. .
a times
x x
. .
a times
x x
. .
a times
.
Each collection of x terms with a brace below is an expanded x
a
, so
each contains a xs multiplied together. There were b x
a
terms, and so have b
3.6 Laws of Surds 23
braces, each containing a xs. Therefore we have a long string of xs multiplied
together, which are a b in number, exactly the denition of x
ab
.
It is worth going through this argument in a concrete case, for example
(x
2
)
3
to help follow the logic.
3.5.2 Examples
Here are some examples
1.
2
5
2
3
= 2
5+3
= 2
8
(= 256)
2.
2
8
2
3
= 2
83
= 2
5
(= 32)
3.
(3
2
)
2
= 3
22
= 3
4
(= 81)
4.
16
1
4
=
4
16(= 2)
5.
15
3
=
1
15
3
(=
1
3375
)
6.
27
2
3
=
1
27
2
3
=
1
3
27
2
=
1
9
3.6 Laws of Surds
A surd is technically a square root which has an irrational value, but we
often talk about surds whenever we manipulate square roots. We have two
main results for manipulating square roots, as shown in table 3.2
1

a
b =
ab
2
b
=
_
a
b
Table 3.2: The laws of surds
3.7 Quadratic Equations 24
3.6.1 Examples
The second law of surds is most often used to work out the square roots of
fractions.
1.
_
1
4
=
4
=
1
2
2.
_
4
7
=
7
=
2
7
The rst law was often used to split large square roots into smaller ones,
by attempting to divide the original number by a perfect square.
1.
12 =
4 3 =
3 = 2
3
2.
80 =
16 5 =
16
5 = 4
5
A very important application of this to come later is that of complex
numbers.
3.7 Quadratic Equations
A quadratic equation in x is an equation of the form
ax
2
+ bx + c = 0, a ,= 0
where a, b and c are constants (that is, they do not change value as x
does.
Its vital that a is not zero, or the equation collapses into that of a straight
line. However, it is quite allowable to have b or c (or both) zero.
3.7.1 Examples
Here are some examples of equations which are, and which are not quadratic
equations, shown in table 3.3. Equations 1,2 and 3 are genuine quadratics,
even though terms are missing in 2 and 3. Equation 4 is the equation of a
straight line, or if you like a = 0 which is not permitted. Equation 5 contains
an x
3
term, and so is a cubic equation and not a quadratic whose highest
term must be x
2
.
Equation Quadratic? a b c
1 2x
2
+ 3x 4 = 0 YES 2 3 -4
2 x
2
+ 2x = 0 YES 1 2 0
3 x
2
+ 3 = 0 YES 1 0 3
4 4x + 2 = 0 NO n/a n/a n/a
5 x
3
+ 3x 2 = 0 NO n/a n/a n/a
Table 3.3: Examples of quadratic equations
3.7.2 Graphical interpretation
To examine the solutions of this equation, it is helpful to consider the graph
of the function given by
y = ax
2
+ bx + c.
When this graph is plotted it gives a characteristic U shaped curve,
called parabola. It has many interesting properties, but the most important
is that it is symmetrical about a vertical line through the maximum, or
minimum point. Our equation corresponds to the above function when y = 0,
or to put it another way, the solutions of our equation occur when the curve
cuts the x-axis (where y is zero).
Figure 3.1: The quadratic equation
This gives us our rst problem, which is that we cannot even be sure that
the equation has solutions. If this seems strange or confusing, recall that
we never claimed that all equations could be solved. Our problem falls into
three categories.
1. The curve cuts the x-axis twice;
2. The curve cuts the x-axis once only (just touching, no more);
3. The curve does not cut the x-axis at all.
This situation is shown in gure 3.1. In this gure all the curves have
been drawn as maximum parabolas, many quadratics show minimum
parabolas but the cases are just the same. The y-axis has been ommitted as
it is not relevant to the problem at hand which is determining the number
of solutions of the corresponding equation.
3.7.3 Factorization
One approach to try and solve a quadratic is the so called factorization or
sum and product method of solution.
We imagine that we can write our equation in the form
(x )(x ) = 0.
Why have we written it this way? Well, for a product of two things to
be zero, one or other, or both must be zero. Therefore in this case, if the
left-hand bracket is zero we see that x = , and if the right-hand bracket is
zero we see that x = . Therefore and are our two solutions.
If we expand these brackets out, we obtain
x
2
x x + = x
2
( + ) + = 0
Now we try to compare this to our quadratic equation, but before we do
so, we standardize our equation, by dividing it by a:
ax
2
+ bx + c = 0 x
2
+
b
a
x +
c
a
= 0.
We now compare the coecients of the x terms in each equation (assum-
ing they are in fact two forms of the same equation). (A coecient is a
number multiplied upon a term).
Comparing we obtain
x
2
term 1 = 1 This is why we standardize;
x term
b
a
= ( + )
constant term
c
a
=
The two signicant equations are this
b
a
= + ;
c
a
=
which tell us that the sum of the solutions is
b
a
and that the product
of the solutions is
c
a
. If we can guess two numbers with these properties, we
have solved the quadratic equation.
This method has the advantages that the theory is simple and straight
forward, but has two crippling disadvantages. Firstly, we have to guess two
numbers, and even in simple cases this may be very dicult, in a hard
example next to impossible. Secondly, and more importantly, before we
start we have no way of knowing how many solutions the equation has. If it
has two, this theory works well, if it has one, then and have the same
value (we say the solution is repeated). However, if there are no solutions, it
will be impossible to guess our two numbers, as they do not exist, but this
may not be obvious from the start.
3.7.4 Quadratic solution formula
Fortunately, we do have a more exible method of solution, but its proof is
dicult.
The proof is given here, but is not required.
First of all consider the problem, we cannot simply take a square root to
remove the square, as the other x term gets in the way. Consequently, we
try to absorb the x term and the x
2
term into one term. This is a dicult
procedure known as completing the square.
Consider
(x + y)
2
= (x + y)(x + y) = x
2
+ 2xy + y
2
We wish to let this be our x
2
+
b
a
x terms. For this to be true, we must
let y =
b
2a
. Note that this does not correspond to exactly what we want:
_
x +
b
2a
_
2
= x
2
+
b
a
x +
b
2
4a
2
which is our quadratic, except that the
c
a
term is missing, and the last
term above is extra, so we add and subtract these two terms respectively.
x
2
+
b
a
x +
c
a
= 0
_
x +
b
2a
_
2
+
c
a
=
b
2
4a
2
_
x +
b
2a
_
2
=
b
2
4a
2

c
a
=
b
2
4ac
4a
2
_
x +
b
2a
_
=
_
b
2
4ac
4a
2
which using the laws of surds (see 3.5), yields our nal result.
x =
b
b
2
4ac
2a
3.7.5 The discriminant
As well as giving a fool-proof method of solving a quadratic, the solution
formula has a small section which tells us some useful information. The
section
b
2
4ac
which was the contents of the square root is known as the discriminant
of the equation, as it tells us how many solutions the equation has.
b
2
4ac > 0 Two real solutions Case 1 above
b
2
4ac = 0 One real solution Case 2 above
b
2
4ac < 0 No real solutions Case 3 above
3.7.6 Examples
Here are some examples of quadratic equations and their solutions.
1.
x
2
+ 2x 3 = 0
2.
x
2
4x + 4 = 0
3.
x
2
+ x + 1 = 0
Solutions
1. In this case a = 1, b = 2, c = 3. We plug this into the solution formula
to obtain
x =
2
_
2
2
4(1)(3)
2
which when calculated out yields answers of 1 and 3.
2. In this case a = 1, b = 4, c = 4. The solution formula yields
x =
+4
_
(4)
2
4(1)(4)
2
which when calculated out yields answers of 2 and 2. The repeated so-
lution is nothing unusual, and indicated that this quadratic has only one
solution.
3. In this case a = 1, b = 1, c = 1. We shall follow this case in more detail.
x =
1
_
1
2
4(1)(1)
2
=
1
1 4
2
=
1
3
2
Note that we have a negative square root here. We cannot calculate this,
suppose that the answer was a positive number, then when squared we get
a positive number, not 3. We also get a positive number when we square
a negative number, and we get zero when we square zero. Thus no number
can be
3.
There are no solutions to this quadratic equation.
3.7.7 Special cases
Observe that if b = 0, or if c = 0 in the quadratic, the equation can be solved
more directly. We show how for completeness, although the formula may still
be used in these cases.
b = 0
We have here
ax
2
+ c = 0 ax
2
= c x
2
=
c
a
Therefore
3.8 Notation 30
x =
_
c
a
.
Note that
c
a
must be positive, and this will be the case only if a and c
have dierent signs, otherwise there are no solutions. Note also the in the
formula, solving directly often leads to forgetting the solution coming from
the negative branch of the square root.
c = 0
In this case we have
ax
2
+ bx = 0 x(ax + b) = 0
from which we obtain, that either
x = 0
which is one solution, or
ax + b = 0 x =
b
a
3.8 Notation
We now introduce two more items of notation.
3.8.1 Modulus or absolute value
A very useful idea in mathematics is the notion of absolute value of a real
number. If x is a real number, we dene the modulus of x, or mod x, denoted
[x[ as follows.
[x[ = larger of x and x
Note that this is not the same as the mod operation in computing.
3.8 Notation 31
Examples
Here are some examples
[2[ = larger of 2 and 2 = 2;
[ 2[ = larger of 2 and 2 = 2;
[0[ = larger of 0 and 0 = 0.
In other words, the modulus function simply strips o any leading minus
sign on the number.
Alternative denitions
There are some alternative, but equivalent denitions of [x[:
[x[ =
_
x if x 0
x if x < 0
and
[x[ =
x
2
where we adopt the convention that the square root takes the positive
branch only, unless we include , which is commonly accepted.
3.8.2 Sigma notation
Suppose that f(k) is some expression involving k (a function of k formally
speaking). For example, f(k) could be 2
k
or 2k + 1 etc. Then, the notation
n
k=m
f(k)
is a shorthand for the expression
f(m) + f(m + 1) + + f(n)
In other words, we insert the value at the bottom of the sigma into the f
expression, then insert that value plus one, plus two, etc., until we reach the
value on top of the sigma, and add all the results together.
We will use the symbol to denote the lack of an endpoint in the sum-
mation.
The ranges above and below the sigma are sometimes ommitted when it
is clear what is being summed.
This may be easier to understand after some simple examples.
3.9 Exponential and Logarithmic functions 32
Examples
Here are some examples of sigma notation
1.
4
k=0
2
k
= 2
0
+ 2
1
+ 2
2
+ 2
3
+ 2
4
= 31
2.
k=4
k = 4 + 5 + 6 + 7 + 8 +
3.
5
k=2
(1)
k+1
_
1
3
k
_
= (1)
3
_
1
3
2
_
+ (1)
4
_
1
3
3
_
+ (1)
5
_
1
3
4
_
+ (1)
6
_
1
3
5
_
=
_
1
3
2
_
+
_
1
3
3
_
_
1
3
4
_
+
_
1
3
5
_
3.8.3 Factorials
The notation n!, where n is an integer greater or equal to 0, is spoken n
factorial and is a shorthand for
n (n 1) (n 2) 2 1.
So for example
5! = 5 4 3 2 1 = 120.
It should be clear that this time of expressions gets very large, very
quickly, and indeed 69! is so large that most calculators are unable to repre-
sent it.
We note that by convention we accept 1! = 1 and 0! = 1.
3.9 Exponential and Logarithmic functions
We now consider another, slightly more complex type of function.
3.9.1 Exponential functions
In quadratics, we had x terms with specic constant powers, such as x
2
. A
very powerful function is formed when x is in the exponent (yet another name
for power or index).
An exponential function is one of the form
y = k
x
where k is some positive constant. These functions are important due to
their extraordinary ability to climb or decrease, as we shall see later when
we examine their graphs.
3.9.2 Logarithmic functions
For a positive number n, the logarithm or log of n to base b, written log
b
n is
the power to which b must be raised to give n. To put this in mathematics:
log
b
n = x b
x
= n
The antilogarithm or antilog of n to the base b is b
n
.
There are two main types of logarithms in use today:
log
10
, often written simply as log;
log
e
, often written
1
as ln.
Examples
Logs are often used to condense a very large range of numbers to a more
managable one.
Examine how in the following table the logs of a large range of values
from one thousandth, to one thousand are contracted to a range from -3 to
3. Verifying that there logarithms are correct is left as a simple exercise to
the reader.
Value 0.001 0.01 0.1 1 10 100 1000
Log
10
-3 -2 -1 0 1 2 3
The scales of pH (chemical scale of acidity) and the decibel range, the
Richter scale of earthquake intensity are all logarithmic,
2
base 10.
Therefore a pH of 6 is 10 times more acidic than the neutral 7, and 5 is
100 times more acidic than 7, etc.
1
The Scottish mathematician John Napier (1550 - 1617) did a great deal of work
on methods of computation and published his Mirici logarithmorum canonis descripto
(Discription of the marvellous rule of logarithms) in 1614. In his honour logs base e are
often called Naperian logarithms. Logs were rst used to help perform large calculations.
2
A lesser known example is the warpspeed scale
Warnings
Just as we have problems with the square roots of negative numbers, so we
cannot take the logarithms of negative numbers, or of zero.
Laws of Logarithms
Logarithms are so useful because they exhibit the following properties, known
as the laws of logarithms. These are closely related to the laws of indices and
are shown in table 3.4.
1 log
n
(x y) = log
n
(x) + log
n
(y)
2 log
n
(x y) = log
n
(x) log
n
(y)
3 log
n
(x
y
) = y log
n
(x)
4 log
n
n = 1
5 log
n
1 = 0
Table 3.4: Laws of Logarithms
These laws easily account for their use in computation. To multiply two
large numbers, one would nd their logarithms from tables, add these two
values, and then by the rst law, this is the logarithm of the product, so one
merely antilogs this number to obtain the answer.
The other major application ot these is in solving equations where the
variable is in the index, i.e. some sort of exponential expression.
3.9.3 Logarithms to solve equations
When we obtain equations like this
a
x
= b
it is sometimes possible to guess the power to which we must raise a to
obtain b. However, more often than not, a and b are not simple integers, and
therefore almost impossible to guess x. In this case, take logarithms on both
sides of the equation.
log
n
a
x
= log
n
b
x log
n
a = log
n
b (using laws of logs 3)
x =
log
n
b
log
n
a
3.9.4 Examples
Solve the following equations for x.
1.
3
x
= 27
2.
3
x
= 30
3.
2
2x
5(2
x
) + 6 = 0
Solutions
1. Although it is clear here that the answer is x = 3 we use logs to illustrate
the point. It really doesnt matter what base of logs we use, provided we are
consistent. Take logs both sides
log 3
x
= log 27 x log 3 = log 27
using the third law of logs (see 3.9.2). We now can perform ordinary
rearranging.
x =
log 27
log 3
= 3
2. This is only slightly more complicated, although it is not possible to guess
the answer easily.
log 3
x
= log 30 x log 3 = log 30
x =
log 30
log 3
= 3.0959
to four decimal places.
3. This problem looks very complicated. Taking logs immediately will get us
into trouble. Partly because we have no law to split logs over addition (note
this carefully, a common mistake), and that the log of zero is a problem (its
something like ).
We might observe that the equation roughly resembles a quadratic (see
3.7, and this is the key. Let us assign u = 2
x
, then clearly
u = 2
x
u
2
= (2
x
)
2
= 2
2x
by the laws of indices (see 3.5). Placing these into the equation yields
u
2
5u + 6 = 0
which is now a plain quadratic. Solving by whatever method yields u = 2
and u = 3 as solutions. Having dealt with the quadratic, we now deal with
the exponential problem, recall that u = 2
x
.
Consider the u = 2 solution, this means 2
x
= 2 so that clearly x = 1,
with no logarithms required.
The u = 3 solution proceeds as follows
2
x
= 3
and following the procedure above we obtain
x =
log 3
log 2
= 1.5850
to four decimal places. So we have two solutions x = 1, x = 1.5850.
3.9.5 Anti-logging
Logarithms are used to undo expressions in which the x we wish to obtain is in
the power. Similarly, exponential functions may be used to undo logarithms.
Recall our denition.
log
b
n = x b
x
= n
this means that
b
log
b
n
= n
To put it another way, we can remove a logarithm by taking the base
number to the power of the logarithm. Lets use some numbers to illustrate
the point.
Example
We know that
log
10
(1000) = 3
so suppose we were asked to solve the equation
log
10
x = 3
we know that x = 1000, but we might not notice this in a more dicult
problem, or one we had not previously seen. To remove the log, we take the
base number (in this case we are using base 10) and raise it to both sides:
10
log
10
x
= 10
3
.
Now we know that the log and the anti-log (the process of raising the
base to this power) cancel each other out on the left, so we obtain
x = 10
3
= 1000
and our equation is solved.
3.9.6 Examples
Solve the following equations for x:
1.
log x
2
= 2
2.
4 ln x = 70
Solutions
1. In the absence of a specic base, we assume base 10, as noted above. So
we may antilog both sides directly here, raising 10 to both sides.
10
log x
2
= 10
2
On the LHS, the anti-log and log cancel, leaving
x
2
= 10
2
= 100 x = 10
2. In this case, the base of the logarithm is e. We could immediately take
anti-logs on both sides, but the 4 in front of the ln x makes it messy. It is
easier to rearrange to ln x rst (onion within onion as before).
ln x =
70
4
3.10 Binomial Expansion 38
Now we anti-log, base e, note that exp( ) is a another notation for e
.
exp(ln x) = exp
_
70
4
_
x = exp
_
70
4
_
x = 39, 824, 784.4
3.10 Binomial Expansion
The binomial theorem or expansion has two main purposes. It allows us to
easily expand out high powers of brackets which contain two terms.
It is also highly suited to calculating probabilities associated with a simple
repeated experiment with a xed probability of success each time.
We shall concentrate on the rst application here.
3.10.1 Theory
Consider
(p + q)
n
for all numbers, and in particular positive integers n.
We now examine the expansion of powers of (p + q), and let us start by
considering some examples
3
:
(p + q)
2
= p
2
+ 2pq + q
2
;
(p + q)
3
= p
3
+ 3p
2
q + 3pq
2
+ q
3
;
(p + q)
4
= p
4
+ 4p
3
q + 6p
2
q
2
+ 4pq
3
+ q
4
.
If you examine this you will note some facts:
Patterns
Binomial expansions follow these rules:
the powers of each term add to the power of the expansion;
the powers of p begin with n and decrement each time until they reach
0;
3
These were produced by considering (p+q)(p+q) and expanding, and then multiplying
by another (p + q) each time.
the powers of q begin with 0 and increment each time until they reach
n.
Armed with these facts we can write out the expansion for any value of
n, except that we as yet cant work out the coecients. However, we can
note something about these too:
Binomial coecients follow these rules:
the coecients on the rst and last term are both 1;
the coecients on the second and one but last term are both n;
if the coecients are laid out in a triangle, then any one can be found
by adding the two above.
This triangle layout is shown in table 3.5 as far as n = 8 and is known as
Pascals triangle
4
.
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 16 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
Table 3.5: Pascals Triangle
We can now work out simple expansions, and this can be done in a two
step process for beginners.
4
Named after Blaise Pascal (1623-1662) the noted French mathematician, physicist and
philosopher. Due to his work in hydrostatics the SI unit of pressure takes his name, and
a computer language is named after him for his production of a calculating machine at
age 19. Pascal, along with Fermat and De Moivre was a pioneer in the mathematics of
probability
3.10.2 Example
Write out the expansion of (p + q)
8
.
Start out by writing the terms, we will start with p
8
and end with q
8
;
each term in between will lose 1 in power from p and gain 1 in power on q.
p
8
+ p
7
q + p
6
q
2
+ p
5
q
3
+ p
4
q
4
+ p
3
q
5
+ p
2
q
6
+ pq
7
+ q
8
This still lacks the coecients, we can place these in front of the terms by
taking them straight from Pascals triangle, the correct row is the one whose
second number matches n = 8.
This gives us
p
8
+ 8p
7
q + 28p
6
q
2
+ 56p
5
q
3
+ 70p
4
q
4
+ 56p
3
q
5
+ 28p
2
q
6
+ 8pq
7
+ q
8
3.10.3 Examples
We can deal with more dicult problems in the same way. Expand the
following expressions fully
1.
_
x
1
x
_
4
2.
(2a + b)
5
Solutions
1. It is usually easier to do the simple expansion rst. That is, expand out
(a + b)
4
which is
a
4
+ 4a
3
b + 6a
2
b
2
+ 4ab
3
+ b
4
and now let a = x and let b =
1
y
(note that the minus sign is included in b
itself). Now insert these into the expansion, inserting brackets for safety.
(x)
4
+ 4(x)
3
_
1
x
_
+ 6(x)
2
_
1
x
_
2
+ 4(x)
_
1
x
_
3
+
_
1
x
_
4
Which, with careful working out using the laws of signs gives us
= x
4
4x
3
_
1
x
_
+ 6x
2
_
1
x
_
2
4x
_
1
x
_
3
+
_
1
x
_
4
and nally we have
= x
4
4x
2
+ 6
4
x
2
+
1
x
4
2. Similarly, begin with (x + y)
5
(I choose x and y because they are not
present in the original problem).
x
5
+ 5x
4
y + 10x
3
y
2
+ 10x
2
y
3
+ 5xy
4
+ y
5
Now we let x = 2a and y = b to obtain
(2a)
5
+ 5(2a)
4
(b) + 10(2a)
3
(b)
2
+ 10(2a)
2
(b)
3
+ 5(2a)(b)
4
+ (b)
5
Now the brackets around the b terms are not required, but its a good habit,
as they are certainly required around the 2a terms. Note that
(2a)
2
= 2a 2a = 4a
2
,= 2a
2
.
In otherwords without the brackets the power only acts on the a and not the
2 as well, this is a common error. Expanding all these powers of 2a gives us
= 32a
5
+ 5(16a
4
)b + 10(8a
3
)b
2
+ 10(4a
2
)b
3
+ 5(2a)b
4
+ b
5
which nally leaves us with
= 32a
5
+ 80a
4
b + 80a
3
b
2
+ 40a
2
b
3
+ 10ab
4
+ b
5
3.10.4 High values of n
This is not examinable and is included for completeness.
For high powers of n working out the powers of the terms is not dicult,
but calculating the coecients is more dicult, for this we use the following
formula.
The number of combinations of r objects picked from n is given by
n
C
r
=
n!
r!(n r)!
This formula gives us the terms from Pascals triangle precisely, but note
that r must be counted from 0. So we could in fact use this to dene the
expansion.
The binomial expansion for (p + q)
n
is
n
r=0
(
n
C
r
p
nr
q
r
).
Remember that the summation starts with r = 0.
3.11 Arithmetic Progressions 42
3.11 Arithmetic Progressions
An arithmetic progression is a sequence of numbers of the following form:
a, a + d, a + 2d, a + 3d, . . . , a + (n 1)d, . . .
where a and b are constants. Note that a+(n1)d is the nth term in the
progression, rather than a + nd as the rst term has no d. In this context,
we call
a the initial term;
d the common dierence.
So in an arithmetic progression, (we abbreviate this to A.P.) each term
except the rst is obtained by adding the common dierence d to the its
predecessor.
3.11.1 Examples
Here are some examples of A.P.s:
1.
1, 2, 3, 4, 5, . . .
is an A.P. with a = 1 and d = 1.
2.
50, 100, 150, 200, . . .
is an A.P. with a = 50 and d = 50.
3.
100, 50, 0, 50, 100, . . .
is an A.P. with a = 100 and d = 50
3.11.2 Sum of an arithmetic progression
Let
a, a + d, a + 2d, . . .
be an A.P.. Then the sum of the rst n terms of the A.P., denoted S
n
is
given by
S
n
=
n
2
(2a + (n 1)d)
Although it is not examinable, a simple proof is given below.
Clearly we can write
S
n
= a + (a + d) + (a + 2d) + + (a + (n 2)d) + (a + (n 1)d)
Now, we right the same R.H.S., but back to front.
S
n
= (a + (n 1)d) + (a + (n 2)d) + + (a + 2d) + (a + d) + a
Now, obviously theres the same number of terms in the R.H.S. of both
expressions, remember there are n terms. We add them together from left
to right to obtain:
2S
n
= (2a+(n1)d)+(2a+(n1)d)+(2a+(n1)d)+ +(2a+(n1)d)+(2a+(n1)d).
Now, as pointed out earlier, there are n terms, and so this becomes
2S
n
= n(2a + (n 1)d)
dividing by 2 on both sides gives the result.
(The mathematician Karl Gauss was allegedly able to perform this trick
in his head in primary school, where is mathematical abilities were rst ob-
served. Many believe Gauss to be the greatest mathematician ever.)
Note
The following notation is often used in texts:
S
n
=
n
2
(2a + n 1d)
in this context, the line over n 1 acts as a bracket.
3.11.3 Example
The 4th term of an A.P. is
1
2
and the 8th term is
3
2
. Find the initial term,
the 3rd term, and sum of the rst 100 terms.
Solution
Suppose the A.P. has initial term a and common dierence d as before.
Then we obtain
a + 3d =
1
2
(i) (as the 4th term is
1
2
)
a + 7d =
3
2
(ii) (as the 8th term is
3
2
)
So here we have two equations in two unknowns, so we solve by subtract-
ing (i) from (ii) to obtain
4d = 2
and so we have found that d =
1
2
. Now that weve found d we can use it
to nd a by simply inserting the value for d into either equation. Inserting
it into (ii) yields:
a + 7(
1
2
) =
3
2
a = 2.
We have found a and d and so we can now go on to nish o the question.
The 3rd term will be a + 2d and so is 2 + 2(
1
2
) = 2 + 1 = 1.
and the sum of the rst 100 terms is given by
S
100
=
100
2
(2(2) + (100 1)
1
2
)
= 50(4 + 99(
1
2
))
= 50(45.5)
= 2275
3.11.4 Example
The sum of the rst ten terms of an A.P. is 10, the sum of the rst hun-
dred terms is 8900. Find the initial term and common dierence of the
progression.
Solution
First of all we formulate the problem in mathematics rather than words.
S
10
=
10
2
(2a + 9d) = 10
5(2a + 9d) = 10
2a + 9d = 2 (i)
3.12 Geometric Progressions 45
S
100
=
100
2
(2a + 99d) = 8900
50(2a + 99d) = 8900
2a + 99d = 178 (ii)
We now have two equations in two unknowns whch we can subtract as
they are to yield
90d = 180 d = 2.
We can now insert this value for d into equation (i) giving
2a + 9(2) = 2 2a 18 2 2a = 20 a = 10.
3.12 Geometric Progressions
A geometric progression (or G.P. for short) is a sequence of numbers of the
form
a, ar, ar
2
, ar
3
, . . . , ar
n1
, . . .
Note then the nth term is ar
n1
, and not ar
n
, similarly to A.P.s because
there is no r in the rst term. In this context, we call
a the initial term;
r the common ratio.
3.12.1 Examples
Here are some examples of G.P.s
1.
1, 2, 4, 8, 16, . . .
is a G.P. with a = 1 and r = 2.
2.
1,
1
2
,
1
4
,
1
8
,
1
16
, . . .
is a G.P. with a = 1 and r =
1
2
.
3.12.2 Sum of a geometric progression
Let
a, ar, ar
2
, ar
3
, . . . , ar
n1
, . . .
be a G.P.. Then the sum of the rst n terms of the G.P., denoted S
n
is given
by
S
n
=
a(1 r
n
)
1 r
provided that r ,= 1.
Although it is not examinable, a simple proof is presented below:
We can write
S
n
= a + ar + ar
2
+ + ar
n2
+ ar
n1
and multiplying both sides by r we obtain
rS
n
= ar + ar
2
+ + ar
n1
+ ar
n
just as in the proof for the sum for A.P.s, we combine these two equations,
this time by subtraction to obtain
S
n
rS
n
= a ar
n
(1 r)S
n
= a(1 r
n
)
and dividing by 1r on both sides now gives us the result. Of course, the
r ,= 1 restriction comes from this devision, ensuring that we are not dividing
by zero.
3.12.3 Sum to innity
Sometimes, we may be asked to nd the sum to innity of a G.P., that is,
the sum of all the terms added together. This is not a relevant question with
A.P.s by the way, as we keep on adding a sizeable chunk each time. The only
way the sum to innity can exist is if the amount we add on with each new
term shrinks to negligible levels as n tends to innity. For example, consider
that [r[ < 1 in a given G.P., then every subsequent power of r gets smaller
and smaller numerically speaking.
The sum to innity is given by
S
=
a
1 r
; [r[ < 1
3.12.4 Example
The 6th term of a geometric progression is 32, and the 11th term is 1. De-
termine
1. the common ratio of the progression;
2. the initial term of the progression;
3. the sum to innity of the progression.
Solution
1. We begin by forumlating the given information as equations.
ar
5
= 32 (i) (6th term is 32)
ar
10
= 1 (ii) (11th term is 1)
This gives us two equations in two unknowns. There are two main meth-
ods of solving these, you could for example rearrange each equation to give
a and then put those two expression equal to each other, and solve for r. We
will solve by simply dividing equation (i) into (ii) to obtain
ar
10
ar
5
=
1
32
r
5
=
1
32
You can see that the as cancel, and the rs combine according to the laws
of indices (see 3.5). We nd r by taking the 5th root on both sides, showing
that r =
1
2
.
2. We now have to obtain a, and we can do this by inserting our discovered
value for r into one of our equations. If we pick (i) we obtain
a(
1
2
)
5
= 32
a
32
= 32 a = 32 32 = 1024.
3. We have now found the two numbers which characterise this pro-
gression, and we are hence in a position to answer almost any subsequent
question on it.
The sum to innity exists if [r[ < 1, and in our example this is certainly
the case. We can then plug in our values
S
=
1024
1
1
2
= 2048.
3.12.5 Example
Here is a slight reworking of a classical example.
Take a chess board (which consists of an 8 by 8 grid of squares). Place
a single 10 pence piece on the rst square, two on the second, four on the
third and so on, doubling for all 64 squares. How much money is there on
the last square, and how much money is there on the entire board?
Solution
This is a geometric progression, as we are multiplying by a constant (2) each
time. Therefore r = 2, and the rst term is 1 (for 1 coin) so a = 1. The
number of coins on the 64th square is given by ar
6
3 which is 1 2
6
3 which
is 9.2233710
16
pounds, or in more usual notation, an incredible 92 233 700
000 000 000 pounds.
The quantity of money on the entire board is the sum of the rst 64
terms, given by
1(1 2
64
)
1 2
=
1 2
64
1
= 2
64
1
which is as close to 2
64
as makes no dierence (the error in this case is
10p). So the money on the board is 1.84467 10
17
pounds, or 184 467 000
000 000 000 pounds.
This example demonstrates the incredible ability of geometric progres-
sions to grow at speed.
Chapter 4
Trigonometry
Trigonometry is an essential skill for performing many real life mathematical
calculations, including the analysis of simple harmonic motion and waves.
There is a strong link between trigonometry and complex numbers as
evidenced by the polar and exponential form of complex numbers.
We recall the most elementary principles quickly.
4.1 Right-angled triangles
We rst recall the trigonometry of simple right-angled triangles.
4.1.1 Labelling
In a right-angled triangle, we call the side opposite the right angle (that is,
the side that does not touch the right angle) the hypotenuse.
Given any other angle (which clearly cannot be other than acute), we
label sides relative to that angle. Thus the side opposite this acute angle is
called simply the opposite and the remaining side becomes the adjacent.
It should be clear that the opposite and adjacent will switch, depending
on what acute angle is considered, while the hypotenuse remains the same
side at all times.
4.1 Right-angled triangles 50
Figure 4.1: Labelling right-angled triangles
4.1.2 Pythagoras Theorem
We remember this most fundamental of theorems concerning the triangle,
proved before the birth of Christ and known in principle in ancient Egypt.
The proof is usually attributed to Pythagoras
1
.
Given a right-angled triangle with hypotenuse length c and other sides of
length a and b.
a
2
+ b
2
= c
2
Thus, this relation may be used when we have two sides of a right-angled
triangle and wish to nd the third.
With less sides known, we must use angles to solve the triangle completely.
4.1.3 Basic trigonometric functions
Recall that
sin =
O
H
; cos =
A
H
; tan =
O
A
where O is the oppositive relative to , A is the adjacent relative to ,
and H is the hypotenuse.
1
Pythagoras (c. 560-480 BC) was a Greek methematician and mystic who founded a
cult in which astronomy, geometry and especially numbers were central. Pythagoras was
exiled to Metapontum around 500 BC for the political ambitions of the cult.
4.1 Right-angled triangles 51
The words SOH, CAH andTOA are often used as mnemonics for these
equations. Theres a variety of ways of remembering these, and its important
you are able to do so.
We dene some other simple short-cut functions relative to these
We shall dene
sec =
1
cos
; csc =
1
sin
; cot =
1
tan
4.1.4 Procedure
The trigonometry of right-angled triangles is particularly simple. The method
we follow tends to follow these steps
1. Identify 3 things, 2 you know and 1 you want;
2. Determine the correct equation to use (perhaps using a mnemonic);
3. Insert what you know into the equation;
4. Rearrange to nd what you want.
In step 1, you might for example have two sides and want the third side -
this is accomplished using Pythagoras theorem and nothing else. Generally
however, two of the things will be sides and one will be an angle (other than
the right angle itself).
For example, we might know one angle and a side and use this to nd
another side. We might have two sides and use this to nd an angle.
4.1.5 Example
A plane takes o and maintains an angle of 15
to the horizontal while it

covers 1000m as measured along the ground. Calculate
(a) the distance it has travelled in the air; (b) the height it has achieved;
by this time.
4.2 Notation 52
Solution
If we draw the triangle which reects this situation then the path of the plane
is the hypotenuse, the length of 1000m along the ground is the adjacent and
the height is the opposite.
(a) We know the angle, and the adjacent, we want the hypotenuse. Look-
ing above (see 4.1.3) we see we need cos, so write the equation, lling the
blanks.
cos 15
=
1000
H
H cos 15
= 1000 H =
1000
cos 15
= 1035.276m
(b) We could now use Pythgoras theorem (see 4.1.2) for this, but we shall
use trigonometry to demonstrate. We have the angle, adjacent and we want
the opposite. We see that tan is the function that associates these together.
tan 15
=
O
1000
O = 1000 tan 15
O = 267.949m
All gures accurate to 3 decimal places.
4.2 Notation
We now introduce a short-hand notation in common use. If you have a
specic trig function, which we shall denote as fn then:
fn
n
(x) = (fn(x))
n
For example
sin
2
x = (sin x)
2
We only use this notation when the index n is a positive integer. This is
to prevent confusion when using 1.
The notation
fn
1
x
represents the correct inverse function, for example
sin
1
x
is the function which takes a sine, and returns the angle.
Please note that
4.3 Table of values 53
sin
1
,=
1
sin x
.
Remember that all trig function work on angles, and they are meaningless
without them. Therefore sin 20 does not mean sin 20, rather it is similar to
the notation
2 which does not mean

2.
4.2.1 Example
A ramp which travels 2m along the ground, and nishes 1m up is constructed.
At what angle is this ramp to the ground?
Solution
We know the opposite (1m) and the adjacent (2m) and we want the angle.
The tan function relates this things together.
tan =
1
2
We now need to obtain so we take the inverse tan function on both sides
to obtain
tan
1
(tan()) = tan
1
(0.5)
where the brackets are added for emphasis. Now the tan
1
and tan functions
cancel each other (thats the whole point) so we obtain
= tan
1
(0.5) = 26.565
4.3 Table of values

The exact values of the sine, cosine and tan functions at important angles
are given in table 4.1.
Note that tan 90 is not dened.
4.4 Graphs of Functions
Consider gure 4.2.
A point P moves around the circumference of a circle of radius 1, and its
angle from the postive x axis is taken to be .
4.5 Multiple Solutions 54
Angle sin cos tan
0 0 1 0
30
1
2
3
2
3
3
45
2
2
2
2
1
60
3
2
1
2
3
90 1 0
180 0 1 0
Table 4.1: Table of trigonometric values
Observe that
sin =
y
1
; cos =
x
1
; tan =
y
x
.
So to summarise, in this case we have
sin = y; cos = x; tan =
y
x
As we let move, even to angles larger than 90
we can work out the

values of the major trigonometric functions and plot their graphs, and these
can be seen in gure 4.3.
We can see that sin and cos are bounded between 1 and 1, while tan
takes extreme values as cos = x = 0.
4.5 Multiple Solutions
Examining the graphs above shows an interesting situation. If we try to solve
the following equation
sin =
1
2
This corresponds to where the line y =
1
2
cuts the graph of y = sin , but
this clearly happens at several points.
Of course, the trig functions repeat themselves every 360
and so we need
only look for solutions in that range.
Figure 4.2: Generating the trigonometric graphs
To aid us we use a diagram often known as a CAST diagram. This
diagram is a helpful memory aid for which functions are positive and which
are negative for dierent angles. Armed with this and the symmetry of
trigonomtric functions we can nd an easy way to locate multiple solutions.
4.5.1 CAST diagram
We use a type of diagram called a CAST diagram (see gure 4.4 on page 57
to remind us of the behaviour of the three basic trig functions. The name of
the diagram comes from the single letter to be found in each quadrant. The
signicance is as follows
A All functions are positive in this quadrant;
S Only sin is positive, cos and tan are negative;
T Only tan is positive, sin and cos are negative;
C Only cos is positive, sin and tan are negative.
The use of the diagram is detailed below. Note that the diagram here is
labelled in the range 0
to 360
, but it is as easy to label in the range 180
to 180
.
Figure 4.3: The graphs of sin and cos
4.5.2 Procedure
We can follow this step-by-step method to nd multiple solutions, given a
principal solution from a calculator or tables.
1. Draw the principal solution on the CAST diagram;
2. Work out the angle this solution is to the horizontal;
3. Draw the other three lines with the same angle to the horizontal;
4. Examine the sign of the RHS of our equation, that is whether our
target number is positive or negative.
5. Select only those solutions whose quadrants are the right sign for the
function you are using.
4.5.3 Example
Solve the equations
1.
sin =
1
2
2.
sin =
1
3
in the range 180 < 180.
Figure 4.4: The CAST diagram
Solution
1. From a calculator or some other source we can get a principal solution of
30
, and from our CAST diagram our other solution is 150
. This is because
sin is positive in the two upper quadrants, and our equation is sin = +
1
2
.
2. Our principal solution is 19.471
, and from our CAST diagram our other

solution is 160.529 (all to 3 decimal places). This is because sin is negative
in the two lower quadrants, and our equation is sin =
1
3
.
4.6 Scalene triangles 58
Other ranges
If we had been asked to solve in the range 0
< 360
the solutions of
part one would be unchanged. The solutions for two would be read from a
CAST diagram labelled appropriately to obtain 199.471
and 340.529
.
4.6 Scalene triangles
Given an equilateral, or isosceles triangle, it is often possible to split it down
the middle to form two right angled triangles.
In this way it is usually not necessary to employ sophisticated trigonom-
etry to such triangles.
However, we need more powerful techniques to deal with scalene or ir-
regular triangles, with no two sides the same length.
4.6.1 Labelling
For non-right-angled triangles, another naming convention and set of equa-
tions are required. We name the sides a, b and c in an arbitrary order.
We then name the angle opposite side a with A, the angle opposite side
b becomes labelled B and similarly for c and C.
It should be noted that there is a close relationship between angles and
their opposite sides. In particular the larger the angle, the larger the opposite
side.
This labelling is shown in gure 4.5 on page 59.
4.6.2 Scalene trigonmetry
Under this naming system, we have the following useful laws.
4.6.3 Sine Rule
In a suitably labelled triangle
sin A
a
=
sin B
b
=
sin C
c
This rule can be used when we know one side and its corresponding angle.
Then, given another side we can nd its corresponding angle, or conversely
given the angle we can nd the other side.
Figure 4.5: Labelling a scalene triangle
When nding angles we will need to solve an equation of the form sin =
x, and in this case we must consider the possibility of multiple solutions
(see 4.5). This is because there will be a solution in the range 0
to 90
as
usual, but also one in the range 90
to 180
, and because the triangle is not

right angled, such a large angle may be possible.
4.6.4 Cosine Rule
In a suitably labelled triangle
a
2
= b
2
+ c
2
2bc cos A
This could also be written
b
2
= a
2
+ c
2
2ac cos B
or
c
2
= a
2
+ b
2
2ab cos C
This rule is very useful when we know two sides and the angle between
them. Then, we can calculate the remaining sides length.
We can also rearrange each equation (left as an exercise for the reader)
to obtain
cos A =
b
2
+ c
2
a
2
2bc
allowing us to obtain an angle given all three sides.
For technical reasons there is no ambiguous case for the cosine rule (the
second solution will lie in the range of 270
to 360
and thus cannot appear

in a triangle).
4.6.5 Example
A conventially labelled triangle has A = 63
, a = 12 and b = 13. Solve it

completely (that is, nd all its angles and sides).
Solution
We have one side and its corresponding angle, and another side. This is a
case for the sine rule.
sin A
a
=
sin B
b

sin 63
12
=
sin B
13
sin B =
13 sin 63
12
= 0.965
Using the inverse function sin
1
we may show that B = 74.853
, but this is
no right angled triangle, as we must check for multiple solutions. Using the
method shown above (see 4.5) we can show that B = 105.147
, and there
seems to be no reason why this could not be correct also.
We have no choice but to accept that this is an ambiguous question, and
nd both possible triangles, splitting it into two cases.
(a) Consider the case B = 74.853
, then as all the angles add to 180
we
obtain that C = 42.147
. All that remains to be found is c. We could use

the sine rule for this, but we shall use the cosine rule for variety.
c
2
= a
2
+ b
2
2ab cos C c
2
= 12
2
+ 13
2
2(12)(13) cos 42.147
= 81.675.
Thus we obtain c = 9.037. This complete solution is
A = 63
, B = 74.853
, C = 42.147
a = 12, b = 13, c = 9.037

4.7 Radian Measure 61
(b) Consider the case B = 105.147
. Then C = 11.853
, and using the cosine

rule shows that
c
2
= 12
2
+ 13
2
2(12)(13) cos 11.853
= 7.652
so that c = 2.766 and the complete solution in this case is

A = 63
, B = 105.147
, C = 11.853
a = 12, b = 13, c = 2.766

4.7 Radian Measure
There is nothing special about the degree. The fact that we have 360
degrees
in a circle is a historical accident, and has no mathematical signicance.
There is a natural unit of angle, which is mathematically simple. This
unit is called the radian.
There are 2 radians in a circle, so the radian is quite big, there being
only slightly over 6 in the whole circle.
4.7.1 Conversion
To convert from degrees to radians we divide by 360 and multiply by 2, or
alternatively multiply by

180
.
We multiply by
180
to convert from radians to degrees.

Some signicant angle conversions are given in table 4.2.
4.7.2 Length of Arc
In a circle with radius r and an arc
2
subtended by an angle measured in
radians (see gure 4.6), the arc length s is given by
s = r
2
An unbroken section of the circumference
4.7 Radian Measure 62
Degrees Radians
0 0
30

6
45

4
60

3
90

2
180
360 2
Table 4.2: Conversion between degrees and radians
Proof
We present a short, non-examinable, proof.
The ratio that the angle is of the whole angle in the circle is the same
as the ratio of the arc length to the whole circumference. That is
2
=
s
2r
s =
2r
2
s = r
Exercise
It is left as an exercise for the reader to show that if the angle was in
degrees the formula would be
s =
r
180
which is clearly less natural.
4.7.3 Area of Sector
In a circle with radius r and a sector
3
subtended by an angle measured in
radians, the sector area A is given by
A =
1
2
r
2
3
A region bounded by two radii and an arc
4.8 Identities 63
Figure 4.6: Length of Arc, Area of Sector
Proof
We present a short, non-examinable, proof.
The ratio that the angle is of the whole angle in the circle is the same
as the ratio of the sector area to the whole area. That is
2
=
A
r
2
A =
r
2
2
A =
1
2
r
2
Exercise
It is left as an exercise for the reader to show that if the angle was in
degrees the formula would be
A =
r
2
360
which is clearly less natural.
4.8 Identities
Trigonometry is rich in identities. Identities are equations, but with a special
signicance. For example
x + 2 = 4
is an algebraic equation. It is true for some values of x (in this case true
for one value), and false for others. Now this equation
4.8 Identities 64
x + x = 2x
is always true, regardless of the value of x. This is an identity, an equation
that holds true for all values of the variables in it.
We may use trigonometric identities to change the form of expressions to
make them easier to deal with.
4.8.1 Basic identities
tan =
sin
cos
sin
2
+ cos
2
= 1
1 + tan
2
= sec
2
cot
2
+ 1 = csc
2
4.8.2 Compound angle identities

sin( ) = sin cos sin cos
cos( ) = cos cos sin sin
tan( ) =
tan tan
1 tan tan
4.8.3 Double angle identities
sin(2) = 2 sin cos
cos(2) = cos
2
sin
2
tan(2) =
2 tan
1 tan
2
4.9 Trigonmetric equations 65

4.9 Trigonmetric equations
In some applications trigonometry leads an almost independent life from
angles. For the example the expression
Asin(t + )
is often used to represent a waveform with angular frequency and a so
called phase shift .
When we deal with equations like this we need to use all we have learned
so far of multiple solutions and/or trigonemetric identities.
4.9.1 Example
Solve the following trigonemtric equation for values of in the range 0

< 360
:
sin = cos
Solution
This is a simple example. Before learning about identities this would have
presented us with a severe problem, we might have been able to solve it by
graphical means, but exact solutions would be hard to nd.
In this case I will rst divide by cos on both sides. Of course that
is a little dangerous if that can be zero so we need to be careful, but when
cos theta is zero sin = 1 so the two are not equal and there are no solutions
like this. Therefore we can exclude this possibility and divide. This obtains
sin
cos
=
cos
cos
tan = 1
(see 4.8.1) which is a much simpler, routine equation. A calculator will
yield a principal solution of 45
and using our CAST diagram (see 4.5), we

can nd the other solution in the range to be 225
.
So
= 45
; = 225
4.9.2 Example
Write the following trigonometric equation in terms of sin only. Hence or
otherwise solve the equation for values of in the range 180
< 180
:
6 cos
2
+ sin = 5
Solution
The rst problem consists of getting rid of the cos
2
term which is not in
terms of sin . We recall from 4.8.1 that
sin
2
+ cos
2
= 1 cos
2
= 1 sin
2
so inserting this into our equation, we obtain

6(1 sin
2
) + sin = 5 6 6 sin
2
+ sin = 5
6 sin
2
sin 1 = 0
which is actually a quadratic equation (see 3.7) which can be made more
clear by letting
s = sin s
2
= sin
2
so we obtain
6s
2
s 1 = 0 (2s 1)(3s + 1) = 0 s =
1
2
; s =
1
3
which we could have solved using the solution formula of course. Now let
us not forget that s was a temporary place holder, we must now solve the
equations
sin =
1
2
; sin =
1
3
which we solved previously in example 4.5.3 so we can write down our
solutions as
= 30
; = 150
; = 19.471
; = 160.529
4.9.3 Example
Suppose that A
1
= sin(t), A
2
= 2cos(t) are two waves, what is the
resultant wave if they are superimposed?
Solution
The principle of superposition says that we can algebraicly add these two
signals to nd the resultant, but we want to nd the resulting wave exactly.
When we add the signals together we get
sin(t) + 2cos(t)
and let us suppose the resulting signal is of the form r sin(t + ), then
4
by the results in 4.8.2 we can expand this out to obtain
r(sin(t) cos + sin cos(t))
Comparing these two equations gives us
r cos = 1 (i)
r sin = 2 (ii)
If we divide equation (ii) by equation (i), then using the results from 4.8.1
we obtain
tan = 2 = 63.435
Either using this to nd r directly, or by squaring (i) and (ii) and adding
together we obtain
r
2
cos
2
+ r
2
sin
2
= 5 r
2
(cos
2
+ sin
2
) = 5 r
2
= 5 r =
5
So the resultant wave is
A =
5 sin(t + 63.435
)
4
We could have just as easily used r cos(t + )
Chapter 5
Complex Numbers
We saw in 3.5 that we have a diculty with certain square roots, namely the
square roots of negative numbers. We introduce the set of Complex Numbers
to address this situation.
5.1 Basic Principle
Rather than dealing with the square roots of all negative numbers, we can
reduce the problem to one such square root. For example, we can use the
laws of surds detailed in 3.5 to split negative square roots up.
9 =
9 = 3
25 =
25 = 5
1
In general, if we take a to be some positive real number
a =
a
so in all cases the problem comes down to the square root of 1. Now
we know that no real number can have this value, but we shall assign this
special square root a specic symbol, namely j.
Please note that i is used throughout the mathematical world itself, both
in textbooks and in calculators, while engineers often prefer j to represent
this square root. You should be aware of this for background reading.
Thus we now say that
5.2 Examples 69
9 = 3j;
25 = 5j;
2 =
2j.
Remember, none of these are real numbers.
We also sometimes dene j by the equivalent equation
j
2
= 1.
5.1.1 Imaginary and Complex Numbers
We shall call any number of the form yj an imaginary number where y is a
real number.
For many people, this terminology produces much of their mistrust and
diculty with this number system. We have invented these numbers in order
to solve a specic type of equation, that is true, but it is not the rst time.
The number zero, and all negative numbers were invented in order to solve
equations which couldnt be handled with the positive numbers alone.
Numbers of the form x + yj are called complex numbers, where x and y
are real numbers.
Observe that complex numbers are in general hybrid numbers, having a
real section or component and an imaginary component.
All real numbers can be thought of as being complex, as for example, we
can write
= + 0 j
and similarly all imaginary numbers are complex, as for example
3j = 0 + 3 j
so that the complex numbers encompass all the numbers we have used
before, and add more besides.
5.2 Examples
Here are some examples of equations that have complex number solutions,
and which we could not solve previously. In particular, we now see that
quadratic equations (see 3.7) which have no real solutions in fact have two
complex solutions.
1. x
2
+ x + 1 = 0
5.3 Argand Diagram Representation 70
Recall that we showed this to be
1
3
2
=
1
3j
2
.
So our two solutions are
x =
1 +
3j
2
; x =
1
3j
2
2. x
2
6x + 13 = 0
Using the quadratic solution formula once more we obtain
6
_
(6)
2
4(1)(13)
2
=
6
16
2
=
6 4j
2
.
So again we have two solutions, which are
x = 3 + 2j; x = 3 2j
5.3 Argand Diagram Representation
The real numbers are frequently represented by a line often called the number
line or simply the real line.
Complex numbers are inherently two dimensional and cannot be pre-
sented on a line, instead we represent these on a plane. We often talk of the
complex plane as an interchangable name for the set of complex numbers, and
diagrams in the form of two dimensional graphs are called Argand diagrams.
We take the horizontal axis to be, essentially, the real line itself, and we
call it the real axis. We call the vertical axis the imaginary axis. Then,
given a complex number z = x+yj we represent it as the point (x, y) on the
diagram.
Note that every point on the diagram corresponds to a unique complex
number, and every complex number has a unique point on the diagram.
Figure 5.1 shows a typical argand diagrams with two complex numbers,
z and w, z = 3 + 5j and w = 4 2j.
5.4 Algebra of Complex Numbers
It is surprisingly easy to perform algebraic operations with complex numbers,
if we simply do not overly concern ourselves with the origin of j.
5.4 Algebra of Complex Numbers 71
Figure 5.1: The Argand diagram
In the remainder of this section, we shall assume that z = a + bj and
w = c + dj and that a, b, c and d are all real numbers.
5.4.1 Addition
Adding two complex numbers is extremely elementary.
z + w = (a + bj) + (c + dj) = (a + c) + (b + d)j
Essentially we treat the js as any other algebraic expression and ma-
nipulate it normally. Note that the result of the addition is also a complex
number.
Examples
Here are some concrete examples involving specic numbers.
1.
(2 + 3j) + (4 + j) = 2 + 4j
2.
(j 4) + (2 3j) = 6 2j
3.
(1 + 2j) + (2 + 3j) + (3 5j) = 0
5.4.2 Subtraction
Subtraction is really no dierent than the addition of a negative number.
z w = (a + bj) (c + dj) = (a c) + (b d)j
Examples
1.
(2 + 3j) (4 + j) = 6 + 2j
2.
(j 4) (2 3j) = 2 + 4j
3.
(3 + j) (3 + 4j) = 3j
5.4.3 Multiplication
Multiplication of complex numbers usually requires the expansion of brackets.
Additionally it often gives rise to a term containing j
2
, at rst glance we
might consider such a term imaginary, but recall that j
2
= 1, so this term
is real. Here is the process.
z w = (a + bj)(c + dj) = ac + adj + bcj + bdj
2
= ac + (ad + bc)j + (1)bd = (ac bd) + (ad + bc)j
Because of the confusion caused by the power of j, it is always a good
idea to reduce such powers as quickly as possible.
Powers of j
We can deal with powers of j as a special case.
j
1
= = j
j
2
= 1 = 1
j
3
= j j
2
= 1 j = j
j
4
= j
2
j
2
= 1 1 = 1
j
5
= j j
4
= j 1 = j
Table 5.1: Powers of j
We have shown how the powers may be calculated in table 5.1. Note
that by j
5
we are repeating ourselves, we can go round and round this table
for any power of j.
Examples
1.
(2 + 3j) (4 + j) = 8 12j + 2j + 3j
2
= 8 3 10j = 11 10j
2.
(2 + 3j)
2
= (2 + 3j)(2 + 3j) = 4 + 12j + 9j
2
= 5 + 12j
3.
(2 + 3j)(2 3j) = 4 + 6j 6j 9j
2
= 4 + 9 = 13
4.
(2 + 3j)
3
= (5 + 12j)(2 + 3j) = 10 + 24j 15j + 36j
2
= 46 + 9j
We could use the binomial expansion (see 3.10) to work out high powers
of complex numbers, but it is quite tedious to work out all the powers of j.
It is easier to use a technique still to come, De Moivres theorem (see 5.7).
5.4.4 Division
Division is perhaps the least routine of all the arithmetic operations. We
begin by multiplying top and bottom by another, specic, complex number.
This leads to two multiplications - its a more drawn out process.
Remember that multiplying a fraction by a number top and bottom does
not change its value, just its form.
z
w
=
a + bj
c + dj
=
(a + bj)(c dj)
(c + dj)(c dj)
=
ac + bcj adj bdj
2
c
2
+ cdj cdj d
2
j
2
=
(ac + bd) + (bc ad)j
c
2
+ d
2
This doesnt exactly look more friendly, but looking closely you will see
that we now only have a real number on the bottom line. The division is
now routine
=
ac + bd
c
2
+ d
2
+
bc ad
c
2
+ d
2
j
It is particularly more pleasant to see this in real examples.
5.4.5 Examples
1.
2 + 3j
2
= 1 +
3
2
j
No trickery is required here, its easy to divide by real numbers.
2.
2 + 3j
4 + j
=
(2 + 3j)(4 j)
(4 + j)(4 j)
Performing the multiplications fully gives rise to
=
5 14j
17
=
5
17

14
17
j
Note that dividing the 15 in nally is not required, it is done here to
stress that the answer really is just a complex number in x + yj form.
3.
2 + 3j
j
=
(2 + 3j)(j)
(j)(j)
=
2j + 3j
2
j
2
Following the method above religiously, we should have multiplied by
j top and bottom not j, remember, we reverse the sign on the imaginary
5.5 Denitions 75
section. However, the aim of this is to create a real number on the bottom
line, and in this case we can obtain that simply by multiplying by j. Its
quite ok to multiply by j though, and some people would prefer this.
=
3 + 2j
1
= 3 2j
5.5 Denitions
5.5.1 Modulus
The modulus of a complex number z = x + yj, written [z[ is given by
[z[ =
_
x
2
+ y
2
.
Note that we met the concept of modulus before (see 3.8.1) and so a
new denition seems odd. Close examination will show that this denition
is exactly the same as the previous modulus on the real numbers (note that
in this case y = 0 and so [z[ =
x
2
).
5.5.2 Conjugate
The conjugate of a complex number z = x + yj, written z
or z is given by
z
= x yj.
On an Argand diagram this represents the reection of the number in the
real axis. Note that
(z
= z.
The conjugate is particularly important in the operation of division, be-
cause
z z
= [z[
2
which is a real number.
Observe that when a quadratic equation has two complex solutions, they
are conjugates of each other.
5.6 Representation 76
5.5.3 Real part
The real part of a complex number z = x + yj, denoted Re(z) or 1(z) is
given by
1(z) = x.
This is a highly intuitive denition.
5.5.4 Imaginary part
The imaginary part of a complex number z = x yj, denoted Im(z) or (z)
is given by
(z) = y.
Warning
Note that, confusingly, the imaginary part of a complex number is a real
number. It is only the coecient of j and does not include j itself.
5.6 Representation
There is more than one way to represent a complex number.
5.6.1 Cartesian form
We normally use the form z = x + yj, representing the point (x, y) on our
Argand diagram.
This form or representation is called cartesian form, after Rene Descartes.
1
1
Rene Descartes (1596-1650) was a French mathematician and philosopher responsible
for popularising analytical co-ordinate geometry using the system of co-ordinates that now
carry his name (in part at least).
5.6.2 Polar form
Sometimes, especially in problems with strong circular symmetry, we use
a form of representation called polar form. In this case, we examine the
location of z on the Argand diagram (see gure 5.2) and work out two new
parameters.
Figure 5.2: Polar representation of a complex number
First of all we work out r, the distance of z from 0. If z = x + yj in
cartesian form, this distance will be
r =
_
x
2
+ y
2
which is clearly the previous denition for [z[, the modulus of z.
Secondly, we work out the angle of the line connecting z and 0 from
the positive real axis. Recall that as in trigonometry, angles are measured
anticlockwise from this line. This parameter is called the argument of z, and
is often written arg z. It is easy to show that
= arg z = tan
1
_
y
x
_
,
but note that you should take care to ensure you have the correct solution.
That is, you should conrm the answer given by the calculator or tables is
really in the same quadrant as the complex number, and if not, use the
multiple solutions techniques to nd the correct solution (see 4.5).
It can be shown that
z = x + yj = r(cos + j sin ).
So this polar form doesnt change the value, as for example r cos = x,
but it only alters the form of expression.
Arithmetic
We can rework multiplication and division in polar form to get a useful result.
Let us take two complex numbers, in polar form.
z = r(cos + j sin ); w = s(cos + j sin )
Then
z w = rs(cos + j sin )(cos + j sin )
= rs(cos )(cos ) + j(sin )(cos ) + j(cos )(sin ) + j
2
(sin )(sin )
= rs(cos )(cos ) (sin )(sin ) + j(cos )(sin ) + (sin )(cos )
Which, by two of the identities in 4.8 can be written as
z w = rs(cos( + ) + j sin( + ))
It is left to the reader as an exercise to show that
z
w
=
r
s
(cos( ) + j sin( )).
Thus to multiply two complex numbers in polar form we multiply their
modulii and add their arguments. To divide, we divide their modulii and
subtract their arguments.
5.6.3 Exponential form
It can be shown, by the power series expansion of all the functions involved
that
e
j
= cos + j sin .
It follows that
re
j
= r(cos + j sin ).
Therefore we can represent a complex number in this form also. This
expansion is produced using calculus, and therefore because we are mixing
calculus with trigonometry we must use radian measure for any values of .
5.6.4 Examples
Represent the following complex numbers in polar and exponential form.
1. 1 + j
r =
1
2
+ 1
2
=
2; = tan
1
1
1
=

4
We check our value for , this angle corresponds to a complex number in
the top right quadrant, which is correct. We proceed.
1 + j =
2(cos

4
+ j sin

4
) =
2e
j
4
2. 1 j
r =
1
2
+ 1
2
=
2; = tan
1
1
1
=

4
We check , it indicates a complex number in the upper right quadrant,
this is clearly wrong, as our number is in the lower left quadrant. Using the
techniques described in 4.5 we can determine that the correct solution is
5
4
or
3
4
(depending on the range we are using - they are the same angle).
Thus
1 j =
2(cos
5
4
+ j sin
5
4
) =
2e
j
5
4
3. 1 +
3j
r =
_
1
2
+
3
2
= 2; = tan
1
3 =

3
Our check on reveals it to be in the correct quadrant, so
1 +
3j = 2(cos

3
+ j sin

3
= 2e
3
.
4. 1
r =
1
2
+ 0
2
= 1; = tan
1
0 = 0
A quick check reveals that is the correct solution. So
1 = 1(cos 0 + j sin 0) = 1e
j0
Clearly this isnt a very useful representation.
5. j
r =
0
2
+ 1
2
= 1; = tan
1
5.7 De Moivres Theorem 80

We have a problem here, we cannot solve the argument equation because
of deivision by zero. However, from an Argand diagram we can clearly see
that the argument is

2
. Thus
j = 1(cos

2
+ j sin

2
) = 1e
j
2
= e
j
2
= exp
_
j
2
_
.
which also doesnt look very helpful, but this representation is useful to
solve some problems.
5.6.5 Examples
1. Calculate e
j
By looking at the exponential representation of a complex number (see
5.6.3) we can see that this is a complex number with modulus 1 and argument
. By looking at this number on an argand diagram we can see that it is in
fact 1. This identity
e
j
= 1
is one of the most famous in mathematics, combining as it does an imagi-
nary number and two irrational numbers in a simple way to perform a simple,
integer, result.
2. Calculate j
j
This is another question most easily answered in exponential form. In
the previous examples we showed that
j = e
j
2
so that
j
j
= (e
j
2
)
j
= e
j
2
2
= e
2
= exp
_
2
_
which is, surprisingly, an ordinary real number, about 0.2079 to four
decimal places.
5.7 De Moivres Theorem
A simple extension of the multiplication demonstrated in the discussion of
the polar form of a complex number (see 5.6.2) is De Moivres Theorem which
says that if
z = r(cos + j sin )
then
z
n
= r
n
(cos n + j sin n).
That is, to take z to the power of n we take the modulus r to the power
of n, and multiply the argument by n.
This result allows us to easily work out real number powers of complex
numbers, as well as nth roots.
5.7.1 Examples
Evaluate the following expressions.
1. (1 +
3j)
4
We rst have to put this number in polar form (see 5.6.2) and we have
seen above that
1 +
3j = 2(cos
_
3
_
+ j sin
_
3
_
)
Invoking De Moivres theorem says that
(1 +
3j)
4
= 2
4
(cos
_
4
3
_
+ j sin
_
4
3
_
)
We could put this back in cartesian form is necessary
= 16(
1
2

3
2
) = 8 8
3j
2. (1 +
3j)
6
It is left to the reader to modify the above argument slightly to show that
this evaluates to be simply 64, a real number.
5.7.2 Roots of Unity
We have already seen that a real number has two square roots, for example
25 = 5;
25 = 5j.
As an exercise, it is interesting to plot these answers on an Argand di-
agram. You will note that the square roots are distributed at 180
to each
other.
Observe further that
1
4
= (1)
4
= (j)
4
= (j)
4
= 1
which tells us that 1 has at least four fourth-roots, given by 1, 1, j
and j. Draw these solutions on an Argand diagram. Again the roots are
distributed perfectly, this time at 90
to each other.
What about the cube roots of 1?
Cube roots of unity
Clearly the cube roots of unity satisfy the following equations
z =
3
1 z
3
= 1 z
3
1 = 0
and it may be shown (by expansion of the RHS) that
a
3
b
3
= (a b)(a
2
+ ab + b
2
)
and so
z
3
1 = 0 (z 1)(z
2
+ z + 1) = 0
which yields solutions when z 1 = 0, that is when z = 1, which is hardly
a surprise and when
z
2
+ z + 1 = 0
which yields solutions
z =
1 +
3j
2
, z =
1
3j
2
When we plot these solutions on an Argand diagram we nd they are sym-
metrically distributed. The modulus of each solution is 1, and the arguments
are 0
, 120
and 240
respectively.
nth roots of unity
In general then, for the nth root, we can show that the solutions in polar
form (and radians) are
cos
_
2m
n
_
+ j sin
_
2m
n
_
5.7.3 Roots of other numbers
Let z = r(cos +j sin ) then we can add any multiple of 2 radians and the
complex number is the same. Thus, if m is an integer
z = r(cos( + 2m) + j sin( + 2m))
and thus, by De Moivres theorem
n
z = z
1
n
= r
1
n
(cos
_
+ 2m
n
_
+ j sin
_
+ 2m
n
_
)
where m can be any integer, but the the range m = 0, 1, . . . , n 1 covers all
the solutions, after this we repeat existing solutions.
5.8 Trigonometric functions 84
5.8 Trigonometric functions
We saw in 5.6.3 that
e
j
= cos + j sin
and we know that
cos() = cos(theta), sin() = sin()
(this can be deduced from the power series expansions of these functions, or
their graphs). Thus we know that
e
j
= cos() + j sin() = cos j sin
If we add this equation to our rst equation we obtain
e
j
+ e
j
= 2 cos
and if we subtract the equations we obtain
e
j
e
j
= 2j sin
which can be rearranged to give
cos =
e
j
+ e
j
2
and
sin =
e
j
e
j
2j
or more readably
cos =
exp(j) + exp(j)
2
and
sin =
exp(j) exp(j)
2j
Showing further the clear link between trigonometric and exponential
functions. There are times when this form may be easier to handle than the
trigonometric function itself.
Chapter 6
Vectors & Matrices
6.1 Vectors
A scalar quantity has magnitude only, but a vector quantity has magnitude
and direction.
The real numbers are often called scalars, although when the sign of the
number is used it can represent direction in a very limited way (there are
only two possible choices, positive and negative).
The complex numbers are often used to represent vectors. We have al-
ready seen in 5.6.2 that complex numbers have a magnitude (length or mod-
ulus) and a direction (argument).
Vectors are frequently represented by lower case, bold letters. For exam-
ple we talk about the vector a or b. When handwritten we usually underline
the vector as we cannot easily emulate a bold type face; thus we talk about
a or b.
Vectors are often represented pictorially as arrows, with the direction
of the line representing the direction of the vector (and so we must label
which direction we travel along the line) and the magnitude of the vector
represented by the length of the line.
6.1.1 Modulus
The magnitude of a vector is called its modulus. It should be clear that
is totally analagous to the concept of the modulus of a complex number
(see 5.5.1) where the modulus represented the length of the line joining the
6.1 Vectors 86
number and zero.
Sometimes the modulus of a vector will be denoted in texts by the same
letter as the vector, without the bold typeface, such that for example
[a[ = a.
6.1.2 Unit Vector
A unit vector is a vector that has a modulus of 1. That is to say, it is almost
representative of a direction. To make a vector of a given length in that
direction we simply multiply by the modulus we desire.
6.1.3 Cartesian unit vectors
As we mainly deal in three dimensional space or less, and we usually use
cartesian coordinates, a special set of vectors has been laid aside for this
purpose.
i is the unit vector in the x direction;
j is the unit vector in the y direction;
k is the unit vector in the z direction.
Example
Thus we might have a vector in the following form
a = 3i + 4j + 12k
which represents taking a point 3 units along the x axis, 4 units along the
y axis and 12 units along the z axis and nding the arrow which joins the
origin to this point.
Modulus
We can work out the modulus of this type of vector easily. If we have a
vector of the form
a = xi + yj + zk
then we may show that
[a[ =
_
x
2
+ y
2
+ z
2
6.1 Vectors 87
6.1.4 Examples
Find the modulus of the following vectors
1.
a = 3i + 4j + 12k
2.
b = i + 2j k
3.
c = 3i 4k
Solutions
In each case look at the coecients of the unit vectors.
1.
[a[ =
3
2
+ 4
2
+ 12
2
=
169 = 13
2.
[b[ =
_
(1)
2
+ 2
2
+ (1)
2
=
6
3.
[c[ =
_
3
2
+ 0
2
+ (4)
2
=
25 = 5
6.1.5 Signs of vectors
We interprete the vector a as begin a vector of the same modulus as a, but
in the opposite direction. It is simple to conrm that this works with out
cartesian unit vectors.
Example
Let the vector a be given by
a = 2i 3j +k
Then
a = (2i 3j +k) = 2i + 3j k
Exercise
Prove that a and a in the above example have the same modulus.
6.1 Vectors 88
6.1.6 Addition
We add vectors by combining them nose to tail. That is, to add a and b
we follow a to its endpoint, and then glue b on at this point. See gure 6.1
for a graphical representation of vector addition.
This corresponds to the natural addition in the cartesian unit vectors.
Figure 6.1: Vector Addition
6.1.7 Subtraction
We dene
a b = a + (b).
In other words, we reverse the direction of vector b, but add it to a in the
usual way. See gure 6.2 for a graphical representation of vector subtraction.
6.1.8 Zero vector
The zero vector, denoted 0 has a mangitude of zero, and so its direction is
irrelevant. As you should expect
a +0 = a
6.1 Vectors 89
Figure 6.2: Vector Subtraction
so that adding or subtracting the zero vector has no eect.
6.1.9 Scalar Product
The scalar product or dot product of two vectors a and b is given by
a.b = [a[[b[ cos
where is the angle between two vectors. Note that this is an ordinary real
number answer and not a vector, this is why we call this the scalar product.
Note further that we always use a . to denote this multiplication, and
not a symbol (see 6.1.12).
Cartesian unit vectors
Note in particular that
i.i = 1 1 cos 0 = 1
Similarly
j.j = k.k = 1
6.1 Vectors 90
Now
i.j = 1 1 cos 90 = 0
and similarly
i.j = j.i = i.k = k.i = j.k = k.j = 0
Thus if we have two vectors
a = x
1
i + y
1
j + z
1
k, b = x
2
i + y
2
j + z
2
k
Then
a.b = (x
1
i + y
1
j + z
1
k)(x
2
i + y
2
j + z
2
k)
= x
1
i.x
2
i + x
1
i.y
2
j + x
1
i.z
2
k
+y
1
j.x
2
i + y
1
j.y
2
j + y
1
j.z
2
k
+z
1
k.x
2
i + z
1
k.y
2
j + z
1
k.z
2
k
= x
1
.x
2
+ y
1
.y
2
+ z
1
.z
2
Thus the scalar product is very simple to work out when the vectors have
this form.
Application
The scalar product can be used to calculate the angle between two vectors,
in the following manner, by rearranging the formula which denes the dot
product we obtain
cos =
a.b
[a[[b[
.
6.1.10 Example
Find the angle between the vectors
a = 3i + 4j + 12k, b = i + 2j k
6.1 Vectors 91
Solution
We calculate the modulus of each vector (we have done this in an example
above) to nd [a[ = 13 and [b[ =
6. We now calculate the scalar product

a.b = (3 1) + (4 2) + (12 1) = 7
and now we have
cos =
a.b
[a[[b[
=
7
13
6
= 0.2198
to four decimal places. Thus = 102.699
.
6.1.11 Example
Let
a = 2i 2j + 2k, b = 5i + 4j + zk.
Find z so that a and b are perpendicular.
Solution
If a and b are perpendicular then = 90
and a.b = 0.
Thus
a.b = (2 5) + (2 4) + (2 z) = 0
10 8 + 2z = 0 2 = 2z z = 1
6.1.12 Vector Product
The vector product or cross product of two vectors a and b is denoted a b
and is given by a vector of magnitude
[a[[b[ sin
where is the angle between the vectors. The product will have a direc-
tion that is perpendicular
1
both a and b such that a, b and a b form a
right-handed triple. (Imagine using a screwdriver to turn from vector a to
b, then the direction in which the screw would travel is that of a b).
This product diers from the scalar product (see 6.1.9) in that it produces
another vector as an answer, not just a scalar.
1
at right angles to
6.1 Vectors 92
Warning
Vector products are not commutative. That is
a b ,= b a
although a b and b a have the same magnitude, they have opposite
directions. Therefore
a b = b a
Cartesian unit vectors
Note in particular that i j has a magnitude of 1 1 sin 90 = 1 and has a
direction perpendicular to i and j such that i, j and k form a right angled
triple, thus it is the direction k.
So
i j = k, j k = i, k i = j
and
j i = k, k j = i, i k = j
while
i i = 0, j j = 0, k k = 0
Thus if we have two vectors
a = x
1
i + y
1
j + z
1
k, b = x
2
i + y
2
j + z
2
k
Then
a b = (x
1
i + y
1
j + z
1
k) (x
2
i + y
2
j + z
2
k)
= x
1
i x
2
i + x
1
i y
2
j + x
1
i z
2
k
+y
1
j x
2
i + y
1
j y
2
j + y
1
j z
2
k
+z
1
k x
2
i + z
1
k y
2
j + z
1
k z
2
k
= x
1
y
2
k x
1
z
2
j y
1
x
2
k + y
1
z
2
i + z
1
x
2
j z
1
y
2
i
= (y
1
z
2
y
2
z
1
)i + (x
2
z
1
x
1
z
2
)j + (x
1
y
2
x
2
y
1
)k
It can be shown later (see 6.4) that we can write this using a matrix
determinant.
a b =
i j k
x
1
y
1
z
1
x
2
y
2
z
2
6.2 Matrices 93
Application
The vector product is useful in a variety of situations, such as nding the
area of thetriangle enclosed by a and b (this is the magnitude of a b).
We shall use it simply to nd a vector at right angles to two given vectors.
6.1.13 Example
Find a vector which is perpendicular to both the following vectors
a = 3i + 4j + 12k, b = i + 2j k
Solution
We simply nd a b which has this property by denition.
a b =
i j k
3 4 12
1 2 1
or we could use the other denition

ab = i(41)(122)j(31)(121)+k(32)(41)
= 28i 9j + 10k
6.2 Matrices
A matrix is a rectangular array of numbers. We use bold typeface and capital
letters to stand for matrices, such as A and B. In handwriting it is common
to use underlining to replace the bold type face, so that we have A and B.
We say that a matrix is m by n, written m n when it has m (hori-
zontal) rows and n (vertical) columns.
We often refer to the numbers in a matrix as entries or cells.
6.2.1 Square matrices
A square matrix has an equal number of rows and columns, and we say that
it is n n.
6.2 Matrices 94
6.2.2 Row and Column vectors
A row vector is a matrix consisting of just one row, and is therefore a 1 n
matrix.
A column vector is a matrix consisting of just one column, and is therefore
a n 1 matrix.
6.2.3 Examples
Consider the matrices
A =
_
_
2 3
1 0
1 1
_
_
, B =
_
2 0 3
2 1 0
_
, C =
_
_
5 2 3
1 1 0
1 1 2
_
_
D =
_
1 2 0 9
_
, E =
_
2
0
_
Then A is a 3 2 matrix, B is a 2 3 matrix, C is a 3 3 matrix, D is
a 1 4 matrix (row vector) and E is a 2 1 matrix (column vector).
6.2.4 Zero and Identity
The zero matrix is a square matrix, denoted by 0 in which all the entries are
zero. There is a family of zero matrices, one for each size of square matrix.
0 =
_
0 0
0 0
_
, 0 =
_
_
0 0 0
0 0 0
0 0 0
_
_
, . . .
The identity matrix is a square matrix, denoted by I in which all the
entries are zero except for those entries on the leading diagonal
2
, which are
1. Thus, once again there is a family of identity matrices, one for each size
of square matrix.
I =
_
1 0
0 1
_
, I =
_
_
1 0 0
0 1 0
0 0 1
_
_
, . . .
2
the leading diagonal goes from the top-left to the bottom-right of the matrix
6.3 Matrix Arithmetic 95
6.3 Matrix Arithmetic
We must dene the arithmetric of matrices next.
6.3.1 Addition
We only dene the addition A+B of two matrices A and B when they are
of exactly the same size, that is, they are both m n. Then we add the
matrices by adding corresponding cells.
6.3.2 Examples
1.
_
2 3 0
1 2 1
_
+
_
3 3 1
3 4 0
_
=
_
(2 + 3) (3 + 3) (0 +1)
(1 + 3) (2 + 4) (1 + 0)
_
=
_
5 0 1
2 6 1
_
2.
_
3 2
0 2
_
+
_
2 0
4 2
_
=
_
(3 + 2) (2 + 0)
(0 + 4) (2 +2)
_
=
_
1 2
4 0
_
6.3.3 Subtraction
We dene the subtraction A B, of B from A only when the matrices are
of exactly the same size, that is, they are both m n. Then we subtract
corresponding cells to perform the subtraction.
6.3.4 Examples
1.
_
2 3 0
1 2 1
_
_
3 3 1
3 4 0
_
=
_
(2 3) (3 3) (0 1)
(1 3) (2 4) (1 0)
_
=
_
1 6 1
4 2 1
_
2.
_
3 2
0 2
_
_
2 0
4 2
_
=
_
(3 2) (2 0)
(0 4) (2 2)
_
=
_
5 2
4 4
_
6.3.5 Multiplication by a scalar
Multiplying a matrix A by a scalar
3
k simply requires that each cell of the
matrix A is multiplied by k. We dene kA and Ak identically.
6.3.6 Examples
2
_
2 3 0
0 1 1
_
=
_
2 2 2 3 2 0
2 0 2 1 2 1
_
=
_
4 6 0
0 2 2
_
6.3.7 Domino Rule
We shall dene the product of two matrices A.B or A B
4
, of A and B
only when the number of columns in A is equal to the number of rows in B.
We use the so called domino rule to help us remember this relationship.
To do this, we write down the size of the matrix under each matrix.
So if A is an mn matrix and B is a p q matrix we write
A B
(m n) (p q)
and the product is dened only if the two inner numbers match, in this
case if n = p.
Moreover, we can use the outer numbers. If the product is dened then it
will be of a size dened by the outer numbers, that is mq in this example.
3
In other words: an ordinary number, which could be real or complex.
4
Unlike in vectors, there is no distinction between . and in matrix arithmetic.
We begin by dening the product of a row vector and a column vector, which
are of equal length. Note that this satises the requirements of the domino
rule, and the product will be 1 1, or in other words an ordinary number.
To do the multiplication we multiply corresponding components and add:
A =
_
a
1
a
2
. . . a
n
_
, B =
_
_
_
_
_
b
1
b
2
.
.
.
b
n
_
_
_
_
_
AB = a
1
b
1
+ a
2
b
2
+ + a
n
b
n
Note that we do not dene the product BA.
All matrix multiplication returns to this simple multiplication, so it is
most important that it is grasped.
To multiply a matrix A by B we rst check, using the domino rule, to
nd whether the multiplication is possible, and also determine the size of the
answer.
Then, to calculate the entry in the ith row and jth column in the product
we multiply the ith row in A by the jth column in B using the approach
above.
We always multiply rows in the rst matrix by columns in the second.
6.3.9 Examples
1.
_
3 2
0 2
_
_
2 0
4 2
_
=
_
(3 2) + (2 4) (3 0) + (2 2)
(0 2) + (2 4) (0 0) + (2 2)
_
=
_
14 4
8 4
_
2.
_
2 0
4 2
_
_
3 2
0 2
_
=
_
(2 3) + (0 0) (2 2) + (0 2)
(4 3) + (2 0) (4 2) + (2 2)
_
=
_
6 4
12 12
_
Note that our second example demonstrates that when we multiply matrices
in the opposite order (cf. our rst example) we usually get a dierent result.
In fact, it can be even more strange than this, as our next examples show.
3.
_
2 3 1
0 1 2
_
_
_
1 0
0 1
2 3
_
_
=
_
(2 1) + (3 0) + (1 2) (2 0) + (3 1) + (1 3)
(0 1) + (1 0) + (2 2) (0 0) + (1 1) + (2 3)
_
=
_
4 6
4 5
_
4.
_
_
1 0
0 1
2 3
_
_
_
2 3 1
0 1 2
_
=
_
_
(1 2) + (0 0) (1 3) + (0 1) (1 1) + (0 2)
(0 2) + (1 0) (0 3) + (1 1) (0 1) + (1 2)
(2 2) + (3 0) (2 3) + (3 1) (2 1) + (3 2)
_
_
=
_
_
2 3 1
0 1 2
4 3 8
_
_
In this case multiplication in the opposite order of the same matrices (exam-
ples 3 and 4) produces not only dierent matrices as a result, but dierent
sizes of matrices.
5.
_
_
2 1 0
1 0 2
1 2 2
_
_
_
_
1
2
3
_
_
=
_
_
(2 1) + (1 2) + (0 3)
(1 1) + (0 2) + (2 3)
(1 1) + (2 2) + (2 3)
_
_
=
_
_
4
5
1
_
_
The opposite product is not dened in this case.
6.
_
0 1 2
_
_
_
2 1 0
1 0 2
1 2 2
_
_
=
_
1 4 2
_
The opposite product is not dened in this case.
6.4 Determinant of a matrix 99
6.3.10 Exercise
Write down any 3 3 matrix, and multiply it by I for that size, in both
orders. What eect does the multiplication have on the original matrix?
Try this for other sizes of square matrices.
6.4 Determinant of a matrix
The determinant of a square matrix A, denoted [A[ has many applications.
We build up the calculation of the determinant starting with a 2 2 matrix.
a b
c d
= ad bc
6.4.1 Examples
1.

2 3
1 4
= (2 4) (3 1) = 5
2.

2 1
0 2
= (2 2) (0 1) = 4
3.

2 4
3 6
= (2 6) (4 3) = 0
4.

1 3
2 4
= (1 4) (3 2) = 2
6.4.2 Sign rule for matrices
To build up larger determinants we rst introduce the sign rule for matrices.
This is just a square array of + and characters, such that the top left
character is always + and we alternate in each direction. Thus we have a
dierent rule of signs for each size of matrix.
_
+
+
_
,
_
_
+ +
+
+ +
_
_
,
_
_
_
_
+ +
+ +
+ +
+ +
_
_
_
_
. . .
6.4 Determinant of a matrix 100
6.4.3 Order 3
To work out the determinant of a 3 3 matrix we select a single row or
column to traverse. It does not matter which one, but in practise we usually
select one that contains a zero if possible.
We shall talk about the rst row by way of example. Take each number
in the row in turn, and multiply it by the sign in the same cell in the rule
of signs. Cover all the cells in the same row and column as our number, and
form a 2 2 matrix out of what remains, and multiply by this determinant.
Continue to the end of the row.
a b c
d e f
g h i
= +a
e f
h i
d f
g i
+ c
d e
g h
6.4.4 Examples
To emphasise that we can expand along dierent rows we shall evaluate the
same determinant in two dierent ways.
2 1 3
4 0 5
0 1 2
We rst expand along the top row, which is the most usual way.
= +2
0 5
1 2
4 5
0 2
+ 3
4 0
0 1
= 2(5) 1(8) + 3(4) = 6

This works ne, but it would be better to expand along something containing
a zero. Lets use the bottom row (remember to readjust for the table of signs).
= +0
1 3
0 5
2 3
4 5
+ 2
2 1
4 0
= 0(. . . ) 1(2) + 2(4) = 6

as you can see, one of the smaller 2 2 determinants does not need to be
calculated, saving us some work, and achieving the same result.
6.5 Inverse of a matrix 101
6.4.5 Order 4
We shall not thoroughly examine the determinants of larger matrices, but
the principle is simple. We use the same procedure for the 4 4 case, but
we reduce to 3 3 determinants.
a b c d
e f g h
i j k l
m n o p
= +a
f g h
j k l
n o p
e g h
i k l
m o p
+c
e f h
i j l
m n p
e f g
i j k
m n o
Each 3 3 determinant can now be reduced to 2 2 determinants and

so on.
We may work out the determinant of any higher size of a matrix in an
analogous way.
There are techniques for reducing determinant in complexity rather than
working them out directly, we shall not cover these.
6.5 Inverse of a matrix
Note that we have not dened division in terms of matrices. Indeed there is
no exact counterpart of division in matrices, but we have a similar concept.
We multiply by the inverse of a matrix rather than talking of dividing.
The inverse of a square matrix A, denoted A
1
is such that
AA
1
= A
1
A = I.
The inverse does not exist for all matrices. When A has an inverse we
say it is invertible, otherwise we say that A is singular.
The singular matrices are those that have a determinant of 0.
It is worth noting that if A
1
is the inverse of A, so A is the inverse of
A
1
, so that invertible matrices exist in pairs, hence the reason for calling
non invertible matrices singular.
6.5.1 Order 2
The inverse of a 2 2 matrix A is dened when [A[ , = 0. If
A =
_
a b
c d
_
then
A
1
=
1
ad bc
_
d b
c a
_
Note that [A[ = adbc, and so if [A[ = 0 we have division by zero which
is why the inverse will not be dened.
Exercise
It is left to the reader as an exercise to show that
AA
1
= A
1
A = I
in this case.
6.5.2 Examples
1.
_
2 3
1 4
_
1
=
1
5
_
4 3
1 2
_
2.
_
2 1
0 2
_
1
=
1
4
_
2 1
0 2
_
3.
_
2 4
3 6
_
1
=
1
0
(. . . )
This matrix is singular, its inverse is not dened as the determinant is zero.
4.
_
1 3
2 4
_
1
=
1
2
_
4 3
2 1
_
6.5.3 Other orders
There are several techniques for nding the inverse of larger matrices, in-
cluding row reduction and cofactors. We shall not cover these formally by
examination, but an outline of one, known as row reduction or Gaussian
elimination
5
is given below.
We rst write the matrix, with the identity matrix I of the appropriate
size glued onto the right.
We then use any of the following operations repeatedly
swap any two rows;
multiply any row by a constant;
add any row to another;
subtract any row from another.
The aim is to produce the identity matrix I on the left of the block,
whereupon the matrix on the right will be the inverse. It is not always
possible to do this (if the determinant of the original matrix is zero it will be
impossible).
We can much more easily verify a matrix is an inverse of another.
6.5.4 Exercise
Show that
_
_
1 3 4
2 3 5
4 3 5
_
_
is the inverse of the matrix
_
_
25 27 7
10 11 3
14 15 4
_
_
5
Named after Karl Gauss (1777-1855), thought by many to be the greatest of all math-
ematicians. In any case Gauss was a proloc mathematician and contributed to diverse
elds in the subject such as number theory and mathematical astronomy.
6.6 Matrix algebra 104
Solution
To prove this you must multiply the matrices together in both directions.
Let us call the matrices A and B respectively. Then you should show that
A.B = I
and
B.A = I
6.6 Matrix algebra
We now summarise some results of elementary matrix algebra which can be
easily veried.
6.6.1 Addition
Algebra concerning the addition of matrices is shown in table 6.1.
A+B = B+A Commutative law
A+ (B+C) = (A+B) +C Associative law
A+0 = A = 0 +A
Table 6.1: Matrix algebra - Addition
Algebra concerning the multiplication of matrices is shown in table 6.2.
6.6.3 Mixed
There is one major result concerning the mixture of addition and subtraction,
which is the distributive law, shown in table 6.3.
6.7 Solving equations 105
AB ,= BA In general
A(BC) = (AB) C Associative law
AI = A = I A
AA
1
= I = A
1
A
Table 6.2: Matrix algebra - Multiplication
A(B+C) = AB+AC Distributive law
(A+B)C = AC+BC Distributive law
Table 6.3: Matrix algebra - Mixed operations
6.7 Solving equations
Consider the following simultaneous equations
3x y = 1
5x + 2y = 3
and compare them to the following matrix equation
_
3 1
5 2
__
x
y
_
=
_
1
3
_
.
Expanding out the matrix equation will show it to be the same as the
two simaltaneous equations. We have simply taken the coecients of the x
and y terms and placed them in a square matrix, multiplied upon the column
vector containing x and y, and setting them equal to a column vector with
the RHS of each equation as its cells.
We have thus reduced the two simultaneous equations to the single matrix
equation
A.X = Y.
Where X is the matrix we wish to nd. If we were dealing in ordinary
algebra here we would simply divide by A on both sides, but we are working
with matrices and we do not dene division.
The next best thing would be to multiply by the inverse of A on both
sides. Remember that order in matrix multiplication is vital, so we must
either premultiply (multiply on the front of both sides) or postmultiply (mul-
tiply on the back of both sides). Postmultiplication gives an equation which
while technically correct does not help us in our problem. If we premultiply
however, we obtain:
A
1
.A.X = A
1
.Y I.X = A
1
.Y X = A
1
.Y
and so we have obtained X.
6.7.1 Example
Solve the set of simaltaneous equations using matrices.
3x y = 1
5x + 2y = 3
Solution
It is rarely worthwhile to use matrix methods to solve two equations in two
unknowns, but we do it this way by way of an example.
Recall that we wrote this as the equation:
_
3 1
5 2
__
x
y
_
=
_
1
3
_
.
Now
A =
_
3 1
5 2
_
A
1
=
1
(3 2) (1 5)
_
2 1
5 3
_
=
_
2 1
5 3
_
So we premultiply our matrix equation by A
1
on both sides.
_
2 1
5 3
__
3 1
5 2
__
x
y
_
=
_
2 1
5 3
__
1
3
_
.
Which multiplies out on the LHS to give
_
1 0
0 1
__
x
y
_
=
_
2 1
5 3
__
1
3
_
.
so that
_
x
y
_
=
_
2 1
5 3
__
1
3
_
.
Now we now from the matrix algebra that the LHS should collapse simply
to the matrix X, so we could have started from this step, it is a matter of
choice. Some people like to do the multiplication on the LHS to verify that
the inverse was correct. In any case, we now proceed with multiplication on
the RHS.
_
x
y
_
=
_
1
4
_
.
Thus x = 1 and y = 4.
6.7.2 Example
Solve the following simaltaneous equation
25a 27b + 7c = 1
10a 11b + 3c = 2
14a + 15b 4c = 3
Solution
This corresponds to the following matrix equation:
_
_
25 27 7
10 11 3
14 15 4
_
_
_
_
a
b
c
_
_
=
_
_
1
2
3
_
_
and normally this would be pretty dicult to solve, as we would need
to nd the inverse of the square matrix using some more time consuming
techniques. Fortunately, we showed in an example above that the inverse of
this matrix was
_
_
1 3 4
2 3 5
4 3 5
_
_
and so we can say that
_
_
a
b
c
_
_
=
_
_
1 3 4
2 3 5
4 3 5
_
_
_
_
1
2
3
_
_
.
In this case we have skipped the multiplication on the left for brevity,
we know this should be the result if we have been correct in our choice of
inverse. We now perform the multipication on the right.
6.8 Row Operations 108
_
_
a
b
c
_
_
=
_
_
(1 1) + (3 2) + (4 3)
(2 1) + (2 2) + (5 3)
(4 1) + (3 2) + (5 3)
_
_
=
_
_
19
21
13
_
_
Therefore a = 19, b = 21 and c = 13.
6.7.3 Row reduction
It is also possible to solve equations like this directly with row reduction,
without nding the inverse directly.
To do this, form the matrix composed of A with Y glued to the right,
and row reduce until we get the identity in the left hand block, at this point
the entries on the right hand column will be X, the variables we require.
It is not always possible to do this. We now examine many features of
matrices in greater depth and detail.
6.8 Row Operations
When one recalls how to solve simultaneous equations by elimination it be-
comes apparant that there are many operations we routinely use.
For example to solve
3x + 2y = 4 (i)
2x y = 5 (ii)
we might multiply equation (ii) throughout by 2. Then we might add
equation (i) to the modied version of (ii) etc. We list the operations fre-
quently used:
Swapping the order of equations for convenience;
Multiplying any equation by a number;
Dividing any equation by a number;
Adding any equation to any other;
Subtracting any equation to any other.
6.8 Row Operations 109
Now when remind ourselves that this system of equations can be written
_
3 2
2 1
_
. .
A
_
x
y
_
. .
X
=
_
4
5
_
. .
B
In fact, since the useful information is mainly to be found in A, X and
B, we often look at the so called augmented matrix which we shall call A
B
.
_
3 2 4
2 1 5
_
then this matrix embodies the whole system of equations. We can perform
the above row operations on this matrix and this will not aect the solutions.
When we perform row operations it is easy to lose track of the changes
you make. In these notes we shall denote rows as R1, R2 etc. and write the
changes as we go.
6.8.1 Determinants
These principles can be applied to determinants too, but with some care.
As determinants depend intimately upon the values of the coecients. For
example
1 0
0 1
= 1 but
2 0
0 1
= 2
so here it is clear that multiplying through on a row by a number does aect
the determinant. In eect all of the row operations discussed in 6.8 may be
used, with care taken for multiplication and division. For example, when
dividing by a constant on a given row, that constant is taken outside the
determinant where it must be multiplied back on. So from above we see that
2 0
0 1
= 2
1 0
0 1
= 2 1 = 2
It is also useful to note that swapping rows has an impact on the value of
the determinant. For example
1 0
0 1
= 1; (swap R1, R2)
0 1
1 0
= 1
In fact it can be shown that swapping any two rows in a determinant swaps
the sign of the result.
6.9 Solving systems of equations 110
It is frequently possible to substantially lower the amount of work required
to nd a determinant in this way. However, it is helpful to see that there
is a consistent form that we try to move the matrix towards. We will see a
reasoning for this later when we revisit systems of equations. For now we
just look at an example.
6.8.2 Example
Find the determinant of the following matrix
_
_
1 2 1
4 2 3
2 3 1
_
_
Solution
We know that introducing zeros make determinant far easier to calculate. So
we proceed this way, starting o by subtracting row 1 from row 3 twice.
1 2 1
4 2 3
2 3 1
R3=R32R1
=
1 2 1
4 2 3
0 1 1
R2=R24R1
=
1 2 1
0 6 7
0 1 1
Although these determinants are equal, this last version is far easier to cal-
culate, since if we expand along the rst column we obtain
= 1
6 7
1 1
0 [. . . [ + 0 [. . . [
= (6)(1) (7)(1) = 6 7 = 1.
6.9 Solving systems of equations
We can use this approach to help us solve systems of equations where the
inverse is dicult or impossible to nd. For example, to solve the set of
equations represented by the augmented matrix above, we simply apply row
operations in a sensible manner. For clarity at each stage we indicate just
what operation has occured, by referring to the rows as R1, R2 etc.
_
3 2 4
2 1 5
_
R2=R22
_
3 2 4
4 2 10
_
R2=R2+R1
_
3 2 4
7 0 14
_
We could continue, but in fact this is already possible to solve very trivially
now. Remembering what this augmented matrix represents we see that our
system of equations has now become.
3x + 2y = 4 (i)
7x = 14 (iii)
So that a solution for one variable is now obvious. From equation (iii) we
see that x = 2, and placing this value into equation (i) we see that
3(2) + 2y = 4 y = 1
and so the system is solved.
6.9.1 Gaussian Elimination
There is an ideal form for the rearranged section of the augmented matrix
to the left of the line. We again begin by looking at the simple example
represented by the system we just examined.
_
3 2 4
2 1 5
_
R2=R22
_
3 2 4
4 2 10
_
R2=R2+R1
_
3 2 4
7 0 14
_
We stopped at this point because the second row represented an almost
direct readout for the variable x. We could have gone further
R2=R27
_
3 2 4
1 0 2
_
as the second row now really does give a direct readout for x = 2. However,
we could have continued again
R1=R13R2
_
0 2 2
1 0 2
_
which now means the rst row gives 2y = 2. We could take two more steps.
R1=R12
_
0 1 1
1 0 2
_
R1R2
_
1 0 2
0 1 1
_
The second step is just to make it very clear that the left closely resembles
the identity matrix. If we now examine this set of equations we see it is
simply
0x + y = 1; 1x + 0y = 2
or
x = 1, y = 2.
Therefore the most ideal solution might be to rearrange the left side of
the augmented matrix to the identity matrix, whereupon the right hand side
becomes the solutions for the variables. To accomplish this we attempt to
produce zeros in the left hand column everywhere except the top entry, and
then we try to produce zeros everywhere in the next column except for the
second entry and so on.
There are two reasons why we do not always do this:
the left hand side may not be square, such as when we have fewer
equations than unknowns;
the extra operations arent really necessary to solve the equations.
In practice therefore we try to reduce the cells to the bottom and left of
the leading diagonal to zero. It is nice, but not essential if the rst number
on each row is a 1. This will allow us to begin with the bottom row and solve
for one unknown and then we move upwards through the row substituting
the known variables to nd each next unknown.
This process is known as Gaussian Elimination.
6.9.2 Example
Use Gaussian elimination to solve the following system of equations:
x + 2y + z = 1
4x + 2y 3z = 2
2x + 3y + z = 3
Solution
The augmented matrix for this system is as follows
A
B
=
_
_
1 2 1 1
4 2 3 2
2 3 1 3
_
_
and we now begin working to produce zeros in the rst column, except for
the top number. We use the top number to let us produce the zeros. In this
case the top number is 1 which is very convenient. If that had occurred in
another row an initial swap might have been a good idea.
R2=R24R1
_
_
1 2 1 1
0 6 7 2
2 3 1 3
_
_
R3=R32R1
_
_
1 2 1 1
0 6 7 2
0 1 1 1
_
_
so thats a great start. For our purposes row 1 is ne as it is now, and we
wont touch it in subsequent work. Now we work only with rows 2 and 3.
We now want to introduce a zero at the bottom of column 2, to continue our
job of zeroing the elements to the bottom and left of the leading diagonal.
Now we could add a sixth of row 2 to row 3 to achieve this, but itll be easier
to swap rows 2 and 3 rst.
R2R3
_
_
1 2 1 1
0 1 1 1
0 6 7 2
_
_
R2=R21
_
_
1 2 1 1
0 1 1 1
0 6 7 2
_
_
I have also taken the liberty to multiply through row 2 by 1 as it makes
our life a little easier. Its much easier to make the bottom of column 2 a
zero now, and doesnt introduce fractions so early. If we simply add row 2
to row 3 six times.
R3=R3+6R2
_
_
1 2 1 1
0 1 1 1
0 0 1 8
_
_
and we need not go further. Row 3 now gives
z = 8 z = 8
and Row 2 now gives
y + z = 1 y + 8 = 1 y = 9
and Row 1 now gives
x + 2y + z = 1 x 18 + 8 = 1 x = 11
and so we have a unique solution
x = 11, y = 9, z = 8.
6.9.3 Example
3x 2y + z = 1
2x + 3y + z = 2
4x 7y + z = 3
Solution
The augmented matrix for this system is
A
B
=
_
_
3 2 1 1
2 3 1 2
4 7 1 3
_
_
We begin by producing a 1 somewhere in the rst column, in order to make
subsequent work more simple. This isnt necessary, but just easier.
R2=R22
_
_
3 2 1 1
1
3
2
1
2
1
4 7 1 3
_
_
R1R2
_
_
1
3
2
1
2
1
3 2 1 1
4 7 1 3
_
_
.
This zero can now be used to produce an elimination of the other numbers
in the column.
R2=R23R1
_
_
1
3
2
1
2
1
0
13
2

1
2
2
4 7 1 3
_
_
R3=R34R1
_
_
1
3
2
1
2
1
0
13
2

1
2
2
0 13 1 1
_
_
.
Now we would next work to produce a zero at the bottom of the right hand
column. To reduce the fractions, lets multiply row 2 by 2.
R2=R22
_
_
1
3
2
1
2
1
0 13 1 4
0 13 1 1
_
_
.
Now we can create a zero in the bottom of column 2 by subtracting row 2
from row 3:
R3=R3R2
_
_
1
3
2
1
2
1
0 13 1 4
0 0 0 3
_
_
.
Now actually this represents a problem. The bottom row is now equivalent
to the equation
0x + 0y + 0z = 3 0 = 3
and this is clearly impossible. This indicates that our system of equations
is inconsistent and there is in fact no solution. A quick calculation of the
determinant of the coecient matrix will reveal it to be zero.
6.9.4 Example
3x 2y + z = 1
2x + 3y + z = 2
4x 7y + z = 0
Solution
You may note this is extremely similar to the previous example. Again the
inverse of the coecient matrix is zero. This means there is no inverse, but
critically it does not mean there are no solutions in itself, it just means there
is not a unique solution.
The augmented matrix for this system is
A
B
=
_
_
3 2 1 1
2 3 1 2
4 7 1 0
_
_
We begin by producing a 1 somewhere in the rst column, in order to make
subsequent work more simple. This isnt necessary, but just easier.
R2=R22
_
_
3 2 1 1
1
3
2
1
2
1
4 7 1 0
_
_
R1R2
_
_
1
3
2
1
2
1
3 2 1 1
4 7 1 0
_
_
.
This zero can now be used to produce an elimination of the other numbers
in the column.
R2=R23R1
_
_
1
3
2
1
2
1
0
13
2

1
2
2
4 7 1 0
_
_
R3=R34R1
_
_
1
3
2
1
2
1
0
13
2

1
2
2
0 13 1 4
_
_
.
Now we would next work to produce a zero at the bottom of the right hand
column. To reduce the fractions, lets multiply row 2 by 2.
R2=R22
_
_
1
3
2
1
2
1
0 13 1 4
0 13 1 4
_
_
.
Now we can create a zero in the bottom of column 2 by subtracting row 2
from row 3:
R3=R3R2
_
_
1
3
2
1
2
1
0 13 1 4
0 0 0 0
_
_
.
Compare this with example 6.9.2 at this point. It might be tempting to think
this is the same situation, but it is not. This time the bottom row is
0x + 0y + 0z = 0 0 = 0
which is true. Therefore there is no inconsistency, its just that one equation
out of the original three is essentially worthless, its a copy of the original
two. (If you look you will see that R3 = 2R1 R2 in the original system).
This means there will be no unique solution. In this case we proceed
as follows. There is no constraint on z, so z can take any value, say t for
example, where t is any real number.
Then row 2 gives (multiplying by 1 rst for clarity)
13y + z = 4 13y = 4 t y =
1
13
(4 t)
and row 1 gives
x +
3
2
y +
1
2
z = 1 x +
3
2
1
13
(4 t) +
1
2
t = 1
x = 1
3
26
(4 t)
1
2
t =
26 12 + 3t 13t
26
=
7 5t
13
and thus
x =
7 5t
13
; y =
4 t
13
; z = t
represents all the (innite!) solutions of this problem. This is called a para-
metric solution since the solution depends on one or more parameters (in
this case t), and as t takes all the values a real number can, we obtain all the
solutions of this system.
6.9.5 Number of equations vs. unknowns
The previous example also shows that elimination can be undertaken with
fewer equations than unknowns. If one repeats the analysis of the system
with the third equation removed one obtains exactly the same solutions.
If one has more equations than unknowns then either the system will be
inconsistent because some equations will contradict another, or one or more
6.10 Inversion by Row operations 117
equation is redundant. An analysis or rank (see below) will reveal which,
but elimination can still take place. Irrelevant equations will be removed
with rows of zeros appearing, while inconsistent systems will be revealed by
impossible equations (a row with zeros except for the right-most column).
6.10 Inversion by Row operations
It is also possible to nd the inverse of a matrix entirely via row-operations.
The theory is beyond the scope of this module, but the principle is that to
invert A an augmented matrix is formed thus
(A[I)
with the appropriately sized identity matrix to the right of the line. Row
operations are undertaken on this matrix until the identity matrix is produced
on the left hand side. The matrix on the right will then be the inverse matrix.
_
I[A
1
_
providing this is possible. As noted above, for singular matrices a row con-
taining zeros left of the line will be produced, which cannot be made into the
identity and will signify this.
6.11 Rank
The rank of a matrix A, denoted r(A) is the order of the largest square
matrix contained within A that has a non-zero determinant. The submatrix
can be formed out of any of the rows and columns of A.
The rank is essentially a measure of the amount of information in a matrix,
or alternatively the number of redundant rows in the matrix. Clearly the rank
of an mn matrix cannot exceed the smaller of m and n, as no larger square
matrix can be constituted.
6.11.1 Example
Find the rank of the following matrices.
(a)
_
_
0 0 0
0 0 0
0 0 0
_
_
6.11 Rank 118
(b)
_
_
1 0 0
0 1 0
0 0 1
_
_
(c)
_
_
3 2 1
2 3 1
4 7 1
_
_
Solution
(a) It should be clear that the determinant of this 3 3 matrix is zero.
Therefore the rank is not 3. One can construct 9 possible submatrices
6
that
are 22, but these again all consist of zeros, and the determinant will be zero.
Therefore the rank cannot be 2. Finally there are 9 possible submatrices that
are 1 1, which are all zero. Therefore the rank cannot be 1. So the rank in
this case is zero.
Can you see that if exactly one number in the matrix was changed to 4
say then the rank would become 1?
(b) We take the determinant of this matrix to obtain 1 ,= 0 and therefore the
rank of this matrix is 3.
(c) It is left as an exercise for the reader to show that the determinant of
this 3 3 matrix is 0. Therefore the rank of the matrix cannot be three. By
eliminating any one of the rows and any one of the columns we can generate
9 matrices which are 2 2. For example, if we eliminate row 3 and column
2 we obtain

3 1
2 1
= (3)(1) (1)(2) = 1 ,= 0
so that we see the rank for this matrix is 2.
We can see looking at example (a) that no information is contained in this
matrix and the rank is 0, (b) may surprise by having a rank of 3 compared to
that in example (c) which is 2, but all the rows in (b) are genuinely dierent.
A close analysis of row 3 in example (c) will show it to be R3 = 2 R1 R2,
and therefore that row really doesnt add anything to the discussion.
6
It is important to notice this. The possible 22 matrices can be found by eliminating
R1 and C1, R1 and C2, R1 and C3, R2 and C1, etc. and it is not simply a matter of
testing the matrices found by eliminating the outer rows and columns.
6.11 Rank 119
6.11.2 Systems of equations
Matrix rank is particularly useful due to what it can tell you about the solu-
tions presented by a system of equations. In particular, a comparison between
the rank of a co-ecient matrix A and the rank of the augmented matrix
A
B
is very useful. We look at some examples that may prove instructive.
6.11.3 Example
Find the rank of the co-ecient and augmented matrix for the following
system of equations.
3x + 2y = 4 (i)
2x y = 5 (ii)
Solution
You may note that this is the very simple system we looked at before in 6.9.
This system gave rise to the co-ecient matrix and augmented matrix as
follows.
A =
_
3 2
2 1
_
; A
B
=
_
3 2 4
2 1 5
_
Now [A[ = (3)(1) (2)(2) = 7 ,= 0, and so r(A) = 2. To look at the
rank of the augmented matrix there are two square 22 matrices that can be
formed, but we already know what the left hand one (found by eliminating
the third column) has a non zero determinant because it is the same as A.
Therefore
r(A) = r(A
B
) = 2.
6.11.4 Example
3x + 2y = 1 (i)
6x + 4y = 2 (ii)
6.11 Rank 120
Solution
It is clear here that equation (ii) is simply double equation (i). Therefore
there is actually only one equations worth of information here.
This system gives rise to the co-ecient matrix and augmented matrix
as follows.
A =
_
3 2
6 4
_
; A
B
=
_
3 2 1
6 4 2
_
Now [A[ = (3)(4) (2)(6) = 0, and so the rank of A cannot be two.
Taking any of the possible one by one submatrices we can clearly see that
the determinant will not be zero. Therefore r(A) = 1.
For the augmented matrix we know that the submatrix formed by elim-
inating the third column has a zero determinant, but we have not checked
the submatrices formed by eliminating the rst or middle column. However a
brief inspection shows that both of these also have a zero determinant. Once
again, any of the six possible 1 1 submatrices show that r(A
B
= 1.
r(A) = r(A
B
) = 1.
It should be clear that since both of these equations are the same, then
by simply taking equation (i) we obtain that
3x + 2y = 1 y =
1
2
(1 3x)
and consequently since there is no contraint on the choice of x and a choice
of x determines y there are innitely many solutions.
6.11.5 Example
3x + 2y = 1 (i)
6x + 4y = 1 (ii)
Solution
This is very similar to the previous example. However the second equation is
not quite the exact double of the rst anymore. In fact it should be obvious
that this system of equations is inconsistent, since the two contradict each
other.
6.11 Rank 121
This system gives rise to the co-ecient matrix and augmented matrix
as follows.
A =
_
3 2
6 4
_
; A
B
=
_
3 2 1
6 4 1
_
Now [A[ = (3)(4) (2)(6) = 0, and so the rank of A cannot be two.
Taking any of the possible one by one submatrices we can clearly see that
the determinant will not be zero. Therefore r(A) = 1.
For the augmented matrix we know that the submatrix formed by elim-
inating the third column has a zero determinant, but we have not checked
the submatrices formed by eliminating the rst or middle column. If we
eliminate the rst column the resulting submatrix has a determinant of
(2)(1) (1)(4) = 2 ,= 0. Therefore the r(A
B
) = 2.
r(A) = 1 < r(A
B
) = 2.
Remember that this system of equations has no solutions.
6.11.6 Summary
Although the previous examples do not constitute a proof, they illustrate the
following results.
With a system of equations in n unknowns such that the matrix of co-
ecients is A and the augmented matrix is A
B
, then if
r(A) = r(A
B
) = n then there exists a unique solution;
r(A) = r(A
B
) < n then there are innitely many solutions;
7
r(A) < r(A
B
) then there is no solution.
We have previously known that to obtain a unique solution for n un-
knowns we require n equations. Rank can be thought of as a way of showing
the the n equations are genuinely dierent and we have sucient information
to solve the system uniquely.
When the rank of the co-ecient and augmented matrix match but are
less than n then we do not have enough information to uniquely determine
a solution, but there is a consistency.
7
Note that this implies that the matrix A has a zero determinant, and thus is singular
(no inverse exists) and therefore other methods must be employed to nd the solutions.
6.12 Eigenvalues and Eigenvectors 122
However, when adding the extra column in the augmented matrix gives
a higher rank than that of the original matrix it has somehow added more
information. Unfortunately that symbolises a clash with the information
available in the original system.
6.11.7 Exercise
Use rank to check the systems of equations already analysed in 6.9.2, 6.9.3
and 6.9.4.
6.12 Eigenvalues and Eigenvectors
In various important problems in Engineering and Science, it is useful to nd
vectors (column vectors) that are unchanged in direction by a matrix.
For a aquare matrix A, there are certain vectors X such that
A.X = X
for a constant . The vectors X, which are not simply the trivial vectors
composed of zeros, are known as eigenvectors. The constants are known
as the eigenvalues.
For example, consider the matrix A and the vector X and the product
AX below.
A =
_
3 1
1 1
_
X =
_
1
1
_
; A.X =
_
2
2
_
= 2
_
1
1
_
= 2X
One can see that this product is in exactly the same direction as the original
vector as is just a multiple of the vector. Furthermore the eigenvector is
known (X) and the eigenvalue = 2. However, there may be another vector.
6.12.1 Finding Eigenvalues
To nd Eiginvalues, note that
AX = X AX = IX AXIX = 0
(AI)X = 0
and provided that the X is not trivial this implies that
[AI[ = 0
which is called the characteristic equation of A.
6.12.2 Example
Find the eigenvalues of
A =
_
3 1
1 1
_
Solution
So the characteristic equation is
_
3 1
1 1
_
_
0
0
_
3 1
1 1
= 0
This determininant is routine to expand:
= (3 )(1 ) + 1 = 0
2
4 + 4 = 0 ( 2)( 2) = 0
and therefore this matrix has only one eigenvalue = 2.
6.12.3 Finding eigenvectors
Once the eigenvalues for the matrix A are found, insert each value in turn
into the equation
AX = X (AI)X = 0
and solve for X. Note that there is no unique solution for each X since
eigenvectors are exactly those that turn into multiples of themselves upon
transformation.
6.12.4 Example
Find the eigenvectors of
A =
_
3 1
1 1
_
Solution
We know already that this matrix has a single eigenvalue of = 2.
Therefore the corresponding eigenvector will be a solution of the equation
(A2I)X = 0
_
1 1
1 1
__
x
y
_
=
_
0
0
_
for some x and y.
_
x + y
x y
_
=
_
0
0
_
You will note that in fact this is really only one equation in two unknowns.
This is entirely to be expected as noted above. Let x = t for some parameter
t, then y = t and the eigenvector for this eigenvalue is
_
t
t
_
for all the possible real values of t. So in fact each eigenvector is really a
class of vectors (that all lie in the same direction), with dierent magnitudes.
Some people like to normalise the vectors so that it has modulus one, which
is a trivial extra step.
6.12.5 Example
Find the eigenvalues and eigenvectors of
_
5 6
1 0
_
Solution
The characteristic equation will be given by
5 6
1 0
= 0
(5 ) 6 = 0
2
5 + 6 = 0 ( 2)( 3) = 0
and so the eigenvalues are 2 and 3 respectively.
We now nd each eigenvector, by substitution into the formula AX = X,
in turn.
For = 2 we obtain
_
5 6
1 0
__
x
y
_
= 2
_
x
y
_
5x 6y = 2x; x = 2y
6.13 Diagonalisation 125
a little investigation shows they are the same equation. A solution then
would be (for example)
_
2
1
_
For = 3 we obtain
_
5 6
1 0
__
x
y
_
= 3
_
x
y
_
5x 6y = 3x; x = 3y
a little investigation shows they are the same equation. A solution then
would be (for example)
_
3
1
_
Therefore the eigenvalues are 2 and 3 with corresponding eigenvectors
_
2
1
_
;
_
3
1
_
6.12.6 Other orders
For larger matrices the problem is tackled in the same way. Note that for a
3 3 matrix the characteristic equation is likely to be a cubic equation and
so on, so there may be algebraic diculties in solving it but the principle
remains the same.
Also, in our 2 2 case nding the eigenvectors we noted that we get two
equations that are essentially the same. For larger matrices such as a 3 3
problem there may appear to be three equations and it is less likely one will
obviously be a multiple of another, but may be a combination of the other
two. Is the system is solved using the techniques discussed above then any
redundant equation will soon be eliminated. The solution will always be
parametric in nature.
6.13 Diagonalisation
Diagonal matrices are square matrices which have zeros in all their entries
except for those on the leading diagonals.
An important problem in matrix theory is the process of producing a
diagonal matrix D from a given square matrix A. This process is known as
diagonalisation.
If A is an n n square matrix, and has n independent eigenvectors then
the diagonal matrix D is given by
D = S
1
AS
where S is the matrix formed by using each eigenvector as a column in turn.
The diagonal matrix is known as the modal matrix, and will be of the form
D =
_
_
_
_
_
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
0 0 0
n
_
_
_
_
_
where
1
,
2
, . . .
n
are the eigenvalues of A.
6.13.1 Powers of diagonal matrices
It is extremely simple to take powers of a diagonal matrix. In fact, all one
needs to do is take the powers of the cells on the leading diagonal.
6.13.2 Example
Find
_
2 0
0 1
_
3
Solution
We start by squaring
_
2 0
0 1
_
2
=
_
2 0
0 1
__
2 0
0 1
_
=
_
4 + 0 0 + 0
0 + 0 0 + 1
_
=
_
4 0
0 1
_
Now nally we multiply once more
_
2 0
0 1
_
3
=
_
2 0
0 1
__
4 0
0 1
_
=
_
8 + 0 0 + 0
0 + 0 0 + 1
_
=
_
8 0
0 1
_
Clearly we would have achieved the result simply by cubing all the numbers
on the leading diagonal.
6.13.3 Powers of other matrices
This can be used the nd powers of other matrices. Suppose A is a square
matrix that has dioganalisation
D = S
1
AS
then premultiply by S and postmultiply by S
1
on both sides.
SD = IAS SDS
1
= A
Then A
2
can be found in the following way
A
2
= SDS
1
SDS
1
= SD
2
S
1
since the inner matrices cancel. It can be proved by induction that
A
n
= SD
n
S
1
6.13.4 Example
Calculate
_
5 6
1 0
_
4
Solution
We have already solved the eigenvalue problem for this matrix in example
6.12.5 and so we can write that
A = SDS
1
where
S =
_
2 3
1 1
_
; D =
_
2 0
0 3
_
It follows that
S
1
=
_
1 3
1 2
_
Therefore
A = SD
4
S
1
=
_
2 3
1 1
__
16 0
0 81
__
1 3
1 2
_
=
_
2 3
1 1
__
16 48
81 162
_
=
_
32 + 243 96 486
16 + 81 48 162
_
=
_
211 390
65 114
_
Chapter 7
Graphs of Functions
We now look at the graphs of some simple functions that are of use to us.
7.1 Simple graph plotting
While dierential calculus can make the job of plotting graphs much easier
it is possible to do some basic graph plotting using very simple techniques
and observations.
The simplest way to plot a graph of y = f(x) is to insert a range of values
of x and calculate the corresponding y values and plot these on graph paper.
The advantages of this method are that it is simple to understand and can
even be easily undertaken by a computer. The disadvantages are, that for
humans it can be very tedious, and we have no way of knowing if interesting
features are occuring between our chosen x values, or outside the range.
We can use simple observations to improve the situation in more compli-
cated examples.
7.1.1 Example
Let us plot the graph of a simple quadratic function
y = x
2
+ 2x 3.
It is a simple exercise to produce a table of values of x ranging from 5
to 5 and calculating the value of y for each one. We only show a few values
here, as an exercise you might include the even values of x.
7.1 Simple graph plotting 129
x 5 3 1 1 3 5
y 18 6 2 6 18 38
We then attempt to draw the smoothest curve through these and we
obtain a graph like gure 7.1.
Figure 7.1: The graph of x
2
+ 2x 3
7.1.2 Example
We now turn our attention to a more simple function
y = 2x + 3.
We can plot this in the same way but in fact this is a straight line. If
you recognise this then two points are sucient to plot, but three is perhaps
prudent as it gives you a chance to spot an error if your three points do not
lie along a straight line.
x 5 0 5
y 7 3 13
Plotting these points and lining them up produces a graph like that shown
in gure 7.2.
7.2 Important functions 130
Figure 7.2: The graph of 2x + 3
7.1.3 Example
Sometimes we have extremely simple functions such as y = 2. In this case,
regardless of the value of x the y-value will be 2, so this simply represents a
straight line parallel to the x-axis at height 2 units above it.
7.2 Important functions
There are a number of very important functions and relationships that appear
throughout mathematics, science and engineering. We look briey at the
most important below.
7.2.1 Direct Proportion
y is said to be directly proportional to x if when we double x we double y
and if we triple x we triple y etc. Mathematically this can be written as
y x.
Note that the symbol is not the same as the very similar greek letter .
Mathematically, it is possible to replace the the symbol for = k. For
example
y x y = kx
for some constant k which is known as the constant of proportionality. It
follows that this is a straight line relationship, and a simple straight line that
passes through the origin at that. This is the way to graphically identify
quantitys that are in direct proportion.
Example
For a constant current I Ohms law is an example of direction proportion,
with the voltage drop V being directly proportional to the resistance R.
V = IR
For a constant mass mNewtons second law shows that the force F exerted
on a particle is directly proportional to the acceleration a that is produced.
F = ma
7.2.2 Inverse Proportion
y is said to be inversely proportional to x if, when we double x we half y,
and when we triple x we third y etc. We write
y
1
x
y =
k
x
where k is again our constant of proportionality. To help examine the graph
of such a function we shall begin with the simple case where k = 1. Note
that negative values of k will alter the argument that follows.
Observe that when x is very small (close to zero) and positive, then it ts
into one many times and y will be very large and positive. Similarly when x
is very large (far to the right) and positive, then it hardly ts into one at all
and so y will be very small and positive. Similarly when x is negative this
reasoning continues to hold, but y will also be negative. This yields a graph
like that shown in gure 7.3.
Observe that dierent positive values of k will aect the precise numbers
but little of the above reasoning and the characteristic shape of the graph.
If k is negative however the graph will appear in the top left and lower right
quadrants.
Figure 7.3: The graph of
1
x
Examples
The pressure P produced when a force F acts on an area A is inversely
proportional to that area, if the force is constant.
P =
F
A
7.2.3 Inverse Square Proportion
When y is said to be inversely proportion to the square of x, it is essentially
a similar case to that above. Here the dening relationship is given by
y
1
x
2
y =
k
x
2
.
We can use very similar reasoning to that above to determine the shape of
the graph. Let us again take k = 1 to perform some of that reasoning. Note
that this time x
2
will always be positive, regardless of the sign of x itself.
Therefore y will always be positive, so nothing will appear in the lower half
of the graph. Additionally the graph will be perfectly symmetrical about the
y axis. Finally you will note that as x
2
becomes much smaller than x for
very small x and much larger than x for large x the shape of the curve will
be more extreme. See gure 7.4.
Figure 7.4: The graph of
1
x
2
Example
Some of the most important relationships in science obey this law.
Newtons law of universal gravitation states that the force F between two
bodies of mass m
1
and m
2
respectively at a distance of r apart is given by
F =
Gm
1
m
2
r
2
where G is the gravitational constant. If the two bodies in question stay the
same mass then the whole top line is constant and this relationship becomes
an inverse square law.
Coulombs law states that the force F between two charges Q
1
and Q
2
at
a distance of r is given by
F =
1
4
Q
1
Q
2
r
2
where
0
is a constant known as the permittivity of the medium. Once again,
if the charges are constant then all the material at the front may be gathered
into one large constant term.
7.2.4 Exponential Functions
We met exponential functions before (see 3.9.1) and looked at them in some
detail, so we simply given an example of one such graph here. You will
observe that the graph is always positive (never below the x axis) and in
the case of exponential growth y = e
x
in this case (see gure 7.5) we have
extremely rapid growth to the right and fall o to the left.
Figure 7.5: The graph of e
x
Exponential decay is simply the mirror image of this graph (y = e
x
) for
example.
Examples
Exponential functions characterise processes like population growth, and dis-
charge from a capacitor.
7.2.5 Logarithmic Functions
Again, we have met logarithmic functions before (see 3.9.2) and we restrict
ourselves to plotting some examples here.
7.3 Transformations on graphs 135
The graph of y = lnx (see gure 7.6) is undened for negative values of
x because we cannot
1
take the logarithm of a negative number. The logs
of very small numbers (less than 1) are negative and large numbers (greater
than one) are positive. The rate at which the graph climbs falls very steeply
as x becomes large. Recall that logs can be used to express very large scales.
Figure 7.6: The graph of ln x
No matter what the base of the logarithm, the graph will pass through
the point (1, 0) because any base raised to the power of 0 will yield 1. A
closeup of the graph around this point is shown in gure 7.7.
7.3 Transformations on graphs
Sometimes it is useful to observe the eects of certain operations on a graph.
We shall consider some examples on the graph of sin x which we reproduce
here below in gure 7.8
1
At least without complex numbers.
Figure 7.7: Closeup of graph of ln x
7.3.1 Addition or Subtraction
If we simply add or subtract an expression from the original expression this
simply adds or subtracts that value to every y value and therefore moves the
entire graph up or down that many units without altering its shape.
See gure 7.9 for an example y = sin x + 1.
7.3.2 Multiplication or Division
If we restrict ourselves to multiplying or dividing our expression by a positive
number only then the eect is to perform that operation on the original y
value. If we multiply by 2 for example this has the eect of stetching the
graph vertically, centred about the x-axis.
See gure 7.10 for an example y = 2 sin x.
If the number is negative this will also reect the graph in the x-axis
7.3.3 Adding to or Subtracting from x
If we replace every single occurence of x with x + this means that we reach
every x value epsilon earlier than before, and that has the eect of moving
Figure 7.8: Graph of sin x
the graph epsilon units to the left. Subtracting will move it to the right. In
periodic graphs this values is known as the phase angle.
See gure 7.11 for an example y = sin(x + 90), you might note that this
is identical to the graph of cos x and so all that separates these two examples
is a phase angle of 90
.
7.3.4 Multiplying or Dividing x
If we multiply x by a positive number we reach values twice as quickly as
before, which has the eect of squeezing the graph horizontally. Dividing wll
slow the rate at which we reach values and will therefore stretch the graph
horizontally. This is therefore aecting the frequency of the graph if it is
periodic.
See gure 7.12 for an example y = sin 2x.
If we multiply by a negative number it will ip the graph around the
y-axis.
7.4 Even and Odd functions 138
Figure 7.9: Graph of sin x + 1
7.4 Even and Odd functions
Some graphs have very specic symmetry patterns.
7.4.1 Even functions
An even function f(x) has the property that f(x) = f(x) for all values of
x.
Essentially this means the height of the function is the same whether we
insert x or x, or in other words the graph has reective symmetry in the y
axis. The name even comes about because when such functions are expanded
as powers of x they only contain positive powers.
Examples of even functions are 1 = x
0
, x
2
, x
n
where n is even, cos x etc.
7.4.2 Odd functions
An odd function f(x) has the property that f(x) = f(x).
This means the the graph could be rotated around 180
around the origin

and it would look the same. Such functions result in only odd powers of x
when expanded as a power series.
Figure 7.10: Graph of 2 sin x
Examples of odd functions are x, x
3
, x
n
where n is odd, sin x etc.
7.4.3 Combinations of functions
Many functions are neither odd nor even, and when such such functions are
combined the properties can be changed or lost.
Combination Result
Odd Odd Odd
Even Even Even
Odd Even Neither
Even Odd Neither
Table 7.1: Adding and Subtracting even and odd functions
Figure 7.11: Graph of sin(x + 90)
7.4.4 Examples
1. Consider
f(x) = x
2
, g(x) = x
4
then clearly both functions are even.
f(x) + g(x) = x
2
+ x
4
(even)
f(x) g(x) = x
6
(even)
2. Consider
f(x) = x
2
, g(x) = x
3
Combination Result
Odd Odd Even
Even Even Even
Odd Even Odd
Even Odd Odd
Table 7.2: Multiplying even and odd functions
Figure 7.12: Closeup of graph of sin(2x)
then clearly f(x) is even and g(x) is odd.
f(x) + g(x) = x
2
+ x
3
(neither)
f(x) g(x) = x
5
(odd)
3. Consider
f(x) = x, g(x) = x
3
then clearly both functions are odd.
f(x) + g(x) = x + x
3
(odd)
f(x) g(x) = x
4
(even)
Chapter 8
Coordinate geometry
We now look at some basic coordinate geometry and we restrict ourselves to
two dimensions in the discussions that follow.
8.1 Elementary concepts
For the following concepts we will consider two arbitary points A dened by
(x
1
, y
1
) and B dened by (x
2
, y
2
). See gure 8.1 for an explanation of the
geometry that follows.
Figure 8.1: Relationships between two points
Note that we have formed a right angled triangle ABC, and that the
8.1 Elementary concepts 143
distances AC and BC can be calculated as
AC = x
2
x
1
; BC = y
2
y
1
8.1.1 Distance between two points
We can see from Pythagoras theorem (see 4.1.2) that
AB
2
= AC
2
+ BC
2
AB
2
= (x
2
x
1
)
2
+ (y
2
y
1
)
2
Thus the distance will be given by
AB =
_
(x
2
x
1
)
2
+ (y
2
y
1
)
2
8.1.2 Example
Find the distance between (2, 3) and (5, 7).
Solution
From the above reasoning (or even from a brief sketch like that above) we
see that the distance will be given by
d =
_
(5 2)
2
+ (7 3)
2
=
3
2
+ 4
2
=
25 = 5
8.1.3 Example
Find the distance between (1, 2) and (3, 4).
Solution
Again, but being very careful with our signs, we obtain
d = sqrt(3 1)
2
+ (4 2)
2
=
_
4
2
+ (6)
2
=
80 = 4
5 8.944
8.1 Elementary concepts 144
8.1.4 Midpoint of two points
We can nd the midpoint M of the line segment joining A and B by using
the fact that triangles AMD and ABC are similar.
It can be down that
M =
_
x
1
+ x
2
2
,
y
1
+ y
2
2
_
Note that this is simply the average x coordinate and the average y co-
ordinate.
8.1.5 Example
Find the midpoint of the line segment joining (2, 3) and (5, 7).
Solution
From the above formula we obtain
M =
_
2 + 5
2
,
3 + 7
2
_
= (3.5, 5)
8.1.6 Example
Find the midpoint of the line segment joining (1, 2) and (3, 4).
Solution
This time we obtain
M =
_
1 + 3
2
,
2 +4
2
_
= (1, 1)
8.1.7 Gradient
The gradient of a line is simply the amount it rises for every unit it travels
to the right. It follows that a horizontal line has a gradient of zero, and a
vertical line as an innite, or undened gradient.
Using the diagram above we see that the gradient of the line segment
from A to B is given by
m =
y
2
y
1
x
2
x
1
8.2 Equation of a straight line 145
8.1.8 Example
Find the gradient of the line segment joining (2, 3) and (5, 7).
Solution
Using the formula above, we obtain
m =
7 3
5 2
=
4
3
8.1.9 Example
Find the gradient of the line segment joining (1, 2) and (3, 4).
Solution
Again, but being very careful with our signs, we obtain
m =
4 2
3 1
=
6
4
=
3
2
8.2 Equation of a straight line
The equation of a straight line is given by
y = mx + c
where x and y are the two variables and m and c are constants. These
constants are the gradient and positive intercept respectively. The positive
intercept is the y value at which the line crossed the y axis. This can easily
be seen by allowing
x = 0 y = m0 + c = c
so that the line passes through (0, c). It follows that a straight line of the
form
y = mx
is a straight line of gradient m passing through the origin. We examined the
problem of plotting straight lines in the previous chapter.
8.2.1 Meaning of equation of line
The equation of a line can be thought of a statement that can be true or
false. For example
y = 2x + 3
is certainly not true for the point (1, 3). When we insert this point we obtain
3 = 2(1) + 3 3 = 5.
The line the equation denes consists precisely of those points for which the
equation is correct. We can therefore conclude that the point (1, 3) does not
lie on this line. By contrast, if we consider (1, 5) we see that
5 = 2(1) + 3 5 = 5
which is clearly a true statement. Thus we know that (1, 5) lies on this
particular line.
8.2.2 Finding the equation of a line
It is frequently the case that we wish to nd the equation of a line. To do
this we normally have two points the line passes through or the gradient of
the line and a single point the line passes through. If using experimental
data it is good practice to take points as distant from each other as possible
as the measuring error is then less signicant in the calculations. We now
look at these cases.
Given Two Points
If we have two points the line passes through, our method is as follows
1. Find the gradient between these two points, m;
2. Insert this value into the equation y = mx + c;
3. We know a point that is on the line, so it makes the equation true,
insert x and y into the equation;
4. Rearrange this equation to obtain c;
1
5. Rewrite the equation y = mx + c with m and c lled in.
1
m and c are the constants that dene the line, once we have them we are nished.
Even though for dierent points on the line x and y will change, m and c will remain the
same.
Given a Gradient and One Point
This actually comes in straight at the second point in the list above, and is
therefore somewhat simpler.
8.2.3 Example
Find the equation of the line joining (2, 3) and (5, 7).
Solution
We have already peformed the rst step in the example above by calculating
the gradient as m =
4
3
. So we know that
y =
4
3
x + c
for some value of c.
We now insert either of the two points we have above. Let us choose the
rst one
3 =
4
3
2 + c 3 =
8
3
+ c c = 3
8
3
=
1
3
.
Therefore the equation of the line is
y =
4
3
x +
1
3
.
8.2.4 Example
Find the equation of the line joining (1, 2) and (3, 4).
Solution
Once again, we have previously determined the gradient between these two
points, this time it is m =
3
2
. Therefore our equation is
y =
3
2
x + c
for some value of c.
We insert a point on the line, again we pick the rst one.
2 =
3
2
1 + c 2 =
3
2
+ c c = 2
3
2

1
2
and so the equation of the line in this case is
y =
3
2
x
1
2
.
Chapter 9
Dierential Calculus
Calculus is essentially the precise study of varying quantities. In a problem
involving a constant, such as a car travelling at constant speed it is easy to
answer most questions about it, such as distance travelled. However, if the
speed varies, everything about the problem becomes a little harder.
It is a testament to the genius of Isaac Newton that he developed calculus
as a stepping stone to his great work on the physics of gravity.
Another mathematician, Liebnitz developed calculus independently.
9.1 Concept
If y is a function of x, then the derivative of y with respect to x, written
dy
dx
is the rate of change of y with respect to x. Graphically this is the gradient
of the function y.
The process of producing the derivative is known as dierentiation.
9.2 Notation
If y = f(x) then we often use the shorthand f
(x) for the derivative. Thus

f
(x) =
dy
dx
9.3 Rules & Techniques 150
We use the notation
d
dx
to represent the phrase the derivative with respect to x of what follows.
Frequently we shall need to dierentiate repeatedly, (that is, to dierenti-
ate the derivative) and we represent subsequent derivatives with the following
notation.
Second derivative
f
(x) =
d
2
y
dx
2
.
Third derivative
f
(x) =
d
3
y
dx
3
.
In general the nth derivative will be
d
n
y
dx
n
.
When it becomes impractical to represent the derivative of f(x) by adding
symbols we sometimes write the derivative in roman numerals as a super-

script.
f
(x) = f
IV
(x)
9.3 Rules & Techniques
It is possible to calculate the derivative for a variety of functions using a
process known as dierentiation from rst principles. Fortunately, this has
already been done for most major functions, and we may use the results. We
proceed to list results before looking at examples.
9.3.1 Power Rule
Most functions we deal with contain x or simple powers of x.
d
dx
(x
n
) = nx
n1
.
Note that n must be a constant, that is it must depend on the value of x.
9.3.2 Addition and Subtraction
We can split dierentiation over a + or sign. If u and v are functions
of x then
d
dx
(u + v) =
d
dx
(u) +
d
dx
(v) ;
d
dx
(u v) =
d
dx
(u)
d
dx
(v) .
9.3.3 Constants upon functions
If a constant k is multiplied upon a function u we can dierentiate ku by
multiplying k by the derivative of u, that is
d
dx
(ku) = k
d
dx
(u) .
9.3.4 Chain Rule
If y is a function of u, and u is a function of x then
dy
dx
=
dy
du

du
dx
.
9.3.5 Product Rule
Unfortunately, we cannot split dierentiation up over a multiplication. If
y = u v where u and v are functions then
d
dx
(u v) = v
du
dx
+ u
dv
dx
.
9.4 Examples 152
9.3.6 Quotient Rule
If y =
u
v
where u and v are functions then
d
dx
_
u
v
_
=
v
du
dx
u
dv
dx
v
2
.
9.3.7 Trigonometric Rules
The derivatives of the two major trigonometric functions are given by
d
dx
(sin x) = cos x;
d
dx
(cos x) = sin x.
It is possible to work out the derivative of the other functions using
trigonometric identities and other rules. Note that to use calculus and
trigonometry one must use radians, the results for angles in degrees are
slightly dierent (anr more complicated).
9.3.8 Exponential Rules
d
dx
(e
x
) = e
x
.
9.3.9 Logarithmic Rules
d
dx
(ln x) =
1
x
.
9.4 Examples
Dierentiate the following with respect to x.
1.
(a) x
2
(b) 4x
3
(c) 2x
(d)
1
x
(e)

x (f)
2
x
2.
9.4 Examples 153
(a) 2x
2
+ x
3
3x (b) 3x(x 2)
(c)
x + 1
x
(d) (x 1)(x + 2)
3.
(a) (2x + 5)
4
(b) (3x
2
+ 4)
5
(c) (2x + 3)
100
(d)
1
(3x 2)
50
(e) sin 3x (f) cos
2
x
(g) e
2x
(h) cos
3
2x
4.
(a) x sin x (b) (3x
2
+ 1) cos x
(c) (ln x)(sin x) (d) (x
2
+ 3x)e
x
5.
(a)
x
sin x
(b)
cos x
x
(c)
e
2x
3x
2
4
(d)
x
2
3x + 6
x
3
+ 2x 5
9.4.1 Solutions
1. It is important to use everything we learned of the laws of indices (see 3.5)
to see when we can use the power rule.
(a)
d
dx
x
2
= 2x
1
= 2x
(b)
d
dx
4x
3
= 3 4 x
2
= 12x
2
(c)
d
dx
2x =
d
dx
2x
1
= 1 2 x
0
= 2
and in general
d
dx
kx = k
(d)
d
dx
1
x
=
d
dx
x
1
= 1 x
2
=
1
x
2
9.4 Examples 154
(e)
d
dx
x =
d
dx
x
1
2
=
1
2
x
1
2
=
1
2
x
(f)
d
dx
2
x
=
d
dx
2
x
1
2
=
d
dx
2x
1
2
=
1
2
2 x
3
2
= x
3
2
=
1
x
3
Note also that constants vanish when we dierentiate them
d
dx
k =
d
dx
kx
0
= 0 kx
1
= 0
2. Although in some of these cases we could use exotic rules like the product
and quotient rule, all of these examples can be dierentiated using the power
rule only, with a little algebra to get them in the right form, we also use the
fact that we can work around addition and subtraction.
(a) Power rule, working between each addition, subtraction
2 2x
1
+ 3x
2
3 = 3x
2
+ 4x 3
(b) We could use the product rule, but its overkill to do so, instead consider
that
3x(x 2) = 3x
2
6x
d
dx
(3x
2
6x) = 2 3x
1
6 = 6x 6.
(c) We could use the quotient rule, but some algebra is better
x + 1
x
= 1 +
1
x
= 1 + x
1
d
dx
(1 + x
1
) = 0 +1x
2
= x
2
=
1
x
2
.
(d) We could use the product rule, but algebra is easier
(x 1)(x + 2) = x
2
+ x 2
d
dx
(x
2
+ x 2) = 2x + 1 0 = 2x + 1.
3. The key to chain rules is to see some block of symbols we wish was simply
x. We let u be this block, and then write y in terms of u. Finally we plug
things into the chain rule formula, like so.
9.4 Examples 155
(a) Chain rule, y = u
4
, u = 2x + 5, and
dy
dx
=
dy
du

du
dx
dy
dx
= 4u
3
(2) = 4(2x + 5)
3
2 = 8(2x + 5)
3
.
(b) Chain rule, y = u
5
, u = 3x
2
+ 4
dy
dx
= 5u
4
(6x + 0) = 30x(3x
2
+ 4)
4
.
(c) Chain rule, y = u
100
, u = 2x + 3
dy
dx
= 100u
99
2 = 200(2x + 3)
99
.
(d) Chain rule, y = u
50
, u = 3x 2
dy
dx
= 50u
51
3 = 150(3x 2)
51
.
(e) Chain rule, y = sin u, u = 3x
dy
dx
= cos u 3 = 3 cos 3x.
(f) Chain rule, y = u
2
, u = cos x
dy
dx
= 2u sin x = 2 sin x cos x = sin 2x.
(g) Chain rule, y = e
u
, u = 2x
dy
dx
= e
u
2 = 2e
2x
.
(h) Chain rule, y = u
3
, u = cos 2x
dy
dx
= 3u
2
2 sin 2x = 6(sin 2x)(cos
2
2x).
(We use the chain rule again to work out
d
dx
cos 2x similar to part (e)).
4. (a) Product rule, u = x, v = sin x
(sin x).1 + (x)(cos x) = sin x + x cos x.
9.5 Tangents 156
(b) Product rule, u = (3x
2
+ 1), v = cos x
(cos x).(6x) + (3x
2
+ 1)(sin x).
(c) Product rule, u = ln x, v = sin x
(sin x)(
1
x
) + (ln x)(cos x).
(d) Product rule, u = (x
2
+ 3x), v = e
x
e
x
.(2x + 3) + (x
2
+ 3x).e
x
= e
x
(x
2
+ 5x + 3).
5. (a) Quotient rule, u = x, v = sin x
(sin x).1 x.(cos x)

sin
2
x
=
sin x x cos x
sin
2
x
.
(b) Quotient rule, u = cos x, v = x
(x).(sin x) (cos x).1

x
2
=
x sin x cos x
x
2
.
(c) Quotient rule, u = e
2x
, v = 3x
2
4
(3x
2
4)(2e
2x
) (e
2x
)(6x)
(3x
2
4)
2
=
2e
2x
(3x
2
3x 4)
(3x
2
4)
2
.
(d) Quotient rule, u = x
2
3x + 6, v = x
3
+ 2x 5
(x
3
+ 2x 5).(2x 3) (x
2
3x + 6)(3x
2
+ 2)
(x
3
+ 2x 5)
2
.
9.5 Tangents
A tangent is a line which only grazes a curve at a specic point. The gradient
of the tangent and the curve are equal at that point.
Dierentiation allows us to easily nd tangents to curves at points.
The procedure is to dierentiate the formula for the curve, as the deriva-
tive is the gradient. Recall that the equation of a straight line is
y = mx + c
where m and c are constants (and m is the gradient).
So once we nd the derivative we have calculated m. To nish we need
only insert the coordinates of the point at which the tangent hits the curve
in order to calculate c.
9.5 Tangents 157
9.5.1 Example
Find a tangent to the line
y = x
2
+ 2x 3
at x = 1.
Solution
When x = 1, y = 1
2
+ 2(1) 3 = 0, so our tangent hits the curve at the
point (1, 0).
Now we need to nd the gradient of the tangent, which is the gradient of
the curve, which is the derivative:
dy
dx
= 2x + 2,
and so when x = 1
dy
dx
= m = 2(1) + 2 = 4.
Thus our line has the form
y = 4x + c
so all we need to do now is to nd c, which we do by plugging in the coor-
dinates of some point on the line, which is (1, 0), so let x = 1, y = 0 and we
get
0 = 4(1) + c c = 4
and we have therefore our nal line as
y = 4x 4.
9.5.2 Example
Find the tangent to the curve
y = xe
x
when x = 0.
9.6 Turning Points 158
Solution
When x = 0, y = 0.e
0
= 0 1 = 0, so the tangent passes through the point
(0, 0).
The gradient of the tangent is the gradient of the curve, and
dy
dx
= e
x
.1 + x.e
x
= e
x
(1 + x)
so when x = 0
dy
dx
= m = e
0
(1 + 0) = 1
so our line has the form
y = 1.x + c
and inserting the point we know is on the line (0, 0) gives
0 = 1.0 + c c = 0
and thus our equation is
y = x
9.6 Turning Points
There are certain features in curves that particularly interest us. An example
would be the peaks and troughs of a curve, or in more exact language,
the maxima and minima. These, together with two other cases we shall
see later, are known as turning points.
Looking at such points on a curve reveals an important fact: at just the
point of the maximum or minimum, the curve is horizontal. In other words
the gradient is zero at these points.
9.6.1 Types of turning point
There are four types of turning point, or points where the gradient is zero
for an instant. These are shown in gure 9.1.
Local maximum
A local maximum is a point on the curve so that if we were to imagine
standing at that point, the curve would descend in both directions, at least
for a little distance. This is case 1 in gure 9.1.
Figure 9.1: Types of turning point
This local maximum may not be the same as the maximum point of the
curve (there may be some other peak which is higher).
Local minimum
A local minimum is a point on the curve so that if we were to imagine
standing at that point, the curve would ascend in both directions, at least
for a little distance. This is case 2 in gure 9.1.
This local minimum may not be the same as the minimum point of the
curve (there may be some other trough which is deeper).
Stationary points of inection
A stationary point of inection is a point on the curve so that if we were
to imagine standing at that point, it would be at at that exact point, the
curve would ascend in one direction and descend in the other, at least for a
little distance.
There are two varieties of this turning point, one that goes down to the
left and up to the right (this is case 3 in gure 9.1), and one which goes up
to the left and down to the right (this is case 4 in gure 9.1).
9.6.2 Finding turning points
To nd the turning points of a curve y = f(x) we rst of all nd the derivative,
f
(x) and solve the equation

f
(x) =
dy
dx
= 0
for x. There may be no, one or several values of x that solve this equa-
tion. For each value of x, place it into the formula for y = f(x) to nd the
corresponding y coordinates.
In doing this we have found all the turning points of the curve.
9.6.3 Classication of turning points
Once we have located our turning points we usually wish to know what type
of turning point each one is. There are two ways of determining this.
Second derivative test
The rst method is to dierentiate the derivative again, to obtain the second
derivative.
We plug the x values from each turning point into this function and,
depending upon the sign of our answer we can determine which type of
turning point we have. This is shown in table 9.1.
f
(x) Turning Point Comments

+ Local minimum Conclusive
Local maximum Conclusive
0 Stationary inection Inconclusive
Table 9.1: Second derivative test
Note that if the second derivative is zero, we know that we have a sta-
tionary point of inection, but not which case. We shall need to use another
test in this case.
Direct analysis of the gradient
This method is usually more dicult to employ, but in some cases it can be
easier, such as when f
(x) is a complex fraction and dierentiating again will

be very arduous. Alternatively, we may be forced to use this method when
the second derivative test is inconclusive.
To use this method we examine the sign of the gradient, f
(x) a little to
the left and to the right of the turning point. It is important we stay close
however. The results we can conclude are shown in table 9.2.
Left Right Conclusion
+ Local maximum
+ Local minimum
+ + Stationary inection, bottom left to top right
Stationary inection, top left to bottom right
Table 9.2: First derivative turning point classication
9.6.4 Example
Find and classify the turning points of the function
y = x
3
+ 3x
2
24x + 3.
Solution
Find the rst and second derivatives
dy
dx
= 3x
2
+ 6x 24;
d
2
y
dx
2
= 6x + 6.
We nd the turning points by solving
dy
dx
= 0 3x
2
+ 6x 24 = 0,
and this is simply a quadratic equation
x
2
+ 2x 8 = 0 (x + 4)(x 2) = 0 x = 4, x = 2
so there are two turning points. We nd the matching y coordinates for each
x value
x = 4 y = (4)
3
+ 3(4)
2
24(4) + 3 = 83
and
x = 2 y = 2
3
+ 3(2)
2
24(2) + 3 = 25.
Finally it simply remains to classify the points, using the second derivative
test.
x = 4
d
2
y
dx
2
= 6(4) + 6 = ve
which indicates a local maximum, and
x = 2
d
2
y
dx
2
= 6(2) + 6 = +ve
which is a local minimum.
Thus we have (4, 83), a local maximum and (2, 25), a local minimum.
9.6.5 Example
Find and classify the turning points of the function
y = x
3
.
Solution
Find the rst and second derivatives
dy
dx
= 3x
2
;
d
2
y
dx
2
= 6x.
Now solve to nd turning points
dy
dx
= 0 3x
2
= 0 x = 0
so we have only one turning point at x = 0. Now we classify it, trying the
second derivative test rst.
x = 0
d
2
y
dx
2
= 6(0) = 0
which is inconclusive.
We have to fall back on using the rst derivative. Try just left of the
turning point, say when x = 1
dy
dx
= 3(1)
2
= 3 = +ve
and right just a bit, say when x = 1
dy
dx
= 3(1)
2
= 3 = +ve
9.7 Newton Rhapson 163
which gives a pattern of up, across and up which is a stationary point
of inection from bottom left to top right.
So we can see that we dont have to nd exotic functions to nd these
inection points.
9.6.6 Example
Find and classify the turning points of
y = ax
2
+ bx + c.
Solution
This is, of course, the standard quadratic equation. Find the rst and second
derivative and remember that a, b and c are constants.
dy
dx
= 2ax + b;
d
2
y
dx
2
= 2a.
So our turning point can be found when
dy
dx
= 0 2ax + b = 0 x =
b
2a
.
so that we have a single turning point where x =
b
2a
.
1
To classify it, use the second derivative, but there is nowhere to plug in
x in this expression. If a is positive then 2a is positive and so is the second
derivative, and we have a local minimum. Similarly we have a local maximum
if a is negative.
So we can tell at a glance which quadratics have a maximum, and which
have a minimum simply by this method.
9.7 Newton Rhapson
The Newton-Rhapson method is a numerical method for rening approximate
solutions of equations. By the word numerical in this context we mean that
1
Comparing this with the quadratic solution formula is rather interesting, it should be-
come clear that the solution formula gives two solutions in general, each an equal distance
away from the turning point, but on either side.
it will not (usually) nd the exact solution to the equation, but one which is
close enough for our purposes.
Suppose that we have an approximate solution for the equation
f(x) = 0
which is x = a
1
, then we can nd a better approximation to the solution
x = a
2
from the following formula:
a
n+1
= a
n
f(a)
f
(a)
.
This process is repeated until we gain our desired accuracy.
This process is shown graphically in gure 9.2, essentially the technique
traces a line from the approximate solution up to the curve, then it rides down
a tangent from that point back to the x-axis. This is our next approximation
and we continue this process until we achieve the desired accuracy.
9.7.1 Example
Numerically solve the equation
x
2
3x + 1 = 0
accurate to four decimal places.
Solution
Of course we could solve this using the quadratic solution formula, and nor-
mally would, but we use it as a simple example of the method, and of an
example with two solutions.
A quick sketch graph shows two solutions, one close to x = 0 and one
close to x = 2.5. We now need to work out our iteration formula.
f(x) = x
2
3x + 1 f
(x) = 2x 3
and so
a
n+1
= a
n
f(a
n
)
f
(a
n
)
a
n+1
= a
n
a
2
n
3a
n
+ 1
2a
n
3
We shall do one iteration of one solution explicitly. Let a
1
= 0, then
a
2
= 0
0 3(0) + 1
2(0) 3
= 0 0.3333 = 0.3333
and continuing, placing this value back into the formula and so on, we obtain:
a
3
= 0.3333 0.0476 = 0.3809
a
4
= 0.3809 1.0131 10
3
= 0.3820
a
4
= 0.3820 4.5907 10
7
= 0.3820.
The solution has stopped changing to our desired precision so we have x =
0.3820.
Using the same formula, but starting with the value for the other solution
b
1
= 2.5 yields:
a
3
= 2.5 0.125 = 2.6250
a
3
= 2.6250 6.9444 10
3
= 2.6181
a
4
= 2.6181 2.1567 10
5
= 2.6180
a
4
= 2.6180 2.0800 10
10
= 2.6180.
The solution has stopped changing to our desired precision so we have
x = 2.6180.
9.7.2 Example
Numerically solve the equation
ln x = x 10
accurate to four decimal places.
Solution
This is not of the form f(x) = 0 which we require, but we can see, by roughly
plotting the graphs of y = ln x and y = x 10 that a solution occurs near
x = 12. Furthermore, we can rearrange to obtain
ln x x + 10 = 0
and so we now have a suitable f(x).
f(x) = ln x x + 10 f
(x) =
1
x
1
9.8 Partial Dierentiation 166
Thus our iteration formula is
a
n+1
= a
n
f(a
n
)
f
(a
n
)
which will be
a
n+1
= a
n
ln(a
n
) a
n
+ 10
a
1
n
1
So shall only explicitly do the rst iteration
a
2
= 12
ln 12 12 + 10
(12)
1
1
= 12 0.5290 = 12.5290
Continuing in the same way, feeding this new value, we obtain
a
3
= 12.5290 1.0259 10
3
= 12.5280
a
4
= 12.5280 3.6431 10
9
= 12.5280
which is no longer changing at the required precision. So we expect the
solution to be
x = 12.5280
9.8 Partial Dierentiation
When we have more than one variable, for example, suppose that some quan-
tity z depends on both x and y, then we must use partial dierentiation.
The partial derivative of z with respect to x denoted
z
x
means the derivative of z with respect to x, treated as though any vari-
ables other than x are constant.
In an analogy of our previous notation, we accept the convention that
x
means the partial derivative, with respect to x, of what follows.
It is possible to partially dierentiate z with respect to x and then y
which is denoted by
y
_

x
(z)
_
=

y
_
z
x
_
=

2
z
xy
.
Note that this is usually the same as the partial derivative of z with
respect to y and then x, denoted
2
z
yx
but not always, so the order on the bottom line of the notation is relevant.
Note that the derivatives take place in the reverse order from their listing on
the bottom line.
Of course, if we partially dierentiate z with respect to x twice, we get
the more familiar notation
2
z
x
2
.
9.8.1 Example
z = x
2
y + 3x
3
y
2
Find
z
x
,
z
y
,

2
z
xy
,

2
z
yx
,

2
z
x
2
,

2
z
y
2
.
Solution
First nd
z
x
. To do this we dierentiate the formula for z with respect to x,
treating every y as a constant:
z
x
= 2xy + 9x
2
y
2
.
To nd
z
y
we dierentiate with respect to y, treating occurrences of x as
constants:
z
y
= x
2
+ 6x
3
y.
Now we can proceed with the other derivatives more easily.
2
z
xy
=

x
_
z
y
_
= 2x + 18x
2
y;
2
z
yx
=

y
_
z
x
_
= 2x + 18x
2
y.
Note that these two derivates are the same, this is almost, but not always
the case.
9.9 Small Changes 168
2
z
x
2
=

x
_
z
x
_
= 2y + 18xy
2
;
2
z
y
2
=

y
_
z
y
_
= 6x
3
.
9.9 Small Changes
Another application of dierential calculus is that of estimating small changes.
Consider our relationship, y = f(x) as usual. A small change
2
in x, say
by adding x will give rise to a small change in y, say y. It may be shown
that
y
x

dy
dx
.
It therefore follows that
y =
dy
dx
x.
Thus, given our increase x in x, we can calculate the corresponding
increase in y, given by y.
We can do the same thing with partial dierentiation if more than two
variables are concerned.
In general then
z =
z
x
x + +
z
y
y.
9.9.1 Example
Consider a circle of radius 5, without direct calculation, nd the approximate
area of the circle if the radius is increased to 5.1.
2
Note that x does not signify two numbers and x multiplied together. In this
context the small greek delta represents a small increase in the variable x. This is
common throughout the literature.
Solution
We know that the area of a circle is given by A = r
2
, and that therefore
dA
dr
= 2r
and we know that
A
dA
dr
r.
So, when r = 5, A = 25 and
dA
dr
= 10.
Thus
A 10 0.1 =
So that we estimate our new area to be
A + A 26 81.681.
In fact the directly calculated answer is 81.713 to three decimal places.
9.9.2 Example
A voltage of 12V is placed over a resistor of value 10. Without direct cal-
culation, nd the change in the current through the resistor if the resistance
is changed to 10.2.
Voltage V , Current I and resistance R are governed by Ohms law.
V = IR
Solution
We need to calulcate I so we rst rearrange by dividing by R on both sides
to obtain
V
R
= I
Our initial current is thus
12
10
= 1.2A.
Here we have three variables, but we shouldnt be alarmed about this
because the voltage is constant, and we want current with respect to voltage,
and we may use the relationship
I
R

I
R
I
I
R
R.
Now
I = V R
1
I
R
= V R
2
=
V
R
2
so rst we have written I to make it a little easier to dierentiate (power
rule rather than the quotient rule), and we remember that V is treated as a
constant in the partial derivative.
Now, plugging in our values we obtain
I
12
10
2
R I 0.12 0.2 = 0.024.
So the change results in a downward movement of the current of 0.024A
(or in other words, we estimate the new current to be 1.176A).
Direct calculation shows the answer in this case to be 1.176A (to three
decimal places).
Figure 9.2: The Newton-Rhapson method
Chapter 10
Integral Calculus
10.1 Concept
We dene integration as the reverse process of dierentiation. That is, the
integral with respect to x of a function f(x), denoted
_
f(x)dx
is F(x) if
d
dx
(F(x)) = f(x).
Note that the dx in the integral is not something to be integrated, rather
it is the end of the notation and denotes something like with respect to x
in this context.
10.1.1 Constant of Integration
Consider that
d
dx
(5x 10) =
d
dx
(5x) =
d
dx
(5x + 1000) = 5
If we consider the problem of nding the integral of 5 with respect to x we
need something that dierentiates to be 5, so any of these functions could
do.
When we dierentiate, any constant present is lost, and therefore when
we integrate we assume that a constant may have been present. This is called
the constant of integration, and is usually denoted by c. Thus
_
5dx = 5x + c.
10.2 Rules & Techniques
Dierentiation is a relatively simple procedure; with practice we can dieren-
tiate almost any function. Integration is, in general, much harder and many
of our favourite rules have no equivalent, so we must use a variety of less
perfect techniques. As before we look at some rules and techniques before
introducing examples.
10.2.1 Power Rule
Integrating powers of x is almost as simple as dierentiating:
_
x
n
dx =
x
n+1
n + 1
+ c; (n ,= 1)
So we raise the power by one and divide by the new power. Note that this
rule breaks down in one case, when n = 1, which corresponds to the integral
of
1
x
, at this value we wind up with division by zero in our above formula.
We need something that dierentiates to this function, and fortunately we
met something like this in dierentiation.
_
x
1
dx =
_
1
x
dx = ln x + c
So that all powers of x are covered.
10.2.2 Addition & Subtraction
Suppose that u and v are functions of x. Then
_
u + vdx =
_
udx +
_
vdx;
and
_
u vdx =
_
udx
_
vdx.
In other words, we can split integration up over addition and subtraction,
just as we could do with dierentiation.
10.2.3 Multiplication by a constant
Suppose that u is a function of x and k is a constant. Then
_
kudx = k
_
udx.
In other words, we can take constants outside integrals, just as we could
do with dierentiation.
10.2.4 Substitution
We have no analog of the chain rule in integration, but the closest thing is
the idea of substitution.
Once again, we look for an ackward expression that would be easier if
it were x, and we let this be u. We then dierentiate u with respect to x,
and imagine that we can split the du from the dx. We rearrange for dx (and
sometimes other bits of x terms and replace the dx with du. We then perform
the u based integral.
10.2.5 Limited Chain Rule
When we looked at the chain rule in dierentiation (see 9.3.4 we saw that
there was a quick way to think about it. We can dierentiate the object u
as if it were x, provided we multiply by
du
dx
. For example, we have the basic
rule
d
dx
(sin x) = cos x,
but the chain rule can extend this to
d
dx
(sin u) = cos u
du
dx
.
This is very quick to operate, and the laborious process of subsitution in
integration does not easily lend itself to this speed.
Fortunately, if our u = ax + b where a and b are constants (b may be
zero), we can take a shortcut. We integrate u just as we would x, and then
divide by a to compensate. So for example, we have the basic integration
rule
_
1
x
dx = ln x + c
We cannot extend this to anything like we were able to with the chain rule
in dierentiation, but if the thing replacing x is simple, namely ax + b we
can. For example:
_
1
ax + b
dx =
1
a
ln(ax + b) + c,
and we can extend every integration result in this way, provided we only
replace x by ax + b.
We shall see later that
_
e
x
dx = e
x
+ c
and so
_
e
ax+b
dx =
1
a
e
ax+b
+ c
and so on.
10.2.6 Logarithm rule
Again, appealing to the dierentiation chain rule (see 9.3.4) we can extend
the basic result
d
dx
(ln x) =
1
x
to
d
dx
(ln u) =
1
u

du
dx
=
u
u
where u is some function of x. It follows that
_
f
(x)
f(x)
dx = ln f(x) + c.
That is, if we need to integrate a function where the numerator is the deriva-
tive of the denominator we simply state it to be the natural logarithm of the
bottom line (plus the usual constant of course).
10.2.7 Partial Fractions
Because of the lack of solid rules in integration, we must rely with our inge-
nuity in algebra far more than was the case in dierentiation. One technique
that is useful for dealing with fraction expressions is called partial fractions
in which we try to split a complex fraction into several smaller, simpler ones.
The rst step of the method is to attempt to factorize the bottom line as
completely as possible, and the exact action will be decided partially on how
well the bottom factors, and on the degree of the top line of the equation.
We shall assume that the degree
1
of the top line is always less than the
degree of the lower line. If this is not the case we must begin with long
division in algebra which is beyond the scope of this module.
We consider several cases and in each case we give a suitable right hand
side which we hope to simplify our expression to.
Denominator factors completely into linear factors
In this case, the bottom line of the expression can be decomposed entirely
into linear factors, that is, factors of the form (ax + b) like (2x 3), or 3x
etc.. In this case we shall use the following target expression.
gx + h
(ax + b)(cx + d)
=
A
ax + b
+
B
cx + d
So the expression on the right hand side is formed entirely of numbers over
each factor. The numbers are represented by the capitals A and B and are
to be determined.
Denominator factors with repeated linear factors
This is similar to the previous case in that the bottom line decomposes com-
pletely into linear factors, but this time one of those factors is repeated. For
example
gx + h
(ax + b)(cx + d)
2
In this case it is not correct to use the expected expansion
gx + h
(ax + b)(cx + d)
2
=
A
ax + b
+
B
cx + d
+
C
cx + d
1
The degree of an equation is the index of the highest power of x, thus a quadratic
equation is of degree 2.
as the bottom lines on the right will not give the correct denominator when
a common denominator is taken. Instead we list a constant over the factor,
and another constant over the factor squared, and so on until the power of
the factor is that we began with. In this case the factor was squared so the
correct treatment would be:
gx + h
(ax + b)(cx + d)
2
=
A
ax + b
+
B
cx + d
+
C
(cx + d)
2
.
Note the square on the bottom of the last bracket.
Denominator factors incompletely
Sometimes one factor on the bottom line will not factorize completely. For
example if we get a factor of (x
2
+ 1) then this cannot be factorized into
linear form (as the quadratic has no solutions). We shall only consider the
case that one factor is a quadratic that cannot be factorized further.
gx + h
(ax + b)(cx
2
+ dx + e)
=
A
ax + b
+
Bx + C
cx
2
+ dx + e
Method
Once the correct form for simplication has been established as discussed
above we need to determine the constants A, B, . . . .
Note that the left hand side and the right hand side are identically equal.
That is to say, they match for all values of x. One way of establishing the
values of A, B, . . . , is to insert values of x repeatedly. Each of these will
produce an equation. So if we have constants A, B and C only then we will
need three equations. In practice however, we can pick strategic values of x
to make this easier.
10.2.8 Integration by Parts
The integration by parts formula is the closest thing that we have to a product
rule for dierentiation, and it is nothing like as powerful.
If u and v are functions of x then
_
v
du
dx
dx = uv
_
u
dv
dx
dx
Proof
We include a simple proof for interest and completeness, it is not examinable.
This result comes from the dierentiation product rule:
d
dx
(uv) = v
du
dx
+ u
dv
dx
Integrating with respect to x both sides obtains
uv =
_
v
du
dx
dx +
_
u
dv
dx
dx,
with the derivative and integral cancelling on the LHS. We now rearrange
this to the parts formula by, for example, subtracting
_
u
dv
dx
dx on both sides.
Observations
Note that to use this as a product rule we need to pick one bit of the product
to be v and the other to be
du
dx
. We will also need to nd
dv
dx
and u. It
follows that we must pick one bit we can dierentiate (v) and one bit we can
integrate
du
dx
.
The RHS of the formula consists of two terms; the term uv is of no
concern, it is already integrated, but the other term can be a problem. If we
cannot work out the integral on the RHS this method is no use. We must
make this integral easier than our rst one, or at least no worse.
10.2.9 Other rules
There are some other rules in integration that are useful, which are as follows.
Trigonometric functions
_
sin xdx = cos x + c
_
cos xdx = sin x + c
_
tan xdx = ln [ sec x[ + c
_
cot xdx = ln [ sin x[ + c
10.3 Examples 179
_
sec xdx = ln [ sec x + tan x[ + c
_
csc xdx = ln [ tan
x
2
[ + c
Exponential functions
_
e
x
dx = e
x
+ c
Miscellaneous functions
_
1
x
2
+ a
2
dx =
1
a
tan
1
x
a
+ c
_
1
x
2
a
2
dx =
1
2a
ln
x 1
x + a
+ c
_
1
a
2
x
2
dx = sin
1
x
a
+ c
10.3 Examples
Here are some examples of various integrations.
10.3.1 Examples
Find the following integrals
1.
_
x
3
+ 4x
2
+ 6x 5dx
2.
_

xdx
3.
_
x
2
+ 1
x
dx
10.3 Examples 180
Solutions
These are all plain, power rule, integrations.
1.
=
x
4
4
+
4x
3
3
+ 3x
2
5x + c.
2.
=
_
x
1
2
dx
=
2
3
x
3
2
+ c.
3.
=
_
x +
1
x
dx
=
x
2
2
+ ln x + c.
10.3.2 Examples
Find the following integrals
1.
_
sin 3xdx
2.
_
(3x 2)
10
dx
3.
_
x
x
2
3dx
4.
_
x
x + 1
dx
Solutions
These can all be tackled with substitution (see 10.2.4).
1. Let
u = 3x
du
dx
= 3 dx =
du
3
So we can replace our 3x by u to obtain
=
_
sin udx,
10.3 Examples 181
but this isnt enough, although the function is now simple to integrate, it is
simple to integrate with respect to u, and not x. We also need to change the
dx, so we use the formula we obtained from rearranging
du
dx
.
=
1
3
_
sin udu =
1
3
cos u + c
but it is not appropriate to leave u in the answer (just as in the dierentiation
chain rule questions) so we place it back in:
=
1
3
cos 3x + c.
This question is also (more quickly) solvable using the limited chain (see 10.2.5)
which is left as an exercise.
2. Let
u = 3x 2
du
dx
= 3 dx =
du
3
.
So we can replace occurences of 3x 2 and dx to obtain
=
1
3
_
u
10
du =
1
3
1
11
u
11
+ c.
Plug the expression for u back in to obtain
=
1
33
(3x 2)
11
+ c.
This question is also (more quickly) solvable using the limited chain
(see 10.2.5) which is left as an exercise.
3. Let
u = x
2
3
du
dx
= 2x
du
2
= xdx.
Note that we try to place x terms with dx and constants and any u terms
(there arent any here) in du. So we obtain
=
_
x
udx
but note that we know xdx =
du
2
so
=
1
2
_

udu =
1
2
_
u
1
2
du
10.3 Examples 182
=
1
2
2
3
u
3
2
+ c
and now plug u back in to obtain
=
1
3
(x
2
3)
3
2
+ c.
4. Let
u = x + 1
du
dx
= 1 du = dx
so plugging this is we obtain
=
_
x
u
dx,
and we know that dx = du so we get
=
_
x
u
du.
However there is still an x present - it was not soaked up in the change
of dierential as happened in the previous question. So we rearrange the
substitution.
u = x + 1 x = u 1
so we can replace x with u 1 and we obtain
=
_
u 1
u
du =
_
1
1
u
du
= u ln u + c
and now plugging in u again
= x + 1 ln(x + 1) + c.
10.3.3 Example
Integrate the following
_
4x + 5
(2x 1)(x + 2)
dx
10.3 Examples 183
Solution
This is a partial fraction problem (see 10.2.7).
Before any integration takes place at all, we rst perform the algebra of
simplifying the expression in the integral. From partial fractions we have
4x + 5
(2x 1)(x + 2)
=
A
2x 1
+
B
x + 2
Now we multiply up by the denominator of the LHS (2x 1)(x +2) on both
sides, which removes the fractions from the problem.
4x + 5 = A(x + 2) + B(2x 1)
and we now have to nd A and B. This expression is an identity, that is to
say, it is true for all values of x. We can insert a couple of values into x to get
simaltaneous equations, but we can make life very easy by picking carefully,
so that one bracket vanishes (to zero).
x = 2 3 = 5B B =
3
5
,
and similarly
x =
1
2
7 =
5
2
A A =
14
5
.
Thus we have shown
4x + 5
(2x 1)(x + 2)
=
14
5
1
2x 1
+
3
5
1
x + 2
and the form on the RHS is easy to integrate.
_
4x + 5
(2x 1)(x + 2)
dx =
14
5
_
dx
2x 1
+
3
5
_
dx
x + 2
=
7
5
ln(2x 1) +
3
5
ln(x + 2) + c
where these integrals may be done by the limited chain rule given above
(see 10.2.5) or full blown substitution (see 10.2.4) as you prefer.
10.3.4 Example
Find the following integrals.
10.3 Examples 184
1.
_
4x
3
6x + 6
x
4
3x
2
+ 6x 4
dx
2.
_
x
x
2
3
dx
Solutions
These are both solvable using the logarithm rule (see 10.2.6).
1. Straight away we observe that the numerator is the derivative of the
denominator. So
= ln(x
4
3x
2
+ 6x 4) + c
2. The numerator is not the denominator, but we can tweak things.
=
1
2
_
2x
x
2
3
dx =
1
2
ln(x
2
3) + c.
10.3.5 Examples
Find the following integrals:
1.
_
x sin xdx;
2.
_
xe
2x
dx.
Solutions
These are classic parts integrations (see 10.2.8).
1. Pick the bit to dierentiate
v = x
dv
dx
= 1
and the bit to integrate
du
dx
= sin x u = cos x
and plug into the parts formula
_
x sin xdx = (x)(cos x) +
_
(cos x)(1)dx
10.4 Denite Integration 185
and the integral on the right is certainly easier, routine in fact.
_
x sin xdx = x cos x sin x + c
2. Pick the bit to dierentiate
v = x
dv
dx
= 1
and the bit to integrate
du
dx
= e
2x
u =
1
2
e
2x
and plug into the parts formula
_
xe
2x
dx = (x)(
1
2
e
2x
) +
_
(
1
2
e
2x
)(1)dx
and the integral on the right is certainly easier, but may require substitution.
_
xe
2x
dx =
1
2
xe
2x
+
1
4
e
2x
+ c
10.4 Denite Integration
We dene the denite integral with respect to x from a to b of f(x), denoted
_
b
a
f(x)dx
is given by
F(b) F(a)
where
F(x) =
_
f(x)dx.
This last integral, of the type we have already met is known as a indenite
integral, and is generally speaking a function of x.
The denite integral will be a number, formed by inserting b and then a
into the indenite integral and subtracting.
We call a and b the limits of integration.
Note that because the denite integral is a number, independent of x that
the variable x is sometimes called a dummy variable as it vanishes from the
result, and we might just as easily have used t or some other name with no
consequence.
10.4.1 Notation
If
_
f(x)dx = F(x) + c
we normally write
_
b
a
f(x)dx = [F(x)]
b
a
which signies that the expression has been integrated, but the limits have
not yet been inserted.
10.4.2 Concept
The denite integral usually has some signicance, usually that of summing
up strips of area.
10.4.3 Areas
The area enclosed by the curve y = f(x) and above the x-axis from x = a to
x = b is given by
A =
_
b
a
ydx
Note that integration treats areas below the curve as negative.
10.4.4 Example
Find the area enclosed by y = 3x
2
, the x-axis and x = 2 and x = 4.
Solution
The required area is
_
4
2
3x
2
dx =
_
x
3
4
2
= (4
3
) (2
3
) = 56.
10.4.5 Example
Find the area enclosed by y = 2x
2
3x 2 and the x-axis.
Solution
We are not given limits here, but this is a quadratic, so we can nd where
the graph hits the x-axis and hence have our limits.
Solve
2x
2
3x 2 = 0 (2x + 1)(x 2) = 0 x =
1
2
, x = 2.
So these are our limits. The area we require is given by
_
2
1
2
2x
2
3x 2dx
=
_
2
3
x
3
3
2
x
2
2x
_
2
1
2
=
__
2
3
2
3
3
2
2
2
2(2)
_
_
2
3
(
1
2
)
3
3
2
(
1
2
)
2
2(
1
2
)
__
= 5.2083
which is below the axis, we can disregard the minus sign.
10.4.6 Volumes of Revolution
The volume formed when the curve y = f(x) is rotated 360
around the
x-axis, from x = a to x = b is given by
V =
_
b
a
y
2
dx
10.4.7 Example
Find the volume formed when the curve y = e
x
is revolved around the x-axis
between x = 1 and x = 3.
Solution
First of all we must nd y
2
:
y
2
= (e
x
)
2
= e
2x
and therefore the volume will be
V =
_
3
1
y
2
dx
=
_
3
1
e
2x
dx =
_
1
2
e
2x
_
3
1
=

2
_
e
6
e
2
_
10.4.8 Mean Values
We sometimes wish to nd the mean value of a function y = f(x) in some
range, from x = a to x = b. This can be done by nding the area enclosed
and nding what constant value would enclose the same area. This results
in the following formula for the mean value M.
M =
1
b a
_
b
a
ydx
10.4.9 Example
Find the mean value of the function
y = 2x + 3
Solution
M =
1
4 2
_
4
2
2x + 3dx
=
1
2
_
x
2
+ 3x
4
2
=
1
2
__
4
2
+ 3(4)
_
_
2
2
+ 3(2)
__
= 9
10.4.10 Example
Find the mean value of the function
y = Asin x
between x = 0 and x = 2 where A is a constant.
Solution
Recall that we use radians in calculus, so we are integrating over 360
here.
M =
1
2 0
_
2
0
sin xdx
=
1
2
[cos x]
2
0
=
1
2
(cos 2) (cos 0)
=
1
2
(1 1) = 0
Oddly, at rst, the mean value of this function is zero. This is because
the sin graph spends an equal amount of time above and below the axis over
this range and the two areas cancel out. This is not really an acceptable was
of dealing with this sort of signal (which could represent, for example, an
A.C. current).
10.4.11 RMS Values
The mean value suers from the problem that integration considers areas
below the x-axis to be negative, and sometimes we wish to use the superior
concept of the RMS or Root-Mean-Squared value of a function. This is given
by
RMS =
1
b a
_
b
a
y
2
dx
so that it is literally the square root, of the mean, of the function squared.
The squaring ensures the function will be positive, while the square root
takes us back to the same units afterwards.
10.4.12 Example
Find the RMS value of the function
y = 2x + 3
Solution
We shall concentrate on nding the contents of the square root rst.
RMS
2
=
1
4 2
_
4
2
(2x + 3)
2
dx
=
1
2
_
4
2
4x
2
+ 12x + 9dx
=
1
2
_
4x
3
3
+ 6x
2
+ 9x
_
4
2
=
1
2
__
4(4)
3
3
+ 6(4)
2
+ 9(4)
_
_
4(2)
3
3
+ 6(2)
2
+ 9(2)
__
=
247
3
Thus the RMS is the square root of this
RMS =
_
247
3
9.0738
10.4.13 Example
Find the RMS value of the function
y = Asin x
between x = 0 and x = 2 where A is a constant.
Solution
This is a sophisticated example, not examinable but included to demonstrate
an interesting point about A.C. versus D.C. power.
Again, we start by nding RMS
2
.
RMS
2
=
1
2 0
_
2
0
(Asin x)
2
dx
= A
2
1
2
_
1
2
(x
1
2
sin 2x)
_
2
0
= A
2
1
2
__
1
2
(2
1
2
sin 4)
_
_
1
2
(0
1
2
sin 0)
__
10.5 Numerical Integration 191
= A
2
1
2
() =
A
2
2
taking the square root we now obtain
RMS =
_
A
2
2
=
A
2
that is, we divide by
2 to nd the analogous D.C. value for an A.C. signal

of raw amplitude A.
Note
I rather glossed over the problem of integrating sin
2
x so I do this now for
completeness.
Recall from our trigonometry that
cos 2x = cos
2
x sin
2
x
and
sin
2
x + cos
2
x = 1
Thus
cos 2x = (1 sin
2
x) sin
2
x = 1 2 sin
2
x
sin
2
x =
1
2
(1 cos 2x)
whereas the function on the left is hard to integrate, the one on the right is
relatively simple.
_
1
2
(1 cos 2x) dx =
1
2
_
x
1
2
sin 2x
_
+ c
by using the limited chain rule (see 10.2.5).
10.5 Numerical Integration
As we have seen, integration is much harder than dierentiation, and indeed
it is easy to produce relatively simple expressions that cannot be integrated
by formula. However, we know that integrals may have very many applica-
tions and are hence valuable to obtain. Fortunately there are a number of
numerical approximation methods we may use to nd a denite integral. We
shall look only at a result known as Simpons rule.
10.5.1 Simpsons rule
This is a method of approximately evaluating a denite integral of the form
_
b
a
f(x)dx.
We split the interval from x = a to x = b into an even number of strips
of equal width which we shall label h, the more numerous the strips, the
more accurate the approximation and the small the value of h (and the more
calculations we must do). The boundary of each strip gives us a particular
x value, we start labelling these values x
0
, x
1
, . . . , x
n
.
That is
x
0
= a;
x
1
= a + h;
x
2
= a + 2h;
.
.
.
x
n
= b.
We then evaluate the y value at each of these x values, usually by sim-
ply inserting the x value in to the expression for y. We label these values
y
0
, y
1
, . . . , y
n
respectively. Then the integral is given (approximately) by
_
b
a
f(x)dx =
1
3
h(y
0
+ y
n
) + 4(y
1
+ y
3
+ + y
n1
) + 2(y
2
+ y
4
+ + y
n2
) .
The following memory aid is sometimes used
_
b
a
f(x)dx =
1
3
h(F + L) + 4(O) + 2(E)
where F stands for the First y value, L stands for the Last y value and O
stands for all the y values with an Odd index totalled, and E stands for all
the y values with an Even index excluding the rst and last totalled.
10.5.2 Example
Use Simpsons Rule to approximate the integral
_
1
0
1 x
2
dx
using four strips of equal width.
Solution
The function is indeed dicult to integrate directly, but nothing of the kind
is required. We rst split the interval from x = 0 to x = 1 into 4 strips,
clearly giving
x
0
= 0, x
1
= 0.25, x
2
= 0.5, x
3
= 0.75, x
4
= 1
Now we must work out the y values for each of these values of x, where y
is the expression we are integrating. With 3 decimal places of accuracy we
obtain
y
0
= 1, y
1
= 0.968, y
2
= 0.866, y
3
= 0.661, y
4
= 0
Finally we can insert into the formula
_
1
0
1 x
2
dx
0.25
3
(1 + 0) + 4(0.968 + 0.661) + 2(0.866)
which comes out as 0.771.
Chapter 11
Power Series
11.1 Denition
A power series in x is a function of the form
f(x) = a
0
+ a
1
x + a
2
x
2
+ a
3
x
3
+
where a
0
, a
1
, . . . are constants. We could write
f(x) =
n
i=0
a
i
x
i
11.1.1 Convergence
Sometimes it is possible to calculate the result of this series of x terms, but
sometimes the value cannot be calculated. We say that the power series
converges when it is possible to work out the value of the function, otherwise
we say that the series diverges.
In fact most power series converge for certain values of x and diverge for
other values of x. In fact we normally nd that there is a so called radius of
convergence r. This means that f(x) converges when [x[ < r, which in the
context of real numbers means that r < x < r (but this concept can apply
to complex numbers too).
Clearly, the series will always converge when x = 0, and so a radius of
convergence of 0 indicates that the series only converges when x = 0. When
we say the radius of convergence is we mean that it converges for all real
(and complex) numbers.
11.2 Maclaurins Expansion 195
11.2 Maclaurins Expansion
It is often useful to be able to determine an equivalent power series expansion
for f(x). We start by assuming that
f(x) = a
0
+ a
1
x +
a
2
2!
x
2
+
a
3
3!
x
3
+
or that
f(x) =
n
i=0
a
i
i!
x
i
We dont have to place the factorials in this expression, but it does make
our life rather simpler.
If we dierentiate this we obtain
f
(0) = a
1
+ a
2
x +
a
3
2!
x
2
+
and again to obtain
f
(0) = a
2
+ a
3
x +
a
4
2!
x
2
+
and so on.
Observe that
f(0) = a
0
, f
(0) = a
1
, f
(0) = a
2
and this holds in general. Thus
a
i
= f
i
(0)
and we can show that
f(x) = f(0) + f
(0)x +
f
(0)
2!
x
2
+
f
(0)
3!
x
3
+
Note that this says nothing about convergence, we havent answered the
question about how useful the series is.
11.2.1 Odd and Even
A function where the only powers of x present are odd numbers is called an
odd function. Odd functions obey the rule
f(x) = f(x).
A function where the only powers of x present are even numbers is called
an even function. Even functions obey the rule
f(x) = f(x).
11.2 Maclaurins Expansion 196
11.2.2 Example
Find the Maclaurin expansion for sin x.
Solution
Let f(x) = sin x, and then
f(x) = sin x f(0) = 0
f
(x) = cos x f
(0) = 1
f
(x) = sin x f
(0) = 0
f
(x) = cos x f
(0) = 1
f
4
(x) = sin x f
4
(0) = 0
you will note that we repeat ourselves after the fourth derivative. Thus
we obtain the expansion
f(x) = 0 + 1x +
0
2!
x
2
+
1
3!
x
3
+
0
4!
x
4
+ cdots
Thus
sin x = x
x
3
3!
+
x
5
5!

x
7
7!
+
It can be shown that this series has an innite radius of convergence.
11.2.3 Exercise
Obtain the Maclaurin expansion for cos x, and show that cos x is an even
function.
11.2.4 Example
Find the Maclaurin expansion for e
x
.
Solution
This is a particularly simple example. We know that
f
(x) = f
(x) = f
(x) = = e
x
11.3 Taylors Expansion 197
and so
f(0) = f
(0) = f
(0) = f
(0) = = 1
so that the Maclaurin expansion is
e
x
= 1 + 1x +
1
2!
x
2
+
1
3!
x
3
+
or as more usually states
e
x
= 1 + x +
x
2
2!
+
x
3
3!
+
x
4
4!
+
11.2.5 Exercise
Prove Eulers identity that
e
j
= cos + j sin
11.2.6 Example
Obtain the Maclaurin expansion for ln x.
Solution
This presents us with a serious problem. Dierentiating is not hard, but look
at f(0) for example, the logarithm of zero is not dened in any base.
We cannot calculate the Maclaurin expansion for this function, we need
another tool.
11.3 Taylors Expansion
Taylors expansion is very similar to Maclaurins expansion, except that it is
not centred at 0, but at some value a.
f(a + x) = f(a) + f
(a)x +
f
(a)
2!
x
2
+
f
(a)
3!
x
3
+
11.3.1 Example
Find the Taylor expansion of ln(1 + x).
Solution
Proceed as before with Maclaurins expansion, by repeatedly dierentiating.
Let f(x) = ln x, then
f
(x) =
1
x
, f
(x) =
1
x
2
, f
(x) =
2
x
3
, f
4
(x) =
3!
x
4
, . . .
and so
f(1) = 0, f
(1) = 1, f
(1) = 1, f
(1) = 2, f
4
(1) = 3!, . . .
and so
f(1 + x) = 0 + 1x +
1
2!
x
2
+
2
3!
x
3
+
or, after tidying up we obtain
ln(1 + x) = x
x
2
2
+
x
3
3

x
4
4
+
x
5
5
+
11.3.2 Identication of Turning Points
Prove that if f(x) has a turning point at x = a then if
f
(a) > 0 then the turning point is a local minimum;

f
(a) < 0 then the turning point is a local maximum.

We used this result frequently in classifying turning points in 9.6.3. We
now present a proof.
Proof
The Taylor expansion of f(x) around x = a is given by
f(a + x) = f(a) + f
(a)x +
f
(a)
2!
x
2
+
f
(a)
3!
x
3
+
So note that
f(a + x) f(a) = +f
(a)x +
f
(a)
2!
x
2
+
f
(a)
3!
x
3
+
and f
(a) = 0 so
f(a + x) f(a) =
f
(a)
2!
x
2
+
f
(a)
3!
x
3
+
If x is suciently small, (i.e. we are very close to the turning point) then we
neglect terms of x
3
or higher.
f(a + x) f(a)
f
(a)
2!
x
2
Now
x
2
2!
is clearly positive, so the sign of the LHS depends entirely upon the
sign of f
(a).
If f
(a) > 0 then f(a + x) f(a) > 0 f(a + x) > f(a) for small x;
If f
(a) < 0 then f(a + x) f(a) > 0 f(a + x) < f(a) for small x.
These statements are precisely that there is a local minimum, or maxi-
mum at x = a respectively.
Chapter 12
Dierential Equations
A very important application of integration is that of dierential equations.
These are equations in terms of the derivatives of a variable.
12.1 Concept
We shall restrict ourselves to Dierntial Equationss involving only two vari-
ables x and y.
A Dierential Equation or D.E. for short is an equation involving x, y
and the derivatives of y with respect to x.
The order of a dierential equation is the number of the highest derivative
present in the equation.
In general we wish to nd the function y = f(x) which satises the D.E.,
and in general, unless we have information to help us calculate them, we will
have a constant for each order of the D.E. when we solve it.
12.2 Exact D.E.s
Dierential equations of the form
dy
dx
= f(x)
are called exact dierential equations, as the right hand side is exactly a
derivative and all we need do is integrate on both sides with respect to x.
12.2 Exact D.E.s 201
This is a rst order exact equation, and we could have second order or third
order equations, but we will restrict ourselves to rst order equations of this
type here.
Equations of the form
d
dx
(f(x, y)) = g(x)
are also exact, as all that has to be done is to integrate on both sides with
respect to x to remove all derivatives.
12.2.1 Example
Solve the dierential equation
dy
dx
= 3x
2
Solution
Integrate on both sides with respect to x, remembering that integration is
the reverse of dierentiation and cancels it.
_
dy
dx
dx =
_
3x
2
dx
The next step is usually ommitted when we feel comfortable.
_
dy =
_
3x
2
dx
y = x
3
+ c
is the nal solution.
Note that
_
dy is a shorthand for
_
1dy which is y.
12.2.2 Example
x
dy
dx
=
x + 1
12.2 Exact D.E.s 202
Solution
This doesnt look exact, but all we have to do is divide by x on both sides
to obtain
dy
dx
=
x + 1
x

dy
dx
=
x
1
2
+ 1
x
so that we obtain
dy
dx
= x
1
2
+
1
x
Now integrate on both sides, with respect to x.
_
dy
dx
dx =
_
x
1
2
+
1
x
dx
so that we obtain
y = x
1
2
1
2
+ ln x + c
which is
y = 2x
1
2
+ ln x + c
12.2.3 Example
Solve the following dierential equation.
x
2
dy
dx
+ 2xy = 4e
2x
Solution
This is an exact equation, but it doesnt look like it. The left hand side can
be written as the derivative of a product, as so
d
dx
_
x
2
y
_
= 4e
2x
and so integrating both sides with respect to x removes the derivative (cancels
it).
x
2
y =
_
4e
2x
dx = 2e
2x
+ c
and so dividing by x
2
we obtain
y =
2e
2x
+ c
x
2
12.3 Variables separable D.E.s 203
12.3 Variables separable D.E.s
f(y)
dy
dx
= g(x)
are called variables separable dierential equations, because it is possible to
separate the function of y on the left hand side from that of the function of
x on the right hand side.
If we integrate with respect to x on both sides we obtain
_
f(y)
dy
dx
dx =
_
g(x)dx
which, by a variation of the chain rule can be shown to be
_
f(y)dy =
_
g(x)dx.
In fact most people thing of rearranging the original D.E. to get terms
involving x on one side, together with dx and those of y and dy on the other,
like so:
f(y)
dy
dx
= g(x) f(y)dy = g(x)dx
and then placing integral signs in front of both sides. Although this does not
formally reect what is happening mathematically it is a useful way to think
of these problems and makes solving problems much easier.
12.3.1 Example
Solve the following dierential equation
y
dy
dx
= x
Solution
There are two ways of doing this, the formal way and the informal way. For
this rst example, we shall do both.
Formally, we integrate with respect to x on both sides.
_
y
dy
dx
dx =
_
xdx
_
ydy =
_
xdx
y
2
2
=
x
2
2
+ c
Note that each integration technically gives rise to a constant, but we can
absorb them into a single one. Were nished with the calculus now, its only
algebra to polish up. Multiply by 2 on both sides
y
2
= x
2
+ 2c
but 2c is just a constant, say d so
y
2
= x
2
+ d
We could leave it here.
The other way of thinking is to imagine we split the dy from the dx. So
that here, we multiply up by dx on both sides
ydy = xdx
and then we place integral signs in front of both sides
_
ydy =
_
xdx
this is not really what happens, but it works out the same, and is faster.
From here we continue as above.
12.3.2 Example
1
dy
dx
= y
where is a constant.
1
This is quite an important D.E. in science, if we let y = A and x = t it is the equation
which denes the activity of a nuclear sample over time (radioactive decay), in which case
lambda is a poaitive constant known as the decay constant (which is related to the half
life). If we let y = I and x = t and =
1
CR
we get the D.E. which describes the current
in a discharging capacitor.
Solution
This is variables separable, even though it is not of the classic form described
above, we can put it into that form. Divide by y on both sides to obtain
1
y
dy
dx
=
there are no x terms, which just makes things easier for us. Do our trick
or rearranging the derivative (placing all the y and dy pieces on one side and
the x and dx pieces on the other).
1
y
dy = dx
and place integral signs in front of both sides (we could have done this for-
mally as above of course).
_
1
y
dy =
_
dx
and we can take the constant out on the RHS to obtain
_
1
y

_
dx
which gives us
ln y = x + c
please remember
_
dx =
_
1dx = x. We can remove the logs on both sides
by taking e to the power of both sides (please see 3.9.5).
e
ln y
= e
x+c
y = e
x+c
we can now use the laws of indices (see 3.5) (the rst one, in reverse) to write
y = e
x
e
c
and note that c is a constant, so that e
c
is a constant, lets say A. We nally
obtain
y = Ae
x
Solution
dy
dx
=
2
(x + 3)(y 2)
12.4 First order linear D.E.s 206
Solution
Again, this is variables separable, but the variables have not yet been seper-
ated. Move all the x and dx terms to one side and the y and dy to the
other.
y 2dy =
2
x + 3
dx
Now place integral signs in front of both sides
_
y 2dy =
_
2
x + 3
dx
and integrate on both sides. Now
_
y 2dy =
y
2
2
2y
dont worry about the constant, we will do one for the whole equation as
usual. The other integral
_
2
x + 3
dx = 2
_
x + 3
d
x
which can be done in several ways, the numerator is the derivative of the
denominator, so we can say
2
_
x + 3
d
x = 2 ln(x + 3)
but we could also have used substitution u = x+3, or the quick substitution
method for ax + b instead of x. Therefore we have
y
2
2
2y = 2 ln(x + 3) + c
which cant be cleaned up with algebra too much more.
12.4 First order linear D.E.s
dy
dx
+ f(x)y = g(x)
are known as rst order linear dierential equations. To solve these we
normally attempt to convert them to an exact dierential equation. We
do this my multiplying through by an integrating factor. Let us call this
function i(x). We attempt to nd a formula for i(x). Please note this is not
examinable, it is the nal result here that is important.
i(x)
dy
dx
+ i(x)f(x)y = i(x)g(x)
Now if we consider that the left hand side is not exact (hopefully) it seems
likely it may be a product
d
dx
(i(x)y) = i(x)g(x)
in which case, from the product rule we have
d
dx
i(x) = i(x)f(x)
di
dx
= i(x)f(x)
which is rst order variables separable
_
di
i(x)
=
_
f(x)dx ln(i(x)) =
_
f(x)dx
which nally gives us
i(x) = e
R
f(x)dx
= exp(
_
f(x)dx)
12.4.1 Example
dy
dx
+ 2y = e
x
Solution
This is a rst order linear dierential equation in classic form. In this case,
we rst nd the integrating factor i(x)
i(x) = e
R
f(x)dx
= e
R
2dx
= e
2x
so we multiply through by this on both sides.
e
2x
dy
dx
+ 2ye
2x
= e
x
e
2x
= e
3x
The left hand side is now exact, althought this might not be obvious.
d
dx
_
e
2x
y
_
= e
3x
and if we now integrate on both sides we obtain
e
2x
y =
1
3
e
3x
+ c
and thus we can now divide by e
2x
on both sides
y =
1
3
e
x
+ ce
2x
12.4.2 Example
x
3
dy
dx
+ x
2
y = x 1
Solution
This is not yet in classic form - note that in the form we stated for this type
of equation there is nothing in front of the
dy
dx
term, so lets divide by x
3
throughout to try and achieve this.
dy
dx
+
1
x
y =
x 1
x
2
Now this is in classic rst order linear form, and we try to nd the integrating
factor i(x).
i(x) = e
R
f(x)dx
= e
R
1
x
dx
= e
ln x
= x
So we now multiply throughout by this factor, which is simply x, to obtain
x
dy
dx
+ y =
x 1
x
which is now an exact D.E., looking at the LHS we see that
d
dx
(xy) =
x 1
x
and so integrating both sides with respect to x yields
xy =
_
x 1
x
dx =
_
1
1
x
dx = x ln x + c
and nally, dividing by x both sides we obtain
y = 1
ln x
x
+
c
x
12.5 Second order D.E.s 209
12.5 Second order D.E.s
Second order dierential equations, are in general more dicult to solve
than their rst order counterparts, but we shall only deal very simple cases
of this class of D.E.s, which many consider simpler to solve than rst order
equations.
We shall consider only second order dierential equations with constant
coecients, which have the form
a
d
2
y
dx
2
+ b
dy
dx
+ cy = f(x).
When f(x) = 0 this is called a homogenous dierential equation, oth-
erwise it is called inhomogeneous. We shall only consider solutions to the
homogeneous case.
The solution to the homogenous equation is known as the Complementary
function, or C.F. for short.
There is a step-by-step procedure for solving these D.E.s.
12.5.1 Homogenous D.E. with constant coecients
Consider
a
d
2
y
dx
2
+ b
dy
dx
+ cy = 0.
We begin by forming a quadratic equation known as the auxilliary equa-
tion. This equation is
am
2
+ bm + c = 0.
Now we know from 3.7 that this equation can have two, one or no real (two
complex) solutions.
Two real solutions
Suppose that we have two real solutions m = and m = , then the C.F. is
given by
y = Ae
x
+ Be
x
where A and B are constants
2
which can often be determined from data in
the question.
2
These arise out of the two integrations required for the second order derivative in the
equation in fact, but this process does not require us to do any integration at all.
One real solution
Suppose that we have one real solution m = , then the C.F. is given by
y = (A + Bx)e
x
where once again A and B are constants.
Two complex solutions
Suppose that we have two complex solutions
3
m = +j and m = j.
Then the C.F. is given by
y = Ae
(+j)x
+ Be
(j)x
but this is a very clumsy representation and infrequently used. We can use
the laws of indices (see 3.5) to improve it. Please note that this proof is
not examinable, it the nal result we want here.
y = Ae
x
e
jx
+ Be
x
e
jx
Now if we use the identity from 5.6.3 we see that we obtain
y = Ae
x
(cos x + j sin x) + Be
x
(cos(x) + j sin(x))
which by th e nature of sin and cos may be shown to be
y = Ae
x
(cos x + j sin x) + Be
x
(cos x j sin x)
and nally with some basic manipulation
y = e
x
((A + B) cos x + (Aj Bj) sin x)) .
Recall that A and B are constants, and so (A + B) and (A B)j are also
constants. Let us call them C and D respectively. Thus
y = e
x
(C cos x + Dsin x)) .
This is the form that should be used with two complex roots, and this
result should be known.
3
Recall at this point that when we solve a quadratic equation to give two complex
solutions these solutions form a conjugate pair (see 5.5.2).
12.5.2 Example
d
2
y
dx
2
5
dy
dx
+ 6y = 0
Solution
This is a second order homogeneous D.E. with constant coecients, and
moreover a = 1, b = 5 and c = 6. The auxilliary equation is thus
m
2
5m + 6 = 0
so solve this, plain vanilla, quadratic equation (see 3.7)
(m2)(m3) = 0 m = 2, m = 3
this is two real roots, so the solution is
y = Ae
2x
+ Be
3x
12.5.3 Example
d
2
y
dx
2
+ 2
dy
dx
+ y = 0
Solution
m
2
+ 2m + 1 = 0
so solve this quadratic.
(m + 1)(m + 1) = 0 m = 1
this is one real root, so the solution is
y = (A + Bx)e
1x
which is
y = (A + Bx)e
x
12.5.4 Example
2
d
2
y
dx
2
+
dy
dx
6y = 0
Solution
2m
2
+ m6 = 0
(2m3)(m + 2) = 0 m =
3
2
, m = 2
this is two real roots, so the solution is
y = Ae
3
2
x
+ Be
2x
12.5.5 Example
d
2
y
dx
2
+ 2
dy
dx
+ 5y = 0
Solution
m
2
+ 2m + 5 = 0
m =
2
_
2
2
4(1)(5)
2
=
2pm
16
2
= 1pm2j m = 1 + 2j, m = 1 2j
So we have two complex roots, with = 1 and = 2 in the form used
above, so that
y = e
1x
(C cos 2x + Dsin 2x)
which is
y = e
x
(C cos 2x + Dsin 2x)
which, incidentially reects a simple harmonic motion under damping.
12.5.6 Example
4
d
2
y
dx
2
=
2
y
where is a constant.
Solution
First place it the more usual form
d
2
y
dx
2
+
2
y = 0
which is the usual second order homogeneous D.E. with constant coecients.
The auxilliary equation is
m
2
+
2
= 0
Remember, is just a constant. Solving this quadratic, we could use the
formula of course, but its easier to say
m
2
=
2
m =
= j
and we have
m = j, m = j
and we have two complex solutions, this time with = 0 and = in the
above form. Thus
y = e
0x
(C cos x + Dsin x)
and nally we have
y = C cos x + Dsin x
which is the general form of a wave with angular frequency , the phase
angle and amplitude can be determined using a procedure demonstrated in
our trigonometry section (see 4.9.3).
4
This is another very important D.E., as it is the D.E. which describes all simple
harmonic motion (behind all sin and cos waves and similar phenomonena).
Chapter 13
Dierentiation in several
variables
The function z = f(x, y) may be represented in three dimensions by a surface,
where the value z represents the height of the surface above the x, y plane.
13.1 Partial Dierentiation
If z = f(x, y) is a function of two independent variables x and y, then we
dene the two partial derivatives of z with respect to x and y as
z
x
= lim
x0
f(x + x, y) f(x, y)
x
and
z
y
= lim
y0
f(x, y + y) f(x, y)
y
13.1.1 Procedure
The actual practice of partial dierentation rather than ordinary is simply
that we dierentiate with respect to one variable, assuming all the others are
constant for this dierentiation. There are some warnings associated with
this that come later, but for now we look at some examples.
13.1.2 Examples
For each of the following functions z = f(x, y), nd
z
x
and
z
y
.
1. z = xy
2
+ 1 (see gure 13.1)
Figure 13.1: The graph of f(x, y) = xy
2
+ 1
2. z = x
3
(sin xy + 3x + y + 4) (see gure 13.2)
Solutions
1.
z
x
= y
2
;
z
y
= 2xy
2.
z
x
= 3x
2
(sin xy + 3x + y + 4) + x
3
(y cos xy + 3)
z
y
= x
3
(x cos xy + 1)
Figure 13.2: The graph of f(x, y) = x
3
(sin xy + 3x + y + 3)
13.1.3 Notation
Due to the ambiguity of the f
notation (we do not know which variable

we dierentiated with respect to) a dierent notation is adopted for partial
derivatives.
f
x
=

x
f =
f
x
f
y
=

y
f =
f
y
13.1.4 Higher Derivatives
If the partial derivatives exist, there is nothing to stop us dierentiating
repeatedly. Note however that we can dierentiate either with respect to the
same variable or a dierent one, giving some interesting notation. We show
the short hand notations at the same time.
Please note the order of dierentiation and notation carefully.
f
xx
=

x
_
f
x
_
=

2
x
2
f =

2
f
x
2
f
yy
=

y
_
f
y
_
=

2
y
2
f =

2
f
y
2
f
xy
=

y
_
f
x
_
=

2
yx
f =

2
f
yx
f
yx
=

x
_
f
y
_
=

2
xy
f =

2
f
xy
while it is usually the case that f
xy
= f
yx
it is not always the case.
13.1.5 Example
For the functions in Example 13.1.2 nd the higher order derivatives f
x
y,
f
y
x, f
y
y and f
x
x.
Solution
1.
f
x
x =

x
y
2
= 0;
f
x
y =

y
y
2
= 2y;
f
y
x =

y
2xy = 2y;
f
y
y =

y
2xy = 2x.
2.
f
x
x =

x
_
3x
2
sin xy + 9x
3
+ 3x
2
y + 12x
2
+ x
3
y cos xy + 3x
3
_
= 3x
2
y cos xy +6x sin xy +27x
2
+6xy +24xx
3
y
2
sin xy +3x
2
y cos xy +9x
2
;
f
x
y =

y
_
3x
2
sin xy + 9x
3
+ 3x
2
y + 12x
2
+ x
3
y cos xy + 3x
3
_
= 3x
3
cos xy + 3x
2
x
4
y sin xy + x
3
cos xy = 4x
3
cos xy + 3x
2
x
4
y sin xy;
f
y
x =

y
x
4
cos xy + x
3
= x
4
y sin xy + 4x
3
cos xy + 3x
2
;
f
y
y =

y
x
4
cos xy + x
3
= x
5
sin xy.
13.2 Taylors Theorem 218
13.2 Taylors Theorem
We now attempt to extend Taylors theorem to two variables (see 11.3).
Suppose that f(x, y) and its partial derivatives (including those of higher
order) exist and are continuous in a neighbourhood of (a, b). Then
f(a +h, b +k) = f(a, b) +
_
h

x
+ k

y
_
f(a, b) +
1
2!
_
h

x
+ k

y
_
2
f(a, b)
+ +
1
(n 1)!
_
h

x
+ k

y
_
n1
f(a, b)+
1
(n)!
_
h

x
+ k

y
_
n
f(a+h, b+k)
where 0 < < 1 and the notation used indicates
_
h

x
+ k

y
_
r
f(x, y) =
_
h

x
+ k

y
__
h

x
+ k

y
_
r1
f(x, y)
where
_
h

x
+ k

y
_
f(x, y) = hf
x
(x, y) + kf
y
(x, y)
13.3 Stationary Points
In simple two variable calculus, one of the most important problems to be
solved is nding and classifying the turning points of curves. That is, the
points where the curve becomes horizontal for however short a time.
We wish to accomplish the same feat in three dimensions, and this cor-
responds to nding places where the landscape is at at a specic point.
13.3.1 Types of points
The most important types of points to nd are local maxima and local min-
ima.
A local maximum (a, b) is a point so that for small h and k
f(a + h, b + k) f(a, b) f(a + h, b + k) f(a, b) 0.
In other words, even slight movements o centre result in a smaller value
(and they must be small, for larger maxima may exist elsewhere).
Conversely, a local minimum (a, b) is a point so that for small h and k
f(a + h, b + k) f(a, b) f(a + h, b + k) f(a, b) 0.
The conditions on the left are more understandable, but the form on the
right is useful later.
13.3 Stationary Points 219
13.3.2 Finding points
At any such point the rate of change with respect to any variable must be
zero (otherwise the surface would be steep rather than at approaching
from the direction of that variables axis).
Therefore, in three variables, where z = f(x, y)
z
x
= 0,
z
y
= 0.
These conditions are necessary for locating local maxima and minima
but they are not sucient. Points that satisfy these equations are known as
critical points or stationary points. We need to classify points we nd in that
way.
13.3.3 Classifying points
If we use Taylors theorem (see 13.2) to expand at the critical point
1
, we
obtain
f(a + h, b + k) f(a, b) =
1
2
_
h

x
+ k

y
_
2
f(a, b) +
and therefore
f(a + h, b + k) f(a, b) =
1
2
_
h
2

2
x
2
+ 2hk

2
xy
+ k
2

2
y
2
_
f(a, b) +
Therefore, if h and k are suciently small then the sign of f(a + h, b +
k) f(a, b) will be determined by
1
2
_
h
2

2
x
2
+ 2hk

2
xy
+ k
2

2
y
2
_
For simplicity we shall make the substitutions that A = f
xx
, B = f
xy
and
C = f
yy
evaluated at (a, b). So we wish to examine the sign of
Ah
2
+ 2Bhk + Ck
2
1
Remember that at the critical point the rst partial derivatives of z with respect to x
and y are zero, which will cause the rst term in the Taylor expansion with derivatives to
vanish.
and we start by completing the square. Provided that A ,= 0, then.
Ah
2
+ 2Bhk + Ck
2
= A
_
h
2
+
2B
A
hk +
C
A
k
2
_
= A
_
_
h +
B
A
k
_
2
+
_
AC B
2
A
2
_
k
2
_
As of course the squared term will always be positive, the sign of the
whole expression (which remember is close to f(a + h, b + k) f(a, b)) will
be dictated by AC B
2
.
If ACB
2
< 0
In this case, then depending on the values of h and k the expression can have
both positive and negative values however close to (a, b) we approach.
Therefore this type of turning point is neither a maximum nor a minimum,
but is in fact a saddle point.
If ACB
2
> 0
In this case the expression will have the same sign as A itself, and we shall
have either a maximum or minimum.
13.3.4 Summary
To nd and classify turning points in three variables where z = f(x, y) the
whole procedure is.
Solve f
x
= f
y
= 0, and nd all (a, b);
If f
xx
f
yy
f
2
xy
< 0 at (a, b) then we have a saddle point.
If f
xx
f
yy
f
2
xy
> 0
If f
xx
> 0 then (a, b) is a local minimum (and f
yy
will also be
positive).
If f
xx
< 0 then (a, b) is a local maximum (and f
yy
will also be
negative).
If f
xx
f
yy
f
2
xy
= 0 further investigation is required.
13.3.5 Example
Find the critical points of
z = f(x, y) = x
3
+ y
3
3x 12y + 20
and determine their nature.
Solution
First of all we need to nd the two partial derivative and set them equal to
zero.
f
x
= 3x
2
3; f
y
= 3y
2
12
and now we place them equal to zero
f
x
= 0 x = 1
f
y
= 0 y = 2
and both these equations must be satised at once. Therefore we have four
critical points
(1, 2), (1, 2), (1, 2), (1, 2)
Now we calculate f
xx
, f
yy
and f
xy
.
f
xx
= 6x; f
yy
= 6y; f
xy
= 0
so to determine the type of turning point in the rst instance we use
f
xx
f
yy
f
2
xy
= 36xy
Now look at each point in turn
(1, 2)
36xy is positive, but f
x
x < 0 we have a local maximum
Calculating z we nd the point at (1, 2, 38).
(1, 2)
36xy is negative, we have a saddle point
(1, 2)
13.4 Implicit functions 222
36xy is negative, we have a saddle point
(1, 2)
36xy is positive, but f
x
x > 0 we have a local minimum
as shown in gure 13.3.
Figure 13.3: A graph with four turning points
13.4 Implicit functions
If x, y, . . . are all functions of one variable t then
dz
dt
=
z
x
dx
dt
+
z
y
dy
dt
+
Thus if
z = f(x, y) = 0
13.5 Lagrange Multipliers 223
then
dz
dx
=
z
x
dx
dx
+
z
y
dy
dx
= 0
Thus
dy
dx
=
z
x
z
y
13.5 Lagrange Multipliers
Lagrange
2
problems where there is some constraint on one or more of the
variables that must be satised.
Normally the problem is expressed in ndind the maxima and minima of
u = f(x, y, z) provided that (x, y, z) = 0.
We can regard (x, y, z) as expressing z in terms of x and y. That is z is
a function z(x, y). This means we are in fact trying to nd the maxima and
minima of
f(x, y, z(x, y))
which is only a function of two independent variables. If f(x, y, z(x, y)) has
a critical point at (a, b).
0 =

x
f(x, y, z(x, y)) = f
x
+ f
z
z
x
0 =

y
f(x, y, z(x, y)) = f
y
+ f
z
z
y
Recall that (x, y, z) = 0 and this time we can regard z as a function of
x and y in exactly the same way to give
x
+
z
z
x
= 0
y
+
z
z
y
= 0
We can eliminate
z
z
and
z
y
we see that (a, b, c) is a critical point if and
only if
(a, b, c) = 0
2
Comte Joeseph Lewis Lagrange (1736-1813) was a French Mathematician who did a
great deal of work on what is now called Classical Mechanics.
13.5 Lagrange Multipliers 224

x
f
z

z
f
x
= 0

y
f
z

z
f
y
= 0
when all functions are evaluated at (a, b, c). If we now dene = f
z
/
z
,
these conditions become
(a, b, c) = 0
f
x
+
x
= 0
f
y
+
y
= 0
f
z
+
z
= 0
The value is called the Lagrange multiplier and we nd turning points
by working out the values of which satisfy the above four equations.
Another way of putting this is that we dene
F = f +
and solve for the equations
F
x
= F
y
= F
z
= = 0
13.5.1 Example
Find the closest distance from the surface z = x
2
+y
2
to the point (3, 3, 4).
Solution
The distance from (x, y, z) to the point (3, 3, 4), which we shall call d sat-
ises the equation
d
2
= f(x, y, z) = (x 3)
2
+ (y + 3)
2
+ (z 4)
2
The surface species the constraint, given that (x, y, z) lies on the surface
z = x
2
+ y
2
we have that
(x, y, z) = x
2
+ y
2
z
2
= 0
We now consider F = f + .
F = (x 3)
2
+ (y + 3)
+
(z 4)
2
+ (x
2
+ y
2
z)
13.6 Jacobians 225
and thus
F
x
= 2(x 3) + 2x = 0
F
y
= 2(y + 3) + 2y = 0
F
z
= 2(z 4) = 0
= x
2
+ y
2
z = 0
If we rearrange each of these equations for each variable in turn, and insert
them into the fourth we obtain
9
(1 + )
2
+
9
(1 + )
2

+ 8
2
= 0
36 ( + 8)(
2
+ 2 + 1) = 0
( 1)( + 7)( + 4) = 0
We examine the distance at each value for .
= 1 x =
3
2
, y =
3
2
, z =
9
2
, d
2
=
19
4
= 7 x =
1
2
, y =
1
2
, z =
1
2
, d
2
=
147
4
= 4 x = 1, y = 1, z = 2, d
2
= 36
so the shortest distance was
19
2
13.6 Jacobians
The determinant

x
u
x
v
y
u
y
v
13.7 Parametric functions 226

is called the Jacobian
3
of x and y with respect to u and v. This is often
denoted by
J
_
x, y
u, v
_
=
(x, y)
(u, v)
=
x
u
x
v
y
u
y
v
This can be expanded to higher dimensions, so that for example the

Jacobian of x, y and z with respect to u, v and w is
(x, y, z)
(u, v, w)
=
x
u
x
v
x
w
y
u
y
v
y
w
z
u
z
v
z
w
13.6.1 Dierential
If z = f(x, y), then if f(x, y) is dierentiable
z = f(x + x, y + y) f(x, y) = xf
x
+ yf
y
+
1
deltax +
2
deltay
where x, y 0
1
,
2
0.
The expression
dz = f
x
dx + f
y
dy
is called the total dierential at z.
13.7 Parametric functions
Suppose z = f(x, y) is dierentiable and suppose x = x(t) and y = y(t) are
dierentiable. Then
dz
dt
= f
x
dx
dt
+ f
y
dy
dt
3
Karl Gustav Jacob Jacobi (1804-1851) was a German mathematician who worked on
various elds such as analysis, number theory and who helped to found the area known as
elliptic functions, which were used to produce the recent proof of Fermats last Theorem.
13.8 Chain Rule 227
in other words
dz
dt
=
z
x
dx
dt
+
z
y
dy
dt
The proof is ommitted.
13.7.1 Example
If z = x
2
y 4xy
20
+ 1 where x = t
3
t and y = cos t nd
dz
dt
when t = 0.
Solution
We need to evaluate f
x
and f
y
.
z
x
= y(2x) 4y
20
+ 0 = 2xy 4y
20
z
y
= x
2
4x 20y
19
+ 0 = x
2
80xy
19
We also require the ordinary derivatives of the x and y functions with
respect to t.
dx
dt
= 3t
2
1,
dy
dt
= sin t
Therefore
dz
dt
= (2xy 4y
20
)(3t
2
1) + (x
2
80xy
19
)(sin t)
Now when t = 0, we have that
x = 0
3
0 = 0, y = cos 0 = 1
plugging this all in yields
dz
dt
= (0 4(1))(0 1) + (0 0)(0) = 4 1 = 4
13.8 Chain Rule
Suppose z = f(x, y) and x = x(u, v) and y = y(u, v) are functions such that
x
u
, x
v
, y
u
and y
v
all exist. Then
z
u
= f
x
x
u
+ f
y
y
u
13.8 Chain Rule 228
i.e.
z
u
=
f
x
x
u
+
f
y
y
u
and similarly
z
v
= f
x
x
v
+ f
y
y
v
i.e.
z
v
=
f
x
x
v
+
f
y
y
v
Chapter 14
Integration in several variables
14.1 Double integrals
Consider the volume between a function over a region R on the x, y plane.
Provided f(x, y) is positive over the region R then
V = lim
x,y0
f(x, y)xy
which we denote as
V =
_ _
R
f(x, y)dxdy
which, if R is considered to be the simple box shown in gure 14.1 (that
is those points for which a x b and c y d) may be evaluated by
V =
_ _
R
f(x, y)dxdy =
_
d
c
_
b
a
f(x, y)dxdy
14.1.1 Example
Calculate the volume under the graph of f(x, y) = xy + 1 over the interval
0 x 2 and 0 y 4.
14.2 Change of order 230
Figure 14.1: Double integration over the simple region R.
Solution
This is very straight-forward, because the region of integration is simply a
rectangle.
V =
_
4
0
_
2
0
xy+1dxdy =
_
4
0
_
x
2
y
2
+ x
_
2
0
dy =
_
4
0
2y+2dy =
_
y
2
+ 2y
4
0
= 24
14.2 Change of order
It is possible, with care, to change the order in which the integration is per-
formed (that is, whether we sum over x or y rst). This is sometimes required
because the order we may try rst results in a very dicult integration.
Usually it helps to sketch over the region of integration rst, we shall
consider a simple situation in which R is such that any line parallel to the x
or y axis meets the boundary or R at most twice. If this is not the case then
it is possible to subdivide R and treat each section in this way.
One possibility is that we split R into two curves y =
1
(x) and y =
2
(x)
such that (x) (x) for a < x < b. We divide R into vertical strips of
14.3 Examples 231
width x, which is then divided into sections of height y. See gure 14.2 for
details.
Figure 14.2: Double integration over x then y.
Therefore the integral in this manner is
_ _
R
f(x, y)dydx
=
_
x=b
x=a
_
_
y=
2
(x)
y=
1
(x)
f(x, y)dy
_
dx
where normally we shall not emphasise that x = or y = in the limits.
However, we can also turn the region into two functions x =
1
(y) and
x =
2
(y) where
1
(y)
2
(y) for all c < y < d. This is shown in gure 14.3.
In this case we can write the integral as
_ _
R
f(x, y)dxdy
=
_
y=d
y=c
_
_
x=
2
(y)
x=
1
(y)
f(x, y)dx
_
dy.
14.3 Examples
Here are some examples of double integrations.
14.3 Examples 232
Figure 14.3: Double integration over y then x.
14.3.1 Example
Reverse the order of integration to evaluate the integral in example 14.1.1.
Solution
V =
_
4
0
_
2
0
xy + 1dxdy
was the original integral, which, because the region of integration is so simple
(just a rectangle) we obtain
V =
_
2
0
_
4
0
xy + 1dydx
=
_
2
0
_
xy
2
2
+ y
_
4
0
dx =
_
2
0
8x + 4dx
= [4x
2
+ 4x]
2
0
= 24
which was the same result we achived before.
14.3.2 Example
Find the volume enclosed under the curve
z = cos xy
bounded in the xy plane by y = x
2
, x > 0, y < 1.
14.4 Triple integrals 233
Solution
This volume is given by
_
1
0
_
x
2
0
cos xydydx.
So evaluate the middle integral rst
=
_
1
0
[x sin xy]
x
2
0
dx
=
_
1
0
_
x sin x
3
_
x sin 0 dx
=
_
1
0
x sin x
3
dx.
This leaves us with a single integral in one variable.
14.4 Triple integrals
Double integrals sum a function over some region R, and if the function
translates as a height then the result is a volume that is calculated.
A volume could be obtained directly by a triple integral over some three
dimension region V (over the function 1). Actually this is usually pointless,
but sometimes we wish to sum a function over a volume and not just an area.
For example, suppose that the density of some region of space is given
by the function = (x, y, z), then the mass of the region could be found by
multiplying the density by the volume elements over the whole volume. In
other words
m =
_ _ _
V
dxdydz.
14.4.1 Example
The density of a gas in a cubic box which extends in each axis from 0 to 5 is
given by
= x
2
y + z
nd the mass enclosed in the box.
14.5 Change of variable 234
Solution
The mass will be given by
m =
_ _ _
V
x
2
y + zdxdydz
which is
=
_
5
0
_
5
0
_
5
0
x
2
y + zdxdydz
and as before we do the integrals from the inside out
=
_
5
0
_
5
0
_
x
3
y
3
+ xz
_
5
0
dydz =
_
5
0
_
5
0
125y
3
+ 5zdydz
=
_
5
0
_
125y
2
6
+ 5yz
_
5
0
dz =
_
5
0
3125
6
+ 25zdz
=
_
3125
6
+
25z
2
2
_
5
0
=
3125
6
+
625
2
= 625(
5
6
+
1
2
) = 625(
4
3
)
=
2500
3
14.5 Change of variable
Sometimes is is easier to calculate an integral by changing the variables it is
integrated with repect to.
Suppose that an integral in terms of x and y is changed to an integral in
terms of u and v where
x = (u, v), y = (u, v)
then the integral over the original region R in the xy plane changes to one
over region R
in the uv plane thus

_ _
R
f(x, y)dxdy =
_ _
R
f((u, v), (u, v))
(x, y)
u, v
dudv
where
(x, y)
u, v
=
x
u
x
v
y
u
y
v
= J
is a Jacobian (see 13.6) related to the transformation.
14.5.1 Polar coordinates
If we wish to transform from x, y to r, we can use the transformation
x = r cos ; y = r sin ; r 0; 0 < 2
and in this case the Jacobian is
x
r
x
y
r
y
cos r sin
sin r cos
= r(cos
2
+ sin
2
theta) = r
This idea can be extended to triple integrals.
14.5.2 Example
Evaluate
I =
_ _
R
(1
_
x
2
+ y
2
dxdy
where R is the region bounded by the circle x
2
+ y
2
= 1.
Solution
We note that the function is more or less 1 r if r is the distance of the
point from the origin, and the region is polar in nature. It follows that a
transformation is useful here.
Transforming to polars
I =
_
2
0
_
1
0
(1 r)rdrd
where the region of the circle has been represented by letting r run from 0
to 1 and letting run from 0 to 2, we have changed the integral in terms of
the new variable and added an r from the Jacobian for this transformation.
We now proceed
=
_
2
0
_
r
2
2

r
3
3
_
1
0
d =
_
2
0
d
6
= frac3
14.5.3 Cylindrical Polar Coordinates
In three dimensions two commonly used systems of coordinated are used.
Cylindrical polar coordinates are simply two dimensions coordinates where
the height is left in cartesian form.
We use the transformation
x = cos , y = sin , z = z
where
0, 0 < 2
and therefore
J =
(x, y, z)
(, , z)
=
cos sin 0
sin cos 0
0 0 1
= (cos
2
+ sin
2
) =
so note that in this coordination system, rho is the distance from the z axis,
not the origin as such.
14.5.4 Spherical Polar Coordinates
To use three dimensional coordinates based on the distance from the origin
and not the z-axis we need to dene the three coordinates r, the distance
from the origin, the angle from the projection of the point on the xy plane
from the x axis, and the angle between the z-axis and the line joing the
point to the origin.
This means the transformation equations are
x = r sin cos , y = r sin sin , z = r cos
r 0, 0 , 0 2
and thus
J =
(x, y, z)
(r, , )
=
sin cos r cos cos r sin sin

sin sin r cos sin r sin cos
cos r sin 0
= r
2
cos
2
sin (cos
2
+ sin
2
) + sin
2
(cos
2
+ sin
2
)
= r
2
sin cos
2
+ sin
2
= r
2
sin
Chapter 15
Fourier Series
Fourier
1
series are a powerful tool. On one hand they simply allow a periodic
function to be expressed as an innite series of simple functions, usually
trigonometric or exponential, but this also allows great insight into a function
be splitting into component frequencies.
15.1 Periodic functions
A periodic function is one that repeats its values at regular intervals. So that
if the repeat occurs ever T units on the x-axis.
f(x) = f(x + T) = f(x + 2T) + + f(x + nT)
The contant value T is known as the period of oscillation.
15.1.1 Example
Of course the classic examples of periodic functions are the graphs of sin x
and cos x.
Consider
y = Asin(t + )
1
Baron Jean Baptiste Fourier (1768-1830) was a French mathematician who narrowly
avoided the guillotine during the French Revolution. Fourier contributed greatly to the
use of dierential equations to solve problems in physics.
15.1 Periodic functions 238
where t is the time in seconds. This is the classic representation of a wave
with amplitude A. The constant is known as the angular frequency and
the constant is known as the phase angle.
Note that
= 2f
where f is the traditional frequency measured in Hertz. As for all oscillations
with a frequency f the time period of oscillation is given by
T =
1
f
.
The frequency f is known as the fundamental frequency of the signal since
this governs the overall timing of the repetitive nature of the function.
15.1.2 Example
Find the period T and the fundamental frequency f for
(a)
f(t) = sin t
(b)
g(t) = 3.4 cos(4.5t + 1.2)
where t is the time in seconds.
Solution
(a) In this case we know that sin t repeats every 2 radians. Therefore
T = 2s 6.283s f =
1
2
= 0.159Hz
(b) Everything here is a distraction apart from the constant 4.5 which
precedes the t. This is the angular frequency and so
= 2f f =

2
=
4.5
2
0.716Hz
and therefore
T =
1
f
1.396s.
15.2 Sets of functions 239
15.2 Sets of functions
Fourier series require expansions in terms of certain collections, or sets, of
functions.
15.2.1 Orthogonal functions
A sequence (
k
(x)) of functions is said to be orthogonal on [a, b] if
_
b
a
m
(x)
n
(x)dx
_
= 0 if m ,= n
,= 0 if m = n
15.2.2 Orthonormal functions
A sequence (
k
(x)) of functions is said to be orthonormal on [a, b] if
_
b
a
m
(x)
n
(x)dx =
_
0 if m ,= n
1 if m = n
15.2.3 Norm of a function
The quantity
[[
k
[[ =
_
b
a
(
k
(x))
2
dx
is known as the norm of the function
k
(x). If we divide each function in
the collection of an orthogonal system by its norm we obtain an orthonormal
system.
15.3 Fourier concepts
We now examine the most important concepts at the heart of the Fourier
Series.
15.3.1 Fourier coecents
If f is a function dened for a x b such that the integral
c
k
=
1
|
k
|
2
_
b
a
f(x)
k
(x)dx
exists. Then the sequence (c
k
) is called the Fourier co-ecients of f with
respect to the orthogonal system (
k
(x)).
15.3.2 Fourier series
Using the Fourier coecients for f, the innite series
k=0
c
k
k
(x)dx
is called the Fourier series of f with respect to the orthogonal system.
15.3.3 Convergence
We already know that not all innite series converge to a value.
If we suppose that
f(x) is dened for a x b;
f(x + nT) = f(x) for all integers n;
f(x) is integrable on x .
Then the Fourier series with respect to the trigonometric system exists
and if f(x)
has a nite number of maxima and mimima;
has a nite number of discontinuities
for a x b then the Fourier series converges to f(x) where f(x) is contin-
uous, and to the midpoint of the two pieces on eitherside of x otherwise.
These are known as the Dirichlet conditions.
15.4 Important functions
The most important sets of functions for Fourier Series are the trigonemetric
and exponential functions.
15.4.1 Trigonometric system
The system of trigonometric functions
1, cos x, sin x, cos 2x, sin 2x, cos 3x, sin 3x, . . .
is orthogonal in the range x . Furthermore
|1| =
2; |cos nx| = |sin nx| =
if n is 1, 2, 3, . . . .
Proof
|1|
2
=
_

1
2
dx = 2;
| cos
2
nx|
2
=
_

cos
2
nxdx
=
1
2
_

(1 + cos 2nx)dx =
1
2
_
x +
sin 2nx
2n
_
=
| sin
2
nx|
2
=
_

sin
2
nxdx
=
1
2
_

(1 cos 2nx)dx =
1
2
_
x
sin 2nx
2n
_
=
Now we must show that the inner product of any function with any dif-
ferent function evaluates to zero. It is left as a trivial exercise to show that
the products of 1 with sin nx and 1 with cos nx produce a zero integral.
_

cos mx cos nxdx =

1
2
_

cos(mn)x + cos(m + n)xdx

=
1
2
_
sin(mn)x
mn
+
sin(m + n)x
m + n
_
= 0 for (m ,= n).
_

sin mx sin nxdx =

1
2
_

cos(mn)x cos(m + n)xdx

=
1
2
_
sin(mn)x
mn

sin(m + n)x
m + n
_
= 0 for (m ,= n).
_

sin mx cos nxdx =

1
2
_

sin(m + n)x + sin(mn)xdx

=
1
2
_
cos(m + n)x
m + n
+
cos(mn)x
mn
_
= 0.
15.5 Trigonometric expansions 242
15.4.2 Exponential system
The system of exponential functions
. . . e
3jx
, e
2jx
, e
jx
, 1, e
jx
, e
2jx
, e
3jx
, . . .
is orthogonal in the range x .
15.5 Trigonometric expansions
Using the trigonometric system of functions in the range x we
can see that the Fourier co-ecients of f, if they exist, (c
k
) can be more
easily written by splitting them into the coeceints for the function 1, a
0
,
the coecents for cos nx, a
n
and for sin nx, b
n
whence
a
0
2
where
a
0
=
1
f(x)dx;
and for values of n = 1, 2, 3, . . .
a
n
=
1
f(x) cos nxdx

b
n
=
1
f(x) sin nxdx

and the Fourier series of f is given by
a
0
2
+
n=1
(a
n
cos nx + b
n
sin nx)
15.5.1 Even functions
If f(x) is even on the range x then it is easy to show
2
that
a
n
=
2
_

0
f(x) cos nxdx, n 0,
b
n
= 0, n > 0.
Note these are not dierent expansions, but what the previously stated ex-
pansions collapse to in this very special case.
2
This follows from the discussion in 7.4 before, noting the nature of each product being
integrated.
15.6 Harmonics 243
15.5.2 Odd functions
If f(x) is odd on the range x then it is easy to show that
a
n
= 0, n 0
b
n
=
2
_

0
f(x) sin nxdx, n > 0
15.5.3 Other Ranges
Note that in much of the theory above, we have not assumed that all expan-
sions run between and . In the trigonometric system it is relatively easy
to adapt the process for periodic functions that repeat from L to L. It can
be shown quite easily that
f(x) =
a
0
2
+
n=1
_
a
n
cos
nx
L
+ b
n
sin
nx
L
_
where
a
0
=
1
L
_
L
L
f(x)dx
a
n
=
1
L
_
L
L
f(x) cos
nx
L
dx for n > 0
b
n
=
1
L
_
L
L
f(x) sin
nx
L
dx for n > 0
15.6 Harmonics
When the Fourier series of a function f(x) is produced using the trigonomet-
ric system, it is clear that the function is made up of signals with specic
frequencies.
The functions
a
n
cos nx + b
n
sin nx
could be combined together into one signal of the form
c
n
sin(nx +
n
)
using the technique shown in 4.9.3. Therefore this whole term represents the
component of the function or signal f(x) with angular frequency n.
15.6 Harmonics 244
This is called the n
th
harmonic. The 1
st
harmonic is therefore given by
a
1
cos x + b
1
sin x = c
1
sin(x +
1
)
is known as the rst harmonic or fundamental harmonic.
The term
a
0
2
can be looked at as a form of static background noise that does not rely on
frequency at all.
15.6.1 Odd and Even Harmonics
Sometimes symmetry in the original function will tell us that some harmonics
will not be present in the nal result.
Even Harmonics Only
If f(x) = f(x + ) then there will be only even harmonics.
Odd Harmonics Only
If f(x) = f(x + ) there there will be only odd harmonics.
We can now surmise a great deal about the terms we expect to nd in
the series before expansion; there are shown in table 15.1.
f(x) = f(x) (Cosine Only) f(x) = f(x) (Sine only)
f(x) = f(x + ) even cosine only even sine only
f(x) = f(x + ) odd cosine only odd sine only
Table 15.1: Symmetry in Fourier Series
15.6.2 Trigonometric system
For the trigonometric system, as described above, it is often useful to combine
the signals as described in 4.9.3. In general however this can be done to yield
the formulae
c
n
=
_
a
2
n
+ b
2
n
15.7 Examples 245
n
= tan
1
_
a
n
b
n
_
for integers n > 0.
15.6.3 Exponential system
The exponential system is less intuitive than the trigonometric system, but
the constants c
n
are often simpler to determine.
15.6.4 Percentage harmonic
With the constants c
n
dened as above we dene the percentage of the nth
harmonic to be
c
n
c
1
100.
That is, the percentage that the amplitude of the n
th
harmonic is of the
amplitude of the fundamental harmonic.
15.7 Examples
15.7.1 Example
If f(x) = x for x such that < x and f(x + 2) = f(x) for all real x,
evaluate the Fourier series of f(x).
Solution
Since f(x) is an odd function we have that a
n
= 0 and
b
n
=
2
_

0
x sin nxdx =
_
2x cos nx
n
_
0
+
2
n
_

0
cos nxdx
b
n
=
2 cos n
n
=
2(1)
n+1
n
.
Thus the Fourier series of f(x) is
2(sin x
sin 2x
2
+
sin 3x
3

sin 4x
4
+ ).
15.8 Exponential Series 246
15.8 Exponential Series
We have noted above the exponential set of functions forms an orthogonal
set of functions and hence it will be possible to produce Fourier series in this
set of functions.
We can approach this idea from another direction. We know from Eulers
identity (5.6.3) that
e
j
= cos + jsin
and therefore
e
jnx
= cos nx + j sin nx
while
e
jnx
= cos(nx) + j sin(nx)
e
jnx
= cos nx j sin nx
due to the even and odd nature of the cosine and sine functions respectively.
We can combine these two equations to nd that
cos nx =
e
jnx
e
jnx
2
sin nx =
e
jnx
e
jnx
2j
Now consider the formula for the Fourier series of f(x) which is
f(x) =
a
0
2
+
n=1
(a
n
cos nx + b
n
sin nx)
and we shall insert these terms
=
a
0
2
+
n=1
_
a
n
2
_
e
jnx
+ e
jnx
_
+
b
n
2j
_
e
jnx
e
jnx
_
_
to simply things, we multiply top and bottom of the right most term by j
which gives us j on the top line and means we are dividing by j
2
= 1. We
absorb that into the brackets, inverting the subtraction to give
=
a
0
2
+
n=1
_
a
n
2
_
e
jnx
+ e
jnx
_
+
b
n
j
2
_
e
jnx
e
jnx
_
_
=
a
0
2
+
n=1
__
a
n
b
n
j
2
_
e
jnx
+
_
a
n
+ b
n
j
2
_
e
jnx
_
15.8 Exponential Series 247
and this can be written more simply as
=
c
n
e
jnx
.
Note carefully here that n now runs from to . The relationship of the
contants c
n
are
c
n
=
_
_
_
1
2
(a
n
jb
n
) n > 0
1
2
a
0
n = 0
1
2
(a
n
+ jb
n
) n < 0
Chapter 16
Laplace transforms
Laplace
1
transforms are (among other things) a way of transforming a dier-
ential equation into an algebraic equation. This equation is then rearranged
and we attempt to reverse the transform. This last part is usually the hardest
unfortunately.
16.1 Denition
Suppose that f(t) is some function of t, then the Laplace transform of f(t),
denoted Lf(t) is given by
F(s) = Lf(t) =
_

0
e
st
f(t)dt
16.1.1 Example
Find Le
at
.
Solution
L
_
e
at
_
=
_

0
e
at
e
st
dt =
_

0
e
(as)t
dt
1
These are named after Pierre Simon de Laplace (1749-1827), a brilliant French math-
ematician sometimes known as the Newton of France.
16.1 Denition 249
=
_
e
(as)t
a s
_
0
which will be dened (convergent) when s > a. In which case
L
_
e
at
_
= 0
1
a s
=
1
s a
16.1.2 Example
Find L1.
Solution
La =
_

0
1 e
st
dt =
_
e
st
s
_
0
Thus
L1 = 0
1
s
=
1
s
16.1.3 Example
Find Lt
n
where n is a positive integer.
Solution
Lt
n
=
_

0
t
n
e
st
dt
Now, recall the integration by parts formula
_
u
dv
dx
dx = uv
_
v
du
dx
dx
and allow
u = t
n
du
dx
= nt
n1
;
dv
dx
v =
e
st
s
Thus we obtain
Lt
n
=
_
t
n
e
st
s
_
_

0
e
st
s
nt
n1
dtdt
The already integrated part of the expression evaluates to zero, when is
used as a limit the exponential term crushes the polynomial term to zero
16.1 Denition 250
and when zero is used as a limit the polynomial term is also zero. Thus,
tidying up the integral that remains, gives
Lt
n
=
n
s
_

0
t
n1
e
st
dt =
n
s
L
_
t
n1
_
so that we obtain a reduction formula. Now, we know Lt
0
= L1 =
1
s
.
Therefore,
Lt =
1
s
1
s
=
1
s
2
L
_
t
2
_
=
2
s
1
s
2
=
2
s
3
and in general
Lt
n
=
n!
s
n+1
16.1.4 Inverse Transform
We dene the inverse Laplace transform, denoted L
1
F(s) in the obvious
way.
Lf(t) = F(s) L
1
F(s) = f(t)
16.1.5 Elementary properties
As the Laplace transform is an integral, we can employ automatically those
laws we take for granted in integration. In particular
Lf(t) g(t) = Lf(t) Lg(t)
and
Lkf(t) = kLf(t)
where k is a constant and f(t) and g(t) are functions of t as indicated.
In other words, we may split the transform (and for that matter the inverse
transform) over addition and or subtraction, and we can take constants in
and out of the transform.
16.1.6 Example
Find Lsin t and Lcos t.
16.2 Important Transforms 251
Solution
We use Eulers identity
e
jt
= cos t + j sin t
Now
L
_
e
jt
_
=
1
s j

s + j
s + j
=
s
s
2
+
2
+ j

s
2
+
2
So
L
_
e
jt
_
= Lcos t + j sin t
and equating real and imaginary components we obtain
Lcos t =
s
s
2
+
2
and
Lsin t =

s
2
+
2
16.2 Important Transforms
We present a table (see table 16.1) of important transforms, without proof,
as this would require the precise denition of indenite integrals. In most
cases a condition must be satised for the integral to be convergent (and thus
for the transform to exist).
16.2.1 First shifting property
Let f(t) be a function of t with Laplace transform F(s), which exists for
s > b, then if a is a real number
L
_
e
at
f(t)
_
= F(s a)
for s > a + b.
Proof
Clearly, if
_

0
e
st
f(t)dt = F(s)
f(t) Lf(t) Condition
e
at
1
s a
s > a
k
k
s
s > 0
t
n
n!
s
n+1
s > 0
sin at
a
s
2
+ a
2
s > 0
cos at
s
s
2
+ a
2
s > 0
sinh at
a
s
2
a
2
s > [a[
cosh at
s
s
2
a
2
s > [a[
Table 16.1: Common Laplace transforms
f(t) Lf(t) Condition
e
at
t
n
n!
(s a)
n+1
s > a
e
at
sin bt
b
(s a)
2
+ b
2
s > a
e
at
cos bt
s a
(s a)
2
+ b
2
s > a
e
at
sinh at
b
(s a)
2
b
2
s > a +[b[
e
at
cosh at
s a
(s a)
2
b
2
s > a +[b[
Table 16.2: Further Laplace transforms
then
F(s a) =
_

0
e
(sa)t
f(t)dt
We note that
L
_
e
at
f(t)
_
=
_

0
e
at
f(t)e
st
dt =
_

0
e
(sa)t
f(t)dt
where the right most term is clearly F(s a) as shown above.
16.2.2 Further Laplace transforms
The rst shifting property of Laplace transforms gives rise to the following
other transforms, shown in table 16.2.
16.3 Transforming derivatives 254
16.3 Transforming derivatives
The most important property of the Laplace transform is how derivates of
f(t) transform.
16.3.1 First derivative
Let [f(t)[ Me
at
when t T for non-negative constants M, a and T. Then
Lf
(t) = sLf(t) f(0)

16.3.2 Second derivative
For the second derivative, we obtain
Lf
(t) = s
2
Lf(t) sf(0) f
(0)
16.3.3 Higher derivatives
We can continue (using proof by induction) to obtain
L
_
f
(n)
(t)
_
= s
n
Lf(t) s
n1
f(0) s
n2
f
(0) f
(n1)
(0)
The most important thing to note is that all these derivates transform to
simple algebraic expressions in terms of the Laplace transform of f(t) itself.
In other words, the dierentiation is undone.
In general, transforming an equation is relatively simple. It must then be
rearranged for the transform of f(t) and then, and this is the hard part, we
struggle to recognise forms for inverse transforms.
16.4 Transforming integrals
If Lf(t) = F(s) then
L
__
t
0
f(t)dt
_
=
1
s
F(s).
16.5 Dierential Equations 255
Proof
Let
_
t
0
f(t)dt = g(t)
or in other words
f(t) =
d
dt
g(t)
which under Laplace transformation yields
Lf(t) = sg g(0)
Note that
g(0) =
_
0
0
f(x)dx = 0
where x has been used to stress the dummy variable.
g =
1
s
Lf(t) Lf(t) =
1
s
F(s)
16.5 Dierential Equations
We are now in a position to solve some dierential equations using Laplace
transforms.
16.5.1 Example
In nuclear decay, the rate of decays is directly proportional to the number of
nucleii to decay. Therefore the number of nucleii remaining N(t) follows the
following dierential equation.
dN
dt
= N
Assuming that N(0) = N
0
when t = 0 solve the dierential equation.
Solution
There are much easier ways of solving this equation than resorting to Laplace
transforms as the reader is no doubt aware, but we use it as a very simple
example of the principles involved.
First we transform the whole equation
sLN N(0) = LN
Now we have an algebraic equation, into which we plug in our initial known
conditions.
sLN N
0
= LN
We now rearrange this to make LN the subject of the equation.
sLN + LN = N
0
(s + )LN = N
0
LN =
N
0
s +
In theory now we need only nd the expression that transforms to the RHS
and we have solved our dierential equation. In practice this can be quite
tricky and we often have to manipulate the expression a great deal rst. In
this very simple case, very little has to be done.
LN = N
0
1
s +
Now the expression
1
s+
is Le
at
where we take a = and so.
LN = N
0
L
_
e
t
_
LN = L
_
N
0
e
t
_
Therefore, removing the transform on both sides we have
N = N
0
e
t
16.5.2 Example
2
d
2
x
dt
2
+ 5
dx
dt
3x = t 4
given that x = 0 and
dx
dt
= 2 when t = 0.
Solution
First of all we transform the equation on both sides
2(s
2
x sx(0) x
(0)) + 5(sx x(0)) 3x =

1
s
2

4
s
.
Now we insert the initial conditions now, which are x(0) = 0 and x
(0) = 2
to obtain
2s
2
x 4 + 5sx 3x =
1
s
2

4
s
.
Next, we rearrange to make the Laplace transform of x(t), denoted by x
for short, to be the subject of the equation.
(2s
2
+ 5s 3)x =
1
s
2

4
s
+ 4 =
1 4s + 4s
2
s
2
x =
4s
2
4s + 1
s
2
(2s
2
+ 5s 3)
.
Finally, we have the most dicult part, we have already transformed the
whole dierential equation into an algebraic equation and solved it, now we
have to invert the transform.
Factorize rst, to simplify as much as possible.
x =
(2s 1)
2
s
2
(2s 1)(s + 3)
=
2s 1
s
2
(s + 3)
and now use partial fractions
2s 1
s
2
(s + 3)
=
A
s
+
B
s
2
+
C
s + 3
and multiply both sides by the denominator of the L.H.S.
2s 1 = As(s + 3) + B(s + 3) + Cs
2
.
This is an identity, or in other words it is true for all values of s, so certainly
it is true for specic values.
s = 3 7 = 9C C =
7
9
s = 0 1 = 3B B =
1
3
and comparing coecients of s
2
we obtain
0 = A + C A =
7
9
.
Thus
x =
7
9
1
s

1
3
1
s
2

7
9
1
s + 3
and therefore
x =
7
9

1
3
t
7
9
e
3t
.
16.5.3 Example
d
4
y
dt
4
81y = 0
given that
y = 1,
dy
dt
=
d
2
y
dt
2
=
d
3
y
dt
3
= 0
when t = 0.
Solution
Transforming the equation yields
s
4
y s
3
y(0) s
2
y
(0) sy
(0) y
(0) 81y = 0
and inserting initial conditions simplies this to
s
4
y s
3
81y = 0
which rearranges to
(s
4
81)y = s
3
y =
s
3
s
4
81
=
s
3
(s
2
9)(s
2
+ 9)
.
Another simplication is possible here.
y =
s
3
(s 3)(s + 3)(s
2
+ 9)
=
A
s 3
+
B
s + 3
+
Cs + D
s
2
+ 9
s
3
= A(s + 3)(s
2
+ 9) + B(s 3)(s
2
+ 9) + (Cs + D)(s 3)(s + 3).
We begin inserting key values of s once again.
s = 3 27 = B(6)(18) B =
1
4
;
s = 3 27 = A(6)(18) A =
1
4
;
s = 0 0 = 27A 27B + D(9) D = 0;
s = 1 1 = 40A 20B + C(8) C =
1
2
.
Therefore
y =
1
4
1
s 3
+
1
4
1
s + 3
+
1
2
s
s
2
+ 9
and we can easily now employ the inverse transform to obtain
y =
1
4
e
3t
+
1
4
e
3t
+
1
2
cos 9t
Alternative Solution
Suppose that we missed the second factorization of the bottom line, then we
would have proceeded as follows.
s
3
(s
2
9)(s
2
+ 9)
=
As + B
s
2
9
+
Cs + D
s
2
+ 9
and multiply by denominator of the L.H.S.
s
3
= (As + B)(s
2
+ 9) + (Cs + D)(s
2
9),
s = 3 27 = 54A + 18B 3 = 6A + 2B
s = 3 27 = 54A + 18B 3 = 6A + 2B
which solved together yield A =
1
2
and B = 0.
s = 3j 27j = (3jC + D)(18)
which gives D = 0 and C =
1
2
.
Now we have
y =
1
2
1
s
2
9
+
1
2
1
s
2
+ 9
and using our table of transforms we see that
y =
1
2
(cosh 3t + cos 3t)
This looks like a dierent solution, but in fact, using the denition of the
cosh function we see that
y =
1
2
_
1
2
e
3t
+
1
2
e
3t
_
+
1
2
cos 3t
which can be seen to be equivalent.
16.5.4 Example
d
2
x
dt
2
=
2
x
using Laplace transforms
2
assuming that x(0) = A and x
(0) = 0.
Solution
Taking the transforms both sides gives
s
2
x sx(0) x
(0) =
2
x
and ll in initial conditions to give
s
2
x sA =
2
x
and rearrange for x
(s
2
+
2
)x = As x = A
s
s
2
+
2
from which it follows simply that
x = Acos t
2
This is the equation for Simple Harmonic Motion and is a very important dierential
equation. It is not necessary to use Laplace transforms to solve it and this method is used
as an example.
16.5.5 Exercise
Using the previous dierential equation
d
2
x
dt
2
=
2
x
but the initial conditions x(0) = 0, and x
(0) = v, show that

x =

v
sin t
16.5.6 Example
Consider the network shown in gure 16.1.
Figure 16.1: An L and R curcuit.
Find i given that i = 0 when t = 0.
Solution
By Kirchos laws
Ri + L
di
dt
= E
and taking Laplace transforms on both sides gives
Ri + L(si i(0)) =
E
s
inserting initial conditions gives
(R + Ls)i =
E
s
and so
i =
E
s(Ls + R)
=
A
s
+
B
Ls + R
.
Therefore
E = A(Ls + R) + Bs
which is true for all values of s, so in particular
s = 0 E = AR A =
E
R
while comparing coecients of s gives.
0 = AL + B B =
EL
R
.
Inserting these values for A and B gives
i =
E
R
1
s

EL
R
1
Ls + R
=
E
R
1
s

E
R
1
s + R/L
So nally we have to invert the transform to obtain
i =
E
R

E
R
e
R
L
t
or
i =
E
R
(1 exp(
R
L
t))
16.5.7 Example
Consider the circuit shown in gure 16.2.
Figure 16.2: An L, C, and R curcuit.
Find i given that R = 250, C = 10
4
F, E = 10V and L = 1H and that
the inital current is zero.
16.6 Other theorems 263
Solution
From Kirchos laws we have
Ri + L
di
dt
+
1
C
_
t
0
idt = 10
which when transformed gives
Ri + L(si i(0)) +
1
Cs
i =
10
s
then we ll in our conditions to give
i(s + 250 +
10
4
s
) =
10
s
which with some work yields
i =
10
(s + 50)(s + 200)
nally leaving us with
i =
1
15
_
e
50t
e
200t
_
16.6 Other theorems
Here are some more theorems about Laplace transforms for which no proof
is given. In each case we assume that f(t) is a function and that
F(s) = Lf(t)
16.6.1 Change of Scale
Given a constant a, then
Lf(at) =
1
a
F(
s
a
).
16.6.2 Derivative of the transform
Given a constant n, then
Ltf(t) =
d
ds
F(s)
or more generally
Lt
n
f(t) = (1)
n
d
n
ds
n
F(s)
16.6 Other theorems 264
16.6.3 Convolution Theorem
Given that g(t) is such that
G(t) = Lg(t)
then
L
1
F(s)G(s) =
_
t
0
f(r)g(t r)dr.
This integral on the right is often denoted by g f or f g.
16.6.4 Example
Given that
Lcos t =
s
s
2
+ 1
nd
Lcos 3t .
Solution
We already can work this out straight away from our table 16.1 but we do
this as an exercise.
From 16.6.1 we see that
Lcos 3t =
1
3
s/3
(s/3)
2
+ 1
=
1
9
s
s
2
/9 + 1
=
1
9
9s
s
2
+ 9
=
s
s
2
+ 9
16.6.5 Example
Given that
L
_
e
4t
_
=
1
s 4
nd
L
_
te
4t
_
16.7 Heaviside unit step function 265
Solution
Again, we can work this out directly from our table 16.2 but this is an
exercise.
We can see from 16.6.2 that we can write
L
_
te
4t
_
=
d
ds
_
1
s 4
_
= (1)(1)(s 4)
2
=
1
(s 4)
2
16.6.6 Example
Find
L
1
_
1
(s 2)(s 3)
_
Solution
Once more, this problem would normally be tackled using partial fractions,
but we use it as a very simple application of the convolution theorem (see 16.6.3).
Let
F(s) =
1
s 2
f(t) = e
2t
; G(s) =
1
s 3
g(t) = e
3t
Then
L
1
_
1
(s 2)(s 3)
_
=
_
t
0
e
2r
e
3(tr)
dr =
_
t
0
e
2r
e
3t
e
3r
dr
Note that t is a constant with respect to this integration, and so we may
bring a term outside.
= e
3t
_
t
0
e
r
dr = e
3t
_
e
r
t
0
giving a nal answer of
= e
2t
+ e
3t
.
16.7 Heaviside unit step function
The Heaviside unit step function
3
is dened by
3
Oliver Heaviside (1850-1925) was a British physicist with no university education.
He produced much of the theoretical underpinning of cable telegraphy.
u(t) =
_
0 for t < 0
1 for t 0
and is shown in gure 16.3.
Figure 16.3: The unit step function u(t).
At times we may wish to cause the unit step to occur earlier or later than
t = 0, and we can do this quite easily by examining the function u(t c)
where c is some constant. This function takes the form
u(t c) =
_
0 for t < c
1 for t c
and is shown in gure 16.4.
Figure 16.4: The displaced unit step function u(t c).
The eect of this function is to allow us to switch on and o other func-
tions and thus help us make very complicated waveforms. We do this by
multiplying the unit function against the function we wish to control, say
f(t).
Thus
f(t) u(t c) =
_
0 for t < c
f(t) for t c
and we have successfully switched o the function f(t) for times before
t = c. Frequently we shall want much more ne control than this however,
requiring that we can again switch o the function f(t) after some time
interval. This can easily be done by combining variations of u(t c).
Figure 16.5: Building functions that are on and o when we please.
For example, consider gure 16.5 on page 267. Here we have shown the
graphs of u(t 2) and u(t 4) and clearly, by subtracting the second graph
from the rst, we obtain the third graph, which shows that we can switch
on between t = 2 and t = 4 or any other values we please.
16.7.1 Laplace transform of u(t c)
Lu(t c) =
e
cs
s
Proof
By denition
Lu(t c) =
_

0
e
st
u(t c)dt
but we know that
e
st
u(t c) =
_
0 for t < c
e
st
for t c
and thus
Lu(t c) =
_

c
e
st
dt.
Note in particular the change of limits of this integral. Therefore
Lu(t c) =
_
e
st
s
_
c
= 0
_
e
sc
s
_
and so
Lu(t c) =
e
sc
s
and allowing c to be zero, note in particular that
Lu(t) =
1
s
16.7.2 Example
Find the function f(t), described by step functions, and the transform F(s)
for the waveform shown in gure 16.6.
Figure 16.6: A positive waveform built from steps.
Solution
First of all we begin to compose the waveform.
f(t) = 4u(t) u(t 2) + 2u(t 4) u(t 6) +u(t 7) u(t 8)
In each case we multiply a combination of step functions which switch the
function on and o in the appropriate places by the function we want in that
region, here that is just a constant each time.
It is easier to expand this before the transform is applied
f(t) = 4u(t) 4u(t 2) + 2u(t 4) 2u(t 6) + u(t 7) u(t 8).
This may look a little intimidating at rst, but in fact it is easy to trans-
form this when we recall that
We can split the transform over addition and subtraction;
We can take constants outside the transform;
Transforming u(t c) is straightforward.
So we obtain
Lf(t) = F(s) = 4
1
s
4
e
2s
s
+ 2
e
4s
s
2
e
6s
s
+
e
7s
s

e
8s
s
=
1
s
_
4 4e
2s
+ 2e
4s
2e
6s
+ e
7s
e
8s
_
16.7.3 Example
Figure 16.7: A waveform built from steps.
Solution
From the graph we see that
f(t) = 2u(t 1) u(t 3) + 1u(t 3) u(t 5)
which when expanded yields
= 2u(t1)+2u(t3)+u(t3)u(t5) = 2u(t1)+3u(t3)u(t5).
Now we apply the transform and obtain
Lf(t) = F(s) = 2
e
s
s
+ 3
e
3s
s

e
5s
s
=
1
s
_
2e
s
+ 3e
3s
e
5s
_
16.7.4 Delayed functions
If Lf(t) = F(s), then, the Laplace transform of the delayed function
f(t a)u(t a) is given by
Lf(t a)u(t a) = e
as
F(s)
Proof
Clearly
Lf(t a)u(t a) =
_

0
e
st
f(t a)u(t a)dt
but note that no area can occur before t = a (due to the switching o of the
function with u(t a). Therefore this integral becomes
=
_

a
e
st
f(t a)dt.
We now make a substitution, let T = t a. Then clearly
dT
dt
= 1 dT = dt
under this substitution the integral becomes
_

0
e
s(T+a)
f(T)dT.
Note carefully the change of limits, and observe that s is a constant with
respect to this integral.
= e
as
_

0
e
sT
f(T)dT = e
as
F(s)
which proves the result.
16.7.5 Example
Solution
Some simple examination shows that this function could be dened as
f(t) =
_
_
_
t when 0 t < 2
2 when 2 t < 5
12 2t when 5 t < 6
and which is zero at all other times. Converting this into step function form,
we obtain
f(t) = tu(t)u(t2)+2u(t2)u(t5)+(122t)u(t5)u(t6).
16.8 The Dirac Delta 272
Figure 16.8: A waveform built from delayed linear functions.
Actually there is more than one way of tackling the problem from this
point, this is one method.
4
We expand f(t) thinking ahead to what we can
tackle.
f(t) = tu(t)tu(t2)+2u(t2)2u(t5)+12u(t5)2tu(t5)+2(t6)u(t6)
where the last section has been expanded to make it a delayed function. If
we proceed to do this in the other sections where possible we obtain.
f(t) = tu(t) (t 2)u(t 2) 2(t 5)u(t 5) + (t 6)u(t 6)
So now we appeal to the result for delayed functions, and noting that the
function being delayed is f(t) = t in each case, to obtain
F(s) =
1
s
2

e
2s
s
2
2
e
5s
s
2
+
e
6s
s
2
=
1
s
2
_
1 e
2s
2e
5s
+ e
6s
_
16.8 The Dirac Delta
The impulse function or Dirac delta is dened by
4
Another way in this case would be to expand with no forethought wherupon we would
have a list of step functions, and step functions multiplied by t. We could then use the
derivative of the transform (see 16.6.2) to crack the problem.
16.8 The Dirac Delta 273
(t) =
_
_
_
0 for t < 0
1
for 0 t <
0 for t
where 0 and
_

(t)dt = 1
The Laplace transform of (t) is given by
L(t) = 1
Proof
L(t) = L
_
1
[u(t) u(t )]
_
=
1
s
(1 e
se
)
Now, recall that
e
x
= 1 x +
x
2
2!

x
3
3!
+
so that
L(t) =
1
s
(1 1 + s
s
2
2
2!
+
s
3
3
3!
)
= 1 s +
s
2
2
2!

s
3
3
3!
+
and now, allowing 0 we obtain
L(t) = 1
16.8.1 Delayed impulse
The Laplace transform of the delayed impulse funtion, (t a), is given by
L(t a) = e
as
Proof
This is a direct consequence of the rst shifting property ( 16.2.1) and the
transform of the normal dirac delta ( 16.8).
16.9 Transfer Functions 274
16.8.2 Example
Find the Laplace transform of the wave train shown in gure 16.9.
Figure 16.9: An impulse train built from Dirac deltas.
Solution
We could describe the train as a function as follows
f(t) = 3(t 1) 2(t 3) + (t 4) 3(t 5)
+2(t 6) (t 8)
and thus the transform will be
Lf(t) = F(s) = 3e
s
2e
3s
+ e
4s
3e
5s
+ 2e
6s
e
8s
16.9 Transfer Functions
Consider once again the circuit in gure 16.2, with a potentially varying EMF
e. If we take Laplace transforms on both sides and adopt the shorthands
i = Li(t) , e = Le(t)
then we obtain
Lsi + Ri +
1
Cs
i = e.
Thus
i
LCs
2
+ RCs + 1
Cs
= e
i
e
=
Cs
LCs
2
+ RCs + 1
.
This is called the transfer function for the circuit. In general the transfer
function of a system, is given by.
F
T
(s) =
Linput
Loutput
.
The transfer function is determined by the system, and once found is un-
changed by varying inputs and their corresponding outputs. The analysis of
the transfer function can reveal information about the stability of the system.
The system can be considered stable if the output remains bounded for all
values of t, even as t . So terms in the output of the form e
t
, t, t
2
cos 3t
are all unbounded. On the other hand, terms of the form e
t
and e
2t
cost
show stability.
We can determine the presence of such terms by analysing the poles
5
of
the transfer function. This really comes down to examining the demonimator
and nding what values of s cause it to become zero.
We can then analyse the stability of the system by plotting the poles on
an Argand diagram, and using the following simple rules.
If all the poles occur to the left of the imaginary axis then the system
is stable;
If any pole occurs to the right of the imaginary axis then the system is
unstable;
If a pole occurs on the imaginary axis the system is marginally stable
if the pole is of order
6
1. The system is unstable if the pole is of higher
order.
So for example, consider the factors in the denominator of the transfer
function shown in table 16.3 and what they indicate. In each case we can see
an expression that would arise from the inverse and see the stability of the
end system.
5
A pole of a function f(z) is a value which when inserted into z, causes an innite
value of f(z). Note that z may be complex.
6
The order of a pole is essentially how often if occurs. If a pole is listed once, it has
order 1, and if listed twice it has order 2, etc.
Factor Inverse Pole Order Stable?
(s 3) e
3t
3 1 no
(s + 2) e
2t
2 1 yes
(s + 2)
2
te
2t
2 2 yes
(s
2
+ 4) sin 2t 2j 1 yes
s 1 0 1 yes
s
2
t 0 2 no
Table 16.3: Examples of transfer function denominators
16.9.1 Impulse Response
If the input to the system is the impulse function (t), and the response is
the function h(t), then the transfer function is given by
F
T
(s) =
Lh(t)
L(t)
= Lh(t) .
since L(t) = 1. So that one way to determine the transfer function for a
system is simply to take the Laplace transform of the impoulse response for
that system.
Supposing that now we change the input function to f(t) and that the
corresponding output function is g(t), we obtain
F
T
(s) =
Lg(t)
Lf(t)
Lh(t) =
Lg(t)
Lf(t)
Lg(t) = Lf(t)Lh(t) .
So we can obtain the transform of the general output, provided we know the
Laplace transform of the input and impulse response for the system.
To obtain the output from the system g(t) we now take the inverse trans-
form on both sides.
g(t) = L
1
Lf(t) Lh(t) .
From the convolution theorem (see 16.6.3), we see that
g(t) = f h.
16.9.2 Initial value theorem
If f(t) is a function of t with Laplace transform F(s) then
lim
t0
f(t) = lim
s0
sF(s).
Proof
We know that
L
_
dx
dt
_
=
_

0
e
st
f
(t)dt = sF(s) f(0).

Now, let s , to obtain
0 = lim
s
sF(s) lim
s0
f(0)
as the integral will tend to zero, because as s the term e
st
tends
rapidly to zero.
0 = lim
s
sF(s) lim
t0
f(t)
and rearranging gives us the theorem.
16.9.3 Final value theorem
If f(t) is a function of t with Laplace transform F(s), and if lim
t
f(t)
exists, then
lim
t
f(t) = lim
s0
sF(s).
Proof
Once again, we begin with the observation
L
_
dx
dt
_
=
_

0
e
st
f
(t)dt = sF(s) f(0).

This time we allow s 0. On the left hand side (the integral) we obtain
_

o
f
(t)dt = lim
t
_
t
0
f
(t)dt = lim
t
f(t) f(0).
and on the right hand side we obtain simply
lim
s0
F(s) f(0).
So equating these again we obtain
lim
t
f(t) f(0) = lim
s0
F(s) f(0)
and adding f(0) on both sides we obtain the nal result.
Chapter 17
Z-transform
17.1 Concept
Consider a signal of a continuous function f(t) and how it appears when the
signal is examined at discrete time intervals, say when
t = kT, k = 0, 1, 2, . . .
where T is some xed time period.
Figure 17.1: A continuous (analog) function
Then instead of seeing a continous curve we instead obtain a series of
spikes indicating the value of the signal at each sampling period, these can
be represented by Dirac deltas.
17.1 Concept 280
Figure 17.2: Sampling the function
Figure 17.3: The digital view
17.2 Important Z-transforms 281
In this way, the discrete version of the sample appears as
f
D
(t) =
k=0
f(kT)(t kT)
where f(kT) is the value of the signal at the sampling frequencies and (t
kT) is the delayed Dirac delta. Note that k is a dummy variable in this
summation and wont appear in the nal expansion, although T will.
If we consider the Laplace transform of this sum, we obtain
F(s) = Lf
D
(t) = f(0) + f(T)e
Ts
+ f(2T)e
2Ts
+
F(s) =
k=0
f(kT)e
kTs
in this sort of expansion the exponential term occurs very frequently. If we
make the simplifying substitution
z = e
Ts
then the transformed equation becomes
k=0
f(kT)z
k
This expression is the Z-transform of the discrete function, and we write.
F(z) = Z f
D
(t) =
k=0
f(kT)z
k
17.2 Important Z-transforms
17.2.1 Unit step function
Z u(t) =
z
z 1
Proof
From the denition of the Z-transform.
U(z) = Z u(t) =
infty
k=0
u(kT)z
k
Now u(kT) means the value of the step function at each of the specied
sampling periods, but this is always simply one.
U(z) =
k=0
1z
k
= 1 + z
1
+ z
2
+ z
3
+
which is a geometric progression. Using the formula for the sum to innity
we obtain that (if [z[ > 1)
U(z) =
1
1 z
1
which although a completely transformed equation, is simpler in appearance
if we multiply by the top and bottom by z which obtains the desired result.
Note that T does not appear in the formula.
17.2.2 Linear function
Z t =
Tz
(z 1)
2
Proof
From the denition
F(z) = Z t =
k=0
f(kT)z
k
.
Now f(t) = t, and t = kT so f(kT) = kT and therefore
F(z) =
k=0
kTz
k
= T(z
1
+ 2z
2
+ 3z
3
+ ).
Which is another standard summation
F(z) = T
z
1
(1 z
1
)
2
which upon multiplying through by z
2
top and bottom produces the desired
result.
17.2.3 Exponential function
Z eat =
z
z e
aT
Proof
From the denition
F(z) = Z
_
e
at
_
=
k=0
f(kT)z
k
=
k=0
e
akT
z
k
= 1 + e
aT
z
1
+ e
2aT
z
2
+ e
3aT
z
3
+
= 1 +
_
e
aT
z
_
1
+
_
e
aT
z
_
2
+
_
e
aT
z
_
3
+ cdots
which is another geometric progression. Using the sum to innity formula
we obtain that
F(z) =
1
1 e
aT
z
1
which multiplying through top and bottom by z gives the required result.
17.2.4 Elementary properties
Just as for the Laplace transform, the Z-transform obeys the following basic
rules. If f(t) and g(t) are functions of t and c is a constant.
Z f(t) g(t) = Z f(t) Z g(t)
and
Z cf(t) = cZ f(t)
17.2.5 Real translation theorem
In Laplace transforms the most important property was that of the transform
of the derivative which allowed dierential equations to be solved easily.
If f(t) is such that F(z) = Z f(t) then
Z f(t nT) = z
n
F(z)
Proof
From the denition
Z f(t nT) =
k=0
f(kT nT)z
k
=
k=0
f((k n)T)z
k
= f(nT) + f(1 nT)z
1
+ f(2 nT)z
3
+
Chapter 18
Statistics
We are frequently required to describe the properties of large numbers of
objects in a simplied way. For example, political opinion polls seek to
distill the complex variety of political opinions in a region or country into
just a few gures. Similarly, when we talk about the mean time to failure
of a component, we use a single number to reect the behaviour of a large
population of components.
We shall review some denitions.
18.1 Sigma Notation
Throughout statistics and mathematics in general, we often use sigma nota-
tion as a shorthand for a sum of objects. A capital sigma is used, and the
range of the summation is given above and below, unless this is obvious.
b
i=a
f(i) = f(a) + f(a + 1) + + f(b)
Usually an index value (i in this example) assumes a start value and
increases by one until it reaches a nal value. The expression is evaluated in
each case and the results are summed.
18.1.1 Example
Here are some examples of sigma notation.
18.1 Sigma Notation 286
n
i=1
i = 1 + 2 + 3 + + n
n
i=1
1
i
2
=
1
1
+
1
4
+ +
1
n
2
We use the symbol to represent the absence of an endpoint.
i=1
1
2
n
=
1
2
+
1
4
+ = 1
In statistics we are usually concerned with summations where the i is just
an index. For example
n
i=1
x
i
= x
1
+ x
2
+ + x
n
is a common abbreviation for adding up n items of data labelled appropri-
ately. There are two specic summations which we examine carefully.
If k is a constant, and f(i) is a function of i we can show easily:
n
i=1
k = nk.
This is because
n
i=1
k = k + k + k + k + k + + k
. .
nof these
= nk.
Another important result is
n
i=1
kf(i) = k
n
i=1
f(i).
This is because
n
i=1
kf(i) = kf(1) + kf(2) + kf(3) + + kf(n)
= k f(1) + f(2) + f(3) + + f(n)
which gives us the result.
18.2 Populations and Samples 287
18.2 Populations and Samples
When we embark upon research, we usually have a specic population in
mind. The population is the entire collection of objects, people, etc. which
we wish to answer questions about.
Populations are often vast, and problems in health care can have a pop-
ulation consisting of all the people on Earth, now over six billion. It is clear
that in most cases it will be impossible to consider every element in the
population.
For this reason, we normally work with a sample, or several samples. A
sample is a collection drawn from the population in some sensible way. By
sensible, we mean that carelessly selecting a sample may give a distorted
collection from the population. For example, if we conduct our opinion poll
my choosing names from a phone book, we may exclude the other members
of that house who are not listed, and those without phones. As the latter
group may have dierent political opinions than the average, we can run into
trouble here.
18.2.1 Sampling
There is a lot to consider in choosing a sample from a population with care,
and we discuss here the most elementary sampling methods.
Random sampling
This term is applied to any method of sampling which ensures that each
member of the population has an equal chance of being selected in the sample.
We could do this by drawing from a hat, or using a table of random
numbers.
Stratied random sampling
In this method, we attempt to contrive the sample so that the proportions of
subsets in the parent population are preserved. For example, in a population
of people that may be (typically) 55% female, we would attempt to ensure
that 55% of the sample were female, but otherwise randomly selected.
18.3 Parameters and Statistics 288
Systematic sampling
In this method, we take some starting point in the population, and select
every kth entry.
Cluster sampling
In this method, we divide the population are into several sections (clusters)
and randomly select a few of the sections, choosing all the members from
them.
Convenience sampling
In this method, we use the data that is readily available (cf. exit polls). This
method has many potential pit-falls.
18.3 Parameters and Statistics
Technically, a number describing some property of the population is called
parameter and a number describing some property of the sample is called
statistic.
18.4 Frequency
In many situations, a specic value will occur many times within a sample.
The frequency of a value is the number of times that value occurs within the
sample.
The relative frequency of a value is determined by dividing the frequency
for that value by the sum of all frequencies (which is the same as the number
of all values). Thus, relative frequency is always scaled between 0 and 1
inclusive and is closely related to the concept of probability.
18.5 Measures of Location
Measures of location attempt to generalise all the values with a single, central
value. These are often called averages, and that quantity that is normally
18.5 Measures of Location 289
called the average in everyday speech is just one example of this class of
measures. There are three main averages.
18.5.1 Arithmetic Mean
The mean, or more precisely arithmetic mean, of a set of n data elements x
1
,
x
2
, . . . , x
n
is given by
1
n
n
i=1
x
i
It is customary to use the symbol x to denote the mean of a sample, and
the symbol to denote the mean of the population.
Clearly the mean provides a rough measure of the centre point of the
data, note that it may be somewhat unreliable for many uses if the the data
is very skewed (see 18.9).
There are two other commonly used averages which may be used.
18.5.2 Mode
The mode of a sample is the value that occurs the most within the sample,
that is, the value with the highest frequency (see 18.4).
Then two values are tied with the highest frequency the sample is called
bimodal and both values are modes.
If more than two values are tied, we call the sample multimodal.
18.5.3 Median
The median of a sample is the middle value when the values of the sample
are ordered in ascending or descending order. Note that when there are an
even number of values, we usually use the mean of the middle two elements.
18.5.4 Example
Consider a sample formed of the numbers 1, 2, 3, 4, 5, 6, 7, 8, 9.
State the mean, mode and median.
18.6 Measures of Dispersion 290
Solution
The sample mean will be given by
x =
1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9
9
= 5.
The mode is the element with the highest frequency, but all the numbers
1 to 9 have the same frequency (1). Consequently there is no unique
mode or one might even suggest that every number is the mode.
The median is the middle number when the sample is arranged in order
(and it already is), so here the median is 5.
This kind of a distribution is called the Uniform Distribution because (at
least within a specic range) every item has an equal chance of being picked.
18.5.5 Example
State the mean, mode and median of the sample formed of the numbers 5,
5, 5, 5, 5, 5, 5, 5, 5.
Solution
The sample mean will be given by
x =
5 + 5 + 5 + 5 + 5 + 5 + 5 + 5 + 5
9
= 5.
This time the mode is clear cut, there is only one element, and it has
a frequency of 9, so the mode is 5.
The median is 5 once again.
18.6 Measures of Dispersion
The previous examples demonstrates that averages on their own do not tell
us enough about the data, we also want to know how spread out or dispersed
the data is.
18.6 Measures of Dispersion 291
18.6.1 Range
The simplest form of this measure is the range, which is simply the smallest
value subtracted from the largest value.
This measure is often unreliable, as the largest and smallest values can
often be freakish and error-prone (called outliers).
Note the whole measure depends on two values only.
18.6.2 Standard deviation
The standard deviation of a set of data elements x
1
, x
2
, . . . , x
n
is given by
=
_
1
n
n
i=1
(x x
i
)
2
We use the symbol to refer to the standard deviation of the population,
and the symbol s to refer to the standard deviation of the sample (which has
a very slightly altered formula).
This form of the standard deviation formula is quite intuitive, showing
the way the measure is formed, but it is computationally ackward. A simple
bit of sigma manipulation provides a computationally simpler expression.
It can be shown that
=
_
1
n
n
i=1
(x x
i
)
2
=
_
1
n
n
i=1
x
2
x
2
A correction is made when we are nding the standard deviation of a
sample as opposed to a population. Therefore
s =
_
1
n 1
n
i=1
(x x
i
)
2
=
_
1
n 1
n
i=1
x
2
x
2
In the former case we alter the n found on the bottom line to n1. (Note
that in the frequency versions of the formulae below it can be easily seen that
n =
f).
The advantage of the standard deviation is that it uses all the values in
the sample to calculate the dispersion.
The standard deviation is strongly focused on the concept of the mean.
Examining its formulation reveals that we are summing the squared dier-
ences from the mean. It follows that in situations were the mean is unsuitable,
the standard deviation is usually unsuitable also.
18.7 Frequency Distributions 292
18.6.3 Inter-quartile range
Just as we did to calculate the median of a sample, we arrange the values in
the sample in ascending order.
The middle values is of course the median, the value found one quarter
of the way through the values is the lower quartile and the value found three
quarters through the values is the upper quartile.
The value found when we subtract the lower quartile from the upper is
called the inter-quartile range.
This measure is superior to the range, and often harder to evaluate than
the standard deviation. However, it is very useful with skewed samples (18.9).
18.7 Frequency Distributions
In practice, we often use frequencies in large samples to reduce the work.
Note that many assumptions may rely on data with little or no skew.
Suppose that in a sample, the only distinct values that occur are given by
x
1
, x
2
, . . . , x
n
; further suppose that each value has a frequency given by f
1
,
f
2
, . . . , f
n
. Then, for example, the mean and standard deviation are given
by
=
n
i=1
f
i
x
i
n
i=1
f
i
and
=
n
i=1
f(x x)
2
n
i=1
f
i
18.7.1 Class intervals
We often consider, either for simplicity of some other reason, that data falls
into certain ranges of values, known as class intervals.
For example, suppose we table the ages of all people admitted to casualty
in a specied time. Rather than consider every age value, we might consider
ages tabulated as shown in table 18.1. There is no need to insist on equal
widths of interval, we can change them as required.
The fundamental trick to dealing with this situation is to imagine that all
the items in each interval are concentrated on the central value. For example,
we consider the 7 items in the rst interval all to be concentrated at 4.5 years,
and so on. The calculation can then go on unimpeded.
18.8 Cumulative frequency 293
Age Frequency
0 - 9 7
10 - 19 4
20 - 29 6
30 - 39 8
40 - 49 5
50 - 59 13
60 - 69 6
70 - 79 1
80 + 1
Table 18.1: An example of class intervals
Note that if we have a value which falls between the intervals, we use
rounding to determine which interval it belongs to. Thus, a value of 9.2
years belongs to the 0-9 interval, while 9.8 years belongs to the 10-19 interval.
Therefore the catchment areas are slightly wider than the intervals and these
are called class boundaries.
18.8 Cumulative frequency
Frequently, it is useful to construct a cumulative frequency graph of our data.
This is useful for nding the median, and other key points in the data range.
The cumulative frequency is the sum of all the frequencies above or below
a specied point.
18.8.1 Calculating the median
Once we have plotted our cumulative frequency graph, we nd the value that
if half of the largest cumulative frequency. If we draw a horizontal line from
the cumulative frequency to the graph and then vertically down, the gure
we land on is the median.
18.9 Skew 294
This is very useful for calculating (or estimating) the mean in data ar-
ranged by frequency tables.
18.8.2 Calculating quartiles
Finding the quartiles is very similar. To nd the lower quartile take one
quarter of the highest cumulative frequency and draw a horizontal line to
the graph and vertically down to the axis we obtain the quartile. We start
with three quarters of the highest cumulative frequency to nd the upper
quartile.
18.8.3 Calculating other ranges
We are not restricted to the 50%, or 25% and 75% marks in this procedure,
we can equally nd the 10%, 5% or any gure we care by working out this
percentage of the total cumulative frequency and tracing to the right and
down.
18.9 Skew
In addition to this, it is possible to measure the skew of a sample.
Skew is a measure of how much the data tends to one side or the other
of the mean. Calculation of skew is beyond the scope of this course.
18.10 Correlation
When we have two sets of interlinked numbers of equal size, we are often
interested in whether the numbers show a correlation. That is to say, whether
or not there is a link between the two sets.
Suppose that we have two sets of numbers
x
1
, x
2
, x
3
, . . . , x
n
; y
1
, y
2
, y
3
, . . . , y
n
then clearly we can plot the values
(x
1
, y
1
), (x
2
, y
2
), (x
3
, y
3
), . . . , (x
n
, y
n
)
18.10 Correlation 295
on a graph. If the points are relatively scattered, this usually suggests no
link, but if the points lie along a line we can make some guesses about a link.
For example, if the xs represented the heights in metres of a certain
population, and the ys represented the weights in kilograms, we would expect
a certain correlation.
Although it is possible to deal with curves, we shall consider the situation
when the points lie approximately in a straight line.
Of course, we prefer to have quantitive measures.
18.10.1 Linear regression
If we assume that the graph of
y = mx + c
approximately passes through the points, we can show that the best es-
timates for m and c are given by the formula
m =
n(
xy) (
x)(
y)
n(
x
2
) (
x)
2
and
c =
(
y)(
x
2
) (
x)(
xy)
n(
x
2
) (
x)
2
after which we can use the formula of the line to estimate the value of y
given a value for x. It is possible to reverse the variables when we wish to
estimate x given y.
Essentially this can be proven by measuring how far each point is from
the proposed line, totalling these distances and then trying to make them as
small as possible (calculus).
All summations here are ranging from i = 1 to n as usual.
18.10.2 Correlation coecient
The Pearson product moment correlation coecient or simply linear corre-
lation coecient r is given by
r =
x
xy (
x)(
y)
_
n(
x
2
) (
x)
2
_
n(
y
2
) (
y)
2
is a measure of how well the scattered points t the straight line above.
18.10 Correlation 296
Interpretation
The value of r is bounded by -1 and 1. That is
1 r 1.
A value of 1 or 1 signies a perfect line up, with 1 representing a positive
gradient (larger x produces larger y), and with 1 representing a negative
gradient (larger x produces smaller y).
So values close to 1 or 1 suggest a correlation, while values close to 0 in
the middle suggest no correlation.
Scaling
The value of r is unaected by scaling - that is the units used do not change
it, nor does swapping the x and y values.
Warnings
Remember, this is designed for linear relationships. If you suspect a curved
relationship you should transform to a straight line rst.
Note also that correlation does not indicate causality.
Chapter 19
Probability
Probability is concerned with the likelyhood of certain events. In many
ways then, probability can be thought of as an attempt to predict the likely
outcomes of experiments, while statistics provides the means of analysing
data after an event has occured.
19.1 Events
An event is a specic well dened occurrence. That is, it must be totally
unabiguous whether or not the event has occured. It is common-place to use
capital letters to denote events.
19.1.1 Probability of an Event
The probability of an event is a real number between 0 and 1 inclusive.
For our purposes a probability of 0 denotes that the event cannot occur,
and a probability of 1 denotes that the event must occur.
We shall use the shorthand notation P(A) to denote the probability of
an event A.
19.1.2 Exhaustive lists
We say that a set of events A
1
, A
2
, . . . , A
n
form an exhaustive list of events
if and only if they cover all possibilities. That is, at least one of the events
19.2 Multiple Events 298
must occur.
19.2 Multiple Events
Commonly, we have several events, and we are interested in the probability
of some combination of events. For example, we may wish to know the
probability that A and B both occur, or the probability that A occurs or B
does not.
19.2.1 Notation
Certain notation is used throughout probability theory to denote these logical
combinations, and we now review this.
The following notation is used.
The probability that both A and B occur is denoted by P(A B) or
P(AB);
The probability that either A or B occurs is denoted by P(A B) or
P(A + B);
The probability that A does not occur is denoted by P(A) or P(A);
The probability that A will occur given that B has already occurred is
denoted P(A[B).
It is vital that you note that P(A + B) is a notation for the probability
of (A or B), and not (A and B) as you might have guessed.
19.2.2 Relations between events
Before we introduce methods for obtaining the probabilities of these combi-
nations, we need to look at two important concepts.
Mutually exclusive events
Two events A and B are said to be mutually exclusive if and only if there is
no way that both A and B can occur.
19.3 Probability Laws 299
Independent events
Two events A and B are said to be independent if and only if the occurrence
of A does not eect the probability of B, and the occurrence of B does not
eect the probability of A.
More than two events
We can easily extend these ideas to more than two events. For example, we
say that events A, B and C are mutually exclusive if and only if
A and B are mutually exclusive;
B and C are mutually exclusive;
A and C are mutually exclusive.
We perform a similar extension to the notion of independence, and if we
are dealing with more than three events.
19.3 Probability Laws
Now we are ready to examine simple probability laws.
19.3.1 A or B (mutually exclusive events)
Given two mutually exclusive events A and B, the probability of A or B
occurring is given by
P(A B) = P(A) + P(B).
A helpful consequence of this result follows
19.3.2 not A
Given any event A
P(A) = 1 P(A).
This is a simple application of the previous law. Clearly A and A cannot
occur at once and so they are mutually exclusive. Furthermore, either A
happens or it does not (in which case A happens) and so P(AA) = P(A) +
P(A) = 1. Rearranging this equation yields our result.
19.3.3 1 event of N
If a set of N events, A
1
, A
2
, . . . , A
N
are mutually exclusive, an exhaustive list
and equally probable then the probability of each event is
1
N
.
19.3.4 n events of N
If an experiment has N equally likely exhaustive outcomes, and the event A
corresponds to n of those outcomes then the P(A) =
n
N
.
We use these result all the time, without even giving it any thought.
19.3.5 Examples
Given a fair coin, the probability of getting a head on a single toss of the
coin is
1
2
.
Given a fair die, the probability of getting a 6 on a single roll of the die
is
1
6
.
We now know how to nd the probability of A or B in certain circum-
stances, so we turn to the problem of A and B.
19.3.6 A and B (independent events)
Given two independent events A and B, the probability of both A and B
occurring is given by
P(A B) = P(A) P(B).
Warning
The result above does not work when the events are dependent; be especially
careful here, it is usually simple to see if events are mutually exclusive, but
more dicult to determine their independence.
19.3.7 Example
A certain device consists of two components A and B, which are wired in
series. If either component is not functioning the circuit will break. Each
component is unaected by the state of the other.
The probability of Abeing functional is 0.9, and the probability of B being
functional is 0.8. What is the probability of the device being functional?
Solution
We shall label A as the event that component A is functioning, and similarly
label B as the event that component B is functioning.
We are told that A and B are independent events, the components work
independently of each other. Therefore we can use our simple law described
in 19.3.6.
P(A B) = P(A) P(B) = 0.9 = 0.8 = 0.72
It is worth noting that the probability that the device is working is lower
than the probability of each component. We shouldnt be surprised that
placing several vulnerable components in series has this eect.
Of course many experiments concern many more than three events, but
this does not cause us any concern as we can easily extend these two proba-
bility laws.
19.3.8 A or B or C or ...
Given n mutually exclusive events A
1
, A
2
, . . . , A
n
, then the probability that
one of them occurs is given by
P(A
1
A
2
A
n
) = P(A
1
) + P(A
2
) + + P(A
n
).
Clearly this is an extension of 19.3.1.
19.3.9 A and B and C and ...
Given n independent events A
1
, A
2
, . . . , A
n
, then the probability that all of
them occurs is given by
P(A
1
A
2
A
n
) = P(A
1
) P(A
2
) P(A
n
).
Clearly this is an extension of 19.3.6.
19.3.10 Example
A fair die is rolled once, show that the probability of rolling a number less
than 6 is
5
6
.
Solution
There are at least three ways of tackling this problem.
Let A
1
be the event of rolling a 1, down to A
6
being the probability of
rolling a 6.
1. We could simply observe that there are only six equally likely outcomes,
and our event pertains to ve of them. That, with a simple application
of 19.3.4, gives us a probability of
5
6
.
2. Clearly A
1
, A
2
, . . . , A
5
are mutually exclusive (it is impossible to roll
two numbers at the same time on one roll). Therefore (by 19.3.1) we
can simply add the probabilities, and as each probability is
1
6
we arrive
at an answer of
5
6
.
3. We could observe that the question is equivalent to the probability of
not rolling a 6. We know the probability of rolling a 6 is simply
1
6
, so
we subtract this from 1 to nd
5
6
, (see 19.3.2).
These results are useful, but we will frequently encounter situations where
two events are not mutually exclusive, or not independent.
19.3.11 A or B revisited
Given two events A and B, the probability of A or B occurring is given by
P(A B) = P(A) + P(B) P(A B).
Compare this result to 19.3.1.
Let us use this result to examine the reliability of two objects in parallel.
19.3.12 Example
A certain device consists of two components A and B, which are wired in par-
allel. Each component has enough capacity individually to work the device.
Each component is unaected by the state of the other.
The probability of Abeing functional is 0.9, and the probability of B being
functional is 0.8. What is the probability of the device being functional?
Solution
We shall label A as the event that component A is functioning, and similarly
label B as the event that component B is functioning.
Unlike the example above, this device will work if A or B is functional.
We note that A and B are certainly not mutually exclusive; it is quite possible
for both components to be working at the same time. Therefore we have to
use the general form of the probability law (19.3.11), and not the basic form
(19.3.1).
P(A B) = P(A) + (P) P(A B) = 0.9 + 0.8 0.72 = 0.98
Note that A and B were independent, so we were able to simply use the
P(AB) = P(A)P(B) relation (see 19.3.6) for that part of the calculation.
Note also that the reliability of the parallel system is much greater than
that of the individual components. Compare this to the previous example of
the series system.
So, we no longer require events to be mutually exclusive to nd the prob-
ability of one of them occurring. So what about the notion of independence?
19.3.13 A and B revisited
Given any two events A and B, the probability that A and B both occur is
given by
P(A B) = P(A)P(B[A) = P(B)P(A[B).
Compare this result to 19.3.6.
This relationship also provides a neat way of working out conditional
probability.
19.3.14 Conditional probability
Given any two events A and B, the probability that A occurs given B already
has is given by
P(A[B) =
P(A B)
P(B)
provided that P(B) ,= 0.
19.4 Discrete Random Variables 304
19.3.15 Example
The probability that two components A and B in a device are both working
is 0.63. The probability that B is working is 0.7. Calculate the probability
that A is working given that B is.
Solution
Clearly P(A B) = 0.63, from the above P(A[B) = 0.63/0.7 = 0.9.
Question. Do we have enough information in this example to calculate
P(A)?
We can also rearrange our probability law 19.3.13 to produce the following
important result.
19.3.16 Bayes Theorem
P(B[A) =
P(B)P(A[B)
P(A)
.
Once again, this assumes that P(A) ,= 0.
More generally we can write
P(A[F) =
P(A)P(F[A)
i
P(F[B
i
)P(B
i
)
.
The letter F has been picked here to represent failure.
19.4 Discrete Random Variables
A discrete random variable is any variable quantity that can only take exact
or separate numerical values. For example, while the length of feet measured
in people may assume any (therefore continuous) values, the shoe sizes they
have can have only discrete values.
19.4.1 Notation
It is usual to use capital letters to denote a variable, and lower case letters
to denote possible outcomes. These outcomes should all be arranged to be
mututally exclusive.
Therefore suppose that X can be one of the following outcomes
x
1
, x
2
, x
3
, . . . , x
n
,
then
n
i=1
P(X = x
i
) = 1
and indeed, this is one denition of a discrete random variable.
19.4.2 Expected Value
The expected value of a discrete random variable is analogous to the mean
value that the variable will have. We denote that expected value of the
variable X by E(X).
Let us examine the probability distribution for a simple random variable.
If we roll two fair dice and add the scores, we obtain the following distribution.
x
i
f(x
i
)
2
1
36
3
2
36
4
3
36
5
4
36
6
5
36
7
6
36
8
5
36
9
4
36
10
3
36
11
2
36
12
1
36
Table 19.1: Probabilities for total of two rolled dice
It is clear in this case that the probability distribution is symmetrical
and that we would expect 7 to be the most common score produced by this
experiment.
How could this be quantitively calculated however? Well suppose that
we calculate the total of totals if we repeat the experiment 36 times. So we
would expect to have 1occurence of 2, 2 occurences of 3 and so on. Therefore
in total we would have
T = 2 1 + 3 2 + 3 3 + + 11 2 + 12 1 = 252
and therefore if 252 is the combined total of 36 experiments we could divide
by 36 to work out the expected score from one such experiment. Thus we
obtain E(X) = 252/36 = 7.
Another way this could be written would be to perform the division by
36 earlier.
E(X) =
T
36
= 2
1
36
+ 3
2
36
+ 3
3
36
+ + 11
2
36
+ 12
1
36
=
252
36
= 7
The division has been moved to the second part of the multiplication to
help highlight that each term is merely the product of the outcome with the
probability of that outcome. We can generalise this as follows.
Denition
If a discrete random variable X has the possible outcomes x
1
, x
2
, . . . x
n
, with
a probability density function f(x) such that f(x
i
) = P(X = x
i
), then
E(X) =
n
i=1
x
i
f(x
i
).
19.4.3 Variance
Given that we have some measure of the mean outcome for a discrete random
variable we turn our attention to how dispersed the outcomes may be around
this mean. We do this by examining the variance which is the name for the
square of standard deviation (see 18.6.2).
Now the standard deviation is essentially the root, of the mean of the
squared deviations from the mean, so the variance will be the same calcula-
tion without this nal root. We shall denote the variance of the variable X
with the notation
var (X) .
Denition
If a discrete random variable X has the possible outcomes x
1
, x
2
, . . . x
n
, with
a probability density function f(x) such that f(x
i
) = P(X = x
i
), and we
adopt the notation that
= E(x)
then
var (x) = E[(X )
2
].
Now although this denition is intuitive, showing as it does that we are
averaging the squares of the deviations of the variable from the mean, it is
tedious to calculate. One can see that the mean has to be subtracted from
all the gures, then these must be squared and totalled. We can use some
algebra to nd an alternative expression.
var (x) = E[(X )
2
]
= E[x
2
2X +
2
] = E(X
2
) 2E(X) +
2
= E(X
2
)
2
and so in summary
var (X) = E(X
2
)
2
which is easier to calculate even if it is less obviously true.
19.4.4 Example
We revisit a previous problem. Let X be the score obtained by totalling the
rolls of two fair dice. Find E(X) and var (X).
Solution
We solve the whole problem as we would from the start, without the assump-
tions we made above.
By denition E(X) is the sum of the products of the outcomes and their
probabilities. This is the sum of the third column which is 7. We now let
= 7. Now
var (X) = E(X
2
)
2
and E(X
2
) is simply found by squaring the x
i
and multiplying them by their
repspective probabilities (and summing these). This is the sum of the fourth
column is 54
5
6
. Thus
var (X) = 54
5
6
49 = 5
5
6
=
35
6
19.5 Continuous Random Variables 308
x
i
f(x
i
) x
i
f(x
i
) x
2
i
f(x
i
)
2
1
36
2
36
4
36
3
2
36
6
36
18
36
4
3
36
12
36
48
36
5
4
36
20
36
100
36
6
5
36
30
36
180
36
7
6
36
42
36
294
36
8
5
36
40
36
320
36
9
4
36
36
36
324
36
10
3
36
30
36
300
36
11
2
36
22
36
242
36
12
1
36
12
36
144
36
total 1 7 54
5
6
Table 19.2: Calculating E(X) and var (X) for two rolled dice.
19.5 Continuous Random Variables
A continuous random variable is not limited to discrete values, but instead
can take any value in a specied range. As an analogy consider the real
number line. The real numbers make up the entire line, and every point
is a real number. No matter what two real numbers one may pick, there
are innitely more inbetween. By contrast there are huge gaps between the
integers as they appear on the number line, and in fact it is easy to pick two
such numbers, such as 2 and 3 so that there is no integer between them. So
the Integers are a discrete collection of numbers while the real numbers are
continuous.
1
1
In fact this nature of the real numbers is noted in the fact that they are often described
as the continuum. Furthermore when people discuss the sizes of the integers versus that
of the real numbers the integers are of the size of the smallest innity
0
while the real
numbers are of the size of the continuum, written c.
19.5 Continuous Random Variables 309
19.5.1 Denition
If X is a continuous variable such that there are real numbers a and b such
that
P(a X < b) = 1; P(X < a) = 0; P(X b)
then X is a continuous random variable.
19.5.2 Probability Density Function
For a continuous random variable X the probability density function or p.d.f.
will in general take the form of a function that varies with X over all the
values in the range.
Instead of asking for the probability that X has a certain value, we usually
need to calculate the probability that X lies in a certain range.
2
We do this
by calculating the area under the curve in that range. Thus
P(a X < b) =
_
b
a
f(x)dx
where f(x) is the p.d.f. for the continuous random variable X.
2
For example, if we want to nd P(X = 2) for a continuous variable, we are normally
looking for all values of X that round to the number 2 and so we nd P(1.5 x < 2)
instead.
Chapter 20
The Normal Distribution
20.1 Denition
In many situations, data follows a very characteristic distribution known as
the normal distribution, or Gaussian distribution.
Figure 20.1: The Normal Distribution
This distribution is totally symmetrical (has no skew), and most data is
to be found in the middle values, while large or small values are much rarer.
The distribution is centered about the mean and about 99.7% of the data
lies within three standard deviations on either side of the mean.
20.2 Standard normal distribution 311
Figure 20.2: Close up of the Normal Distribution
20.2 Standard normal distribution
The normal distribution with a mean of 0 and a standard deviation of 1 is
called the standard normal distribution.
It is important because questions about all normal distributions can be
reduced to a problem on the standard distribution, for which a set of tables
can be produced (see table A.1 below).
Areas under the graph of the distribution represent the proportion of the
population residing there.
20.2.1 Transforming variables
Essentially to use the standard distribution, we must be able to transform
more dicult problems.
Let us look at a specic problem. Suppose that the heights of 1000 people
are normally distributed with mean 170 cm, and standard deviation 5 cm.
How many people will have heights below 180 cm.
We deal with this problem by calculating how far we are away from the
mean, measured in standard deviations. In this example the distance between
180 and the mean 170 is 10 cm, but we want to nd the number of standard
deviations away, so dividing by the deviation, 5, we obtain 2.
In general, if the boundary value we are interested in is x, the appropriate
standard normal boundary value z is given by
z =
x
To complete our problem, we examine our tables which show the area
under the graph (proportion) up to and including our value (which is 2).
From the tables we get a value of 0.977, and this fraction of our population
represents 977 people.
20.2.2 Calculation of areas
Note that the tables provided, like most tables show areas calculated only
in a certain way. To work out other areas we must use ingenuity combined
with the following facts
1. The total area under the graph is 1;
2. The graph is perfectly symmetrical about 0;
3. Therefore the area under each side is
1
2
.
20.2.3 Example
Let us return to our example, and make it slightly more complex.
Suppose that the heights of 1000 people are normally distributed with
mean 170 cm, and standard deviation 5 cm. How many people will have
heights between 167 cm and 180 cm.
A sketch graph of each situation is highly advisable.
Next we calculate z for each of our values.
z
1
=
180 170
5
= 2; z
2
=
167 170
5
= 0.6
We already know from the tables that the area enclosed under the graph
up to z = 2 is 0.977. We are only interested in the region above 167 as well
however.
We simplify the problem by calculating the area from z = 0 to z = 2, this
is the same area, without the region below z = 0 which we know has area
0.5. Therefore the contribution from z = 0 to z = 2 is 0.477.
Now we work out the area from z = 0.6 to z = 0. Our table does not
consider negative values of z and so we must do something dierent. By
symmetry, the area from z = 0 to z = +0.6 will be the same. The area up
to z = 0.6 is 0.726 from our tables, so once more we subtract 0.5 to nd our
desired area, with a value of 0.226.
Adding these regions together gives an area of 0.477 + 0.226 = 0.703, so
multiplying this area (proportion) by our population, we come up with an
answer of 703 people.
20.2.4 Condence limits
Consider the central 95% of our population, that is a proportion of 0.950.
As that section is spread symmetrically, half of this lies on either side of the
mean, so 0.475 lies above the mean for example.
If we wish to nd the z boundary point for this, we need to get an area
like one in our tables, that is all the area up to that point. The area under
the mean is 0.5, and the area above to the boundary is 0.475, creating a total
of 0.975.
This corresponds to a z value of 1.96.
The values 1.96 and +1.96 are called the 95% condence limits
for our data. Picking a member of the population at random we can be 95%
sure they will fall into this region.
20.2.5 Sampling distribution
Suppose that we have a large population, out of which we select many samples
of size n. If we calculate the sample mean for each sample, then we could
consider the sample formed by these averages.
This is a sample taken from the population of all possible averages of
samples of size n from the original population.
This is confusing - you must realise that there are two populations, the
original one (P
1
say), and the population (P
2
say) of samples of size n taken
from P
1
.
Example
Take the following, extremely small example.
Suppose our population P
1
consists of the numbers
1, 2, 3, 4, 5
and that we are taking samples of size 3, then the population P
2
of all
possible samples looks like this.
20.3 The central limit theorem 314
1 2 3
1 2 4
1 3 4
2 3 4
1 2 5
1 3 5
2 3 5
1 4 5
2 4 5
3 4 5
20.3 The central limit theorem
This is an enormously important result in statistics. Many populations we
deal with are not normally distributed, but this theorem gives us hope here.
The central limit theorem suggests that as the sample size n increases,
the sampling distribution approaches a normal distribution.
1
In other words, our population may not be very normal (see our exam-
ple above), but the means of the samples drawn from it are roughly normal,
and become more so as n increases.
More specically, if we have a distribution (which may or may not be
normal) with mean and standard deviation , and we randomly select
samples of size n from this population, then:
The distribution of sample means x approaches a normal distribution
as n becomes larger;
The mean of the sample means will be the population mean ;
The standard deviation of the sample means will be

n
.
A value of n = 30 is quite safe to use this for most original parent popu-
lations, but any sample size is safe for a normally distributed parent.
We write
x
=
and
x
=

n
1
There are exceptions, the Cauchy distribution does not obey the Central Limit The-
orem for example.
20.4 Finding the Population mean 315
20.4 Finding the Population mean
Imagine a population data is thought to have a mean , then if a sample of
size n is taken at random we can be 95% condent that the sample mean x
will lie in the following range
1.96

n
x + 1.96

n
where the gure of 1.96 is the corresponding z score for a two-tailed 95% con-
dence interval, and we are taking samples from the sample mean population,
and so its standard deviation as above.
If n = 1 then clearly this just becomes the normal 95% condence limit,
but as n becomes larger this interval becomes smaller and smaller, giving us
a more detailed and precise picture of the possible locations of x.
Of course, normally the whole point is that we dont have the population
mean, but instead have simply the sample mean and wish the deduce the
population mean from it. The same logic can be employed. We can interpret
the interval above like so:
[ x[ 1.96

n
so that the distance between and x has an upper limit (within our con-
dence level). Or to put it yet another way
x 1.96

c
x + 1.96

c
.
so that given x and an estimate for we can produce a range of possible
values for which will narrow as n increases given the relevant condence
level.
20.5 Hypothesis Testing
Another use for this result is to determine if a mean has moved signicantly.
This is generally only useful if we really were sure we knew it before.
Suppose that a mean of a population was reliably found to be . Some
time later a sample of size n was taken from the new population, and the
mean found to be x, assume the standard deviation has not changed.
20.5 Hypothesis Testing 316
20.5.1 Two tailed tests
Suppose that we have these hypotheses
H
1
: The means are signicantly dierent ,= ;
H
0
: The means are in fact the same (the null hypothesis).
Once again suppose that we wish to use a condence level that corre-
sponds to a score of Z for a two-tailed test. This is indeed a two tailed test
since hypothesis H
1
is simply whether the two means are dierent, in either
direction.
We rst of all obtain a test statistic by working out how far away our new
mean is from the old
[ x[
but we are really interested in how far this is in standard deviations. So we
divide by the standard deviation for this situation which is /
n since we
are looking at a sample of size n taken from the population. So our test
statistic is
z =
[ x[
where once again, due to the two-tailed nature of the problem we are only in-
terested in the magnitude of the dierence, not the sign. If this test statistics
is larger than the value Z obtained above then either
1. the mean has not changed but this sample lay outside the condence
interval even so; this is called a Type I error, and the probability of
this occurring is usually denoted as and called the signicance level.
Clearly = 0.05 for a 95% condence.
2. the hypothesis H
1
is actually true.
There is no way to determine for sure which is the case, which is why a
high level of condence is useful in order to make the error unlikely. We are
therefore forced to accept H
1
and reject H
0
.
Conversely, if the value of z < Z then either
1. the mean has changed signicantly, but our random sample simply did
not reect this; this is called a Type II error, and probability of this
occurring is usually denoted as .
2. the hypothesis H
0
is actually true.
Again we reject H
1
and accept the null hypothesis that the change is not
statistically signicant.
20.6 Dierence of two normal distributions 317
20.6 Dierence of two normal distributions
Another useful sampling distribution is that of the dierence of two normal
distributions.
Suppose that we have two normal distributions, with means
1
and
2
and standard deviations
1
and
2
. Then if we take a sample of size n
1
from
the rst distribution and n
2
from the second then the dierence in the means
of the samples will have a mean
d
of
d
=
1
2
and standard deviation of
d
=
2
1
n
2
1
+

2
2
n
2
2
.
Appendix A
Statistical Tables
For the standard normal distribution, that is the normal distribution with
mean 0 and standard deviation 1, the probability density function is given
by:
(x) =
1
2
e
1
2
x
2
=
1
2
exp(
1
2
x
2
).
Table A.1 shows the cumulative normal distribution calculated as
(x) =
_
x
(z)dz.
The
2
distribution, for a given condence P and degrees of freedom is
dened by
P
100
=
I(
2
, )
I(0, )
where
I(a, b) =
_
b
a
x
1
2
1
e
1
2
x
dx.
is shown tabulated in tables A.2 and A.3.
319
x 0 1 2 3 4 5 6 7 8 9
0.0 0.500 0.504 0.508 0.512 0.516 0.520 0.524 0.528 0.532 0.536
0.1 0.540 0.544 0.548 0.552 0.556 0.560 0.564 0.567 0.571 0.575
0.2 0.579 0.583 0.587 0.591 0.595 0.599 0.603 0.606 0.610 0.614
0.3 0.618 0.622 0.626 0.629 0.633 0.637 0.641 0.644 0.648 0.652
0.4 0.655 0.659 0.663 0.666 0.670 0.674 0.677 0.681 0.684 0.688
0.5 0.691 0.695 0.698 0.702 0.705 0.709 0.712 0.716 0.719 0.722
0.6 0.726 0.729 0.732 0.736 0.739 0.742 0.745 0.749 0.752 0.755
0.7 0.758 0.761 0.764 0.767 0.770 0.773 0.776 0.779 0.782 0.785
0.8 0.788 0.791 0.794 0.797 0.800 0.802 0.805 0.808 0.811 0.813
0.9 0.816 0.819 0.821 0.824 0.826 0.829 0.831 0.834 0.836 0.839
1.0 0.841 0.844 0.846 0.848 0.851 0.853 0.855 0.858 0.860 0.862
1.1 0.864 0.867 0.869 0.871 0.873 0.875 0.877 0.879 0.881 0.883
1.2 0.885 0.887 0.889 0.891 0.893 0.894 0.896 0.898 0.900 0.901
1.3 0.903 0.905 0.907 0.908 0.910 0.911 0.913 0.915 0.916 0.918
1.4 0.919 0.921 0.922 0.924 0.925 0.926 0.928 0.929 0.931 0.932
1.5 0.933 0.934 0.936 0.937 0.938 0.939 0.941 0.942 0.943 0.944
1.6 0.945 0.946 0.947 0.948 0.949 0.951 0.952 0.953 0.954 0.954
1.7 0.955 0.956 0.957 0.958 0.959 0.960 0.961 0.962 0.962 0.963
1.8 0.964 0.965 0.966 0.966 0.967 0.968 0.969 0.969 0.970 0.971
1.9 0.971 0.972 0.973 0.973 0.974 0.974 0.975 0.976 0.976 0.977
2.0 0.977 0.978 0.978 0.979 0.979 0.980 0.980 0.981 0.981 0.982
2.1 0.982 0.983 0.983 0.983 0.984 0.984 0.985 0.985 0.985 0.986
2.2 0.986 0.986 0.987 0.987 0.987 0.988 0.988 0.988 0.989 0.989
2.3 0.989 0.990 0.990 0.990 0.990 0.991 0.991 0.991 0.991 0.992
2.4 0.992 0.992 0.992 0.992 0.993 0.993 0.993 0.993 0.993 0.994
2.5 0.994 0.994 0.994 0.994 0.994 0.995 0.995 0.995 0.995 0.995
2.6 0.995 0.995 0.996 0.996 0.996 0.996 0.996 0.996 0.996 0.996
2.7 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997
2.8 0.997 0.998 0.998 0.998 0.998 0.998 0.998 0.998 0.998 0.998
2.9 0.998 0.998 0.998 0.998 0.998 0.998 0.998 0.999 0.999 0.999
3.0 0.999 0.999 0.999 0.999 0.999 0.999 0.999 0.999 0.999 0.999
Table A.1: Table of (x) (Normal Distribution)
320
P 99 95 90 85 80 60 50
= 1 0.000157 0.00393 0.0158 0.0358 0.064 0.275 0.455
2 0.0201 0.103 0.211 0.325 0.446 1.02 1.39
3 0.115 0.352 0.584 0.798 1.01 1.87 2.37
4 0.297 0.711 1.06 1.37 1.65 2.75 3.36
5 0.554 1.15 1.61 1.99 2.34 3.66 4.35
6 0.872 1.64 2.20 2.66 3.07 4.57 5.35
7 1.24 2.17 2.83 3.36 3.82 5.49 6.35
8 1.65 2.73 3.49 4.08 4.59 6.42 7.34
9 2.09 3.33 4.17 4.82 5.38 7.36 8.34
10 2.56 3.94 4.87 5.57 6.18 8.30 9.34
11 3.05 4.57 5.58 6.34 6.99 9.24 10.34
12 3.57 5.23 6.30 7.11 7.81 10.18 11.34
13 4.11 5.89 7.04 7.90 8.63 11.13 12.34
14 4.66 6.57 7.79 8.70 9.47 12.08 13.34
15 5.23 7.26 8.55 9.50 10.31 13.03 14.34
20 8.26 10.85 12.44 13.60 14.58 17.81 19.34
30 14.95 18.49 20.60 22.11 23.36 27.44 29.34
40 22.16 26.51 29.05 30.86 32.34 37.13 39.34
50 29.71 34.76 37.69 39.75 41.45 46.86 49.33
60 37.48 43.19 46.46 48.76 50.64 56.62 59.33
Table A.2: Table of
2
distribution (Part I)
P 40 30 20 10 5 2 1 0.1
= 1 0.708 1.07 1.64 2.71 3.84 5.41 6.63 10.83
2 1.83 2.41 3.22 4.61 5.99 7.82 9.21 13.82
3 2.95 3.66 4.64 6.25 7.81 9.84 11.34 16.27
4 4.04 4.88 5.99 7.78 9.49 11.67 13.28 18.47
5 5.13 6.06 7.29 9.24 11.07 13.39 15.09 20.51
6 6.21 7.23 8.56 10.64 12.59 15.03 16.81 22.46
7 7.28 8.38 9.80 12.02 14.07 16.62 18.48 24.32
8 8.35 9.52 11.03 13.36 15.51 18.17 20.09 26.12
9 9.41 10.66 12.24 14.68 16.92 19.68 21.67 27.88
10 10.47 11.78 13.44 15.99 18.31 21.16 23.21 29.59
11 11.53 12.90 14.63 17.28 19.68 22.62 24.73 31.26
12 12.58 14.01 15.81 18.55 21.03 24.05 26.22 32.91
13 13.64 15.12 16.98 19.81 22.36 25.47 27.69 34.53
14 14.69 16.22 18.15 21.06 23.68 26.87 29.14 36.12
15 15.73 17.32 19.31 22.31 25.00 28.26 30.58 37.70
20 20.95 22.77 25.04 28.41 31.41 35.02 37.57 45.31
30 31.32 33.53 36.25 40.26 43.77 47.96 50.89 59.70
40 41.62 44.16 47.27 51.81 55.76 60.44 63.69 73.40
50 51.89 54.72 58.16 63.17 67.50 72.61 76.15 86.66
60 62.13 65.23 68.97 74.40 79.08 84.58 88.38 99.61
Table A.3: Table of
2
distribution (Part II)
Appendix B
Greek Alphabet
The greek alphabet has been included for completeness here. Of course, not
all the greek letters are used in this course, but a full reference may prove
useful.
322
Name Lower Case Upper Case
Alpha A
Beta B
Gamma
Delta
Epsilon E
Zeta Z
Eta H
Theta
Iota I
Kappa K
Lambda
Mu M
Nu N
Xi
Omicron o O
Pi
Rho P
Sigma
Tau T
Upsilon
Phi
Chi X
Psi
Omega
Index
!, 32
<, 2
=, 2
>, 2
Im(z), 76
Re(z), 76
j, 68
(z), 76
, 2
1(z), 76
, 2
, 2, 31
, 2
cos , 50
cot , 51
csc , 51
, 273
, 64
, 2
, 2
, 2
, 2
,=, 2
, 64
sec , 51
sin , 50
tan , 50
i, 86
j, 86
k, 86
n
C
r
, 41
i, 68
j, 68
absolute value, 30
adjacent, 49
angles
converting, 61
degrees, 61
radians, 61
angular frequency, 64
antilog, 33, 36
antilogarithm, 33
arc length, 61
area of sector, 62
Argan diagrams, 70
augmented matrix, 108
binomial expansion, 38
binomial theorem, 38
BODMAS, 3, 12
brackets
expanding, 17
multiplying, 18
CAH, 51
calculus
derivative, 149
dierential, 149
adding, 151
chain rule, 151
product rule, 151
quotient rule, 152
subtracting, 151
trigonometry, 152
integral, 172
areas, 186
denite, 185
INDEX 324
fractions, 176
logarithm, 175
mean value, 188
numerical, 191
parts, 177
power rule, 173
RMS, 189
substitution, 174
volumes, 187
several variables
Lagrange multipliers, 223
stationary points, 218
turning points, 158
Cartesian vectors, 86
CAST diagram, 55
combinations, 41
complex numbers, 68, 69
addition, 71
algebra, 71
Argand diagrams, 70
cartesian form, 76
conjugate, 75
division, 74, 78
exponential form, 78
imaginary, 69
imaginary part, 76
modulus, 75, 77
multiplication, 72, 78
plane, 70
polar form, 77
real part, 76
subtraction, 72
complex plane, 70
constant of integration, 172
convolution, 264
cosine rule, 59
Coulombs law, 133
counting numbers, 8
cross product, 91
decibel, 33
decimal places, 3
delta, Dirac, 273
dierential equations
Laplace transforms, 256
dierentiation, see clculus149
partial, 214
discriminant, 28
domino rule, 96
dot product, 89
equation of straight line, 145
equations
exponential, 34
quadratic, 24
discriminant, 28
in trigonometry, 65
simple cases, 29
solving, 26, 27
rearranging, 11
solving
multiple solutions, 54
with matrices, 104
trigonometric, 54, 64
events, 297
independent, 299, 303
multiple, 298
mutually exclusive, 298
exp(), 37
expanding brackets, 17
expansion
binomial, 38
expected value, 305
exponent
exponential equations, 34
exponential functions, 32, 36
factor, 20
factorial, 32
factorization, 20, 26
Fourier
coecients, 239
INDEX 325
series, 240
function notation, 15
functions
exponential, 32, 36
inverse, 16
logarithmic, 33
trigonometric, 50
Gaussian elimination, 102
gradient, 144
Heaviside function, see Laplace transforms
hypontenuse, 49
identities, 63
identity matrix, 94
imaginary numbers, 69
impulse, 273
index, 21, 32
indices, 21
integers, 9
positive, 8
integration, 172
inverse function, 16
invertible, 101
irregular triangles, 57
Jacobians, 225
Lagrange multipliers, 223
Laplace transforms, 248
change of scale, 264
convolution, 264
derivatives, 253, 264
Dierential equations, 256
rst shifting, 251
Heaviside function, 266
important transforms, 251, 253
integrals, 255
inverse, 250
unit step function, 266
laws
of indices, 21
of logs, 34
of signs, 2
of surds, 23, 68
leading diagonal, 94
length of arc, 61
log, 33
logarithm, 33
logarithmic functions, 33
logarithms, 34
matrices, 93
addition, 103
algebra, 103
augmented, 108
cells, 93
column vector, 94
determinant, 99
domino rule, 96
eigenvalues, 122
eigenvectors, 122
entries, 93
identity, 94
inverse, 101
multiplication, 96, 104
rank, 117
row operations, 108
row reduction, 102
row vector, 94
rule of signs, 99
solving equations, 104
square, 93
zero, 94
modulus
complex number, 75
complex numbers, 77
of real number, 30
vector, 85, 86
multiple solutions, 54
multiplying brackets, 18
Napier
INDEX 326
John, 33
Newtons laws
Gravitation, 133
second, 131
notation
function, 15
sigma, 31, 285
trigonometric, 52
numbers, 8
complex, see complex numbers
imaginary, 69
prime, 8
rational, 9
real, 9
whole, 8
Ohms law, 131
operator precedence, 3, 12
opposite, 49
order of precedence, 3
parabola, 25
partial dierentiation, 214
partial fractions, 176
Pascals triangle, 39, 41
pH, 33
phase shift, 64
positive integers, 8
power, 21, 32
Power Series, 194
prime numbers, 8
principle of superposition, 66
Probability
distributions
uniform, 290
probability, 297
conditional, 303
continuous variables, 308
discrete variables, 304
exhaustive list, 297
expected value, 305
laws, 299
n of N, 300
1 of N, 300
and, 300, 301, 303
not, 299
or, 299, 301, 302
multiple events, 298
probability density function, 309
proportion
direct, 130
inverse, 131
inverse square, 132
Pythagoras theorem, 50
quadratic equations, 24, 27
complex solutions, 69, 75
radians, 61
rational numbers, 9
real line, 70
real numbers, 9
line, 9
rearranging equations, 11
Richter scale, 33
right angled triangles, 49
RMS value, 189
row reduction, 102
scalar product, 89
scalene triangles, 57
sector area, 62
Sigma notation, 31
sigma notation, see notation
signicant gures, 4
Simpsons rule, 191
sine rule, 59
singular, 101
SOH, 51
stationary points, 158, 218
Statistics, 285
statistics
averages, 288, 289
INDEX 327
cumulative frequency, 288
frequency, 288
measures
of dispersion, 290
of location, 288
population, 287
range, 291
relative frequency, 288
sample, 287
sampling, 287
standard deviation, 291
sum and product, 26
surd, 23
Taylors Expansion, 195
Taylors Theorem, 218
Taylors theorem, 197
TOA, 51
triangles
irregular, 57
labelling, 49, 58
right angled, 49
scalene, 57
trigonometric equations, 64
trigonometry, 49
common values, 53
cosine rule, 59
functions, 50
identities, 63
compound angles, 64
double angles, 64
non right angled, 58
sine rule, 59
turning points, 158
unit step function, see Laplace transforms
variables
continuous random, 308
discrete random, 304
vector product, 91
vectors, 85
addition, 88
Cartesian, 86
column, 94
cross product, 91
dot product, 89
modulus, 85
row, 94
scalar product, 89
subtraction, 88
unit, 86
zero, 88
warpspeed, 33
waveform, 64
whole numbers, 8
zero matrix, 94
zero vector, 88

Engineering Math Reference

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Engineering Math Reference

Diunggah oleh

Hak Cipta:

Format Tersedia

Engineering Mathematics

to the horizontal while it

2 which does not mean

4.3 Table of values

we can work out the

, but it is as easy to label in the range 180

, and from our CAST diagram our other solution is 150

, and from our CAST diagram our other

, and because the triangle is not

and thus cannot appear

, a = 12 and b = 13. Solve it

, then as all the angles add to 180

. All that remains to be found is c. We could use

a = 12, b = 13, c = 9.037

, and using the cosine

so that c = 2.766 and the complete solution in this case is

a = 12, b = 13, c = 2.766

to convert from radians to degrees.

4.8.2 Compound angle identities

4.9 Trigonmetric equations 65

and using our CAST diagram (see 4.5), we

so inserting this into our equation, we obtain

5.7 De Moivres Theorem 80

6. We now calculate the scalar product

or we could use the other denition

= 2(5) 1(8) + 3(4) = 6

= 0(. . . ) 1(2) + 2(4) = 6

Each 3 3 determinant can now be reduced to 2 2 determinants and

= 1; (swap R1, R2)

around the origin

(x) for the derivative. Thus

symbols we sometimes write the derivative in roman numerals as a super-

(sin x).1 x.(cos x)

(x).(sin x) (cos x).1

(x) and solve the equation

(x) Turning Point Comments

(x) is a complex fraction and dierentiating again will

2 to nd the analogous D.C. value for an A.C. signal

(a) > 0 then the turning point is a local minimum;

(a) < 0 then the turning point is a local maximum.

notation (we do not know which variable

13.7 Parametric functions 226

This can be expanded to higher dimensions, so that for example the

in the uv plane thus

f((u, v), (u, v))

sin cos r cos cos r sin sin

2; |cos nx| = |sin nx| =

cos mx cos nxdx =

cos(mn)x + cos(m + n)xdx

sin mx sin nxdx =

cos(mn)x cos(m + n)xdx

sin mx cos nxdx =

sin(m + n)x + sin(mn)xdx

f(x) cos nxdx

f(x) sin nxdx

(t) = sLf(t) f(0)

(0)) + 5(sx x(0)) 3x =

(0) = v, show that

(t)dt = sF(s) f(0).

(t)dt = sF(s) f(0).

Anda mungkin juga menyukai