Anda di halaman 1dari 120

1

Introduction to computational methods in science


and engineering using MATLAB
Dr. rer. nat. Hans-Georg Matuttis
University of Electro-Communications,
Department of Mechanical and Control Engineering
Chofu Chofugaoka 1-5-1
Tokyo 182-8585
Japan
http://www.matuttis.mce.uec.ac.jp/, but
1. When something is unclear, ask me, not you neighbor, who is busy himself. Ask
as much questions as you need.
2. The Script will be downloadable from http://www.matuttis.mce.uec.ac.jp/
or from the E-Learning system. You can read it online or print it out. If you
print out more than 100 pages, you have to submit an application (signed by
me) for more printout pages. Reading the script does not replace attending the
lecture.
3. The homework is exercise to learn programming, you cannot learn programming
by reading the script.
4. Learn to use the online-help from MATLAB
5. For the credit:
Presence in the Lecture
Number of Points in the programming homework and the E-Learning System

Getting Started
0.1

Why Computational Methods

Most problems in Science and Engineering differ from undergraduate problems in the
respect that no closed solutions exist: Whereas there is a closed solution (solution
function) for the harmonic oscillator with viscous damping,
x +

x$
!2 "#

+02 x = 0

Viscous damping

x(t) = A exp(t) exp i

&

02

'

2t

there is no closed solution for the harmonic oscillator


with sliding friction (see graphics to the right)
m
x + sgn (v) FN +kx = 0, sgn(v) =
!

"#

Sliding friction

v
,
|v|

but solutions can only be given piecewise for ranges


where the friction is constant:
%

'

x(t) = (x x0 ) sin t +
x0
2
%
'

x(t) = (x 3x0 ) sin t +


+ x0
2

T
2

for

0t

for

T
FN
t T, x0 =
2
k

For forces which are not linear in x and its derivatives, in general not even piece-wise
solutions can be given. Other problems for which no solutions exists are problems with
many degrees of freedom (e.g. planet systems), or flow problems.
For the technically important fields of structural analysis and fluid mechanics, most
results are nowadays obtained by computer simulations.

4
Fluid mechanics: Flow around a sphere with increasing Reynolds number/ flow speed:
Analytical solutions exist only for the Stokes flow problem.
Vortices
Vortex Street
Stokes
Turbulence

0.2

New MAC-Installation

Since 4/2006, the exercises-room is equipped with MAC-computers instead of the old
SUN-Unix-Workstation-Terminals. Software can either be started via the WindowIcons in the Applications-Directory, or via the command-line terminal, which gets
started by clicking on the X-Icon. MATLAB can be started by clicking on the MATLABIcon. It is recommended for the Course to use the EMACS-Editor. Because the current
MAC-OSX-Operating-System is based on the Unix-Operating system, the following
comments on Unix are useful, last not least, because UNIX-commands (for directory
listings, previewing of graphics, removal of unneeded data etc.) can be used from the
MATLAB-prompt via ! as escape-sequence.
WARNING! The new MAC-Installation allows the teacher to view the screen and the
currently active programs in each student terminal. Applications which are unneeded
for the lesson can be terminated from the teacher-console.

0.3

UNIX-Workstations

The MAC-computers in the terminal-room cannot be used remotely. If students want


to login remotely from outside the terminal-room, the access is possible to the SUNcluster which has the name sun.edu.cc.uec.ac.jp, which consists neither of MACs
nor PCs, but UNIX-Workstations. The login (also possible from the MAC-computers)
has to be via the secure shell and login by setting the X-terminal is possible as
ssh -X sun.edu.cc.uec.ac.jp
or
ssh -Y sun.edu.cc.uec.ac.jp
(the option -Y or -X depends on the version of the operating system one is logging
on from). UNIX was originally written to be used from a commando-prompt window,
not from GUI/Window systems. It is advisable to be able to use the original UNIXcommands. If you like UNIX and would like a UNIX-like environment on your PC,
install the free CYGWIN-package, www.cygwin.com.
Some survival-UNIX-commands:

0.3. UNIX-WORKSTATIONS

copy the file source to the file destination if


destination exists, overwrite it with source
cp -r source destination
like copy, -r means recursive, works also with
directories
pwd
display current directory
mv
source destination
rename the file source to the file destination
cd
change directory
ps x
display existing jobs and their job-id
kill job-id
kills the job with the number job-id
ls
list directory content
ls -d
list all directories in the current directory
ls -lrt
list all files with information about their size, in
the order in which the have been created, the
newest ones a the end
ls [a-c]* b
list all files in this directory with names beginning with a,b or c
find . -name thisname -print look for the file or directory thisname in the current directory and in all the subdirectories
find . -name *arg* -print
display all files and directories in in the current
directory and in all the subdirectories which contain the string arg in their name
fgrep asdf *.txt
look for all lines which contains the string asdf
in all files which have the extension .txt.
UNIX is a multi-user multi-process operating system, so several uses can run commands
at the same time on a single computer. It is also to move jobs in the background
when starting them so that they dont block the command prompt by appending the
ampersand &. If a program was started in the foreground and blocks the prompt, it
can be pushed into the background via CNTRL-Z and the execution will be interrupted.
If in the same terminal window bg is typed, the execution is resumed. (Of course,
this does not work with programs which have an input prompt in the foreground, like
MATLAB).
Some special directories:
./ the current ../ the directory below ~/
the users login direcdirectory
the current directory
tory
Some special characters:
*
any string
?
any single character
[a-f] all of the characters of
a,b,c,d,e,f
The recommendation for this course is to work in a directory which is dedicated to
MATLAB alone, and this directory is not the root-directory of the account. The name
ml should do just fine:
cp

source destination

mkdir ml
cd ml

0.4
0.4.1

MATLAB
Introduction: Interpreters and Compilers

In general, for the programming projects with high numerical complexity, it will be
the best to develop the algorithms in MATLAB. MATLAB is, like BASIC or symbolic
computer languages like MAPLE, MATHEMATICA, MACSYMA and REDUCE, an
interpreter language, i.e. the language commands are translated into processor instructions. Nevertheless, MATLAB is not a symbolic language, but performs all calculations
numerically, i.e. with floating point numbers.1 The language can be used either from
a command prompt or as a functional (or object-oriented) programming language. In
compiler languages like FORTRAN, C or PASCAL, the program is fully translated
into processor instructions before execution. If errors occur at runtime, the memory
contents is difficult to analyze, usually only with the help of a debugger, which may
alter the program execution and memory layout up to the point that some errors cannot reproduced. The debuggers properties vary much more than the language itself. In
MATLAB, after a program crash, the data are still accessible in MATLABs memory
and can be analyzed using the commands from the MATLAB-language itself.
Interpreters allow fast program development. As a rule, their execution times are
higher than those of compiler languages, but during program development, usually the
compile time is more costful than the actual runtime. In MATLAB, when complex
builtin functions are initialized via small commands, like a matrix inversion, very often
the advantage in speed for the compiler languages is negligible.
Many programming languages have a whole zoo of data types. MATLABs elementary
elementary data type is the complex matrix. (Recently, MATLAB also offers more
kinds of data types, but we will not use them in this course). Variables can be processed up to the point where they take a complex value. Variables which are used as
indices must nevertheless have an integer value.
Because it is not possible to declare variables in MATLAB, is refuses to process variables which are not initialized. In FORTRAN77, for example, it was possible to use
variables which were neither declared nor initialized, and which assumed the value 0
at the moment they were used.

0.4.2

Getting started

The MATLAB-Interpreter is started on our installation by by typing


matlab
at the command prompt, which starts the MATLAB-desktop. If you are busy and you
dont want to see the splash-screen (MATLAB-Commercial) at the program start, use
matlab -nosplash
1

The symbolic package available with MATLAB is basically MAPLE with a MATLAB-Interface.

0.4. MATLAB

Basic Commands:
edit
starts the MATLAB-Editor with Syntax-highlighting of MATLABcommands. You can use any editor you like to write MATLAB-files,
but the line-end may vary between operating systems and may lead
to trouble
clear
empties the memory
clear a
clear the variable a from the memory
who
displays the variables which have been assigned
help
gives help concerning a specific topic
help help
tells you how to use the help function
lookfor
looks for a word in the help files, useful if you are looking for a command according to context, but are not sure about the command
name
disp(a)
displays the value of the variable a
disp(a)
displays just the string, a.
rand
random number generator, will be used a lot to initialize data
format
format the output, format compact suppresses output of empty lines,
format short forces the rounding of the output to eight digits, but
the computations are still performed with full precision
%
comment sign
ls
displays the current working directory of MATLAB, i.e. the directory
for which MATLAB can access the files directly
cd
changes the current working directory of MATLAB
The MATLAB-desktop is written in JAVA (another interpreter-based programming
language), which has still some stability problems2 , so the desktop crashes relatively
often. If you dont want to work with the desktop to avoid unnecessary crashes, but
want to write the programs in a Unix-editor you know, you can also start MATLAB
with the command-prompt only as

matlab -nosplash -nodesktop

To get an idea why the JAVA-Interface of MATLAB crashes so often, see the internal memo from
SUN from http://www.internalmemos.com/memos/memodetails.php?memo id=1321
2

8
Special Characters:
!
escape sequence, allows to use UNIX-Commands like cd, pwd from
the MATLAB-prompt
[...] 1. vector brackets referring to the value of the entry, [1 2 3] is a
vector with the entries 1, 2 and 3.
2. brackets referring to the output arguments of functions.
(...) 1. Brackets referring to the indices of a vector, a(3) is the third
element of the vector a
2. brackets referring to the input arguments of a function.
...
three dots mark the end of a line which is continued in the next line
;
has no syntactical function like in C but is only used to suppress the
output of the operation

i, j
stands for the complex increment 1, but can also be overwritten
for other uses.
pi
is indeed 3.1415....
,
divide commands, when several command lines should be written in
the same editor line
:
divide loop variables, lower_bound:stepwidth:upper_bound,
WARNING !
lower_bound,stepwidth,upper_bound only displays the variables
lower_bound, stepwidth, upper_bound
As a first reference, Kermit Sigmons MATLAB primer at
http://math.ucsd.edu/driver/21d-s99/matlab-primer.html
can be recommended. It gives a short overview over available commands, but it is a
good idea to get used to the builtin help-function of matlab (just type help from the
prompt). For most purposes, the internal help is sufficient. Manuals for MATLAB
are available, but there is not much information which ones needs beyond the builtin
help command on a daily basis, except the references to the algorithms used. This is a
huge difference to e.g. MATHEMATICA, where the algorithms are secret. Beware,
in contrast e.g. to FORTRAN, MATLAB is case sensitive, ABC is not the same as
abc. If you used the same variable names in lower case and upper case in the same
programs, you will run into trouble anyway. Information about a public-domain clone
of MATLAB, OCTAVE, can be found at www.octave.org.
Control statements are usually terminated via the end command, no matter whether
it is an if statement or a for loop:
a=2
b=3
for i=1:10
if (a>b)
i
disp(a>b)
end
else
disp(a<=b)
end

0.4. MATLAB

0.4.3

Matrix Processing

MATLAB was started by Cleve Moler, a famous researcher in numerical linear algebra,
as a MATRIX LABORATORY for his students, which should allow fast, save and easy
development of algorithms for numerical matrix analysis.
MATLAB has evolved to a general purpose language which specialized applications in
many fields. Many books in the meantime use MATLAB either as a formal language of
for the programming examples, have a look at http://www.mathworks.com/support/
books/index.jsp.
Matrix Syntax:
*
.*
a(2:4)
end
a(2:end)
b=c(2:3,2:6)

multiplies two matrices according to the conventions of inner/outer/matrix product


multiplies two matrices elementwise
elements of vector a from the second to the
fourth element
the last element in a row/column of a vector/matrix
elements of vector a from the second to the last
element
assign b the values in the matrix c from line 2
and 3 from row 2 to 6

With the matrix syntax and the proper use of brackets, many operations can be simplified without the use of loops:
a=[0.5:0.5:10]
for i=1:20
or
a(i)=i/2
a=linspace(0.5,10,20)
end
Many functions either operate on vectors and matrices elementwise, or they are matrix
function in the sense that the operations are performed as matrix functions.
Matrix/Vector Functions:
length
give the longest dimension of a matrix, or the length of a vector
size
gives the dimensions of a matrix
linspace(a,b,m) make a vector with entries in m equidistant intervals between a and b
rand(n,m)
set up a random matrix with n lines and m rows
exp
exponential function, works elementwise on a matrix
expm
matrix exponential function, works on the eigenvalue of a matrix and
can only be used for square matrices
eig
eigenvalue decomposition
inv
matrix inversion
norm
matrix/vector norm
det
determinant
svd
singular value decomposition

10

0.4.4

User-defined Functions (m-files)

User-defined functions can be written as ASCII-files with the extension .m. A function
my_function would contain in the file my_function.m
function [out_arg1,out_arg2,arg3]=my_funtion(in_arg1,in_arg2,arg3)
% function [out_arg1,out_arg2,arg3]=my_function(in_arg1,in_arg2,arg3)
% The first comment after the function declaration is
% displayed if "help my_function" is typed to write
% self-documenting functions
........
return
It is advisable always to end a function with a return statement, and also the main
program.
For input-functions, MATLAB-functions use call by value, which means that the
input-arguments (in round brackets) cannot be modified in the functions. Only the
output-arguments (in []-brackets) can be modified by the called function. If an argument is to be used as an input-argument and an output-argument, it must appear in
the round brackets and in the []-brackets, like arg3 in the above example.
Global variables can be defined with the statement global in a similar way as variables
are declared in other programming languages. The same declaration must then be used
in the functions which use the variable. Global variables can be also overwritten in the
functions, they are call by reference variables.
FORTRAN uses call by reference for all input variables of subroutines. C uses call by
value for scalars and call by reference for arrays, so that a pointer to a variable must
be used if a scalar is to be modified in the functions.
Functions can be overloaded for different numbers of input parameters and for scalar
and matrix functions. If the operations used in the function allow an interpretation in
the matrix-sense, the function can automatically used for functions.

0.4. MATLAB

11

Exercises
1. Set up a vector with the entries (1, 2, 3, 4, . . . n) once using a for-loop, the second
time using an implicit loop.
2. Multiply every second element with a constant a, once using a for-loop, once
using an implicit loop.
3. Write a program which tests which finds out which elements of a vector are even
4. See what happens if you set up ones(L) , ones(L,1), ones(1,L), and what
happens when you try to multiply these objects with each other.
5. a) What do you expect what the following program does?
clear
step=2
upper_bound=10
for i=1,step,upperbound
disp(i)
end
return

b) What does the program really do? c) How do you have to rewrite the program
so that the program does what you expected it to do in a)
6. Write a program which computes the factorial n! of an integer number n,
n n!
1 1
2 12
3 123
4 1234
7. Rewrite the factorial-program as a subroutine
8. Rewrite the factorial-subroutine so that the input-arguments are checked, so that
only proper input arguments are accepted.
9. Use the help-function of MATLAB to find out the relation between the built-in
function gamma and the factorial.

Chapter 1
How to write better programs
In this chapter, I will discuss the basics of programming style for numerical computing.1
Everything seems to be a matter of course, and during several courses, some students
who considered themselves experienced programmers skipped these lessons. Usually,
after 2 weeks homework, they ran into exactly those pitfalls, problems and errors as I
discussed in these pages, and usually wasted several hours which could have been spent
productively. My usual comment was: We had this two weeks ago when you didnt
attend . . .

1.1
1.1.1

Programming Style
Choosing variable names

Of course, nobody would use variable names in scientific computing which have no
scientific meaning, like linda,charly,taro when there is no documentation what the
variables mean. Variables which are difficult to spell, like asdtfgl or such-like should
better be avoided, except if there is a convention how to compose such variable names.
Some variable names in scientific programming are self-explaining, like
x,y,z,vx,vy,vz,omega
etc. I is very easy to over-do the self-explanation by choosing too long variable names,
as i saw once in the programs in a masters thesis:
this_is_the_coordinate_of_x.
this_is_the_coordinate_of_y,
I will use the terminology Computational Physics, Computational Engineering, Scientific
Computing, Scientific Programming pretty much as synonyms. Numerical methods, numerical
mathematics, numerical algorithms I will use when I want to emphasize mathematical techniques
to handle floating point computations, minimize roundoff-errors, control discretization errors etc.
Numerical physics i will use if i want to emphasize that the techniques for the computational
physics require an understanding of the floating point computations involved.
1

14

CHAPTER 1. HOW TO WRITE BETTER PROGRAMS

etc. Of course, dont choose meaningless short variable names, like


xx,xxx,xxxx,xxxxxy
should be avoided. If one wants to express the order of the derivation, e.g. time
derivation of a coordinate (Gear Predictor corrector method uses up to the 5th time
derivative ....), is may be good practice to use
x_0, x_1, x_2, x_4 . . . . .
for the coordinate, first derivative, second and so on. A convention like
dx_f, dy_f, dx2_f, dx_dy_f

.....

is also not a bad idea.


The Fortran77-standard allowed only variable (and subroutine) names of six characters, so once I spend a happy week in 1989 rewriting my longer into shorter variable
names. That was before the coming of the Fortran90-standard, nowadays, all FortranCompilers accept longer variable names, but you may come across programs in the old
convention. I am not sure about the variable-lengths for C++-compilers, but be aware
that internally the compiler will expand internally the variable name variablename in
structure structurename in the object objectname to something like
objectname_structurename_variablename
and when these names become too long, this may also cause trouble. A colleagues
program once refused to compile because the internal name representation was longer
than 256 characters, and also debugging tools have problems if e.g. subroutine names
in objects or modules are becoming too long. As far as too long variable names are
concerned, one may run into similar Problems with new C++ Compilers as one did
with Fortran77 Compilers decades ago.
Be aware that similar variable names can be easily confused, especially if they make
use of uppercase/lowercase letters and the underscore, like
Variablename, variablename, Variable_name
so the use of all three in the same program will certainly cause problems. It is a
good convention to use variable names which sound differently. At one point in ones
programming career, one should decide whether to write composite variable names
with an underscore, Variable_name, or not, as Variablename, or as VariableName.
More considerations about the convention and choices for Variable names can be found
in Code Complete.
It is practical to reserve i,j,k for loop variables for short loops, which increases readability, especially if the original mathematical formulae e.g. for vector operations use

1.1. PROGRAMMING STYLE

15

i, j, k as indices. In scientific applications, n is very often reserved for particle numbers,


and l,lx,ly,lz for system sizes. Using short variable names increases the readability,
when the usage of the variable is clear, e.g. one just implements a formula according
to a text book. Of course, it is not a bad idea to include the reference (page and title)
to the text book in a comment line.
Original
Good:
Not so good:
formula:
% A.B. Meier, Mechanics,
%
a = v/t
% p. 15, eq. 4
acceleration=velocity/time
1 2 a=v/t
position=.5*acceleration*time^2
x =
at
x=.5*a*t*t
2
For long, complex programs, the usage of e.g. n as particle number or m as number
of timesteps becomes increasingly cumbersome, especially if one tries to recycle the
code and the variable names have already be used e.g. as loop-variables. It is better
to append some information, so if one has to treat walls and particles in a program,
one should define n_wall, n_particle, and for the computation of e.g. the mass, the
program would best be written like
for i_particles=1:n_particle
m_particle(i_particle)=r_particle^2*pi
end
and in the same way for the walls.

1.1.2

Readability of code and Computational effort

Try to write your code as readable as possible. One definition of a programming Guru2
is: He understands his programs still after he has not looked at them for ten years. If
you consider yourself a Guru, try to read programs from ten years ago. There is a world
championship in writing the most unreadable C-Program, the infamous International
Obfuscated C Code Contest3 , and one of the winners wrote the following:
/*
* Program to compute an approximation of pi
* by Brian Westley, 1988
* (requires pcc macro concatenation; try gcc -traditional-cpp)
*/
#define _ -F<00||--F-OO--;
int F=00,OO=00;
main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()
{
_-_-_-_
Further Information on how to write good Programs can be found in: Code Complete, SteveMcConnel, Microsoft Press, Paperback 1993, also available in Japanese
3
Homepage at http://www.ioccc.org/
2

16

CHAPTER 1. HOW TO WRITE BETTER PROGRAMS

_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_
_-_-_-_
}
As the purpose of science is clarity, the purpose is writing scientific code is also clarity and
readability. Unreadable code is code which is hard to debug, and errors in scientific computing
are much more difficult to detect if you get the second digit in your as in the case of commercial software, where you can always tell by messages like segmentation violation. Moreover,
commercial software vendors can make money by selling software updates, whereas in scientific computing, people who wrote buggy code will have trouble in their career. Unreadable
code is not the fault of the programming language, though some programming languages
attract chaotic programmers more than others. The advantage of restrictive programming
languages like ADA is, that you cannot make certain classed of errors.

What is in a line
Be aware that identical operations in a computer are not speeded up by cramming everything
in the same line,
a=2*b
a=a^2
a=a/c+d
will take the same computer time as
a=((2*b)^2)/c+d
what is more readable, depends on the implemented formulae. There are tricks called performance optimization which actually allow faster program execution due to the style the
code is written, but this has nothing to do with cramming many commands in a single line,
but this can only discussed in a later chapter.

Coherence
If you are not sure which lines in the code should be grouped together, it is best to stick
to the concept of coherence, writing operations in consecutive lines which affect the same
variables. Instead of

1.1. PROGRAMMING STYLE

17

a1=b1*c1
a2=b2+c2
a3=b3/c3
a1=a1-d1/e1
a2=(a1+a2)/2
a3=a3*a2
it is better to write
a1=b1*c1
a1=a1-d1/e1
a2=b2+c2
a2=(a1+a2)/2
a3=b3/c3
a3=a3*a2
Once I had to find the error in the program of a student. The result was correct, except that
it was 10 orders of magnitude wrong. He should have divided the result by a timestep dt.
The student knew that if one has to do many divisions by the same numberdt, it is fast to
compute the inverse i_dt=1/dt and multiply with i_dt. And he thought that he could save
programming time by not defining a new variable i_dt, so his program looked like
dt=10^-5
dt=1/dt
............
(one page of code
............
result=preliminary_result/dt
A perfect interaction of a stupid choice of variable names (the name of the variable at the
end did not match the meaning), a code which was longer than one page, so one could not
read it in a single window, and an incoherent way of using the variable dt.

Line length and Subroutine length


Fortran77 and also some other programming languages limited the line length to 72 characters. Many C-Programmers consider it as advanced programming style to indent their code
so much that they cannot even get the full line length on a 19inch screen, and they have to
scroll their windows to the left and right. Be aware that what you cannot see on a single
glance, but only after repeated scrolling, can easily cause errors as you have no oversight
over your code. This applies to vertical as well as horizontal scrolling, so keep the line number
of a subroutine to a certain limit (two A4 pages may already be too long) and also the number
of columns should not be more than maybe 80 characters, every program which needs more
should have more subroutines.

18

CHAPTER 1. HOW TO WRITE BETTER PROGRAMS

1.2

Safety first

The most important aspect of scientific programming is the safety of the programs. Never
in the history of mankind has it been possible to produce so many wrong answers so fast4 .

1.2.1

Check Input variables

Always check the input variables of your subroutines. You may know with which parameters
the subroutine must be used, but there may be somebody else who may not know it, usually the next student who uses the program after you, who will produce a lot of numerical
garbage. So even if you have a simulation of a mechanical system, which should be used with
positive timestep and positive masses, you better check whether the timestep at the masses
are larger than 0 at the beginning of the program. Moreover, error in passing arguments in
the subroutine can be detected more easily like that. If you find a wrong input parameter,
dont replace the input with a default value, but stop the program good and hard:
input(mass)
if (mass<-0)
error(mass should be larger than 0)
endif
For general software, it may be a good idea to define a default value. For most numerical
applications (except for accuracy thresholds), specifying a default input may be a very bad
idea.

1.2.2

Operator precedence

For analytic arithmetic expressions, the order of the arithmetic operations is usually well
defined, so that a + b cd is automatically evaluated as a + (b (cd )). Usually, the order of
the operations is equally clear with logical expressions, but with numerical code, it is a priori
not clear whether for the logical operator not, and, or as ~,&,|
(~a<b*c&d==0)
is evaluated as ((~(a<b*c))&(d==0)), or as (~((a<b*c)&(d==0)), or whether the logical
operations can indeed be applied bitwise to the integer-values as ~(10101010)=(01010101)
and then be used as the numbers of respective type. So if anything occurs which is more
ambiguous than addition and multiplication, one should use brackets.

1.3

Program documentation

Always document your program, and the best method will be to write the explanation within
the code, if they are elsewhere, they will get lost over the years. I will reject any project
which is not well documented.
4

Carl-Erik Froberg

1.3. PROGRAM DOCUMENTATION

1.3.1

19

Stupid comments

There are useful ways and stupid ways to write comments. When I once emphasized the
importance of comments for computer programs, in the next exercise lesson one student
wrote the following comment:
% here is a comment
When I asked why he wrote such a comment, he said: Because you said we should write
comments. But he had not written in his program what the program should do, and during
one hour of programming forgot actually what he should program... Another stupid comment
would be
% Divide by c
a=b/c
Of course, the multiplication is self-explaining, but for the same short line a comment like
% c from function XXYYZZ, not yet checked whether c becomes 0
a=b/c
may help a lot in debugging the code. Generally, focus on what the code is doing, not how,
because how it is done can be read from the programmed lines.

1.3.2

Comments

Usually, every line which contains information which is not self-explaining, like
volume=lx*lx*lz
mass=volume*rho
should better be documented. Of course, the amount of comments necessary grows with the
number of people who are supposed to use the code, with the number of functions and lines
in the code and with the complexity. If you are not sure who will use the code, then better
write your program documentation in English. It is generally a good Idea to formalize ones
documentation, especially at the beginning of functions/subroutines:
%PURPOSE: What the program is supposed to do
%USAGE: When and how the program
%AUTHOR: Who wrote the program
%DATE: Date when the program was written
%ALGORITHM: If the algorithm used is more complicated
%
than what you can document in the body of the subroutine, you
%
better explain the algorithm here
%LITERATURE: If you have used a complicated algorithm e.g. for
%
matrix inversion etc, write from which book or article the algorithm
%
comes, usually you have also used the naming conventions, and anybody
%
who wants to understand the algorithm (maybe you after ten years) better
% reads the literature first.
%CAVEATS: If you have programmed
%TODO: How to improve the algorithm the next time you have time
%REVISION HISTORY: Write the date when you modified the algorithm,

20

CHAPTER 1. HOW TO WRITE BETTER PROGRAMS

This above example is easy to maintain, to modify or add to. What is not easy to
maintain, would be something like
%
PURPOSE:
% +-------------+
and so on and so on. The simpler you design your comments, the more likely it is that
you really write them in the way they should be written. If any of the above points, leave
them away. If the routine is complete and runs as it should, dont write an empty TODO
point. If your routine-name is my_asin, (My arcus-sinus), then you dont have to do much in
principle. But if the routine actually computes the sinus in a non-standard-way by polynomial
approximation, you better write where you have it from in the literature. If the routine is
vectorized, this should be stated in the PURPOSE. If the vectorization works only if a vectorized
division is availabe, this should be written in the caveat. If you write a routine for the first
time, you dont have to write a REVISIONHISTORY, the date is enough.
And when you change the routine, also change the comments! Nothing is more confusing
than working with a correct routine for my_sinus which calculates a cosinus.

Exercises
1. Check the MATLAB-programs you wrote up to now whether they are in accordance
with the above ideas
2. Write a program which creates a matrix where the first column contains equally spaced
x-values between -5 and 5, and the second column contains the values of the secondorder polynom y = ax2 + bx + c
3. Write a program which makes creates a matrix where the first column contains equally
spaced x-values between -5 and 5, and the second column contains the values of the
function y = 1/(1 + x)
4. Write a program which can dectect whether the result of an mathematical computation
has complex parts

Chapter 2
Stochastic methods I
Stochastic methods use concepts from probability theory. Knowledge about stochastic methods is important in every field of science and engineering, because each data series contains
a certain element of chance or a certain scattering of the data.

2.1

Random Number Generators

In computer simulations, the element of chance is usually simulated by so-called randomnumbers, or pseudo-random-numbers. A random number generator is a function which should
generate a sequence of numbers which are distributed according to certain probability rules.
In case of equally-distributed random numbers, the numbers are usually between 0 and 1,
and all values can be obtained with the same probability. The random number generator in
MATLAB is called rand, and it can be called with arguments so that the result is not just a
single random number but a vector or matrix:
clear , format compact
rand
% output a
a=rand(1,4) % output a
b=rand(4)
% output a

, format short
random number
4x1 vector of random numbers
4x4 matrix of random numbers

This program using the function rand for equally distributed random numbers gives the
following output:
>> showrand
ans =
0.9501
a =
0.2311
b =
0.7621
0.4565
0.0185
0.8214

0.6068

0.4860

0.8913

0.4447
0.6154
0.7919
0.9218

0.7382
0.1763
0.4057
0.9355

0.9169
0.4103
0.8936
0.0579

22

2.1.1

CHAPTER 2. STOCHASTIC METHODS I

Mean and Variance

Standard quantities which characterise statistical properties of a set a1 , a2 , . . . an of n numbers


are the mean
mean: = &a' =

n
n
1(
1 (
ai . and the variance = Var(a) =
&ai &a''2 .
n i=1
n 1 i=1

the mean of the squares of the differences between the respective samples and their mean.
The square root of the variance is called the standard deviation.
% PURPOSE: Calculate mean and Variance
% for the MATLAB-Random-Number Generator
clear
format compact
format short
n_rn=10000
rn_vec=rand(n_rn,1);
% rand(n_rm,1) gives line vector of length 10000,
% rand(1,n_rn) gives a row vector of length 10000,
% rand(n_rn) gives a square matrix of length 10000^2 and crashes the program
mean_rn=mean(rn_vec);
var_rn=var(rn_vec)
return
Exercise: Calculate by hand the theoretical mean and variance of the for random numbers
equally distributed between 0 and 1.
Another random number generator in MATLAB is randn, which creates random numbers
according to the Gauss distribution
)

1
(x xm )2
G(x) = exp
2 2
2

and the normally distributed random numbers from randn have mean xm and standard
deviation = 1.
Exercise 2 : Estimate the error-dependence in a statistical sequence of random numbers
from the number of random numbers used by comparing the theoretical variance for the
randn-random number generator with the actually measured variance.

2.1.2

Distributions and tests of random numbers

A visualization for random numbers is just to draw the histogram: How many random numbers in the interval X. These intervals are called bins of the the histogram, the collection
of data in the histogram is often called binning. For a certain number of bins in the histogram, the distribution of the random numbers can be studied. For the following program
the output is given below and the drawn histogram is given on the right:

2.1. RANDOM NUMBER GENERATORS

23
2

clear
format compact
format short
a=rand(1,4)
hist(a)
>> randhist
a =
0.3423

1.5

0.5

0.3544

0.7965

0.5617

0
0.3

0.4

0.5

0.6

0.7

0.8

If many bins are used, and few random numbers, the histograms is rough, if more
random numbers are used, the histogram is smooth:
50 random numbers,10 bins

clear
format compact
format short
a=rand(1,50);
subplot(3.9,2,1), hist(a)
set(gca,Xticklabel,)
title(50 random numbers,10 bins)
axis tight

10

0
500 random numbers,10 bins
60
40

b=rand(1,500);
subplot(3.9,2,3), hist(b)
set(gca,Xticklabel,)
title(500 random numbers,10 bins)
axis tight
d=rand(1,50000);
subplot(3.9,2,5), hist(d)
set(gca,Xticklabel,)
title(50000 random numbers,10 bins)
axis tight
subplot(3.9,2,7), hist(d,100)
title(50000 numbers,50 bins)
axis tight

20
0
50000 random numbers,10 bins
4000
2000
0
50000 numbers,50 bins
400
200
0

0.2

0.4

0.6

0.8

Exercise 3: Estimate the dependence of the statistical fluctuations, i.e. dependence in the
differences in the number of entries in the histogram on the number of entries.
A basic test for random numbers is whether the random numbers in a bin are the same
within the statistical fluctuations Much more sophisticated tests for Random Numbers can
be found in Knuth1 . Nevertheless, to evaluate the usability of a random number algorithm
for a given problem, one should not rely on theoretically available algorithms, but one should
1

Donald Knuth, The Art of Computer Programming, Addison-Wesley 1998

24

CHAPTER 2. STOCHASTIC METHODS I

test the algorithm for a problem with an unknown solution with a related problem for which
one knows the solution. Another visual way of controlling random number sequences is to
plot one sequence as the x- and the other as the y-coordinate:
clear
format compact
n_rn=100;
a=rand(n_rn,1);
b=rand(n_rn,1);
plot(a,b,.)
axis image

2.2
2.2.1

0.8
0.6
0.4

Usage of random numbers 0.2


Initializing the Seed

0.2

0.4

0.6

0.8

Random numbers are used to verify statistical hypotheses, or to initialize simulations in an


arbitrary way. To test statistical hypotheses with several, independent sequences of random
numbers. Nevertheless, during program development, it is advantageous to test and debug the
program always with the same random number sequence. The start value which determines
the sequence is called seed, which is set by
rand("seed",X)

where X should be a numerical value. For general random number generators, very often
prime numbers have to be used as seed, so always read the documentation first.

2.2.2

Monte Carlo Method: Calculate by random numbers

Before random numbers could be easily and fast generated with computer algorithms, mathematicians used tabulated random numbers2 , similar as values for integrals are still used
today. Some of these random number tables had been compiled using Roulette Results from
the Casino of Monte Carlo in Monaco, and so Monte Carlo Methods got their name. In recent years, in Computer Science it has become fashionable to name some methods Las Vegas
methods instead of Monte Carlo methods, but the difference is purely academic.

See E.g. Random numbers in uniform and normal distribution : with indices for subsets / compiled
by Charles E. Clark, Chandler Pub. Co, 1966
2

2.1. RANDOM NUMBER GENERATORS

25

To calculate with random numbers, let us


consider a quarter circle of radius 1 and area
a = /4, in a square of length 1 and area
a! = 1. If we choose a point randomly inside
the square, the probability P that it is inside
the area is
P =

N(in Circle)
a
/4
=
=
,
a!
1
N(in Square) = Ntotal

where P is the relative frequency with which


points are found inside the quarter-circle.
Therefore, can be computed via the relative
frequencies as
=4

N(in Circle)
.
Ntotal

A program which does this computation is


given on the right.
Exercise: Try to understand the time behavior by plotting the difference to the absolute
result in different scales (logarithmically, double logarithmically)

2.2.3

r=1

clear
format compact, format short
mc_step=10000
n_insize=zeros(mc_step,1);
n_try=zeros(mc_step,1);
i_inside=0;
for i_mc=1:mc_step
x=rand;
y=rand;
r2=x*x+y*y;
if r2<=1
i_inside=i_inside+1;
end
n_try(i_mc)=i_mc;
n_inside(i_mc)=i_inside;
end
4*i_inside/mc_step
return

Simulation of Stochastic processes

Random numbers allow to simulate processes which are often considered to be deterministic
in a stochastic way. Let us in the following consider a league of teams, which sports (baseball,
soccer, basketball ....) does not matter. Each of the six teams has a certain game strength
Si . Let us define the probability for a team A to win against the other team B as
PAB =

SA max (Si )

.
SB SA + SB

In the following program, the game strength (team_quality) is the same for each team,
nevertheless you will find that usually one team wins. In the run which is depicted behind
the listing, the percentage of wins for all the teams are plotted. One can see that in the
beginning leading team 2 finishes as the last team, whereas the also leading team 6
wins the championship. In real life, sports reporters waste a lot of time and energy
on explaining such developments, but in out simulation, we can see that such such narrow
outcomes are just a result of chance. For stock exchange fluctuations, the same reasoning
applies.
clear , format compact,
n_team=5, n_game=100

format short

26

CHAPTER 2. STOCHASTIC METHODS I

for i=1:n_team
team_quality(i)=11
end
n_games_played(1:n_team)=0
n_games_won(1:n_team)=0
for i_game=1:n_game
for i_team=1:n_team
for j_team=i_team+1:n_team
n_games_played(i_team)=n_games_played(i_team)+1;
n_games_played(j_team)=n_games_played(j_team)+1;
win_probability=...
...% relative probability
(team_quality(i_team)/team_quality(j_team))*...
...% normalization
max(team_quality)/(team_quality(i_team)+team_quality(j_team));
% assign a winner according to the probability
if (win_probability>rand)
n_games_won(i_team)=n_games_won(i_team)+1;
else
n_games_won(j_team)=n_games_won(j_team)+1;
end
score(n_games_played(i_team),i_team)=n_games_won(i_team);
score(n_games_played(j_team),j_team)=n_games_won(j_team);
end
end
end
% normalize the number of games won to a winning probability
normalization=ones(size(score));
normalization=cumsum(normalization(:,1));
plot(normalization,score(:,1)./normalization,--,...
normalization,score(:,2)./normalization,.,...
normalization,score(:,3)./normalization,+,...
normalization,score(:,4)./normalization,-.,...
normalization,score(:,5)./normalization,:,...
normalization,score(:,6)./normalization,-)
legend(team 1,team 2,team 3,team 4,team 5,team 6)

2.1. RANDOM NUMBER GENERATORS

27

1
0.8
0.6
0.4
0.2
0

10

20

30

40

50

60

70

80

team 1
team 2
team 3
team 4
team 5
90
team 6

100

Exercise: Modify the quality and see how the winning probability changes. Find out how
strong you have to modify the winning probability so that it wins in all test runs.

2.2.4

Time averages and ensemble averages

In the example of calculating with random numbers, better statistic can be obtained by
using more Monte-Carlo steps. As an alternative, one can also run the programs with few
Monte-Carlo steps several times with different seeds, save the results and average the results.
This will reduce the noise in the data. For statistically independent data, as for our calculation
of , both approaches are independent.
The law of large numbers states that the actual probabilities are only realized after infinitely
many tries. For a finite number of realizations, the fluctuations in the systems can clearly be
felt.
An approach like the one above where successive Monte-Carlo data are obtained independently from each other is called simple sampling. If the Monte-Carlo data are chosen
depending on the previous data, this procedure is called importance sampling.
Homework 1: The obsolete Why-Function
Implement the old why-function (which does not exist any more) from Matlab Version 5.
When you typed why, you got the possible answers:
why not?;
dont ask!;
its your karma.;
stupid question!
how should I know?
can you rephrase that?
it should be obvious.
the devil made me do it.
the computer did it.
the customer is always right.
in the beginning, God created the heavens and the earth...
dont you have something better to do?;
because you deserve it
or

28

CHAPTER 2. STOCHASTIC METHODS I

Cleve / Jack / Bill / Joe / Pete


insisted on it
suggested it
told me to
wanted it
knew it was a good idea
wanted it that way
Write a program which randomly gives one of the four answers, with equal probability
Homework 2: The goat quiz
At the end of a quiz-show, the winner has to chose his price behind three doors. Between
two doors, there is a goat, if the winner chooses one of these doors, he gets nothing.
First the winner chooses one door. Then the show-master opens one of the remaining doors,
which has a goat behind.
The winner is now allowed to switch his choice to the third remaining door, or to stick to the
door he has chosen.
If his choice was successful, he gets the price.
Write a program which allows you to find out whether it is better for the winner to switch
the door after the show-master shows the goat, or whether it is better to stick with the first
choice.
Think about what is better before you write the program, but dont manipulate the program
outcome to obtain your conjectured result.

Chapter 3
Numerical Analysis I
3.1

Data types: Integers

Integers are represented according to the number representation the computer uses internally. For example, in the binary representation, integers are represented as combination of
0 and 1, in the hexadecimal (Greek-Latin for 16) representation, integers are represented as
combinations from 0 to A, see Tab. 3.1. If you need the conversion from decimal to binary,
decimal
00
01
02
03
04
05
06
07
08
09

binary
00000
00001
00010
00011
00100
00101
00110
00111
01000
01001

hexadecimal
00
01
02
03
04
05
06
07
08
09

decimal
10
11
12
13
14
15
16
17
18
19

binary
01010
01011
01100
01101
01110
01111
10000
10001
10010
10011

hexadecimal
0A
0B
0C
0D
0E
0F
10
11
12
13

Table 3.1: Integers from 0 to 18 in decimal, binary and hexadecimal representation.


hexadecimal to decimal or whatever, you can always use the MATLAB-functions dec2hex,
hex2dec, dec2bin and bin2dex. 2, pronounced as to, like in decimal to hexadecimal.
The same naming logic is applied in num2str, conversion from numeric to string.
The difference between one integer and the next largest representable integer is always one,
and integers in different representations are always the same integers.
Integers in FORTRAN are also sometimes declared as INTEGER*4, because 4 Byte=4 8
Bit=32 digits are used to represent these standard integers. As one bit is reserved for the
sign of the integer, largest representable integer is something like 231 1, the smallest 231 +1.
As extensions to Standard -FORTRAN, there exist in some compilers also the INTEGER*8
type (8-Bit-integers, from 263 + 1 to 263 1) and the INTEGER*2 type (2-Bit-integer).
INTEGER*8 is convenient when large Integer-values have to expressed without the rounding
occurring in Floating point computations, whereas INTEGER*2 is convenient if large arrays

30

CHAPTER 3. NUMERICAL ANALYSIS I

of integers must be stored where the integers can only take very few values. The danger of
using the non-standard integer-types is that if one changes the compiler (or the computer
one works on), these data-types may not be available any more, and one has to rewrite the
whole program.
The C/C++-standards do not define the absolute accuracy of their data-types, but provides
the type int and longint, where the longint has possibly the larger number of digits (but may
have the same number as short int). Additionally, there is the unsigned-data-type, which
allows to represent a largest number in signed which is twice as large as in the signed data
type.

3.1.1

Fixed point numbers

Fixed-Point numbers are created from integers by renormalizing the integer with a prefactor. Fixed-Point numbers are needed in environments where a constant absolute precision is
needed, for example in the banking sector, where the accuracy of an operation always must
be rounded to a certain digit, e.g. 1/10000 $, and this accuracy must be maintained over the
whole data range, from the smallest transactions of a few dollars to Billions of dollars.

3.2

Data types: Floating point numbers

In technical and scientific applications, the orders of magnitudes used are much larger than
e.g. in banking or administration. Trillions of Dollars (1012 ) are a lot of money, but trillions
of molecules is something rather microscopic. Therefore, the preferred data type in scientific
computations is the floating point number, where the numbers are spaced our irregularly,
more numbers in smaller intervals, so that the relative accuracy of operations is constant,
not the absolute accuracy as with integer numbers.
MATLAB performs all operations in floating point numbers (actually, in complex floating
point numbers). In contrast, many standard programming languages like C, C++, FORTRAN, do not perform type conversion during arithmetic operation, but only at the time
of the assignment of the result. That means an integer-division of a number by a larger
number gives 0, and depending on the data-type, the results have different accuracies, as in
the following example in FORTRAN90:

program test_implicit
implicit none
write(*,*) 3/7
write(*,*) 3./7.
write(*,*) 3.d0/7.d0
stop
end

! = 0
! = 0.428571
! = 0.428571428571429

Integer-division
REAL*4-Division
REAL*8 Division

3.2. DATA TYPES: FLOATING POINT NUMBERS

3.2.1

31

Error

For the following sections, it will be convenient


exact A B = C
to define the numerical error of an opera = C
B
numerical A
tion, the difference between the outcome of an

exact operation using real numbers and the absolute error #absolute = |C C|

numerical operation using numbers as they relative error #relative = |C C|/|C|


are stored
in the computer. With respect to representing mathematical real numbers, e.g. multiples of
5./9., on the computer, integers have a constant absolute error, on average the error is of the
order of 1, whereas for floating point numbers have constant relative error as can be seen
in the following table.
Floating point number
Integer
Operation
Error
Operation
Error
50./9.=5.555555555555556
< O(1014 ) 50/9=5
< O(1)
500./9.=55.55555555555556 < O(1013 ) 500/9=55
< O(1)
relative error constant
absolute error constant

3.2.2

Usage

They are the only numbers on a computer which which fast, numerical computations are
possible over a large range of possible values. Floating Point Operations, FLOPS, are usually
given as the benchmarks for computers, and currently the fastest computer in the World,
the Earth Simulator near Yokohama, can do about 40 Tera-Flops. The precision of the
declared variables is usually expressed in the declaration statement: In the FORTRAN77
standard, REAL*4/REAL*8 (or DOUBLE PRECISION) expressed that 4/8 Byte were used
to represent and the data.

3.2.3

Data-Layout

In floating point numbers, Mantissa and Exponent are stored in such a way that the number
is represented as the sum of powers of the base , precision t and lower and upper bounds
for the exponents e, L e U. A floating point number xcan then be represented as
x=

d1
dt
d2
+ 2 + ... t

'

with
0 di 1,

(i = 1, . . . , t)

The usual real numbers in a higher programming like C or FORTRAN language have the
following characteristics:
Kind
Byte/Bit mantissa/exponent
Range
valid digits
Real
4/32
23/8
8.431037 3.371038
6-7
307
308
Double
8/64
52/11
4.1910
1.6710
15-16

32

3.2.4

CHAPTER 3. NUMERICAL ANALYSIS I

Example

The above representation does not give equidistant numbers, as can be seen if the distribution
of numbers for is plotted for = 2, 1 e 2, t = 3:

-4

-3

-2

-1

As can be seen by the above graph, floating point numbers have as many numbers between 1
to 10 as between 10 to 100, whereas integers and fixed point numbers have as many numbers
in the interval from 0 to 1 as from 1 to 2. In other words, if numbers are rounded to fixed
point numbers, there is a constant absolute error over the whole range of numbers, whereas
for floating point numbers, there is a constant relative error over the whole range of available
numbers.
The builtin-function in MATLAB to find out which is the largest relative space between
successive floating-point numbers is eps. This function depends on the implementation of
MATLAB as well as on the hardware and will give different results on different processors. If
you use a computer language other than MATLAB of FORTRAN90, where these functions
are built in, you can use the following algorithm:
% program eval_myeps
clear
format compact
% compute machine-epsilon
myeps=1.
myepsp1=myeps+1.
while (myepsp1>1)
myeps=0.5*myeps;
myepsp1=1+myeps;
end
myeps
Other builtin functions which are convenient to get ideas about the feasibility of some numerical algorithm are realmax, the largest representable floating point number, and realmin,
the smallest floating point number which is larger than 0. All these functions eps, realmin,
realmax are implementation dependent, i.e. their result may be different on different computer models, because the mathematical operations are wired in a different way on the
chip.
The actual number of valid digits of mantissa and exponent are usually not defined in

3.3. CHECKING FOR EQUALITY

33

language-Standards, so that the IEEE-Standard (IEEE= Institute of Electrical and Electronics Engineers)
uses for double precision
S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
0 1
11 12
63
with the sign S, the exponents E and Mantissa digits F. whereas CRAY used something like
S EEEEEEEEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
0 1
17 18
63
which due to the lower accuracy and other idiosyncrasies in rounding, has now totally vanished. For most numerical computations, double precision is sufficient, and the errors with
single precision computations will be too large. Additionally to double precision, many manufacturers offered Real*16/Quadruple Precision, which usually will be considerably slower
than double precision.
Modern compilers and Processors, as the Pentium4 and the G4/G5 allow faster computations if the compiler options allow rounding for double precision, so that the results will be
considerably less accurate than double precision/16 digits, but still more accurate than single
precision/8 digits.
That 4/8 Byte are used does not mean that all compiler functions operate on these data
types are computed with the full accuracy, correct up to the last Bit. The IEEE-Standard
defined that all results have to be given in such a way that only the last/ least significant bit
is rounded. Because this can become quite costly, one can usually choose compiler options
which offer higher accuracy, but not so good performance, or faster but less accurate code.
The errors elf-implemented routines may suffer from additional errors which will be discussed
in the next sections.

3.3

Checking for Equality

Concerning what has been said in this section about accuracy, there are some things one can
do with integers which one should not do with floating point numbers. For a start, do not
check floating point numbers for equality
if (a==b)
but check for equality with a certain error #. Be sure whether you need the absolute error
n=10
epsilon=10^{-n}
if (abs(a-b)<epsilon)
or the relative error, which for two values a, b =
) 0 can be defined as
n=10
epsilon=10^{-n}
if (abs(a-b)<epsilon*max(abs(a),abs(b)))

34

CHAPTER 3. NUMERICAL ANALYSIS I

The last example is much saver than writing it as


n=10
epsilon=10^{-n}
if (abs(a-b)/max(abs(a),abs(b))<epsilon) ! dont do this !!!!
which will crash the program in case that a = b = 0. Moreover, a multiplication can be
executed faster than a division, so if the if-condition is inside an often executed loop, the
division can slow down the execution of the loop considerably.

3.4

Impossible Numbers

Several Mathematical operations are not well defined in mathematics, like dividing by 0, or
computing the real value of an asin of a number with an absolute value larger than 1. The
computer has to do something if the operation is mathematically undefined or meaningless.
Programs in compiler languages like C and FORTRAN usually crash, and leave it to the
programmer to find out where the error occurred.
If a variable is defined in FORTRAN as real, the result must also be real, so expressions like
sqrt(-1) or asin(1.5) crash the program. As the elementary datatype in MATLAB is the
complex array, such operations give in MATLAB the correct result, e.g.
> sqrt(-1)
ans = 0 + 1i
> asin(1.5)
ans = 1.57080 - 0.96242i
This may become a problem if the expected result is indeed real, but very near the undefined
value, e.g. if the result without rounding error should be 1, but due to rounding it is e.g..
1.000000000001, and the asin computed from it is
> asin(1.000000000001)
ans = 1.5708e+00 - 1.4143e-06i
so that the computation will be continued with a complex part. In such cases, the input
should always be checked with an if-statement whether it conforms to the expectations.
There is also an IEEE-Standard which defines such exceptions, e.g. what should be done if
e.g. a number is divided by 0. The result is stored in a bit-pattern which is outputted as NaN,
Not a Number. MATLAB is a bit more sophisticated. For a start, it gives the correct
result for the division, :
> 4/0
warning: division by zero
ans = Inf
> -2/0
warning: division by zero
ans = -Inf
for Inf, the usual rules apply, but some cases are different:

3.5. ERRORS

35

> Inf+3
ans = Inf
> Inf+Inf
ans = Inf
> Inf/Inf
ans = NaN
> Inf-Inf
ans = NaN
When tested for equality via the ==-Operator, one Idiosyncrasy is, that Infinity is always
equal to Infinity in MATLAB, but NaN is always unequal to NaN:
> 4==4
ans = 1
> Inf==Inf
ans = 1
> NaN==NaN
ans = 0
and for tests for NaN, the isnan-Function must be used:
> isnan(4)
ans = 0
octave:17> isnan(NaN)
ans = 1
octave:18>
To test which numbers are the largest and smallest, the MATLAB-Functions realmin and
realmax can be used. Because Inf, -Inf and NaN must are represented as Floating-Point
patterns in MATLAB, there are about three to four Bit-patterns less available in MATLAB
than in Compilers for e.g. FORTRAN or C which dont use Inf and Nan. Because the
Bit-Pattern of the largest Numbers are used, the largest represented floating-point number
is smaller than in the compilers.

3.5

Errors

As we have seen in the previous chapter, the representation of real numbers as floating point
approximation leads intrinsically to rounding errors. In the following, we will treat additional
sources of error which occur in the evaluation of algebraic equations.

3.5.1

Truncation error

Function evaluation
Many mathematical expressions are defined as an infinite process, for example the exponential
function is
x
x2 x3
exp(x) = 1 + +
+
...
(3.1)
1!
2!
3!

36

CHAPTER 3. NUMERICAL ANALYSIS I

The error which results when e.g. the infinite series is instead computed with only a finite
number of operations, i.e. truncated after a finite step is called the truncation error. In fact,
if in a given interval a function f (x) is given by the infinite polynomial series with series
coefficients a1 , a2 , a3 . . . , a , if the series is truncated after n steps in a given interval, an
approximation
f (x) =

n
(
i=0

a%i xi + O(xn+1 )

(3.2)

can be found which has a smaller error than the truncated series using the coefficients of the
infinite series ai . Such a series is called an n-th order approximation of f (x), which often
makes use of the expansion of the function in terms of Orthogonal polynomials1
Whereas the exponential function exp(x) is defined in the infinite series with the coefficients
exp(x) =

(
xn

n=0

n!

the best finite approximation in the interval 0 x ln2 to exp(x) with 10 digits is
exp(x) = a0 + a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5 + a6 x6 + a7 x7 + (x)
with (x) 2 1010 and the coefficients are given in Tab.3.2. Be aware that the coefficients for the truncated polynomial approximation depend on the interval for which the
approximation should be used, to minimize the error.

n
an
0
1.00000 00000
1
-0.99999 99995
2
0.49999 99206
3
-0.16666 53019
4
0.04165 73475
5
-0.00830 13598
6
0.00132 98820
7 -0.00014 131261

1/n!
1.000000000000000000
-1.000000000000000000
0.500000000000000000
-0.166666666666666657
0.041666666666666664
-0.008333333333333333
0.001388888888888889
-0.000198412698412698

Table 3.2: Coefficients for the Polynomial approximation of exp(x) in the interval
0 x ln2 (middle column) and the corresponding coefficients of the infinite Taylor
series.
In practice, many transcendental functions f (x) which are introduced in elementary classes
of mathematical lessons are numerically better approximated by other approximations than
polynomial approximations, e.g. by making explicit use of divisions, which can itself mimic
operations of infinite order in x, either via Pade-Approximation (quotient of two polynomial
expressions) or via continued fractions.
An effective strategy, especially with periodic functions, is argument reduction, so that one
does not have to compute the Taylor series for large x, but for a small x near the origin
Chap. 22, Handbook of Mathematical Functions, M. Abramowitz, I. Stegun, National Bureau of
Standards.
1

3.5. ERRORS

37

by either shifting the periodic functions like sin, cos into the interval between [0, /4], or by
decomposing the function into a product of an integer argument and an non-integer argument,
like in the case of the exponential function, where one computes
exp(x) = exp(m + f ) = exp(m) exp(f ), m integer, |f | < 1.
Many approximation of transcendental functions can be found in Abramovitz/Stegun2 .
Other examples for truncation error can be found in other series expansion methods, e.g.
Fourier series truncated after a certain number of coefficients, or Pade approximations, where
an analytical function
+
ai xi
f (x) = +i=1

i
i=1 bi x
is approximated by the truncated Pade approximation
+

3.5.2

Rounding error

n
a
i xi
f(x) = +i=1
.
m i
i=1 bi x

Because we have only a finite number of digits available, when we try e.g. in Octave to
compute 5/9, we get
> format long
> 5/9
ans = 0.555555555555556
>
So, first of all, it is not necessary to input 5./9. like in FORTRAN when one wants to use
floating point numbers. On the other hand, one sees that the periodic fraction which is the
result must be rounded to 16 decimal digits.
Therefore, when we compute the exponential function in the following program,
% Example for rounding error in computing transcendental functions
clear
format compact
format long
x=-20.5
n_iter=100
myexp=0.
for i=0:n_iter-1
% Compute the Taylor-series for the exp-function
% x!=gamma(x+1)
myexp=myexp+x^i/gamma(i+1)
end
exp(x)
return
2

Handbook of Mathematical Functions, M. Abramowitz, I. Stegun, National Bureau of Standards.

38

CHAPTER 3. NUMERICAL ANALYSIS I

we obtain
myexp=-4.422614950123058e-07
as a result, instead of the correct
exp(x) = 1.250152866386743e 09.
As we see, the result is so very wrong that not even the sign of the result is correct, we get
a negative value for a computation which should always give positive values. The problem is
also not the range of the numbers, because the smallest number representable in MATLAB
precision is 10300 , much smaller than the correct result, 109 . The problem is also not a
truncation error, as we are still trying to add taylor contributions, even the result does not
change any more after the 95th iteration. The problem is that we try to add something which
is smaller than the last digit of the summation.
There are possibilites to circumvent such kinds of problems which will be explained later in
the lecture.

3.5.3

Catastrophic cancellation

Even for the last bit of a floating point function evaluation in double precision, which gives
about 16 digits accuracy, the 17th digit is of course wrong. The subtraction of numbers of
nearly equal size shifts these invalid digits in front, so that the results For expressions like
a=cos(x)^2-sin(x)^2
this gives dubious results whenever the argument x is a multiple of /4, with arbitrary number
of canceled digits. The problem can simply be circumvented by using the trigonometric
identity cos(2x) = cos(x)2 sin(x)2 so that
a = cos(2x)

(3.3)

always gives the result with the accuracy of the compilers evaluation of the cos- evaluation.

3.6

Good and bad directions in Numerics

The following integral is positive because, because the integrand is positive in the whole
integration interval [0,1]:
En =

xn ex1 dx,

n = 1, 2, . . .

From partial Integration we obtain a relation between En and En1 , which can be used to
iteratively compute En if we have E0 given:
,

n x1

x e

dx = x e

-1 ,
-

n x1 -

= 1n

0
1

nxn1 ex1 dx

xn1 ex1 dx

En = 1 nEn1 ,

n = 2, . . .

3.6. GOOD AND BAD DIRECTIONS IN NUMERICS


In a REAL*8 implementation, we obtain with E1 = exp(1)
the result to the right. We can be sure that at least E18 , and
therefore all the following solutions are wrong, because the
result should not become negative in the first place. Because
the Ei should fall monotonically, in the iteration 1 nEn the
term nEn is approaching 1 in the iteration, and the correct
information in the iteration is quickly annihilated, so that
only the erroneous last digits survive. If we use instead
the reordered equation
En = 1 nEn1 ,
n = 2, . . .
1 En
so that En1 =
,
n = . . . 3, 2
n

(3.4)
(3.5)

we can approximate the starting value:


En =

,
,

xn ex1 dx

(3.6)

xn dx

(3.7)

0
1

0
xn+1 -1

39

E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20

0.367879441171442
0.264241117657115
0.207276647028654
0.170893411885384
0.145532940573080
0.126802356561519
0.112383504069363
0.100931967445092
0.09161229299417073
0.08387707005829270
0.07735222935878028
0.07177324769463667
0.06694777996972334
0.06273108042387321
0.05903379364190187
0.05545930172957014
0.05719187059730757
-0.02945367075153627
1.559619744279189
-30.19239488558378

(3.8)
n + 1 -0
1
=
,
(3.9)
n+1
which shows that for very large n, En is very small, and take E21 0 as an educated
guess, so that E20 = 0.5. The output for this iteration E20 E0 is given on the left, on the
right the output is overlayed with the output of the iteration E0 E20 .
E1
0.3678794411714423
E1
0.3678794411714423
0.2642411176571154
E2
0.2642411176571154
E2
0.2642411176571153
0.2072766470286539
E3
0.2072766470286539
E3
0.207276647028654
0.1708934118853843
E4
0.1708934118853843
E4
0.170893411885384
0.1455329405730786
E5
0.1455329405730786
E5
0.1455329405730801
0.1268023565615286
E6
0.1268023565615286
E6
0.1268023565615195
0.1123835040692999
E7
0.1123835040692999
E7
0.1123835040693635
0.1009319674456008
E8
0.1009319674456008
E8
0.1009319674450921
E
0.09161229298959281
E9
0.09161229298959281
0.09161229299417073
9
0.083877070104072
E10 0.083877070104072
E10 0.0838770700582927
0.07735222885520793
E11 0.07735222885520793
E11 0.07735222935878028
0.0717732537375049
E12 0.0717732537375049
E12 0.07177324769463667
0.06694770141243632
E13 0.06694770141243632
E13 0.06694777996972334
0.06273218022589153
E14 0.06273218022589153
E14 0.06273108042387321
0.05901729661162711
E15 0.05901729661162711
E15 0.05903379364190187
0.05572325421396629
E16 0.05572325421396629
E16 0.05545930172957014
0.0527046783625731
E17 0.0527046783625731
E17 0.05719187059730757
0.05131578947368421
E18 0.05131578947368421
E18 -0.02945367075153627
0.0250000000000000
E19 0.025
E19 1.559619744279189
0.5000000000000000
E20 0.5
E20 -30.19239488558378

40

CHAPTER 3. NUMERICAL ANALYSIS I

As can be seen, the second iteration with the wrong starting value converges against the
right end-value exp(1), whereas the first iteration with the right starting value converges
against a wrong result. This shows the art of numerical computing, which is, to obtain a
correct end result with a good routine and a wrong starting value, instead of obtaining a
wrong end result with a correct starting, but a bad routine.
It will be later become obvious in this course that integration is always the good direction
in numerical computing, which can decrease initial errors, whereas the differentiation is the
bad direction, which can increase initial errors. This is in contrast to manual calculation,
where the differentiation is easier to treat than integration.

3.7
3.7.1

Calculus and order of Methods


Taylor Approximation revisited

Many functions can be approximated by a polynomial


series

f (x) =

ai xi

i=1

in such a way, so that around the point x0 , for the th


derivative f (x), we have
f (x) =

(
f ()(x0 )

=1

(x x0 ) .

sin(x)
1st Order
3rd Order
5th Order
7th Order

0
2
5

cos(x)
0th Order
2nd Order
4th Order
6th Order

For the functions exp(t), sin(t), cos(t), the Taylor series


is given below:
2
exp(t) =
sin(t) =
cos(t) =

n
(
t

n=0

n=0

n=0

n!

=1+

t
t2 t3
+ + + ...
1! 2! 3!

(1)n

t2n+1
t3 t5
= t + ...
(2n + 1)!
3! 5!

(1)n

t2n
t2 t4 t6
= 1 + + ...
2n!
t!
4! 6!

6
exp(x)
0th Order
1st Order
2nd Order
3rd Order

4
2
0

If we truncate the (for transcendental functions infinite) series after a finite number of terms,
we obtain the Taylor approximation. The the evaluation of a Taylor approximation, e.g. of
fourth order with the coefficients a, b, c, d, e
f (x) = a + bx + cx2 + dx3 + ex4
series can be done in an efficient and in an inefficient way. Using the above formula directly,
we can write
f(x)=a+b*x+c*x*x+d*x*x*x+e*x*x*x*x

3.7. CALCULUS AND ORDER OF METHODS

41

so that we need four additions and ten multiplications. If we use brackets around the expression in a skilled way, four additions and four multiplications are sufficient:
f(x)=a+(b+(c+(d+e*x)*x)*x)*x
It is easy to write down the derivative of the above polynom as
f(x)=b+(2*c+(3*d+4*e*x)*x)*x
8

In MATLAB, the evaluations of polynoms is implemented with the function polyval, the derivative with
polyder, but the order of the coefficients is the oppo6
site from the above example, and the graph can be seen
on the right:
clear, format compact
P=[1 0 -1]
x=linspace(-3,3,100);
y=polyval(P,x);
P_deriv=polyder(P);
y_deriv=polyval(P_deriv,x);
plot(x,y,-,x,y_deriv,--)
legend(f(x)=x^2-1,d/dx f(x)=2x)
grid
axis image

Typical functions which cannot be approximated by


Tayler-series are functions with a jump, like the sign2
function sign(x) = x/|x|.
Because the Taylor-series is an infinite series, one needs
comparatively many terms to obtain a good approximation. If convergence is sought only on a finite inter- 4
vall, the Tchebicheff-approximation, which minimizes
the error over a finite interval, is usually a much better
approximation for the same number of terms.
6

3.7.2

f(x)=x 1
d/dx f(x)=2x

Integration I

In the same way that many transcendental functions can be represented by an infinite Taylor
series but approximated as a finite polynomial series in x, integrals and derivatives can be
approximated by replacing the infinitely small differential dx by the finite difference x,
and the error can be expressed as a power of x, as in the approximation of transcendental
functions by finite power series. The simplest method to numerically integrate an integral
I=

f (x)dx

(3.10)

42

CHAPTER 3. NUMERICAL ANALYSIS I

consist by the simple evaluation of the corresponding Riemann-sum as


I (1) = x

(
i

f (xi ), f (xi ) = {a, a + x, . . . b 2x, b x},

(3.11)

where b a is a multiple of x, and the integration points are spaced equidistantly.3 I (1)
means that the method is of first order in x, the error is of the order of x2 .
Numerical integration is sometimes called quadrature, maybe from the time where the
integral was approximated numerically by drawing squares under the graph, and this box
counting was the first non-analytical quadrature. As an example, let us compute the
integral

, b
erf(b)
erf(a)
2
exp(x )dx =

,
2
2
a
which is a bit unintuitive because its needs the error function erf to be represented analytically. With the integration bounds of [0, 1] the integral is with about 15 digits accuracy
,

exp(x2 )dx = .7468241328124270.

Now let us approximate this integral with the rectangle midpoint rule, where we replace the integral with a Riemann sum with n corner points and we will evaluate
the function in the middle of the n 1 inIntegration with Rectangle Midpoint Rule
tervals of equal width h with the functon 1
evaluated at the middle instead of the left 0.9
or right end of the integration interval:
0.8
clear
format long
n=101
% n odd !
dx=1/(n-1)
% stepsize
xrect=[dx/2:dx:1-dx/2];
yrect=exp(-xrect.*xrect);
sum(yrect)*dx
,

0.7
0.6
0.5
0.4
0.3
0.2
0.1

f (x)dx = h f (x0 )+f (x1 )+f (x2 ) . . .+f (xn )

0
0.4

0.2

0.2

0.4

0.6

0.8

1.2

1.4

For 100 intervals / 101 corner points, the


result 0.74682719849232 is correct up to 5
digits or the rectangle midpoint rule , amazing, as we only used 100 point. In our MonteCarlo evaluation of , we needed of the order of 10000 points when we wanted an accuracy
of only two digits.
Very often, numerical integration methods are introduced not by using the rectangle midpointrule, but the trapeze-rule, which is slightly more complicated than the midpoint rule, as each
integration interval must be approximated as a trapeze:
,

f (x)dx =

/
h .
f (x0 ) + 2f (x1 ) + 2f (x2 ) + . . . + f (xn )
2

There are methods which dont chose the points equidistantly, but optimize the choice of points
so that the most accurate approximation is obtained with the minimum number of points.
3

3.7. CALCULUS AND ORDER OF METHODS


Instead of evaluating the left and right
bound of each interval, we will count the
function values between the upper and
lower integration bounds once, the function
values of the upper and lower bound only
half. This corresponds to the summation
over the existing trapeze areas:

43
Integration wit Trapeze Rule

1
0.9
0.8
0.7
0.6
0.5

0.4
clear
format long
0.3
n=101
% n odd !
0.2
dx=1/(n-1)
% stepsize
0.1
xtrap=[0:dx:1];
0
ytrap=exp(-xtrap.*xtrap);
0.4
(sum(ytrap)-.5*(ytrap(1)+ytrap(n)))*dx

0.2

0.2

0.4

0.6

0.8

1.2

1.4

Surprisingly, the result of 0.74681800146797 is one digit less accurate than the result with the
midpoint rule, though the program was more complicated, because we had to think about
a proper way to implement the trapeze shape for each interval. If we think about a graph
with mostly negative curvature, the trapeze rule will end with an approximation which is
constantly below the true function value. For the rectangles of the midpoint rule are partly
above, partly below the graph, so that there is error compensation already within one interval.
In the rectangle midpoint rule, we have chosen the quadrature points in the middle of the
interval. If we would have choosen the values for the function evaluation at the left/right
boundary of each interval, we would have obtainted 0.74997860426211/ 0.74365739867383,
considerably less accurate than the rectangle midpoint rule.
It can be shown4 that the

Integral over convex curve

midpoint rule has an accuracy of


1 3 %%
h f (xi ),
24 i

(3.12)

Trapeze rule
underestimated

Rec. Midp. rule


overestinated

better than the trapeze rule which


has an accuracy of
1 3 %%
h f (xi ),
12 i

(3.13)

so it is surprising that textbooks usually introduce numerical quadrature via the trapeze rule.
Because both formulae are correct up to the second power of hi , and the error is of the third
power, they are called formulae of second order.
More accurate, accurate to third order, is the composite Simpson-Rule S(f ), which makes use
of combining the rectangle midpoint rule R(f ) and the trapez rule T (f ). When we compare
the integral for the midpoint rule and for the trapeze rule, we see that in our integration
G.E. Forsythe, M. Malcom, C. Moler, Computer Methods for Mathematical computations, Prentice Hall 1977
4

44

CHAPTER 3. NUMERICAL ANALYSIS I

intervall with a convex function, the trapeze rule allways gives a too small result, the midpoint
rule gives allways a too large result. Therefore, if we average R(f ) and T (f ), we will get a
better result than R(f ) and T (f ) alone. Because the error for R(f ) (Eqn. 3.12) is twice as
large as the error for T (f ) (Eqn. 3.13), we should not take the direct average 12 R(f ) + 12 T (f ),
but the weighted average so that the error of both rules cancels:
2
1
S(f ) = R(f ) + T (f ).
3
3
Its error can be shown5 to be of the order of
1 4 %%%%
h f (xi ).
2880 i
For our example with the integral from 0 to 1 over exp(x2 ), we obtain 0.746824132817537
as the result, instead of the exact 0.7468241328124270 . . . .
In our second-order formulae, we tried to approximate the graph with straight lines and
integrated the area below the curve. A parabola is determined by 3 points, and therefore one
can also try to approximate the graph via a parabola instead of a straight line to obtain a
Simpson rule directly by supplying three integration points for each interval.
It is therefore necessary to have an odd number of integration points, and the direct derivation
of the Simpson rule can be done for an integration interval with length 2h and by inserting
the Taylor expansion of the function f (x) with the th derivatives f () around the point
x0 so that
f (x) =

(
f ()(x0 )

=0

Simpson-Rule

(x x0 )

instead of the function f (x) itself yields:

2h %

2h

f (x)dx =

%%

'

f (h) + f (h) x + f (h) x + . . . dx

2h %

f (h) + f (0) f (2h)/(2h) x +

f (0) 2f (h) + f (2h)/(2h)

'

g(x)

f(x)

dx =

h
0
h
2h 3h 4h
(f (0) + 4f (h) + f (2h))
3
Using this formula for the single interval of length h, we can compose the formula for the
whole integration over [a, b] with the integration point from x0 = a to x2n = b :
,

f (x)dx =

The MATLAB-Program is

/
h.
f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + . . . + f (x2n )
3

G.E. Forsythe, M. Malcom, C. Moler, Computer Methods for Mathematical computations, Prentice Hall 1977
5

3.7. CALCULUS AND ORDER OF METHODS

45

clear
format long,format compact
n=101
% n odd !
dx=1/(n-1)
% Stepsize
xsimp=[0:dx:1];
ysimp=exp(-xsimp.*xsimp);
(4*sum(ysimp(2:2:n))+2*sum(ysimp(3:2:n-1))+ysimp(1)+ysimp(n))*dx/3
which gives the Result 0.74682413289418, slightly worse than the composed Simpson rule.
In the following table, we compare the errors of the different orders by underlining the correct
digits and introduce the Big-O-notation
Method
Exact
Rectangle, left endpoint
Rectangle, right endpoint
Rectangle, midpoint
Trapeze rule
Simpson
Composite Simpson

Integral
01
2
0 exp(x )dx
0.7468241328124270 . . .
0.74997860426211
0.74365739867383
0.74682719849232
0.74681800146797
0.74682413289418
0.746824132817537

Order correctness
O(h )
O(h)
O(h)
O(h2 )
O(h2 )
O(h3 )
O(h3 )

Several conclusions can be drawn from viewing the above table, which hold also for other
numerical methods which have an intrinsic truncation error:
1. If a formula if of nth order and a discretization of 1/100 of the interval length are used,
for a first order implementation, the error is about 1/100=1 %, for the second order
method it will be about 1/10.000 and for the third order method it will be 1/1000000.
(Of course, the prefactors in the order also have to be taken into consideration).
2. Therefore, it is not always necessary to increase the number of discretization steps to
obtain a more accurate result. The change from first order to second order in the
rectangle rule resulted from just switching the integration points by h/2.
3. If the theoretical accuracy cannot be reached, it is necessary to consider whether
1. The function under consideration does not fulfill the necessary criteria (smoothness
etc) or
2. to search for the error in the program resulting from incorrect prefactors, intervals
with incorrect bounds etc. If a formula of second order gives results with an error
proportional 1/(number of points), then the interval bounds are usually determined
wrongly.
4. Be aware that it is not possible to integrate functions analytically if their integral has
no solution due to divergence etc .....
In this section, we have discussed the error resulting from the integration over a whole interval.
This is also called a global error, in contrast to the error which occurs in the approximation
of the single interval. Numerical methods suffering from truncation error vary depending on
whether the global error is the same as the local error, whether the global error is larger than

46

CHAPTER 3. NUMERICAL ANALYSIS I

the local error (many solvers for differential equations which do not conserve energy) or the
global error is smaller than the local error (error compensation as in the case of the Simpson
Integration.).
One generally should be very
careful in using a method with
Relative Accuracy for Integrating exp(x*x) between 0 and 1
low order accuracy and a small
10
Rectangular left and right boundary
time step. First of all, for
10
many problems, such a method
Trapeze Rule
can become quite time con10
suming.
Furthermore, the
Rectangular Midpoint Rule
10
more function evaluations occur, the more rounding errors
10
are accumulated.
The diagram to the right shows the
10
Simpson
cost-performance diagram, the
10
number of time steps plotted with respect to the accu10
racy. Cost-performance diagrams vary depending on the
10
10
10
10
10
10
10
evaluated functions. As can
Number of Integration Points
be seen, beyond 1000 integration points, the accuracy of the
Simpson method has already reached the limit of 16 digits of the double precision accuracy
and therefore the integral evaluation cannot be increased further by increasing the number
of integration points.
0

10

12

14

16

There are integration formulas which are easier to use than the numerical approximations
of the Riemann sum we introduced here, which are called Newton-Coates formulae. The
Midpoint rule is called an open Newton-Coates formula, because the endpoints of the integration intervall are not evaluated, the formulae for which the endpoints must be evaluated,
are alled closed Newton-Coates formulae. The following table shows the Taylor expansion for a single interval of length h for Newton-Coates formulae of different order with the
corresponding error term:
Name
Midpoint TrapezRule
Simpson
Simpson 3/8
Bode

Integral Formula
0 x2
f (x)dx = h[ 12 f1 + 12 f2 ]
0x1
x3
f (x)dx = h[ 13 f1 + 43 f2 + 13 f3 ]
0x1
x4
f (x)dx = h[ 38 f1 + 98 f2 + 98 f3 + 38 f4 ]
0x1
x5
14
64
24
64
x1 f (x)dx = h[ 45 f1 + 45 f2 + 45 f3 + 45 f4 +

14
45 f5 ]

Error Term
+O(h3 f %% )
+O(h5 f (4) )
+O(h5 f (4) )
+O(h7 f (6) )

Carrying the error compensation in formulas with truncation error further to higher orders,
by combining low order methods as in the case of the composite Simpson rule so that a
higher-order method results, is called Romberg integration. If the limit for infinitely high
orders is taken, this is called Richardson-extrapolation, and these ideas can also be applied
to differentiation and the solution of numerical differential equations.

3.7. CALCULUS AND ORDER OF METHODS

3.7.3

47

Differentiation I

In the same way one can derive the Newton-Coates formulae for integrals from their Taylor
expansion in the previous section, one can derive formulae for the derivatives using the
Taylor expansion6 . Such approximations are often called finite difference formulas, as they
approximate the differential with the finite difference. For a data which take the value
fi2 , fi1 , fj , fj+1 , fj+2 at equidistant points, we get the following finite difference schemes
for first order derivatives.:
Name:
ForwardDifference
BackwardDifference
3point symmetric
3point asymmetric
5point symmetric

Finite difference scheme:


(fi+1 fi )/x
(fi fi1 )/x
(fi+1 fi1 )/(2x)
(1.5fi + 2fi+1 .5fi+2 )/x
(fi2 8fi1 + 8fi+1 fi+2 )/(12x)

Leading error
xf %% (x)/2
xf %% (x)/2
x2 f %%% (x)/6
x2 f %%% (x)/3
x4 f %%%%% (x)/3

Note that the leading coefficients in front of the fi2 , fi1 , fj , fj+1 , fj+2 have to add up to 0.
For second order derivatives, similar schemes are be written down in the following table and
again the coefficients add up to 0:
Name:
3point symmetric
3point asymmetric
5point symmetric

Finite difference scheme:


(fi1 2fi + fi+1 )/(x2 )
(fi 2fj+1 + fi+2 )/(x2 )
(fi2 + 16fi1 30fi + 16fi+1 fi+2 )/(12x)

Leading error
x2 f %%%% (x)/12
x2 f %%% (x)
x4 f %%%%%% (x)/90

In contrast to numerical integration, which smoothes out errors via error compensation,
numerical differentiation roughens up the solution. If high accuracy is desired, there are
usually better solutions than computing the derivatives directly via finite difference schemes.
The graph to the right shows the numerical integral of
1

sin(y)dy = 1 cos(x)

and the numerical differential of sin(x),


d
sin(x) = cos(x)
dx
with some additional noise of 1 % in sin(x). The above graph was produced with the following
program:
clear
format compact
nstep=200
x=linspace(0,4*pi,nstep);
dx=mean(diff(x));
idx=1/dx;
y=sin(x-dx/2)+0.01*(rand(size(x))-.5);
subplot(3,1,1)
6

Clive A.J. Fletcher, Computational Techniques for Fluid Dynamics, Vol.1, 2nd. ed. Springer 1990

48

CHAPTER 3. NUMERICAL ANALYSIS I


1

sin(x)+0.005*rand
d/dx sin(x)
cos(x)
1int(sin(x))

0.5
0
0.5
1
0

6
absolute error

10

12

10

12

10

12

0.1
0.05
0
0.05

cos(x)d/dx*sin(x)
cos(x)1+int(sin(1)

0.1
0

6
relative error

(cos(x)d/dx*sin(x))/cos(x)
(cos(x)1+int(sin(1))/cos(x)

4
2
0
2
0

plot(x,y,-.,x(1:nstep-1),diff(y)*idx,...
x(1:nstep-1),cos(x(1:nstep-1)),:,...
x(1:nstep-1),1-cumsum(y(1:nstep-1))*dx,--)
axis tight
legend(sin(x)+-0.005*rand,d/dx sin(x),cos(x),1-int(sin(x)))
subplot(3,1,2)
plot(x(1:nstep-1),diff(y)*idx-cos(x(1:nstep-1)),...
x(1:nstep-1),1-cumsum(y(1:nstep-1))*dx-cos(x(1:nstep-1)),:)
legend(cos(x)-d/dx*sin(x),cos(x)-1+int(sin(1))
title(absolute error)
axis tight
subplot(3,1,3)
plot(x(1:nstep-1),((diff(y)*idx-cos(x(1:nstep-1)))./cos(x(1:nstep-1))),...
x(1:nstep-1),(1-cumsum(y(1:nstep-1))*dx-cos(x(1:nstep-1)))...
./cos(x(1:nstep-1)),:)
legend((cos(x)-d/dx*sin(x))/cos(x),(cos(x)-1+int(sin(1))/cos(x))
axis tight
title(relative error)
return
Both the differential and the integral should give cos(x), but the differential is so noisy that
the result deviates visibly from the exact solution. The integral over the noisy result gives
nevertheless a smooth curve. This is again a case of a good and a bad direction of
numerical computing, as we encountered before by rewriting the iterative computation of the

3.7. CALCULUS AND ORDER OF METHODS

49

equation
into

En = 1 nEn1 (numerically unstable)


En = 1 nEn1 (numerically stable).

As can be seen, for the differentiation and its inverse operation, the integration, in numerical
analysis, the differentiation consists the bad direction, integration is the good direction.
In numerical analysis, integrals, also of higher order, can usually be computed with sufficient
precision, in contrast to derivatives, whereas in analytical calculations, it is usually always
possible to compute differentials, but very often the computation of closed forms for integrals
is problematic.
Exercises:
1. Write a program which produces floating point numbers for base 2 and mantissa 4, as well
as for base 4 with mantissa 2.
a) Chose the exponent so that both number systems are roughly comparable.
b) Plot the position of the numbers.
c) Compare both number systems: Which number system can be supposed to have the better
roundoff-properties.
2. Write a program which computes the exponential function exp(x) using the Taylor series
and one program which computes the exponential function by evaluating the integer part of
x using powers of the Euler number e and the non-integer-part using the taylor series. For
which size of the arguments become the

Chapter 4
Graphics
4.0.4

Initializing and manipulating vectors

Instead of using for loops for setting up vectors and matrices, it is convenient in MATLAB to
use the implicit loops provided by the colon operator : and brackets for the array constructor
[]
>> a=[3:6]
a =
3
4

For step-sizes different from one the stepsize can be specified as


[lower_bound:stepsize:upper_bound] like in
>> a=[3:.5:6]
a =
3.0000
3.5000

4.0000

4.5000

5.0000

5.5000

6.0000

This is different from loops in FORTRAN and C, where the stepsize is added as the third argument for a loop statement. Whereas the colon operator notation using : constructs a vector
with a given lower and upper bound for a given stepsize, [lower_bound:stepsize:upper_bound],
if instead of the stepsize the number of points is known, it is more convenient to use the
linspace-function
>> a=linspace(3,6,7)
a =
3.0000
3.5000

4.0000

4.5000

5.0000

5.5000

There is also a function which gives vectors in logarithmic spacing


>> b=logspace(1,1000,3)
b =
10
Inf
In
>> b=logspace(1,4,4)
b =
10
100

1000

10000

6.0000

52

CHAPTER 4. GRAPHICS

If several vectors should be concatenated, this can be done with the brackets for the arrayconstructor []
>> c=[1 3]
c =
1
3
>> c=[4 c b]
c =
Columns 1 through 6
4
Column 7
10000

10

100

1000

After a lot of vector operations, one usually one also needs functions which give informations
about the vectors used. The most elementary function, which displays information about
variables, is
>> who
Your variables are:
a

ans

The length of a vector is displayed by


>> length(a)
ans =
7
but this function makes no difference between column- and row vectors. For information on
higher dimensions, one has to use the function
>> size(a)
ans =
1
7
Vector elements can be accessed either via the for loops like in other programming languages
like in
>> for i=1:length(b)
f(i)=2*b(i)
end
f =
20
f =
20
200
f =
20
200
f =
20
200

2000
2000

20000

53
or via the colon-notation with : and round brackets so that for a vector
>> c=.2:.2:1.2
c =
0.2000
0.4000

0.6000

0.8000

1.0000

1.2000

the assignment of the second to the fourth element to a vector g can be written as
>> g=c(2:4)
g =
0.4000

0.6000

0.8000

The whole of a vector can be assigned without specifying the bounds like in
>> h=c(:)
h =
0.2000
0.4000
0.6000
0.8000
1.0000
1.2000
If the vector from a lower bound up to the end should be assigned, this can be done via the
the end statement in round brackets together with the colon operator :
>> v=c(4:end)
v =
0.8000
1.0000

1.2000

Functions which operate on vectors are usually defined in the canonical way, that means
in a way in which one expects the function to work. The functions prod and sum acting on a
vector behave in the way one expects, e.g. they give as a result the product and sum of the
vector elements. Whereas prod and sum are acting on vectors and give a scalar as a result,
the functions cumsum and cumprod which computed the cumulated sum and the cumulated
product give a vector as a result
>> cumsum(1:5)
ans =
[1 3 6

10

15]

One must be careful with the use of multiplicative operators *, / and ^, which are in
MATLAB in general interpreted in the sense of numerical linear algebra, so that columnand line- operations must match. If one wants to use these operators elementwise, one should
use their elementwise variants which are preceded by a ., as in .*, ./ and .^.

54

4.1

CHAPTER 4. GRAPHICS

Setting up and manipulating Matrices

Matrices can be manipulated in the same way as vectors, preferably with the colon operator
: and the brackets for the array-constructor []. Some elementary builtin MATLAB matrix
functions will be explained here because they make matrix construction easier. The ones
function sets up a matrix with ones as every element, as usual in MATLAB a single matrix
sets up a square two-dimensional matrix
>> ones(2)
ans =
1
1
1
1
For non-square matrixes, two indices have to be specified, where the first is the columns-index,
and the second is the row index, for example
>> ones(3,2)
ans =
1
1
1
1
1
1
The zeros function behaves in the same way as the ones function, only that it sets up
matrixes with 0 as every element
>> zeros(2,3)
ans =
0
0
0
0

0
0

In linear algebra, the identity matrix is very important, and therefore the unit matrix in
MATLAB is named eye, eyedentity / identity
>> eye(3)
ans =
1
0
0

0
1
0

0
0
1

It may be surprising, but the identity-matrix is also defined for non-square matrices, as the
following example shows
>> eye(2,5)
ans =
1
0
0
1

0
0

0
0

0
0

Another important matrix function is the constructor for the random matrix

4.1. SETTING UP AND MANIPULATING MATRICES


>> rand(2,4)
ans =
0.8214
0.4447

0.6154
0.7919

0.9218
0.7382

55

0.1763
0.4057

Matrices can then be constructed via matrix functions alone like in


>> c=ones(2)-eye(2)
c =
0
1
1
0
or with the help of the matrix constructor brackets [] so that
>> b=zeros(2)
b =
0
0
0
0
>> d=[2 3
4 5]
d =
2
3
4
5
>> e=[c b
d c]
e =
0
1
1
0
2
3
4
5

0
0
0
1

0
0
1
0

A very convenient function similar to linspace in one dimension which can be used to set up
arguments for functions in higher dimensions is the meshgrid-function which the functionality
is as follows:
>> [X,Y] = meshgrid(1:3,10:14)
X =
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
Y =
10
10
10
11
11
11
12
12
12
13
13
13
14
14
14

56

4.2
4.2.1

CHAPTER 4. GRAPHICS

Graphs and Visualization


Visualizing Vectors

The elementary command in MATLAB for plotting functions etc. is the plot command.
Plots can be shown in the plotting window either alone or as one of many sub-plots, like in
the following example
>> x=[.1:.1:.5]
x =
0.1000
0.2000
>> y=[20:-4:1]
y =
20
16
12

0.3000

0.4000

0.5000

>> subplot(2,2, 1)
>> plot(y)
>> subplot(2,2,2)
>> plot(x,y)
which displays on the screen (note the differnt scale on the x-axis)
20

20

15

15

10

10

0
0.1

0.2

0.3

0.4

0.5

Plots of vectors can be done either by plotting the vector directly or by specifying two vectors,
the first will be taken as the x-axis. If the vector length does not match, MATLAB issues
an error message and stops the program execution. The plots are automatically done in the
sub-plot which has been called last.
A subtle way of plotting it the plot of a vector of complex numbers. If you have a complex
vector c, you can get the real part x and the imaginary part y via
x=real(c)
y=imag(c)
The command plot(c) has then the same effect as plot(x,y) which means that the imaginary part is plotted versus the real part.
If a new plotting window should be opened, this can be done via the figure command, the
first window is built is figure(1) command which is automatically executed if no plotting

4.3. VISUALIZING ARRAYS

57

window is open, figure(2) opens a second plotting window and so on. Plots are done in the
window for which the figure command was called last.
There is a wide variety of ways to influence graph annotation in MATLAB
0.3

% example for graph anotation


subplot(2,2,1)
x=[.1:.1:.5]
plot(x,x.*x,x,x.*x.*log(x))
xlabel(Xaxis)
ylabel(Yaxis)
titel(Plot anotations)
text(.14,.2,any label here)
legend(x^2,x^2*log(x))

any label here

Yaxis

0.2
0.1

x2
x2*log(x)

0
0.1
0.2
0.1

0.2

0.3
Xaxis

0.4

0.5

The legend created by label can be moved with the mouse. MATLAB graphics can be saved
in various styles (Postscript, encapsulated Postscript, JPEG, .....) via the print command.
The line-style (full lines, dotted lines, symbols) can be changed via the arguments in the plot
command
0.5

plot(x,log(x),...
x,x,:,...
x,x.^2,+,...
x,x.^3,*-)

0
0.5

To look at a drawing in higher resolution,


use the zoom command, aim with the mousepointer at the region which should be zoomed
and click the left mousebuttom (the right
mousebuttom unzooms the region again).

4.3

1
1.5
2
2.5
0.1

0.2

0.3

0.4

0.5

Visualizing Arrays

As an example in this section, we will use the the Rosser matrix


>> rosser
ans =
611
196
196
899
-192
113
407 -192
-8
-71
-52
-43
-49
-8
29
-44

-192
113
899
196
61
49
8
52

407
-192
196
611
8
44
59
-23

-8
-71
61
8
411
-599
208
208

-52
-43
49
44
-599
411
208
208

-49
-8
8
59
208
208
99
-911

29
-44
52
-23
208
208
-911
99

If a matrix is displayed with the plot command, the lines are each plotted as a vector, as in
the example for plot(rosser) below on the right.

58

CHAPTER 4. GRAPHICS

Arrays are plotted as arrays with the verbmesh-command, which plots the data in a
wire-frame-type of graph, as below on the right.
The view command can be used to set a different viewing angle for three-dimensional plots. It
is also possible to change the viewpoint interactively via the rotate3d command by pointing
with the mouse on the frame and pulling the frame of the 3D-graph.
1000

800
1000

600

400

500

200
0

0
500

200

400

1000
8

600

6
4

800
2

1000

4.4

Analyzing systems via plotting

In the following, a linear, an exponential function, a hyperbola, a logarithmic and an inverse


square root are plotted via the following program, in linear, x- and y logarithmic as well as
double logarithmic scale:
clear
format compact
x=[linspace(0.1,10,100)];
subplot(2,2,1)
plot(x,x,-,x,exp(x),--,x,1./x,:,x,log(x),-.,x,1./sqrt(x),-+)
axis([0 10 -3 20 ])
title(linear plot)
legend(x,exp(x),1/x,log(x),1/sqrt(x))
subplot(2,2,2)
semilogx(x,x,-,x,exp(x),--,x,1./x,:,x,log(x),-.,x,1./sqrt(x),-+)
axis([0.1 10 -3 20 ])
title(semilogarithmic in x-direction)
legend(x,exp(x),1/x,log(x),1/sqrt(x),2)
subplot(2,2,3)
semilogy(x,x,-,x,exp(x),--,x,1./x,:,x,log(x),-.,x,1./sqrt(x),-+)
axis([0.1 10 .01 20000 ])
title(semilogarithmic in y-direction)
legend(x,exp(x),1/x,log(x),1/sqrt(x),2)

4.4. ANALYZING SYSTEMS VIA PLOTTING

59

subplot(2,2,4)
loglog(x,x,-,x,exp(x),--,x,1./x,:,x,log(x),-.,x,1./sqrt(x),-+)
axis([0.1 10 .01 20000 ])
title(logarithmic )
legend(x,exp(x),1/x,log(x),1/sqrt(x),2)

semilogarithmic in xdirection

linear plot

20

20
x
exp(x)
1/x
log(x)
1/sqrt(x)

15
10

15
10

0
0

x
exp(x)
1/x
log(x)
1/sqrt(x)

10

10

semilogarithmic in ydirection
4

10

10

10

logarithmic
4

10

x
exp(x)
1/x
log(x)
1/sqrt(x)

x
exp(x)
1/x
log(x)
1/sqrt(x)

10

10

10

10

10

10

10

10

10

10

Many systems in science and mathematics can be better understood by just plotting typical
properties in different scales. Logarithmic, linear, exponential and power laws can be found
in nature, and are easily identifiable by plotting the data in difference scales.

60

4.4.1

CHAPTER 4. GRAPHICS

Linear Plots

Typical linear plots result from linear response functions, the simplest is probably Hooks law, which is
sketched in the drawing to the right. It is a linear
law, and in the dynamical situation, when the spring is
pulled with a force in a certain frequency, the elongation
changes with the same frequency.
Such a linear response is not a matter of course.
There are nonlinear systems which respond with e.g.
frequency-doubling to an external stimulation, like in
the case where a high intensity red laser beam going into
a target comes out as a blue laser beam (blue light with
twice the frequency of the red light).

Crystall target
Red Beam
Blue Beam
LASER
4.4.2

Logarithmic Plots

If the y-axis of a plot is choosen logarithmically, exponential curves appear as straight lines,
so that logarithmic plots allow to identify exponential behavior. Typical examples for exponential behavior are time evolution plots. Radioactive decay and the increase of the GDP
in Economies are examples for such a time evolution. Below, the increase in the Dow Jones
Industrial Stock Index is shown. The curves in linear plot are bent, but more or less straight
in a logarithmic plot. It is a matter of ongoing debate whether this reflects rather the exponential increase in the strength of the US-Economy or just the exponential inflationary
effects.

If instead of the y-axis the x-axis is choosen in logarithmic scale, logarithmic curves become
straight lines. Logarithmic curves grow slower than linear curves. Typical examples for
logarithmic behavior are animal senses. Light and sound are perceived on a logarithmic

4.4. ANALYZING SYSTEMS VIA PLOTTING

61

scale, i.e. sound is not twice as loud if the pressure of the sound wave is twice as high,
but 102 as hight.

4.4.3

Double logarithmic Plots

Double logarithmic plots allow to identify power laws, functions of the form xr , where r
does not necessarily have to be integer. Power laws are usually found in nature when systems
suffer from finite size effects.

The Gutenberg-Richter, which states that the probability of earthquakes of magnitude x is


proportional to a function of 1/x r is an example where a system (the tectonic plates) create
earthquakes which are up to a maximum size, the size of the plate itself.
The curves of a power law can be written as a superposition of exponential curves, like in
the following program
clear
format compact
a=linspace(.0,100,1000)
la=length(a)
x=linspace(0.1,1000,100);
y=zeros(size(x));
for i=1:la
y=y+exp(-.1*a(i).*x);
end
loglog(x,y)
This means that a power-law is found in a system if there are many different scales, which
contribute to an exponential phenomenon and on each size scale (variable a in the above
example) there is a different prefactor in the exponential law.
In the same way as the curve of a power law can be written as a superposition of exponential
curves, a Lorentzian can be written as a superposition of Gaussian curves.
Another example for power-laws is the 1/f -noise in many technical applications. I its very
often found in system where a seemingly continuous process is a result of discrete processes,
and the deviation from the mean causes the noisy fluctuations.

62

4.5
4.5.1

CHAPTER 4. GRAPHICS

Specialized Plots and specialized styles


Graph Properties

The axis of a graph can be easily modified with the axis-command.


axis image
defines the same length unit for x and yaxis.
axis([xmin xmax ymin ymax])
defines the minimal coordinates for x and yaxis respectively. Because MATLAB usually
chooses the axis so as to end the axis at values for multiples of 1, 10, 100, it is sometimes
necessary to set
axis tight
so that the axes terminate at the extremal values of the plot. Apart from the axis, a grid can
be specified by using grid. Sometimes it is necessary to modify picture properties like the
axis labels etc from the default values chosen by MATLAB. For the program
clear
format compact
x=linspace(0,2*pi,10);
y=sin(x);
h=plot(x,y)
g=axes
get(h)
get(g)
the plot is defined as a variable, and these variable can be displayed with the get command.
The entries for h and g can then be directly modfied using the set-command, by specifying
the object-name, h, g, the property to modify, e.g. Color and the new value, e.g.
set(g,Color,[.5 .5 .5])
Another possible usage, if one already knows the property name, is e.g.
set(gca,XTickLabel,{One;Two;Three;Four})
which labels the first four tick marks on the x-axis and then reuses the labels until all ticks
are labeled. The labels can be positioned like
set(gca,XTickLabel,{1;10;100})

4.6. MATLAB-OUTPUT INTO A FILE

4.5.2

63

Including Images

The command
image
puts a default-image (in GIF-Format) on the graphics-screen. In general, graphics of nearly
any format can be read and displayed using
name=imread(name.gif)
image(name)
The date in the variable name can then be manipulated like a usuall MATLAB-array.

4.6

MATLAB-output into a file

Very often, one wants to save some output of MATLAB onto a file to include it in other
documents. For short output, it is simplest under a window-system to copy the desired lines
with the mouse into an editor. If the output becomes too long, one can use the command
diary on
and MATLAB will then output not only to the screen, but also in the file diary . If one
wants to redirect the output in a file with a special name, one can use the command
diary(special_filename)
To end the output in the diary, use the command
diary off
If you want to include the output in a LaTeX document and preserve the Computer-outputlook, you can use the
\begin{document}
\end{\document}
style, all the program examples in this scriptum are produced in such a way.

4.7

Graphics to include into documents

If you want to save the graphics on the MATLAB-Graphics screen as a file that should be
included in a text (for e.g. LaTeX or WORD), you have to use the print-command, with
the syntax
print -dFORMAT FILENAME

64

CHAPTER 4. GRAPHICS

The following table introduces some graphics-formats


print -dps2 name.ps

print -dpsc2 name.ps


print -deps2 name.eps
print -depsc2 name.eps
print -djpeg name.jpg
If the orientation should

Black-white Postscript-output, Graphics which can be directly


plotted on as (Postscript-) printer. All graphics are on a single
page in the paper format.
Like above, but in color
Black-white encapsulated POSTSCRIPT-output, can be included in word-processor-programs like LaTeX.
Like the above, but in color.
Output in jpeg-format which can be included in Word-processors
like Word or for Internet-pages.
be changes e.g. to portrait, one can use the MATLAB-command

orient

portrait

4.8

Including Encapsulated Postscript in LaTeX

If you want to include jpeg in GUI-based word-processors, you can use the corresponding
menus and/or the mouse.
LaTeX, probably the most widely used word-processing program in the sciences, is not GUIbased. A (not so) short introduction about the various commands in various languages can
be found on ftp://ctan.tug.org/tex-archive/info/lshort/. LaTeX is rather a textprogramming-Language, which converts a file name.tex (the program) into a file name.dvi
(the device-independent-file) which can then be converted into general formats like Postscript
via
dvips -o name.ps name
which produces a postscript-file. These postscript-formats can then e.g. by the command
ps2pdf be used to transform the Postscript-format into PDF-format (Adobe-Acrobat). If
you want to prepare a LaTeX-document with the corresponding graphics, you have to load
a package with the software for including graphics. A widely used package is the epsfigpackage, so whereas for a conventional LaTeX-report, the header looks like
\documentclass[twoside,12pt]{report}
the header must contain
\documentclass[twoside,12pt]{report}
\usepackage{epsfig}
if postscript-graphics should be included. Graphics can then be included with
\epsfig=filename,width=??,height=??,angle= where either width or height must be given,
the angle can also be left away. Here are some examples:
\epsfig{file=graphiken/circle_square.eps,width=2cm}

r=1

4.9. PROCESSING GRAPHICS

65

\epsfig{file=graphiken/circle_square.eps,height=2cm}

r=1

\epsfig{file=graphiken/circle_square.eps,height=2cm,angle=-90}

r=1

\epsfig{file=graphiken/circle_square.eps,height=2cm,width=4cm}

r=1

In principle, all postscript-files should be includable in Latex, but some Programs produce
postscript-output which is not compatible with LaTeX. Under UNIX, one can use the command
ps2epsi name.ps name.epsi
to convert a file name.ps to a file name.epsi which corresponds to the encapsulated postscript
interchange format.

4.9

Processing Graphics

If you have graphics in another format than postscript, like *.jpeg or *.gif files, which you
want to include in LaTeX, you have to convert them into postscript with some other software.
One of the most widely used programs for this task under UNIX is xv, which allows to load
graphics in one format and to save it in another format, for example
xv name.jpg
will load the program name.jpg. Pressing the right button of the mouse will make a menu
appear, and the graphics can be saved as Postscript by choosing the appropriate menu (SAVE
FORMAT POSTSCRIPT).

Chapter 5
Linear Algebra
Usually, one learns about linear algebra in the first year of study, but often, one needs it much
later, when one has forgotten most of it already. MATLAB means Matrix LABoratory and
its first version was written by Cleve Moler so that his students could learn linear Algebra
more easily.
General documentation of MATLAB can also be found on http://www.mathworks.com/
access/helpdesk/help/techdoc/matlab.shtml

5.1
5.1.1

Matrix Manipulation
Matrix commands

The diagonal of a matrix can be extracted with the diag command in the following way:
A =
0.520109
0.510104
0.010375

0.340012
0.326988
0.782090

0.470293
0.636776
0.900370

> diag(A)
ans =
0.52011
0.32699
0.90037
If the input of the diag-command is a vector, diag constructs a matrix with the vector on
the diagonal, a typical example how commands are overloaded in MATLAB:
> b=[3 5 7]
b =
3 5 7
> A=diag(b)
A =

68
3
0
0

CHAPTER 5. LINEAR ALGEBRA


0
5
0

0
0
7

The operator for the matrix-transpose is the accent :


> A=rand(2)
A =
0.66166 0.48661
0.69184 0.39113
> B=A
B =
0.66166
0.48661

0.69184
0.39113

Because MATLAB knows the difference between column- and row-vectors, the transposeoperator can also be used to transform column- into row-vectors and vice versa:
> v=[1 2 3 4 5]
v =
1 2 3 4 5
> u=v
u =
1
2
3
4
5
For complex-valued matrices, the -operator gives the Hermitian conjugate matrix:
> H=rand(3)+sqrt(-1)*rand(3)
H =
0.59574 + 0.89043i 0.91601 + 0.87663i
0.71691 + 0.73996i 0.31324 + 0.44034i
0.38660 + 0.13756i 0.33661 + 0.71527i

0.19920 + 0.74066i
0.19254 + 0.85119i
0.29184 + 0.58186i

> G=H
G =
0.59574 - 0.89043i
0.91601 - 0.87663i
0.19920 - 0.74066i

0.38660 - 0.13756i
0.33661 - 0.71527i
0.29184 - 0.58186i

0.71691 - 0.73996i
0.31324 - 0.44034i
0.19254 - 0.85119i

The commands which extract the upper/lower trigonal matrix are triu/tril:

5.2. MATRIX PRODUCTS


> A
A =
0.951650
0.109170
0.667123

0.084814
0.585341
0.528991

69

0.208357
0.562931
0.860920

> tril(A)
ans =
0.95165
0.10917
0.66712

0.00000
0.58534
0.52899

0.00000
0.00000
0.86092

> triu(A)
ans =
0.95165
0.00000
0.00000

0.08481
0.58534
0.00000

0.20836
0.56293
0.86092

If the columns or rows should be flipped, i.e. if their order should be inverted, this can be
done with the commands fliup and fliplr, flip up down and flip left right:
> fliplr(A)
ans =
0.73180 0.40541
0.55208 0.79014
> flipud(A)
ans =
0.79014 0.55208
0.40541 0.73180
These two commands can be used to form a transposition for a complex matrix, which is not
the hermitian conjugate:
> A=rand(2)+sqrt(-1)*rand(2)
A =
0.839504 + 0.572899i 0.466803
0.086815 + 0.252680i 0.132638
> B=fliplr(flipud(A))
B =
0.132638 + 0.086518i 0.086815
0.466803 + 0.675260i 0.839504

5.2

+ 0.675260i
+ 0.086518i

+ 0.252680i
+ 0.572899i

Matrix Products

For matrices and vectors, there are a lot of ways products can be computed. There is no
difference between vectors and matrices, a vector is just a matrix with only one row or column.
The simplest form is the elementwise product, which uses the operator .*:

70

CHAPTER 5. LINEAR ALGEBRA

> u=[1 2 3 4]
u =
1 2 3 4
> v=[5 6 7 8]
v =
5 6 7 8
> u.*v
ans =
5 12

21

32

The inner product for a row-vector u and a column-vector w is computed with the operator
*:
> u=[1 2 3 4]
u =
1 2 3 4
> w=[1 1 2 2]
w =
1
1
2
2
> u*w
ans = 17
If instead of u*w we compute w*u, the result is the outer product
> w*u
ans =
1 2
1 2
2 4
2 4

3
3
6
6

4
4
8
8

Matrices can be treated in the same way as vectors with elementwise multiplication .* or
multiplication in the sense of linear algebra:
> A=[1 2
> 3 4]
A =
1 2
3 4
> B=[1 -1

5.2. MATRIX PRODUCTS

71

> -2 2]
B =
1 -1
-2
2
> A*B
ans =
-3
-5

3
5

> A.*B
ans =
1 -2
-6
8
A matrix-vector product is performed like this:
> A=[1 2
> 3 4]
A =
1 2
3 4
> v=[1
> 2]
v =
1
2
> A*v
ans =
5
11
Matlab also has the Kronecker-Product
> u=[1 2 3 4]
u =
1 2 3 4

as a builtin function,

> v=[5 6 7 8]
v =
5 6 7 8
> kron(u,v)
ans =
5
6
7

10

12

14

16

15

18

21

24

20

24

28

32

72

CHAPTER 5. LINEAR ALGEBRA

> kron(u,v)
ans =
5 10 15
6 12 18
7 14 21
8 16 24

20
24
28
32

Whereas the elementwise matrix product computed with . is commutative, of course the
matrix product computed with is not commutative.

5.3

Repetition of elementary linear Algebra

The angle between vectors to vectors v and w for any finite can be computed via their
inner/scalar-product as
|v w|

cos =
.
vv ww
For the inner/ scalar-product, we have the CauchySchwartzInequality
|v w|

vv

w w.

Vectors for which the scalar product is 0 are called orthogonal. Whereas orthogonality of two
vectors v and w can be defined in theoretical mathematics as the property that their scalar
product is zero, v w = 1, in numerical mathematics it is necessary to define orthogonality
in a way so that possible rounding errors are taken into account, as the following example
shows
> w=[sqrt(3) sqrt(3)]
w =
1.73205080756888 1.73205080756888
> v=[sqrt(3) -sqrt(3)]
v =
1.73205080756888 -1.73205080756888
> v*w
ans = -9.64636952420157e-17
Obviously the last result should be exactly zero, but due to the rounding errors in the
computation, there is a finite error. How the definition of orthogonality can be applied in
such a way that rounding errors are taken into account can be seen in the next section about
the rank of matrices.

5.3.1

Size and Rank of Matrices

The size of a matrix A can be computed with the size-command:

5.3. REPETITION OF ELEMENTARY LINEAR ALGEBRA

73

> size(A)
ans =
2 2
size gives a two-row vector as an answer, the number of columns is size(A,2), the number
of rows size(A,1). Also the length of columns / rows for a column/ row vector v can be
computed with size(v,2) / size(v,1)
The rank of a matrix is in theoretical linear algebra the number of linear independent
rows/columns. Because the definition of linear independence is equivalent to the definition
of orthogonality, we will use the rank computation as the criterion for the orthogonality. The
rank of a matrix can be computed with MATLABSsrank command. How the rank command
works, will be explained later in the section about the singular value decomposition, along
how one should choose the optional threshold in MATLABSs rank command. First let us
review some theorems about the rank of matrices1

5.3.2

Some Theorems on the rank of matrices

The outer product of two matrices of two vectors always gives a matrix of rank 1.
Example:
Random matrices have nearly always full rank, i.e. the rank of a matrix constructed with rand is the same as the number of the columns/rows, and if the number
rows/columns larger than the number of columns/rows, we have rank(A)=min(size(A)):
> A=rand(3,4)
A =
0.23382 0.43570
0.79868 0.34546
0.66927 0.71192

0.42862
0.69142
0.15419

0.97961
0.74305
0.11667

> rank(A)
ans = 3

Square matrices which have a rank smaller than their number of columns/rows are
called singular. They cannot be inverted, and systems of linear equations where the
equations form a singular matrix cannot be solved. Their determinant vanishes.
The Rank of a matrix does not change through transposition, complex or hermitian
conjugation.
The product of non-singular matrices as the same rank as the matrices themselves:
> A=rand(3,4)
A =
1

Roger A. Horn, Charles R. Johnson, Matrix Analysis Cambrigde University Press 1991

74

CHAPTER 5. LINEAR ALGEBRA


0.476924
0.061165
0.139677

0.068071
0.885041
0.708093

0.420827
0.027155
0.489577

> B=rand(3)
B =
0.908288 0.703948
0.245781 0.950685
0.942011 0.726192

0.363589
0.097344
0.064962

> C=B*A
C =
0.52703
0.18896
0.50276

0.94231
0.92705
0.75283

0.57935
0.17690
0.44795

0.883968
0.417966
0.978820

1.45301
0.70990
1.19982

> rank(A)
ans = 3
> rank(B)
ans = 3
> rank(C)
ans = 3
For rank-deficient matrix, the rank of the product matrix is the same as that of the
matrix with the lowest rank:
> A=rand(2,4)
A =
0.200421 0.795092
0.838726 0.220597

0.896583
0.018236

0.454798
0.018493

> A(3,:)=A(2,:)
A =
0.200421 0.795092
0.838726 0.220597
0.838726 0.220597

0.896583
0.018236
0.018236

0.454798
0.018493
0.018493

> B=rand(3)
B =
0.94359 0.31700
0.50896 0.36833
0.48172 0.19705

0.20635
0.40063
0.42594

> C=B*A
C =
0.62806

0.85555

0.86569

0.43882

5.3. REPETITION OF ELEMENTARY LINEAR ALGEBRA


0.74695
0.61907

0.57430
0.52044

0.47035
0.44326

75

0.24569
0.23061

> rank(A)
ans = 2
> rank(B)
ans = 3
> rank(C)
ans = 2

5.3.3

Rank-Inequalities

For A Mm,n we have rankA min(m, n)


When a column or a row of a matrix are deleted, the rank of the resulting matrix
cannot be larger than the rank of the original matrix.
For A Mm,k , B Mk,n we have
rankA + rankB k rankAB
min(rankArankB)
For A, B Mm,k , rank(A + B) rankA + rankB
For A Mm,k , B Mk,p , C Mp,n we have
rankAB + rankBC rankB + rankABC.

5.3.4

Norms of a Matrix

Every Matrix-Norm can also be used as a vector-norm, but not vice versa. Therefore, we
explain here only definitions for matrix-norms. Analogous to real and complex Scalars, one
wants to use something like an absolute value also for matrices. Something which behaves
like an absolute value under addition is the Norm of a matrix.

Properties of MatrixNorms
||A|| 0 (Non-negativity)
||A|| = 0 if A = 0
||cA|| = |c| ||A|| for all real and complex c (homogeneity)
||A + B|| ||A|| + ||B|| (triangle inequality)
||AB|| ||A|| ||B||(sub-multiplicativity)

76

CHAPTER 5. LINEAR ALGEBRA

Definitions for Norms


In MATLAB, all of the following norms exist, and they can be computed via the function
norm(x) and if necessary further arguments.
Name
Definition
MATLABFunction

1. Spectral Norm ||A||2


maximal Eigenvalue of A A norm(A), norm(A,2)
2. 1Norm ||A||1
maximal row sum of A
norm(A,1)
3. Norm ||A||
maximal
column
sum
of
A
norm(A,inf)
& +

4. FrobeniusN. ||A||Fro
( Ai,j Aj,i )
norm(A,fro)

5.3.5

Determinant of a Matrix

The norm only fulfills sub-multiplicativity, i.e. the norm of a matrix product is equal or
smaller than the product of the norms of the factors. An absolute value which fulfills the
multiplicativity is the determinant, which can be computed in matlab via det(A):
det A det B = det(A B)

Further properties of the matrix are:

The exchange of two adjacent columns/rows inverts the sign of the matrix:
> A=rand(3)
A =
0.209224 0.413728
0.106481 0.192283
0.291095 0.436435

0.212479
0.074438
0.508115

> det(A)
ans = -0.0017939
> B=[A(:,2)
B =
0.413728
0.192283
0.436435

A(:,1) A(:,3)]
0.209224
0.106481
0.291095

0.212479
0.074438
0.508115

> det(B)
ans = 0.0017939
The determinant of the identity matrix is one, independent of its dimension.
Never use the Cramer Rule or the Jacobi expansion for the computation of a determinant, is is wasteful and numerically instable.
The numerically most suitable computation method for determinants is the so-called
LU-decomposition, where the matrix A is decomposed as a product of an lower triangular matrix L with 1s on the diagonal and an upper triangular Matrix U as
L U = A.

5.3. REPETITION OF ELEMENTARY LINEAR ALGEBRA

77

The determinant of A is therefore the product of the diagonal entries of U. Rowand Column-Permutations, so called pivoting, increases the numerical accuracy of the
decomposition, for details, see [Gol89]. The MATLAB-command which computes the
matrix determinant via LU-decomposition is det.

5.3.6

Matrix inverses

Nonsingular square matrices are inverted by the inv command. The elementwise division of
one matrix by another in MATLAB is written as A./B, where all entries of the divisor matrix
must be )= 0 to avoid an error message. This is totally different from the MATRIX division
A/B, which corresponds to the multiplication of matrix A with the inverse of matrix B:
> A=rand(2)
A =
0.29975 0.85007
0.88812 0.33290
> B=rand(2)
B =
0.89979 0.72370
0.53648 0.97567
> A/B
ans =
-0.33410
1.40492

1.11909
-0.70090

> C=inv(B)
C =
1.9926 -1.4780
-1.0956
1.8376
> A*C
ans =
-0.33410
1.40492

1.11909
-0.70090

Because the product C*A does not necessarily the same result as the product A*C, there is
also the right division of a matrix, which with the above matrices gives
> C*A
ans =
-0.71537
1.30362
> B\A

1.20181
-0.31963

78

CHAPTER 5. LINEAR ALGEBRA

ans =
-0.71537
1.30362

1.20181
-0.31963

If one tries to invert a singular matrix, MATLAB gives a result (usually wrong), and issues
an error message:
> A=[1 1
> 1 1]
A =
1 1
1 1
> inv(A)
warning: inverse: matrix singular to machine precision, rcond = 0
ans =
1 1
1 0
> B=inv(A)
warning: inverse: matrix singular to machine precision, rcond = 0
B =
1 1
1 0
> B*A
ans =
2 2
1 1

5.4

How many matrix products are possible

A matrix product is computed using three indices i, j, k,


aij =

bik ckj

Therefore, there are 6 possible orders to to program the loops, but basically, there are only
two possibilities:
clear
format long
n=20
b=randn(n).*10.^(16*randn(n));
c=randn(n).*10.^(16*randn(n));

5.4. HOW MANY MATRIX PRODUCTS ARE POSSIBLE


tic
% Version 1: Dot-Product
a1=zeros(n);
for j=1:n
for i=1:n
for k=1:n
a1(i,j)=a1(i,j)+b(i,k)*c(k,j);
end
end
end
toc

tic
% Equivalent to
a2=zeros(n);
for j=1:n
for i=1:n
a2(i,j)=b(i,:)*c(:,j);
end
end
toc
tic
% Version 2: Daxpy-Product
a3=zeros(n);
for j=1:n
for k=1:n
for i=1:n
a3(i,j)=a3(i,j)+b(i,k)*c(k,j);
end
end
end
toc
tic
% equivalent to
a4=zeros(n);
for j=1:n
for k=1:n
a4(:,j)=a4(:,j)+c(k,j)*b(:,k);
end
end
toc
return

79

80

CHAPTER 5. LINEAR ALGEBRA

We have also included the tic and toc command to profile the time used for a matrix
multiplication. It can be seen that MATLAB performs much faster if the inner loop is
evaluated using the :-notation.
The first version of the matrix-matrix multiplication has a inner vector product as a Kernel,
the inner part of the routine. The second version of the Matrix-multiplication has a kernel
which can be written as
(y = a(x + (y ,
an operation where the left side in words is A X Plus Y, for which often the acronym
SAXPY or DAXPY (S for single, D for double precision) is in use.
It turns out that both operations are numerically equivalent, and both need 2l3 floating point
operations (multiplications and additions).
It is common to give the speed of computers by how many Floating Point Operations Per
Second (Flops) then can perform. Modern PCs are in the range of a few hundreds MFlops,
Workstations are nowadays in the GFlops-Range, and the Earth Simulator, a Supercomputer
near Yokohama, can to about 4 TeraFLOPS.
Using programs to test the speed of Computers is called benchmarking.

5.5
5.5.1

Matrix Inverses again


How to solve a linear system by hand

The standard way to solve a linear system of k equations with k unknowns,


Ax = b,
with the unknowns in the vector x and the right-hand-side b

a1,1 a1,2 a1,k


a2,1 a2,2 a2,k
..
..
..
..
.
.
.
.
ak,1 ak,2 ak,k

is to rewrite the system in augmented form

x1
x2
..
.
xk

a1,1 a1,2 a1,k


a2,1 a2,2 a2,k
..
..
..
..
.
.
.
.
ak,1 ak,2 ak,k

- b
- 1

- b2
- .
- .
- .
- bk

b1
b2
..
.
bk

and transform the matrix and the right-hand-side vector b via elementary row- and columnoperations (subtracting multiples of some rows from other rows) to upper triangular form,
where all the elements below the diagonal are 0:

a
1,1 a
1,2 a
1,k
0 a
2,2 a
2,k
..
..
..
..
.
.
.
.
0
0 a
k,k

b1
b2
..
.
bk

5.5. MATRIX INVERSES AGAIN

81

The solutions for x1 , x2 , . . . xk can then be computed via back-substitution as


xk = b/
ak,k
/
.
k1,k xk /
ak1,k1
xk1 = bk1 a

xk2 =
xi =

bk2 a
k2,k1 xk1 a
k2,k xk /
ak2,k2

k
(
1
bi
a
i,j xj
a
i,i
j=i+1

This scheme of eliminating elements so that a triangular coefficient matrix survives for which
the unknowns can be computed in a trivial way is called Gaussian elimination. As an example

in augmented form

9 3 4
x1
7

4 3 4 x2 = 8
1 1 1
x3
3

9 3 4 -- 7

4 3 4 - 8 .
1 1 1 - 3

We start by interchanging the first and the last row

1 1 1 -- 3

4 3 4 - 8 .
9 3 4 - 7

Next, we subtract 4times the first row from the second row and nine times the first row from
the last row:

1
1
1 -3

0 - 4 .
0 1
0 6 5 - 20
Finally, we add -6 times the second row to the last row, and obtain the triangular system

1
1
1

0
0 1
0
0 5

- 3

- 4 ,
- 4

from which we can compute the unknowns successively as x= 4/5, x= 4 and x= 1/5.

5.5.2

The numerical variants: LU-decomposition

For numerical purposes, the two steps, reduction to a triangular system (elimination step),
and backward substitution (solution step), are often split up in two routines. A common
collections of subroutines for the computation of numerical linear algebra is the LINPACKpackage, which includes matrix inversions and orthogonalization methods for real and complex matrices. MATLABS routines for linear algebra are basically routines from LINPACK,
and Cleve Moler, the inventor of MATLAB was also a co-author of LINPACK.

82

CHAPTER 5. LINEAR ALGEBRA

Numerically, the Gaussian elimination is usually implemented as a LU-decomposition, a


factorization of the Matrix A of the system
Ax = b
into an upper trigonal matrix U and lower trigonal matrix L, so that
LU = A
so that the solution can again be computed in a trivial way. In MATLAB, the LU-factorization
can be computed via the lu command, for example as
> a
a =
-1.0688456296920776
0.0473232455551624
-0.5952438712120056

0.5834664106369019
-0.6955339908599854
-0.0617007017135620

-0.0174380335956812
-0.2883380949497223
-1.1060823202133179

> [l,u]=lu(a)
l =
1.000000000000000
-0.044275098518011
0.556903499136249

0.000000000000000
1.000000000000000
0.577325122175704

0.000000000000000
0.000000000000000
1.000000000000000

u =
-1.068845629692078
0.000000000000000
0.000000000000000

0.583466410636902
-0.669700958047086
0.000000000000000

-0.017438033595681
-0.289110165605131
-0.929460456605607

The solution of a linear system Ax = b can be computed in MATLAB with the slashcommand, which is not only the division from the left for scalars,
5/4
ans = 1.25000000000000
> 5\4
ans = 0.800000000000000
but also for matrices. The Algebraic meaning is
A\B = A1 B,

A/B = A B 1 ,

and for matrices (Remember that MATLAB means MATRIX Laboratory), this is not necessarily the same. The solution for Ax = b can be obtained by formally dividing through A
from the left,
Ax = b A\Ax = A\b x = A\b.
The solution of the triangular system, including with testing whether Ax is really equal b,
can then be programmed in the following way:

5.5. MATRIX INVERSES AGAIN

83

> A=rand(3)
A =
0.63356
0.98480
0.60858

0.25786
0.13788
0.76457

0.71159
0.62761
0.90059

> b=rand(3,1)
b =
0.81931
0.61835
0.14195
> x=A\b
x =
-0.74296
-2.36996
2.67169
> A*x
ans =
0.81931
0.61835
0.14195
In LINPACK, the elimination step is called factoring (because the LU-decomposition produces
two factors, L and U), and the Double precision GEneral matrix FActoring is therefore called
DGEFA. The solution/substitution of the system is DGESL , SL for solution.
There exists also a LINPACK-benchmark, which sets up matrices in a well-defined way and
computes the matrix inverses, then computes the number of floating point operations and
the time and then computes the Flop-rate. In this way, the speed of computers has been
evaluated for decades.2

5.5.3

Matrix inversion

A matrix inversion can be computed in the same way as the solution of a linear system, which
we see that if we write the problem as

AA

= E, with the identity-matrix E =

1 0 0
0 1 0

.. .. . . ..
. .
. .

0 0 1

http://www.netlib.org/benchmark/linpackd in FORTRAN, but also available in other languages, the results in http://www.netlib.org/benchmark/linpackd/performance.ps
2

84

CHAPTER 5. LINEAR ALGEBRA

where the columns of the identity matrix E have the role of


the right-hand side b and the columns of A1 are the unknowns x. It is now clear why it is advantageous to use the
LU-decomposition, as it allows the simultaneous solution of
systems with arbitrary many columns on the right hand side.
After the factoring is completed, the solution step for computing the inverse of a ll matrix takes l times as many steps
as the solution of the system for a single column right-handside. We can see this by using the flops-command which
was available with old versions of MATLAB (before version
6), which measured the number of floating point iterations,
and the example program on the right.
For 150 150 matrices, the number of FLOPS necessary for
the solution of the linear system is 2419042, for the computation of the matrix inverse it is 6907967. This means that the
number of FLOPS required for the matrix inversion is about
three times as much than the solution of the linear system.
For the solution of the linear system, the highest computational cost is actually the factoring, not the backward substitution, and we can see that the backward substitution takes
twice as much operations as the factoring itself.

5.5.4

clear
format compact
n=150
A=randn(n);
b=randn(n,1);
flops(0)
x1=A\b;
flops
flops(0)
x2=inv(A)*b;
flops
return
>>
n =
150
ans =
2419042
ans =
6907967

Accuracy of the matrix inversion

Up to now, we have not discussed the error in matrix inversions. As we have not used any
order of approximation, it is clear that there will be neither truncation error nor discretization
error, and only rounding errors have to be taken into consideration. As a test case for the
matrix inversion, let us consider the matrix
A=
which has the inverse
1

1
1
1$ 1

1/$
1/$
(1 + $)/$
1/$

If we compute the inverse in double precision, for # = 108 , we obtain for


A=
1.000000000000000
0.999999990000000

1.000000000000000
1.000000000000000

the inverse
99999999.4975241
-99999998.4975241

-99999999.4975241
99999999.4975241

5.6. EIGENVALUES

85

instead of the expected result


1

108 108
100000001
108

What went wrong? The numerical parameter which describes how accurate a matrix inversion
can be computed, or a linear system can be solved, is the condition number , which is
implemented in MATLABs cond-function. The condition number for a matrix A is defined as
the norm of the Matrix |A| divided by the norm of the inverse matrix |A1 |, or, if |A|/|A1 | <
1, then as the inverse = |A1 |/|A|.
There is a heuristic, which says that if the condition-number of a matrix A is 10k , for a
matrix inversion about k digits will be lost in accuracy. four our above matrix with # = 108 ,
the condition number is = 4 108 . as the error is about 0.5 for a matrix for which the
entries are of the order of 108 , we see that the predictions of the Heuristic are quite accurate.
We have discussed that there are two possible implementations of the matrix-multiplication,
the DOT and the DAXPY-product. The LU-decomposition can formally written as an operator O acting on the original matrix A, so that formally
O A = L U.

In other words, the LU-decomposition is a very special matrix-matrix multiplication, and


therefore there a two variants. The conventional DAXPY-variant, which is widely treated
in textbooks on numerical analysis, is implemented in MATLAB, and in Packages like LINPACK, LAPACK, NAG and visual numerics. The rather rarely mentioned DOT-variant, for
which depending on the implementations Names like Doolittle, Crout or Crout-Doolittle are
used, is basically only used in NUMERICAL RECIPES, a compendium of numerical routines,
where none of the authors has a background in analysis. A DDOT-routine in MATLAB (not
shown, because I dont want anybody to use it) for the above problem produced the result
1.99999999
-0.99999999

-1.00000000
1.00000000

and the error was not in the eight digit, but already in the first digit! The scalar product as
a kernel introduces rounding errors which cannot be predicted with the conventional formula
using the condition number.

5.6

Eigenvalues

The eigenvalues can be computed in MATLAB via the eig command. For a random matrix
A, we obtain the eigenvalues as
>> A=randn(2)
A =
0.5181
-1.2274
0.8397
0.1920
>> eig(A)
ans =
0.3551 + 1.0020i
0.3551 - 1.0020i

86

CHAPTER 5. LINEAR ALGEBRA

so one can see that the eigenvalues of a real square matrix are in general not real. For a
symmetric matrix, we see that

>> A=A+A
A =
1.0363
-0.3876
>> eig(A)
ans =
1.2167
0.2035

-0.3876
0.3840

we obtain real eigenvalues. Formally, the eigenvalues i of a matrix A are often introduced
as the roots of the characteristic polygyon of A,

det(A E) = 0,

E=

1 0 0
0 1 0

.
.. .. . . ..
. .
. .

0 0 1

For a diagonal 2 2 matrix,


det

))

a 0
0 b

1 0
0 1

**

= (a )(b ) = 0,

we see that the solutions for are exactly a and b. In other words, the eigenvalues of a
diagonal matrix are the diagonal matrix entries themselves. As we have seen above, the
eigenvalues of a real symmetric diagonal matrix are real. If we look at the characteristic
polynomial, we see that for an upper triangular matrix, the off-diagonal elements vanish in
the characteristic polynomial, so also for a trigonal matrix the eigenvalues are exactly the
diagonal elements:

A =
1.0363
0
>> eig(A)
ans =
1.0363
0.3840

-0.3876
0.3840

5.6. EIGENVALUES

5.6.1

87

What to do with eigenvalues

For a non-diagonal matrix A, the action of the non-diagonal


matrix can usually be replaced somehow by the action of
one or more eigenvalues. One example for such an action is
the multiplication of a matrix with a vector. We if we multiply a
vector iteratively with a matrix A and compute of the norm of the
vector, for example with the program on the right, we find that
after several iterations the length of the iterated vector become
the absolute value of the largest eigenvalue of the matrix:
ans =
-0.23606797749979
norm_of_v =
norm_of_v =
norm_of_v =
norm_of_v =
norm_of_v =
norm_of_v =
norm_of_v =
norm_of_v =

5.6.2

4.23606797749979
4.12310562561766
4.23570259468110
4.23606684261261
4.23606797397526
4.23606797748884
4.23606797749976
4.23606797749979
4.23606797749979

clear
format compact
format long
v=[1
1];
v=v/norm(v);
A=[1 2
2 3];
eig(A)
for i=1:8
v=A*v;
norm_of_v=norm(v)
v=v/norm(v);
end

Diagonalization and Eigenvectors


)

1 2
Now, if we have a matrix A =
we can call eig not only to compute the eigen2 3
values, but using two output-arguments in constructor-brackets [], we can also obtain the
eigenvectors as
>> [u,l]=eig(A)
u =
0.85065080835204
-0.52573111211913
l =
-0.23606797749979
0

0.52573111211913
0.85065080835204
0
4.23606797749979

88

CHAPTER 5. LINEAR ALGEBRA

(the eigenvectors l are then not outputted as a vector, but as a diagonal matrix). In our
above example, where we iteratively multiplied the vector v with the matrix A, the end result
for v is
>> v
v =
0.52573111213781
0.85065080834049
which is the right column of u, and therefore the eigenvector to the larger eigenvalue 2 =
4.23606797749979. In other words, out iterative multiplication of a vector to a matrix is a way
to find the largest eigenvalue and the eigenvector corresponding to this largest eigenvector,
and in the literature, this method is often called the power method, because it corresponds
to multiplying a power of A onto v :
An v > (umax
The matrix u which contains the eigenvectors is at the same time the transformation which
transforms A onto diagonal form, so that
%

u Au =

0.23606797749979
0
0 4.23606797749979

The matrix u is called a unitary transformation

5.6.3

Computing the characteristic polynomial

As we have used Newton-Raphson Iteration for the computation of roots of polynomials,


one could think that this would also be a good method to compute the eigenvalues from the
characteristic polynomial
det(A E) = 0.
Actually, this is not the case. The numerical algorithm for the computation of the eigenvalues,
which will not be elaborates here, makes use of all of the l l matrix entries, whereas the
solution of the characteristic polynomial only makes use of l coefficients computed from the
l l matrix entries, so again we loose significant information as in the example for the
intersection computation of ellipses via the fourth order polynomial.
On the contrary, instead of computing eigenvalues via roots it is usually feasible to compute
the roots of a polynomial by rewriting it as the corresponding eigenvalue problem. First let
us divide the polynomial P (x)
P (x) = a
k xk + a
k1 xk1 . . . a
0 = 0
by the leading coefficient a
k , so that our equation looks like
P (x) = xk + ak1 xk1 . . . a0 = 0.

5.6. EIGENVALUES

89

Then one can set up the so-called companion matrix CP for P (x) e.g. as

CP

0
0
..
.

1
0
..
.

0
1
..
.

..
.

0
0
..
.

0
0
0

1
a0 a1 a2 ak1

and the eigenvalues of CP are the roots of the polynomial P (x). For example, the polynomial
P (x) = x3 2x2 5t + 6 = 0
has the roots x1 = 1, x2 = 2, x3 = 3. If we set up the companion matrix as
C =
0
0
-6

1
0
5

0
1
2

We obtain as the eigenvalues of the companion matrix C


> eig(C)
ans =
3.0000
1.0000
-2.0000

5.6.4

Stability Analysis

Eigenvalues play an important role in stability analysis, i.e. in the analysis whether a numerical problem is stable or not. Instability usually results from eigenvectors which are
larger than 1 (or, in some cases, different from 1). As an example of how eigenvalues enter
in the solution of problems, let us look at the example problem for the ordinary differential
equations:
function dydt = f(t,y)
% necessary parameters as global variables
global D
global omega0
% velocity component
dydt(1)=-omega0^2 * y(n,2)- 2*D*y(n,1) ;
% position component
dydt(2)=y(n,1);
return
This can also be written in matrix notation as

90

CHAPTER 5. LINEAR ALGEBRA

function dydt = f(t,y)


% necessary parameters as global variables
global D
global omega0
% velocity component
dydt=[- 2*D
-omega0^2
0
1
]* y;
% position component
return
)

2D 2
Obviously, the Matrix A =
has Eigenvalues, and the integration step is given
0
1
by At., so in fact errors in the time integration can be analyzed by analyzing the eigenvalues
of Adt. For this harmonic oscillator, the Matrix is trigonal and the eigenvalues are obviously
(2Ddt, dt) , and are therefore constant in time, so the problem can be analyzed also purely
analytical. For more complicated problems, like for the ordinary differential equations of the
Lorentz attractor
dx
dt
dy
dt
dz
dt

= (y x)
= rx y xz
= xy bz,

with real constants , r, b the matrix of the ordinary differential equation is obviously a nondy dz
linear function, because the time evolution ( dx
dt , dt , dt ) cannot be written as a product of
a matrix A independent of x, y, z and the vector x, y, z, like in the case of the harmonic
oscillator. The classical way to analyze the stability of such a system is to linearize the
matrix, usually a risky business, because the linearized matrix is not guaranteed reproduce
the full behavior of the non-linear system. The modern approach is to simply perform the
time integration and output representative values for the eigenvalues of the matrix

B = r 1 x .
0
x b

Now let us solve the Lorentz-Model with constant stepsize using the Euler method and let
us plot the Eigenvalues of the matrix. We know already that the Euler-method is bad,
so our solution will be inaccurate, but it will be much more interesting to implement the
Euler-method in two different ways and see how the two solutions diverge from each other.
% Compute the Lorenz-Model
clear,format compact
n=20;
r=60;
b=8/3;
sigma=10;

5.6. EIGENVALUES

91

t_max=1.3
dt=0.01 % diverges with this timestep: dt=0.011;
ndt=round(t_max/dt);
x=zeros(ndt,1);
mateig=zeros(ndt,1);
x(1)=1;
y=x; z=x;
bild=0;
k=[1 1 1];
k(:,2:ndt)=zeros(3,ndt-1);
prop=[ -sigma sigma 0,
0
-1
0,
0
0 -b];
% L\"osung der DGL direkt
for i=1:ndt-1
dx=sigma*(y(i)-x(i))*dt;
dy=(x(i)*(r-z(i))-y(i))*dt;
dz=(x(i)*y(i)-b*z(i))*dt;
x(i+1)=x(i)+dx;
y(i+1)=y(i)+dy;
z(i+1)=z(i)+dz;
end
% Loesung der DGL mit Matrix% Vektor-Multiplikationen
for i=1:ndt-1
prop(2,1)=(r-k(3,i));
prop(3,1)=(k(2,i));
k(:,i+1)=dt*(prop*k(:,i))+k(:,i);
mat_eig(i+1)=max(abs(eig(prop*dt)));
end
subplot(4,1,1)
plot3(x(1:ndt),y(1:ndt),z(1:ndt));
subplot(4,1,2)
plot3(k(1,1:ndt),k(2,1:ndt),k(3,1:ndt));
subplot(4,1,3)
plot3(k(1,1:ndt)-x(1:ndt),...
k(2,1:ndt)-y(1:ndt),...
k(3,1:ndt)-z(1:ndt));
subplot(4,1,4)
plot(mat_eig)
This is the first surprise, that two implementations of the Euler-method dont give numerical
identical results, and the difference increases if we increase the maximal time. The next

92

CHAPTER 5. LINEAR ALGEBRA

surprise comes when we increase the timestep from dt=0.01 to dt=0.013. We can see that
then the Solution and the maximal eigenvalues start to diverge, and if the maximal time is
taken longer, the program even crashes because it reaches infinity. Here we have found the
property of the solution of differential equations, that the eigenvalues of the corresponding
matrix times the time step may not become larger than 1, or the solution does not converge
any more.
The eigenvalue-spectrum obtained from the Euler-method is also representative for the eigenvalues which we would obtain from higher-order Methods like Runge-Kutta, which are itself
only a sophisticated concatenation of Euler-steps with different step-size.

5.6.5

Eigenvalue condition number

As in the case of the matrix inversion, there is a parameter which tells how accurately the
condition of the eigenvalues could be performed. In MATLAB, the function which gives the
eigenvalue condition number (different from the condition number cond for matrix inverses)
is called condeig, and it gives the inverses of the eigenvectors of the matrix.

Chapter 6
Ordinary differential equations
For ordinary differential equations there is a closed theory about which solution method
should be applied in which case. In the case of ordinary differential equations, the total
differential imposes additional constraints on the solution so that the numerical equations can
be satisfied more easily. In contrast, Partial differential equations are much more difficult to
treat numerically, because the boundary conditions impose certain constraints on the solution
method, so that in the case of nonlinear equations, the optimal choice for a solution strategies
is far from obvious.

6.1
6.1.1

Reference Example
Newtons equations of motion

Ordinary differential equations play a an important role in science and engineering, and
maybe the most central equation is Newtons equation of motion, which relates a time- ,
velocity and position-dependent force F (x, x,
t), mass m and a
F (x, x,
t)x = ma.
Rewriting the equation with the second derivative of the position x
, we get
F (x, x,
t) = m
x,
which due to the second derivation of x is called a ordinary differential equation of second
order. In general, it can be shown that n ordinary differential equations of order m can be
rewritten into n m coupled differential equations of first order. For the case of Newtons
equations of motion, this can be done by introducing the velocity v as the derivative of x so
that
F (x, x,
t) = mv,

v = x.

Because standard texts in numerical analysis prefer to deal with first order differential equations, is is importand to understand the latter form.

94

6.1.2

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

Linear oscillator

2 x, which
For simplicity, we set the mass m = 1. If the force takes the form 2Dxomega

0
corresponds to a linear spring with linear damping, the equation of motion takes the form

2Dx 02 = x
.
The solution of this equation with the Damping term 2D and the frequency of the undamped
oscillation 0 is
x(t) = x0 exp(Dt) cos(D t),
D =

&

02 D2 .

Though there are solution schemes to solve second order equations directly, it is usually
simpler to solve equation of second order by reducing them to a system of coupled first order
equations. For our problem, we introduce the velocity v and its time derivative v = a, so this
leads to the system of first order equations
v = 2Dx 02 x,

x = v.

It is customary in the mathematical community to introduce the vector y =


vector symbol( ) and to rewrite the equation as

8v9
x

(without

d
y = F (y, t),
dt
where F (y, t) becomes a vector-valued function with the time t and the vector y as argument.

6.2

The Euler Method

When faced with a differential for an numerical implementation, intuitively one first wants
to replace the differential operator d by the finite difference as
dy
y(t + t) y(t)

dt
t
so that the first order approximation in the solution of one time-step ti with value y(ti ) and
the Function value Fi = F (yi , ti ) to the next is
y(ti + t) y(ti ) + Fi t,
which is called Euler method.
clear
format compact
D=.2 , omega0=1 % Damping and Force constant
x0=1 , v0=0 %Initial conditions

6.2. THE EULER METHOD

95

dt=0.1, t0=0, t_max=30 % time-step, start-time, end time


y(1,1)=v0
y(1,2)=x0
t(1)=t0
n=1
while (t(n)<t_max)
% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y(n+1,2)=y(n,2)+dt*y(n,1);
% current time
t(n+1)=t(n)+dt;
n=n+1;
end
% exact solution
omega_d=sqrt(omega0^2-D^2);
y_ex=(x0*exp(-D*t).*cos(omega_d*t));
subplot(2,2,1)
plot(t,y(:,2),-,t,y_ex,:)
legend(Euler, dt=0.1,Exact)
axis tight

Euler
Computed
solution
y0

1
Euler, dt=0.1
Exact

Exact
solution

0.5

t0

t 0+dt

0.5

0
10
20
30
Strategy of the Euler method: Evaluate the value at the right side of the Result of the Euler method for the damped harmonic
interval via the starting value of the oscillator: The period and the amplitude are wrong.
left side and the tangent of the left
side of the interval.
By construction, the Euler-method is a first-order method, because we only retained the
terms proportional to t in the expansion. Of course, if we plot the absolute error for our

96

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

exponentially vanishing solution, the error also vanishes exponentially, therefore we dont
draw the error here.

6.2.1

Discussion of the Error

Geometrically speaking, the Euler Method chooses the starting value and the tangent at
the same point for the integration, which is correct only in the infinitesimal limit. It can
be seen that the results obtained from the from the Euler Method are far from satisfying
for our timestep of dt = 0.1. If we decrease the step size, we can reduce the discretization error for the time step, but there is a limit which is reached for the above differential equation with dt = 0.001, and even using 1/10 and 1/100 the timestep does not
change the result significantly any more - by 10 times and 100 times the computational cost.
10

10

10

10

10

10

10

dt=102
dt=103
dt=104
dt=105

10

15

20

25

30

For some ordinary differential equations, we will increase also the rounding error (from adding
each new timestep) for integrating out the same time interval, so in the limit of dt 0, we
wont obtain the correct result with the Euler Method. Therefore there is one thing about the
Euler Method which should be kept in mind: NEVER USE THE EULER METHOD
IN A SERIOUS APPLICATION1
Except for stochastic differntial equations, where the stochastic noise destroys the systematic
error, but even then there may be better choices ....
1

6.2. THE EULER METHOD

6.2.2

97

Modified Euler Method

There are several strategies which can be used to reduce the error in the Euler Mehod. One
possibility is to use for the timestep from t0 to t + dt the value of F (y, t0 + dt/2), in the
middle of the interval, instead of F (y, t0 ) on the left of the interval, which results in a second
order method. This is similar to the midpoint method in numerical quadrature, which gives
a second order method, whereas the rectangular method with the value at the end of the
integral gives only a quadrature rule of first order.

Modified Euler
1

Computed
solution
y0

Euler, dt=0.1
Exact

Exact

0.5

solution
0

t0

t0 +dt/2

t 0+dt

Strategy of the modified Euler


method: Evaluate the value at the
right side of the interval via the
starting value of the left side of the
interval and the tangent in the middle of the interval.

6.2.3

0.5
0

10

20

30

Result of the modified Euler method for the damped


harmonic oscillator: The period and the amplitude are
computed much more accurately than for the Euler
method.

Heuns method

Heuns method uses the value y0 and the tangent F (y0 , t0 ) at the left of the interval to
computes the Euler step to the right of the interval F (y0 + dtF (y0 , t), t0 + dt) as an estimate/prediction of the value at the right intervall, then it calculates as corrected value for
F (y, t) the averages between the solution F (y0 , t0 ) at the left hand side of the interval and
F (y0 + dtF (y0 , t), t0 + dt) at the right side.
Heuns method is a second order method, and there is a certain structuarl similarity to the
trapeze rule in the quadrature, where also the left hand value and the right hand value are
used.
Nevertheless, there are some new perspectives about this method which allow to develope a
new class of integration methods for ordinary differential equations which are of higher than
second order:
1. The idea of first advancing the time integration in a predictor step, then to modify
the result in a corrector step, is the basis of the so-called predictor-corrector methods.

98

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS


2. The idea to use more than a single value of F (y, t) within a single time intervall dt is
the basis of the Runge-Kutta-type methods.

clear

; format compact

D=.2 , omega0=1 % Damping and Force constant


x0=1 , v0=0 %Initial conditions
dt=0.1, t0=0, t_max=30 % time-step, start-time, end time
y(1,1)=v0
y(1,2)=x0
t(1)=t0
n=1
while (t(n)<t_max)
% halfstep
% velocity
y_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y_pred2=y(n,2)+dt*y(n,1);
% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2*.5*(y(n,2)+y_pred2)- 2*D*.5*(y(n,1)+y_pred1));
% position
y(n+1,2)=y(n,2)+dt*.5*(y(n,1)+y_pred1);
% current time
t(n+1)=t(n)+dt;
n=n+1;
end
% exact solution
omega_d=sqrt(omega0^2-D^2);
y_ex=(x0*exp(-D*t).*cos(omega_d*t));
subplot(2,2,1)
plot(t,y(:,2),-,t,y_ex,:)
legend(Euler, dt=0.1,Exact)
axis tight

6.2. THE EULER METHOD

Modified Euler
Predicted
Solutions

99

Exact
solu
Computed tion
corrected
solution

Heun, dt=0.1
Exact
0.5

t0

t 0+dt

0.5

0
10
20
30
Strategy of Heuns method: Evaluate the values and tangents at the Result of Heuns method for the damped harmonic osleft and right side of the interval as cillator: The period and the amplitude are computed
predicted values and take the av- much more accurately than for the Euler method.
erage as corrected value.

6.2.4

Stability

Up to now we have focused in our investigations purely from the point of accuracy, in the
meaning that a numerical solution we get will have some finite error in comarison to the
exact solution of the problem, but will be more or less the same shape. Actually, a more
fundamental problem in numerical analysis is stability, loosely speaking, the mathematical
problem whether a numerical solution has the same shape of the exact solution at all. If
we look at the following first order differential equation
d
y(t) = 1 ty 1/3 ,
dt
which for y(0) = 1 is strictly real in the interval [0,5]. The numerical solution overshoots
for too large time-steps, as is shown in the following graphs for the numerical solution with
the Euler method, so that y(t) becomes negative, and therefore in MATLAB delivers the
complex roots of the negative values of y(t). The result of the numerical integration for too
large time-steps shows a total different shape than the exact solution, and is therefore called
unstable. It is therefore a primary aim to choose numerical methods and time-steps so that
the solution is stable, accuracy is only a secondary concern.
Regrettably, some methods which give very high accuracy for some problems give very small
stability with other problems. It is often advisable to check the stability of a method by
using different time steps to see if the numerical solution changes, or not. If a small change
of the time-step leads to only a small change in the solution, the solution is stable. The
mathematical definition of stability is that a solution undergoes only a small change for a
small change of the initial conditions, and in this respect, the time-step represents something
like an initial condition.

100

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS


real component

imaginary component

1.5

0.2

0.4
0.5

0.6
0

0.8

0.5

1.2
1

Euler, dt=0.5
Euler, dt=0.25
Euler, dt=0.125
Euler, dt=0.05
Exact

1.5
0

Euler, dt=0.5
Euler, dt=0.25
Euler, dt=0.125
Euler, dt=0.05
Exact

1.4

1.6
3

6.3
6.3.1

Programming Ordinary differential equations


Readability

Before we proceed to higher order formulae, we should improve the readability of the code.
Here is a good opportunity to introduce MATLAB functions. For the Euler method, we had
% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y(n+1,2)=y(n,2)+dt*y(n,1);
As was emphasized in the first chapter, readability is tantamount in programming, and
whereas the Euler algorithm was still quite readable, for Heuns method, we hat to treat
% velocity
y_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y_pred2=y(n,2)+dt*y(n,1);

6.3. PROGRAMMING ORDINARY DIFFERENTIAL EQUATIONS

101

% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2*.5*(y(n,2)+y_pred2)- 2*D*.5*(y(n,1)+y_pred1));
% position
y(n+1,2)=y(n,2)+dt*.5*(y(n,1)+y_pred1);
and this is certainly not readable any more. An insight is, that the forcelaw for the time
integration of the spring was inputted twice, once for the corrector step, and once for the
predictor step, so it would be a good idea to input the force law evaluation as a MATLAB
function.

6.3.2

Global variables

Most ordinary differential equations do not only need the time as an input parameter, which
we specified in the above example, but they need other input parameters as well. One of
the simplest ways to incorporate such input parameters is via the declaration of MATLAB
global attribut, which allows the specifiction of global variables, and which are of course not
limited in their use to functions for ordinary differential equations.

6.3.3

MATLAB functions

A typical MATLAB function is the following code, where the constructor brackets [] have
to be used for the output arguments and the round brackets () for the input arguments:
function [output_arg1,output_arg2]=function_name(input_arg1,input_arg2);
% Comment following the function declaration; This comment will be displayed
% when you type
% "help function_name"
% from the MATLAB prompt.
global a
% a global variable , which must be declared as global
% somewhere else and initialized
output_arg1=input_arg1+input_arg_2*a
output_arg2=input_arg1*input_arg_2
return % end of function
The function is called for example as
[out1,out2]=function_name(25,24)
If not all input-arguments are used in the function call, like
[out1,out2]=function_name(25)
MATLAB terminates with an error message. MATLAB functions cannot override input
arguments, in the following example,

102

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

function [output_arg1,output_arg2]=function_name(input_arg1,input_arg2);
output_arg1=input_arg1+input_arg_2
output_arg2=input_arg1*input_arg_2
input_arg1=15
return % end of function
the line
input_arg1=15
does not have any meaning, because only output-arguments (in constructor brackets [])
are recopied to the calling program. If not all output- or input-arguments are assigned,
MATLAB terminates with an error message. Overloading, the use of a variable number of
input-arguments is possible, and in this case one has to ask the number of the input arguments
with the MATLAB function nargin and the number of output arguments nargout. We will
not treat overloading here, but is is easy to find examples for overloaded methods by looking
at some MATLAB functions which exist in MATLAB-code (most MATLAB functions are
written in MATLAB) in the toolbox directory.
which histo
will display the directory in which the toolbox-MATLAB-function histo can be found, and it
is possible to load the function into the editor and view the usage of the operator overloading.

6.3.4

MATLAB-functions for ODE-Solvers

It is customary to write MATLAB-functions for ODE-Solvers with a header like


function dydt = f(t,y)
with the time t and the generalized coordinates y as input and the first order derivative of
the generalized coordinates dxdy as output. We can rewrite Heuns method
% velocity
y_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y_pred2=y(n,2)+dt*y(n,1);
% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2*.5*(y(n,2)+y_pred2)- 2*D*.5*(y(n,1)+y_pred1));
% position
y(n+1,2)=y(n,2)+dt*.5*(y(n,1)+y_pred1);
using the MATLAB-function (we will retain the the time as function-argument though there
is no explicit time-dependence in our force law)

6.3. PROGRAMMING ORDINARY DIFFERENTIAL EQUATIONS

103

function dydt = f(t,y)


% necessary parameters as global variables
global D
global omega0
% velocity component
dydt(1)=-omega0^2 * y(n,2)- 2*D*y(n,1) ;
% position component
dydt(2)=y(n,1);
return
as
clear

; format compact

global D, D=.2 , global omega0, omega0=1 % Damping and Force constant


x0=1 , v0=0 %Initial conditions
dt=0.1, t0=0, t_max=30 % time-step, start-time, end time
y(1,1)=v0
y(1,2)=x0
t(1)=t0
n=1
while (t(n)<t_max)
% predicted value
y_pred=y(n,:)+dt*f(t(n),y(n))
-omega0^2 * y(n,2)- 2*D*y(n,1) );
% corrected value
y(n+1,:)=y(n,:)+.5*dt*(f(t(n),y(n))+f(t(n),y_pred)
t(n+1)=t(n)+dt;
n=n+1;
end
% exact solution
omega_d=sqrt(omega0^2-D^2);
y_ex=(x0*exp(-D*t).*cos(omega_d*t));
subplot(2,2,1)
plot(t,y(:,2),-,t,y_ex,:)
legend(Euler, dt=0.1,Exact)
axis tight
which is much more readable than the original. This also allows us to see how many function
evaluations are necessary to a given timestep: whereas for the original Euler method we used
one function evaluation per timestep, for Heuns method we need two function evalution
per timestep. Therefore, Heuns method is not only more accurate, but also more costful.
Higher order methods will need even more function evaluations, therefore we rewrite Heuns
method as a function along with the computation of the number of steps, and the control
of the reasonableness of the input parameters. Moreover, the evalf command of MATLAB
is used so that the MATLAB-file which contains the differential equation can be passed as

104

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

an argument. Moreover, we have initialized tout and yout so that we dont loose time by
allocating new memory space when a new element is added to these vectors in each timestep:
function [tout,yout] = heun(yinit,tstart,tend,dt,f)
% Heuns Method:
% Runge-Kutta integrator (2nd order)
% Input arguments %
y = current value of dependent variable
%
t = independent variable (usually time)
%
dt = step size (usually timestep)
%
f = right hand side of the ODE; f is the
%
name of the function which returns dy/dt
%
Calling format f(y,t).
% Output arguments %
yout = new value of y after one stepsize dt
nsteps=ceil((tend-tstart)/dt)
dt
if nsteps<0
tstart
tend
dt
error (tend-tstart is not a positive multiple of dt)
end
if (abs(nsteps*dt-(tend-tstart))>1e-6*dt)
disp(warning: time interval not a multiple of timestep)
disp(inputed timestep:)
dt
dt=(tend-start)/nsteps
disp(use instead)
dt
end
dt
if (size(yinit,1)==1)
yinit=yinit;
end
%allocate necessary memory to save time:
yout=zeros(length(yinit),nsteps); %
tout=zeros(1,nsteps);
yout(:,1)=yinit;
y=yinit;
tout(1)=tstart;
n=1;

6.4. THE CLASSICAL RUNGE-KUTTA FORMULA

105

for k=1:nsteps
F1 = feval(f,y,tout(n));
t_full = tout(n) + dt;
ytemp = y + dt*F1;
F2 = feval(f,ytemp,t_full);
n=n+1;
y= y + .5*dt*(F1 + F2);
yout(:,n)=y;
tout(n)=t_full;
end
return
This program can then be called from a driver routine (a routine which does nothing else
than call a specific function) in such a way:
clear ;
format compact
global D, D=.2 ,
global omega0, omega0=1 % Damping and Force constant
omega_d=sqrt(omega0^2-D^2);
dt=0.1, t0=0, t_max=20 % time-step, start-time, end time
x0=1
%Initial conditions
v0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);
[t,y]=heun([v0;x0],t0,t_max,dt,harm_osc);
% exact solution
y_ex=(x0*exp(-D*t).*cos(omega_d*t));
subplot(2,2,1)
plot(t,(y(2,:)-y_ex)./y_ex,:)
legend(Heun, dt=0.1,Exact)
axis tight

6.4
6.4.1

The classical Runge-Kutta formula


The Idea

The idea to evaluate not only a single integration point, and moreover compute within a single
timestep more integration points from previously computed auxiliary timesteps is realized in
the so-called Runge-Kutta algorithm. The formulae for the so-called classical Runge-Kutta
method are
dt
(k1 + 2k2 + 2k3 + k4 )
6
= f (ti , yi )

yk+1 = yk +
k1

k2 = f (ti + dt/2, yi + k1 dt/2)

106

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS


k3 = f (ti + dt/2, yi + k2 dt/2)
k4 = f (ti + dt, yi + k3 dt)

It uses four evaluations F 1, F 2, F 3, F 4 at intermediate steps t0 , t0 + dt/2, t0 + dt/2, t0 + dt.


F 2 is computed from F 1, F 3 from F 2 and F 4 from F 3. Afterwards, the new y is computed
as a weighted average of the F 1, F 2, F 3, F 4. The function then looks like this
function [tout,yout] = rk4_class(yinit,tstart,tend,dt,f)
% Classical Runge-Kutta integrator (4th order)
% Input arguments %
y = current value of dependent variable
%
t = independent variable (usually time)
%
dt = step size (usually timestep)
%
f = right hand side of the ODE; f is the
%
name of the function which returns dy/dt
%
Calling format f(y,t).
% Output arguments %
yout = new value of y after one stepsize dt
nsteps=ceil((tend-tstart)/dt)
if nsteps<0
tstart
tend
dt
error (tend-tstart is not a positive multiple of dt)
end
if (abs(nsteps*dt-(tend-tstart))>1e-6*dt)
disp(warning: time interval not a multiple of timestep)
disp(inputed timestep:)
dt
dt=(tend-start)/nsteps
disp(use instead)
dt
end
if (size(yinit,1)==1)
yinit=yinit;
end
yout=zeros(length(yinit),nsteps);
tout=zeros(1,nsteps);

yout(:,1)=yinit;
y=yinit;
tout(1)=tstart; %40

6.4. THE CLASSICAL RUNGE-KUTTA FORMULA

107

n=1;
half_dt = 0.5*dt;
dt_6=dt/6;
for k=1:nsteps
% y
F1 = feval(f,y,tout(n));
t_half = tout(n) + half_dt;
ytemp = y + half_dt*F1;
F2 = feval(f,ytemp,t_half);
ytemp = y + half_dt*F2;
F3 = feval(f,ytemp,t_half);
t_full = tout(n) + dt;
ytemp = y + dt*F3;
F4 = feval(f,ytemp,t_full);
y = y + dt_6*(F1 + F4 + 2.*(F2+F3));
n=n+1;
yout(:,n) = y;
tout(n)=t_full;
end
return;
The same driver as for Heuns program above can be used, just with the line
[t,y]=heun([v0;x0],t0,t_max,dt,harm_osc);
replaced by
[t,y]=rk4_class([v0;x0],t0,t_max,dt,harm_osc);

6.4.2

The importance of the initial condition

In the section for Eulers and Heuns method, we used as initial condition x0 = 1, v0 = 0 to
compare the numerical result with the exact solution
yex = (x0 exp(Dt) cos(d t))
Actually, this solution is not the exact solution for the initial value problem with v0 = 0, but
for the initial value problem with
v0 = D exp(Dt0 ) cos(d t0 ) d exp(Dt0 ) sin(d t0 )
The improvement for the Runge-Kutta method compared to Heuns method would not been
visible because the initial value for the integration is so far off that the numerical solution
is quite wrong. Whenever one computes numerical solutions to compare them with exact
solutions, one should be sure that they are the solutions for the identical problem.

108

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

1
correct initial condition
exact solution
incorrect initial condition
0.5

0.5
0

6.4.3

Accuracy

Now that we have outlined several algorithms with different order (different truncation error
with respect to the Taylor expansion of dt), we should compare the above methods with
respect to their cost and accuracy. As has been mentioned already above, the cost of a
Runge-Kutta step is four function evaluations per timestep, in contrast to a single function
evaluation for Euler and two function evaluations for Heun. Let us compare the accuracy
of the three methods, once for the absolute accuracy ycomputed yexact , and once for
the relative accuracy (ycomputed yexact )/yexact . For the Euler method, we obtain an
exponentially decaying error due to the fact that the solution decays exponentially, and the
relative error increases exponentially. The absolute error starts at the order of 102 , which
is the square of the order of the timestep (dt = 0.1)2 , as was expected.
Euler, dt=0.1, absolute error

10

Euler, dt=0.1, relative error

2
2

10

10

10

10
0

10

20

30

40

10

20

30

40

For Heuns method, the absolute error starts at the order of 103 , which is the order of the
timestep (dt = 0.1)2 , which was also expected. The relative error is constant for a certain
time, and then increases exponentially.

6.4. THE CLASSICAL RUNGE-KUTTA FORMULA


Heun, dt=0.1, absolute error

10

10

109

Heun, dt=0.1, relative error

10

10

10

10

20

30

40

50

10

20

30

40

50

For the classical Method by Runge and Kutta, the absolute error starts at the order of
106 , which is by sheer luck one order more accurate than the fifth power of the timestep
(dt = 0.1)4 , which as was expected as the absolute error. Again, as in Heuns method, the
relative error is constant for a certain time, and then diverges exponentially.
Class. RungeKutta, dt=0.1, absolute error
10

10

10

Class. RungeKutta, dt=0.1, relative error


10

10

10

10

10

20

30

40

50

10

20

30

40

50

The last investigations have reviewed some old concepts and shown some important new
concepts for error analysis:
1. The order of the Euler, Heun and Runge-Kutta method are 1,2 and 4 respectively,
therefore the absolute error at the beginning of the integration process is of the order
+1 of the timestep dt2 , dt3 and dt4 for an initial amplitude of the order of 1.
2. The local error is the error for a single timestep, and the local absolute error at the
beginning of the integration is the same as the local relative error.
3. The behavior of the relative error is a bit more complicated, as can be seen, the relative
error increases during the integration process, but not monotonically. The error at
the end of the integration process is called the global error, and it can be seen that
the global relative error is much larger than the local absolute error. Whenever one
performs a time-integration of ordinary differential equations, one should know which
is the actually permitted error, and this is determined by the physical problem.

110

6.5

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

Adaptive Stepsize Control

It is possible to construct Runge-Kutta schemes using redundant evaluations of F (y, t) so


that the timestep can be computed in order n and in order n + 1 synchronously. One of the
first such method was proposed by Fehlberg for fourth and fifth order, another, currently
very popular method of the same order is the more stable scheme by Prince and Dormand.
The knowledge about both orders allows to estimate the error of the solution, and one can
therefore devise strategies to reduce the timestep if the error is too large, or to increase the
timestep more accurately than desired (and therefore takes too much computer time). Such
Runge-Kutta methods are build into MATLAB as ode23 and ode45, and with the driver
clear ;
format compact
global D, D=.2 ,
global omega0, omega0=1 % Damping and Force constant
omega_d=sqrt(omega0^2-D^2);
dt=0.1, t0=0, t_max=50 % time-step, start-time, end time
x0=1
%Initial conditions
v0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);
[t,y]=ode23(harm_osc2,[t0 t_max],[v0 x0]);
y_ex=(x0*exp(-D*t).*cos(omega_d*t));
subplot(2,2,1)
semilogy(t,abs(y(:,2)-y_ex),:)
title(ode23, dt=0.1, absolute error)
axis tight
subplot(2,2,2)
semilogy(t,abs((y(:,2)-y_ex)./y_ex),:)
title(ode23, dt=0.1, relative error)
axis tight
and the file for the differential equation harm_osc2
function dydt=harm_osc2(t,y)
format compact
global D
global omega0
% d velocity/dt
dydt(1,1)=-omega0^2 * y(2)- 2*D*y(1);
% d position/dt
dydt(2,1)=y(1);
return
This file is different from our previously used harm_osc.m file, as the order of the input
parameters t and y are exchanged. The following solution for our damped harmonic oscillator
has been computed:

6.5. ADAPTIVE STEPSIZE CONTROL

111

0.5

0.5

10

15

20

25

30

10

15

20

25

30

0.3
0.25
0.2
0.15
0.1
0.05

Above the solution was plotted, below we see the timestep. The time-adaption algorithm
changed the timestep depending on whether the oscillation was at a relative minimum or in
a straight motion. The accuracy of the time integration can be set by the input parameters
of the ode23 function, see help ode23. The accuracy diagram for the default accuracy is the
following:
ode23, dt=0.1, absolute error

10

10

10

ode23, dt=0.1, relative error

10

10

20

30

40

50

10

20

30

40

50

The same plots can be made for the ode45 algorithm, which gives the following accuracy
diagram

112

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS


ode23, dt=0.1, absolute error

ode23, dt=0.1, relative error


0

10
10

10

10

10

10

10

10

20

30

40

50

10

20

30

40

50

The computed solution and the timestep are


1

0.5

0.5

10

15

20

25

30

35

40

45

50

10

15

20

25

30

35

40

45

50

0.3
0.25
0.2
0.15
0.1
0.05

and it can be seen that MATLAB starts with a very small timestep and then increases the
timestep significantly to reach the default accuracy of the time integrator. The advantage of
these adaptive methods for reasonableordinary differential equations are:
One can specify the relative and absolute errors on input, and obtain a solution which
is guaranteed to be inside the specified errors.
The performance is optimal, i.e. for the given method there will be no solution which
will be computable with less timesteps/less computer time.
Without knowing anythin about the system, and the relation between the timestep
and the error resulting from the timestep for the given set of equations, one obtains a

6.5. ADAPTIVE STEPSIZE CONTROL

113

correct solution.
Therefore, it is allways a good idea to start an investigation of a problem with the above
method. But there are some caveats, for the case of unreasonable differential equations,
and these are systems which are often encountered in daily life, which are treated in the next
subsection.

6.5.1

Problems with Adaptive stepsize control

Adaptive stepsize control needs some assumptions about the smoothness of the treated differential equations. There are some notorious physical situations which lead to non-smooth
problems:
Coulomb-friction: If an ordinary differential equations contains terms which contains
the sign of a function, like in the case of Coulomb-friction,
FCoul = sign(v),
it may happen that the solution for the equation is not smooth enough, so that even a
reduction of the time step does not lead to the same solution for different orders of the
function evaluation of the ODE-solver. In that case, the solver stops, or it continues
only with very small time steps so that the solution is not finished within finite time.
Bouncing balls: If an object flies in free motion in a gravitational field, its trajectories
are parabolic. If it hits a target, the motion is suddenly reversed. For the numerical
time integration, the free motion allows a very large timestep, whereas in the moment
where the target is hit, the timestep has to be drastically reduced. It is possible that
numerical solvers with adaptive stepsize control are not able to reduce the step-size
appropriately, and in the simulation the impacting particle may not be reflected, but
may fly through the target. The risk for such a mishap is higher for higher order
solvers, e.g. for 8th order.
Because the adaptive stepsize control needs some information about how the timestep must
be reduced, MATLAB allows to specify the way in which the timestep should be changed via
the options-command.

6.5.2

Coulomb Friction

In contrast to the friction of and in fluids, which is for small velocities v proportional to the
velocity, Coulomb friction, the friction of solid on solid surfaces, is proportional to the sign
of th friction only. Using the Coulomb friction coefficient and the normal force fn , we can
write the Coulomb friction as
FCoul = fn v.
Obviously, this forcelaw has a jump at v = 0, and we know from physics that the friction
FCoul can take any value from fn to fn for v = 0. Actually, there is a method to solve

114

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

such an undetermined Problem in a numerically exact way2 , but we will just try to use
the adaptive stepsize control in the
hope that we get a reasonable solution by decreasing the stepsize. Using
clear
the ordinary differential equation
format compact
global D
y = y 2Dsign (y)
,
Let us check the output for different
values of D using the programs to the
right and let us look at the timestep:
d/dx2 x + 2 D *sign(d/dx x) + x = 0
1
D=0.05
D=0.1
0.5

0.5
0

4
timestep

dt

10

D=0.05
D=0.1
10

It can be seen that for D = 0.05,


as long the oscillation resembles the
oscillation of the damped harmonic
oscillator, the timestep is comparable to the one one expects for the
harmonic oscillator. For D = 0.1,
where the 0-amplitude is reached, the
timestep goes down for several orders
ob magnitude to guarantee the vanishing of the amplitude, and the integration is slowed down several orders
of magnitude in comparison with the
damped harmonic oscillator.
2

tmax=7
D=0.05
[t1,y1]=ode23(lin_coul_osc,[0 tmax],[1 0]);
t1_plot=linspace(0,max(t1),2*length(t1));
y1_plot=interp1(t1,y1,t1_plot,spline);
D=0.1
[t2,y2]=ode23(lin_coul_osc,[0 tmax],[1 0]);
t2_plot=linspace(0,max(t2),2*length(t2));
y2_plot=interp1(t2,y2,t2_plot,spline);
t2_plot=[t2_plot tmax];
y2_plot=[y2_plot ; [0,1]];
subplot(2,2,1)
plot(t1_plot,y1_plot(:,1),...
t2_plot,y2_plot(:,1),:)
axis tight
legend(D=0.05,D=0.1)
title(d/dx^2 x + 2 D *sign(d/dx x) + x = 0)
subplot(2,2,2)
semilogy(t1(2:end),diff(t1),t2(2:end),diff(t2),:)
ylabel(dt)
xlabel(timestep)
legend(D=0.05,D=0.1)
axis([0 7 1e-5 1])
return

function dydt = f(t,y)


% lin_coul_osc.m
global D
dydt = [y(2); -y(1)-2*D*sign(y(2))];
return

Hairer et al, Solving Ordinary Diff. Equations I, Springer

6.6. STIFF DIFFERENTIAL EQUATIONS

6.6

115

Stiff differential equations

There is a class of differential


equations, which are called stiff.
They usually involve two different
timescales/ periods for the oscillation
of a scientific phenomenon, like in the
case of the Van der Pols equation
y1% = y2

y2% = 1 yi2 y2 y1

clear
format compact
global mu,mu=500
t_max=1.5
D=0.2
tic
[t1,y1]=ode23(vanderpol,[0 t_max],[2 0]);
t1_plot=linspace(0,max(t1),2*length(t1));
y1_plot=interp1(t1,y1,t1_plot,spline);
toc

which for values of of the order of 1


is a absolutely ordinary set of differential equations, but for increas- tic
ing , usual integrators need very [t2,y2]=ode23s(vanderpol,[0 t_max],[2 0]);
small timesteps for the integration t2_plot=linspace(0,max(t2),2*length(t2));
process. For = 500, we have com- y2_plot=interp1(t2,y2,t2_plot,spline);
puted the solution using the standard toc
ode23-integrator and the stiff variant ode23, and it can be seen that subplot(2,2,1)
the stiff integrator uses much longer plot(t1,y1(:,1),*,t1_plot,y1_plot(:,1))
timesteps to obtain the same result, title(vandermode-ODE, ode23)
subplot(2,2,2)
and is ten times as fast.
MATLAB offers several stiff solvers, semilogy(diff(t1))
among them ode23s and ode45s, ylabel(timestep)
which are Runge-Kutta-Type, and
ode15s, which has several options for subplot(2,2,3)
plot(t2,y2(:,1),*,t2_plot,y2_plot(:,1))
the choice of the solution method.
Currently, there is no clear defini- title(vandermode-ODE, ode23s)
tion of when an ordinary differen- subplot(2,2,4)
tial equations is stiff, because it is semilogy(diff(t2))
not allways possible to identify time- ylabel(timestep)
scales in the system. The current
heuristic definition definition is: A
stiff differential equation is a differential equation for which a stiff solver
function dydt = vanderpol(t,y)
works much better than an ordinary
% vanderpol.m
solver.
If very high accuracy is necessary global mu
for the solution of the system, the
dydt=[y(2); mu*(1-y(1)^2)*y(2)-y(1)];
timestep of the stiff solver is reduced
return
to the timestep for the ordinary
solver.
For the example of the previous section for the Coulomb friction problem, the solution using a
stiff solver is not better than for the ordinary integrator. On the contrary, for some parameters

116

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

it may happen that the solver reduces the timestep to numerically 0 and the solution process
terminates with an error message.
vandermodeODE, ode23
2

10

10

10

timestep

1.9995
1.999
1.9985
1.998
1.9975

0.5

10

1.5

vandermodeODE, ode23s

200

400

600

800

1000

10

1.9995
timestep

1.999
1.9985
1.998
1.9975

6.7

0.5

1.5

10

10

20

30

Symplectic differential equations

The examples we have discussed up to now were implementations of Newtons Equation of


Motion where the system underwent continuous energy loss. Actually, there is a huge class
of systems which obey Newtons equation of motion for which no energy-loss occurs, there
are systems of atoms and molecules. These systems are called symplectic, and can be written
via canonical equations (using generalized coordinates and generalized momenta).

6.7.1

St
ormer-Verlet Method

In the previous examples, we always included some damping in the system and rewrote the
ordinary differential equation as a system of coupled first order differential equations. For the
most widely used symplectic (energy-conserving) time integrator, it is not necessary to rewrite
the first order differential equation from second to first order, on the contrary, this method
is not able to handle velocity-dependent (first order terms) at all. Using the acceleration a
(=Force/mass), we can write the Verlet method for the coordinate x as
xi+1 = 2xi xi1 + ai dt2 .

6.7. SYMPLECTIC DIFFERENTIAL EQUATIONS

As one can see, this method uses


not only the information of the
current timestep (xi and ai ), but
also the information from the
previous timestep (xi1 ), and is
therefore a so called multistepmethod, a method which uses
information from several steps.
The previously described RungeKutta methods are members of
the class of the so-called one-step
methods.
A possible implementation of the
Verlet-method with computation
of the number of timesteps is
shown on the right.
As can
be seen from the formula for
the Verlet-algorithm, when we
start with timestep 0, we also
have to compute timestep dt,
which is also done in the following example program before
the loop, in an approximate and
unsatisfying way. Because the
multi-step methods dont allow
by themsteves the computation
of the previous timesteps, they
are called non-self-starting, in
contrast to one-step-methods,
which are called self-starting.
In conventional implementation of multistep-methods, usually at the beginning a selfstarting method is used to compute the previous timestep.
Because verlet-type of algorithms
are mostly used for molecular
simulations, where the details of
the initial conditions dont matter,
at least not up to an error dt,
for practical applictions it is no
problem that the Verlet-method
is not self-starting.

117

function [tout,yout] = verlet(yinit,tstart,tend,dt,f)


% Stoermer-Verlet Method:
% Symplectic Method of (2nd order)
% Input arguments %
y = current value of dependent variable
%
t = independent variable (usually time)
%
dt = step size (usually timestep)
%
f = right hand side of the ODE; f is the
%
name of the function which returns dy/dt
%
Calling format f(y,t).
% Output arguments %
yout = new value of y after one stepsize dt
nsteps=ceil((tend-tstart)/dt)
dt
if nsteps<0
tstart
tend
dt
error (tend-tstart not positive multiple of dt)
end
if (abs(nsteps*dt-(tend-tstart))>1e-6*dt)
disp(warning: time interval not)
disp(a multiple of timestep)
disp(inputed timestep:)
dt
dt=(tend-start)/nsteps
disp(use instead)
dt
end
dt
if (size(yinit,1)==1)
yinit=yinit;
end
yout=zeros(length(yinit),nsteps);
tout=zeros(1,nsteps);
yout(:,1)=yinit;
y=yinit;
tout(1)=tstart;
n=1;
dt2=dt*dt;
% compute timestep before initial timestep,
% this implementation BAD, as it is one order less
% accurate than verlet itself !!!!!!!!!

118

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

F1 = feval(f,yout(2,1));
y_mdt=y(2,1)-F1*dt2;
F2 = feval(f,yout(2,1));
yout(2,2)=2*y(2,1)-y_mdt-F2*dt2;
tout(2)=dt;
for k=2:nsteps-1
F1 = feval(f,yout(2,k));
t_full = tout(k) + dt;
yout(2,k+1)=2*yout(2,k)-yout(2,k-1)+F1*dt2;
tout(k+1)=t_full;
end
return;

6.7.2

Precision

We compare the numerical solution for the harmonic oscillator without damping
function out=verlet_lin_osc(in)
% verlet-lin-osc
% linear oscillator with frequency omega
% for use with verlet-type integrator
global omega0
out=-omega0^2*in;
return
using the main program
clear ;
format compact
global D, D=0 ,
global omega0, omega0=1 % Damping and Force constant
omega_d=omega0;
dt=0.01, t0=0, t_max=300 % time-step, start-time, end time
x0=1
%Initial conditions
v0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);
tic
[t,y]=verlet([v0 x0],t0,t_max,dt,verlet_lin_osc);
y_ex=(x0*cos(omega_d*t*1)); % exact solution
[rkt2,rky2]=ode23(lin_osc,[t0 t_max],[v0 x0]);
y_rk2=(x0*exp(-D*rkt2).*cos(omega_d*rkt2)); % exact solution

6.7. SYMPLECTIC DIFFERENTIAL EQUATIONS

119

[rkt4,rky4]=ode45(lin_osc,[t0 t_max],[v0 x0]);


y_rk4=(x0*exp(-D*rkt4).*cos(omega_d*rkt4)); % exact solution
subplot(3,1,1)
semilogy(t,abs((y(2,:)-y_ex)))
title(Error for verlet)
axis tight
subplot(3,1,2)
semilogy(rkt2,abs(rky2(:,2)-y_rk2))
title(Error for ode23)
axis tight
subplot(3,1,3)
semilogy(rkt4,abs(rky4(:,2)-y_rk4))
title(Error for ode45)
axis tight
return
Error for verlet

10

10

10

50

100

150
Error for ode23

200

250

50

100

150
Error for ode45

200

250

300

50

100

150

200

250

300

10

10

It can be seen that the Verlet-Algorithm has a larger error for the initial timesteps, due to
our choice of the earliest timestep in first order. Nevertheless, the error bound is constant
over the whole integration interval. The remarkable property of the Verlet-method is that its

120

CHAPTER 6. ORDINARY DIFFERENTIAL EQUATIONS

global error is the same as its local error.


Though ode23 and ode45 from MATLAB start which a much smaller error, the global error
is proportional the the integration time, i.e. with time grows over all bounds. Physically that
means that if non-symplectic algorithms like Runge-Kutta are used for energy-conserving
systems, the energy will drift significantly over the integration interval. Verlet-Type of algorithms are even stable for millions of integration steps.

6.7.3

Velocities

The verlet methods only makes use of the coordinates, not of the velocities. Because the
velocities dont occur in the equations, they can only be estimated using the relation
vi = (ri+1 ri1 ) /2dt,
so that the velocities of a timestep are only known after the completion of the following
timestep. Therefore, it is not possible to incorporate velocity-dependent interactions in the
verlet-scheme.
Often, it is not clear how large a timestep should be chosen for a given dissipative problem.
There are some people who advocate the following procedure: Run the problem without dissipation and fix the timestep so that the change in energy during the simulation is negligible,
than use this timestep for the dissipative system. Our exploration of the symplectic integrator
shows that such a strategy is meaningless. The non-dissipative systems are a totally different class than the dissipative systems, even the best non-symplective integrators cannot
compete with quite mediocre symplectic integrators. Contrarywise, symplectic integrators
cannot be used with dissipation, in the above Verlet-Stormer integration there is no possibility to implement velocity-dependent forces, because at the time the forces must be computed,
the velocity is not yet known. The same is true for modifications like the velocity-Verlet
scheme, where one knows the velocity half a timestep too late, using the velocity from the
previous timestep introduces errors which are of the order of Verlet-scheme itself.

Anda mungkin juga menyukai