Getting Started
0.1
Most problems in Science and Engineering differ from undergraduate problems in the
respect that no closed solutions exist: Whereas there is a closed solution (solution
function) for the harmonic oscillator with viscous damping,
x +
x$
!2 "#
+02 x = 0
Viscous damping
&
02
'
2t
"#
Sliding friction
v
,
|v|
'
x(t) = (x x0 ) sin t +
x0
2
%
'
T
2
for
0t
for
T
FN
t T, x0 =
2
k
For forces which are not linear in x and its derivatives, in general not even piece-wise
solutions can be given. Other problems for which no solutions exists are problems with
many degrees of freedom (e.g. planet systems), or flow problems.
For the technically important fields of structural analysis and fluid mechanics, most
results are nowadays obtained by computer simulations.
4
Fluid mechanics: Flow around a sphere with increasing Reynolds number/ flow speed:
Analytical solutions exist only for the Stokes flow problem.
Vortices
Vortex Street
Stokes
Turbulence
0.2
New MAC-Installation
Since 4/2006, the exercises-room is equipped with MAC-computers instead of the old
SUN-Unix-Workstation-Terminals. Software can either be started via the WindowIcons in the Applications-Directory, or via the command-line terminal, which gets
started by clicking on the X-Icon. MATLAB can be started by clicking on the MATLABIcon. It is recommended for the Course to use the EMACS-Editor. Because the current
MAC-OSX-Operating-System is based on the Unix-Operating system, the following
comments on Unix are useful, last not least, because UNIX-commands (for directory
listings, previewing of graphics, removal of unneeded data etc.) can be used from the
MATLAB-prompt via ! as escape-sequence.
WARNING! The new MAC-Installation allows the teacher to view the screen and the
currently active programs in each student terminal. Applications which are unneeded
for the lesson can be terminated from the teacher-console.
0.3
UNIX-Workstations
0.3. UNIX-WORKSTATIONS
source destination
mkdir ml
cd ml
0.4
0.4.1
MATLAB
Introduction: Interpreters and Compilers
In general, for the programming projects with high numerical complexity, it will be
the best to develop the algorithms in MATLAB. MATLAB is, like BASIC or symbolic
computer languages like MAPLE, MATHEMATICA, MACSYMA and REDUCE, an
interpreter language, i.e. the language commands are translated into processor instructions. Nevertheless, MATLAB is not a symbolic language, but performs all calculations
numerically, i.e. with floating point numbers.1 The language can be used either from
a command prompt or as a functional (or object-oriented) programming language. In
compiler languages like FORTRAN, C or PASCAL, the program is fully translated
into processor instructions before execution. If errors occur at runtime, the memory
contents is difficult to analyze, usually only with the help of a debugger, which may
alter the program execution and memory layout up to the point that some errors cannot reproduced. The debuggers properties vary much more than the language itself. In
MATLAB, after a program crash, the data are still accessible in MATLABs memory
and can be analyzed using the commands from the MATLAB-language itself.
Interpreters allow fast program development. As a rule, their execution times are
higher than those of compiler languages, but during program development, usually the
compile time is more costful than the actual runtime. In MATLAB, when complex
builtin functions are initialized via small commands, like a matrix inversion, very often
the advantage in speed for the compiler languages is negligible.
Many programming languages have a whole zoo of data types. MATLABs elementary
elementary data type is the complex matrix. (Recently, MATLAB also offers more
kinds of data types, but we will not use them in this course). Variables can be processed up to the point where they take a complex value. Variables which are used as
indices must nevertheless have an integer value.
Because it is not possible to declare variables in MATLAB, is refuses to process variables which are not initialized. In FORTRAN77, for example, it was possible to use
variables which were neither declared nor initialized, and which assumed the value 0
at the moment they were used.
0.4.2
Getting started
The symbolic package available with MATLAB is basically MAPLE with a MATLAB-Interface.
0.4. MATLAB
Basic Commands:
edit
starts the MATLAB-Editor with Syntax-highlighting of MATLABcommands. You can use any editor you like to write MATLAB-files,
but the line-end may vary between operating systems and may lead
to trouble
clear
empties the memory
clear a
clear the variable a from the memory
who
displays the variables which have been assigned
help
gives help concerning a specific topic
help help
tells you how to use the help function
lookfor
looks for a word in the help files, useful if you are looking for a command according to context, but are not sure about the command
name
disp(a)
displays the value of the variable a
disp(a)
displays just the string, a.
rand
random number generator, will be used a lot to initialize data
format
format the output, format compact suppresses output of empty lines,
format short forces the rounding of the output to eight digits, but
the computations are still performed with full precision
%
comment sign
ls
displays the current working directory of MATLAB, i.e. the directory
for which MATLAB can access the files directly
cd
changes the current working directory of MATLAB
The MATLAB-desktop is written in JAVA (another interpreter-based programming
language), which has still some stability problems2 , so the desktop crashes relatively
often. If you dont want to work with the desktop to avoid unnecessary crashes, but
want to write the programs in a Unix-editor you know, you can also start MATLAB
with the command-prompt only as
To get an idea why the JAVA-Interface of MATLAB crashes so often, see the internal memo from
SUN from http://www.internalmemos.com/memos/memodetails.php?memo id=1321
2
8
Special Characters:
!
escape sequence, allows to use UNIX-Commands like cd, pwd from
the MATLAB-prompt
[...] 1. vector brackets referring to the value of the entry, [1 2 3] is a
vector with the entries 1, 2 and 3.
2. brackets referring to the output arguments of functions.
(...) 1. Brackets referring to the indices of a vector, a(3) is the third
element of the vector a
2. brackets referring to the input arguments of a function.
...
three dots mark the end of a line which is continued in the next line
;
has no syntactical function like in C but is only used to suppress the
output of the operation
i, j
stands for the complex increment 1, but can also be overwritten
for other uses.
pi
is indeed 3.1415....
,
divide commands, when several command lines should be written in
the same editor line
:
divide loop variables, lower_bound:stepwidth:upper_bound,
WARNING !
lower_bound,stepwidth,upper_bound only displays the variables
lower_bound, stepwidth, upper_bound
As a first reference, Kermit Sigmons MATLAB primer at
http://math.ucsd.edu/driver/21d-s99/matlab-primer.html
can be recommended. It gives a short overview over available commands, but it is a
good idea to get used to the builtin help-function of matlab (just type help from the
prompt). For most purposes, the internal help is sufficient. Manuals for MATLAB
are available, but there is not much information which ones needs beyond the builtin
help command on a daily basis, except the references to the algorithms used. This is a
huge difference to e.g. MATHEMATICA, where the algorithms are secret. Beware,
in contrast e.g. to FORTRAN, MATLAB is case sensitive, ABC is not the same as
abc. If you used the same variable names in lower case and upper case in the same
programs, you will run into trouble anyway. Information about a public-domain clone
of MATLAB, OCTAVE, can be found at www.octave.org.
Control statements are usually terminated via the end command, no matter whether
it is an if statement or a for loop:
a=2
b=3
for i=1:10
if (a>b)
i
disp(a>b)
end
else
disp(a<=b)
end
0.4. MATLAB
0.4.3
Matrix Processing
MATLAB was started by Cleve Moler, a famous researcher in numerical linear algebra,
as a MATRIX LABORATORY for his students, which should allow fast, save and easy
development of algorithms for numerical matrix analysis.
MATLAB has evolved to a general purpose language which specialized applications in
many fields. Many books in the meantime use MATLAB either as a formal language of
for the programming examples, have a look at http://www.mathworks.com/support/
books/index.jsp.
Matrix Syntax:
*
.*
a(2:4)
end
a(2:end)
b=c(2:3,2:6)
With the matrix syntax and the proper use of brackets, many operations can be simplified without the use of loops:
a=[0.5:0.5:10]
for i=1:20
or
a(i)=i/2
a=linspace(0.5,10,20)
end
Many functions either operate on vectors and matrices elementwise, or they are matrix
function in the sense that the operations are performed as matrix functions.
Matrix/Vector Functions:
length
give the longest dimension of a matrix, or the length of a vector
size
gives the dimensions of a matrix
linspace(a,b,m) make a vector with entries in m equidistant intervals between a and b
rand(n,m)
set up a random matrix with n lines and m rows
exp
exponential function, works elementwise on a matrix
expm
matrix exponential function, works on the eigenvalue of a matrix and
can only be used for square matrices
eig
eigenvalue decomposition
inv
matrix inversion
norm
matrix/vector norm
det
determinant
svd
singular value decomposition
10
0.4.4
User-defined functions can be written as ASCII-files with the extension .m. A function
my_function would contain in the file my_function.m
function [out_arg1,out_arg2,arg3]=my_funtion(in_arg1,in_arg2,arg3)
% function [out_arg1,out_arg2,arg3]=my_function(in_arg1,in_arg2,arg3)
% The first comment after the function declaration is
% displayed if "help my_function" is typed to write
% self-documenting functions
........
return
It is advisable always to end a function with a return statement, and also the main
program.
For input-functions, MATLAB-functions use call by value, which means that the
input-arguments (in round brackets) cannot be modified in the functions. Only the
output-arguments (in []-brackets) can be modified by the called function. If an argument is to be used as an input-argument and an output-argument, it must appear in
the round brackets and in the []-brackets, like arg3 in the above example.
Global variables can be defined with the statement global in a similar way as variables
are declared in other programming languages. The same declaration must then be used
in the functions which use the variable. Global variables can be also overwritten in the
functions, they are call by reference variables.
FORTRAN uses call by reference for all input variables of subroutines. C uses call by
value for scalars and call by reference for arrays, so that a pointer to a variable must
be used if a scalar is to be modified in the functions.
Functions can be overloaded for different numbers of input parameters and for scalar
and matrix functions. If the operations used in the function allow an interpretation in
the matrix-sense, the function can automatically used for functions.
0.4. MATLAB
11
Exercises
1. Set up a vector with the entries (1, 2, 3, 4, . . . n) once using a for-loop, the second
time using an implicit loop.
2. Multiply every second element with a constant a, once using a for-loop, once
using an implicit loop.
3. Write a program which tests which finds out which elements of a vector are even
4. See what happens if you set up ones(L) , ones(L,1), ones(1,L), and what
happens when you try to multiply these objects with each other.
5. a) What do you expect what the following program does?
clear
step=2
upper_bound=10
for i=1,step,upperbound
disp(i)
end
return
b) What does the program really do? c) How do you have to rewrite the program
so that the program does what you expected it to do in a)
6. Write a program which computes the factorial n! of an integer number n,
n n!
1 1
2 12
3 123
4 1234
7. Rewrite the factorial-program as a subroutine
8. Rewrite the factorial-subroutine so that the input-arguments are checked, so that
only proper input arguments are accepted.
9. Use the help-function of MATLAB to find out the relation between the built-in
function gamma and the factorial.
Chapter 1
How to write better programs
In this chapter, I will discuss the basics of programming style for numerical computing.1
Everything seems to be a matter of course, and during several courses, some students
who considered themselves experienced programmers skipped these lessons. Usually,
after 2 weeks homework, they ran into exactly those pitfalls, problems and errors as I
discussed in these pages, and usually wasted several hours which could have been spent
productively. My usual comment was: We had this two weeks ago when you didnt
attend . . .
1.1
1.1.1
Programming Style
Choosing variable names
Of course, nobody would use variable names in scientific computing which have no
scientific meaning, like linda,charly,taro when there is no documentation what the
variables mean. Variables which are difficult to spell, like asdtfgl or such-like should
better be avoided, except if there is a convention how to compose such variable names.
Some variable names in scientific programming are self-explaining, like
x,y,z,vx,vy,vz,omega
etc. I is very easy to over-do the self-explanation by choosing too long variable names,
as i saw once in the programs in a masters thesis:
this_is_the_coordinate_of_x.
this_is_the_coordinate_of_y,
I will use the terminology Computational Physics, Computational Engineering, Scientific
Computing, Scientific Programming pretty much as synonyms. Numerical methods, numerical
mathematics, numerical algorithms I will use when I want to emphasize mathematical techniques
to handle floating point computations, minimize roundoff-errors, control discretization errors etc.
Numerical physics i will use if i want to emphasize that the techniques for the computational
physics require an understanding of the floating point computations involved.
1
14
.....
15
1.1.2
Try to write your code as readable as possible. One definition of a programming Guru2
is: He understands his programs still after he has not looked at them for ten years. If
you consider yourself a Guru, try to read programs from ten years ago. There is a world
championship in writing the most unreadable C-Program, the infamous International
Obfuscated C Code Contest3 , and one of the winners wrote the following:
/*
* Program to compute an approximation of pi
* by Brian Westley, 1988
* (requires pcc macro concatenation; try gcc -traditional-cpp)
*/
#define _ -F<00||--F-OO--;
int F=00,OO=00;
main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()
{
_-_-_-_
Further Information on how to write good Programs can be found in: Code Complete, SteveMcConnel, Microsoft Press, Paperback 1993, also available in Japanese
3
Homepage at http://www.ioccc.org/
2
16
_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_
_-_-_-_
}
As the purpose of science is clarity, the purpose is writing scientific code is also clarity and
readability. Unreadable code is code which is hard to debug, and errors in scientific computing
are much more difficult to detect if you get the second digit in your as in the case of commercial software, where you can always tell by messages like segmentation violation. Moreover,
commercial software vendors can make money by selling software updates, whereas in scientific computing, people who wrote buggy code will have trouble in their career. Unreadable
code is not the fault of the programming language, though some programming languages
attract chaotic programmers more than others. The advantage of restrictive programming
languages like ADA is, that you cannot make certain classed of errors.
What is in a line
Be aware that identical operations in a computer are not speeded up by cramming everything
in the same line,
a=2*b
a=a^2
a=a/c+d
will take the same computer time as
a=((2*b)^2)/c+d
what is more readable, depends on the implemented formulae. There are tricks called performance optimization which actually allow faster program execution due to the style the
code is written, but this has nothing to do with cramming many commands in a single line,
but this can only discussed in a later chapter.
Coherence
If you are not sure which lines in the code should be grouped together, it is best to stick
to the concept of coherence, writing operations in consecutive lines which affect the same
variables. Instead of
17
a1=b1*c1
a2=b2+c2
a3=b3/c3
a1=a1-d1/e1
a2=(a1+a2)/2
a3=a3*a2
it is better to write
a1=b1*c1
a1=a1-d1/e1
a2=b2+c2
a2=(a1+a2)/2
a3=b3/c3
a3=a3*a2
Once I had to find the error in the program of a student. The result was correct, except that
it was 10 orders of magnitude wrong. He should have divided the result by a timestep dt.
The student knew that if one has to do many divisions by the same numberdt, it is fast to
compute the inverse i_dt=1/dt and multiply with i_dt. And he thought that he could save
programming time by not defining a new variable i_dt, so his program looked like
dt=10^-5
dt=1/dt
............
(one page of code
............
result=preliminary_result/dt
A perfect interaction of a stupid choice of variable names (the name of the variable at the
end did not match the meaning), a code which was longer than one page, so one could not
read it in a single window, and an incoherent way of using the variable dt.
18
1.2
Safety first
The most important aspect of scientific programming is the safety of the programs. Never
in the history of mankind has it been possible to produce so many wrong answers so fast4 .
1.2.1
Always check the input variables of your subroutines. You may know with which parameters
the subroutine must be used, but there may be somebody else who may not know it, usually the next student who uses the program after you, who will produce a lot of numerical
garbage. So even if you have a simulation of a mechanical system, which should be used with
positive timestep and positive masses, you better check whether the timestep at the masses
are larger than 0 at the beginning of the program. Moreover, error in passing arguments in
the subroutine can be detected more easily like that. If you find a wrong input parameter,
dont replace the input with a default value, but stop the program good and hard:
input(mass)
if (mass<-0)
error(mass should be larger than 0)
endif
For general software, it may be a good idea to define a default value. For most numerical
applications (except for accuracy thresholds), specifying a default input may be a very bad
idea.
1.2.2
Operator precedence
For analytic arithmetic expressions, the order of the arithmetic operations is usually well
defined, so that a + b cd is automatically evaluated as a + (b (cd )). Usually, the order of
the operations is equally clear with logical expressions, but with numerical code, it is a priori
not clear whether for the logical operator not, and, or as ~,&,|
(~a<b*c&d==0)
is evaluated as ((~(a<b*c))&(d==0)), or as (~((a<b*c)&(d==0)), or whether the logical
operations can indeed be applied bitwise to the integer-values as ~(10101010)=(01010101)
and then be used as the numbers of respective type. So if anything occurs which is more
ambiguous than addition and multiplication, one should use brackets.
1.3
Program documentation
Always document your program, and the best method will be to write the explanation within
the code, if they are elsewhere, they will get lost over the years. I will reject any project
which is not well documented.
4
Carl-Erik Froberg
1.3.1
19
Stupid comments
There are useful ways and stupid ways to write comments. When I once emphasized the
importance of comments for computer programs, in the next exercise lesson one student
wrote the following comment:
% here is a comment
When I asked why he wrote such a comment, he said: Because you said we should write
comments. But he had not written in his program what the program should do, and during
one hour of programming forgot actually what he should program... Another stupid comment
would be
% Divide by c
a=b/c
Of course, the multiplication is self-explaining, but for the same short line a comment like
% c from function XXYYZZ, not yet checked whether c becomes 0
a=b/c
may help a lot in debugging the code. Generally, focus on what the code is doing, not how,
because how it is done can be read from the programmed lines.
1.3.2
Comments
Usually, every line which contains information which is not self-explaining, like
volume=lx*lx*lz
mass=volume*rho
should better be documented. Of course, the amount of comments necessary grows with the
number of people who are supposed to use the code, with the number of functions and lines
in the code and with the complexity. If you are not sure who will use the code, then better
write your program documentation in English. It is generally a good Idea to formalize ones
documentation, especially at the beginning of functions/subroutines:
%PURPOSE: What the program is supposed to do
%USAGE: When and how the program
%AUTHOR: Who wrote the program
%DATE: Date when the program was written
%ALGORITHM: If the algorithm used is more complicated
%
than what you can document in the body of the subroutine, you
%
better explain the algorithm here
%LITERATURE: If you have used a complicated algorithm e.g. for
%
matrix inversion etc, write from which book or article the algorithm
%
comes, usually you have also used the naming conventions, and anybody
%
who wants to understand the algorithm (maybe you after ten years) better
% reads the literature first.
%CAVEATS: If you have programmed
%TODO: How to improve the algorithm the next time you have time
%REVISION HISTORY: Write the date when you modified the algorithm,
20
This above example is easy to maintain, to modify or add to. What is not easy to
maintain, would be something like
%
PURPOSE:
% +-------------+
and so on and so on. The simpler you design your comments, the more likely it is that
you really write them in the way they should be written. If any of the above points, leave
them away. If the routine is complete and runs as it should, dont write an empty TODO
point. If your routine-name is my_asin, (My arcus-sinus), then you dont have to do much in
principle. But if the routine actually computes the sinus in a non-standard-way by polynomial
approximation, you better write where you have it from in the literature. If the routine is
vectorized, this should be stated in the PURPOSE. If the vectorization works only if a vectorized
division is availabe, this should be written in the caveat. If you write a routine for the first
time, you dont have to write a REVISIONHISTORY, the date is enough.
And when you change the routine, also change the comments! Nothing is more confusing
than working with a correct routine for my_sinus which calculates a cosinus.
Exercises
1. Check the MATLAB-programs you wrote up to now whether they are in accordance
with the above ideas
2. Write a program which creates a matrix where the first column contains equally spaced
x-values between -5 and 5, and the second column contains the values of the secondorder polynom y = ax2 + bx + c
3. Write a program which makes creates a matrix where the first column contains equally
spaced x-values between -5 and 5, and the second column contains the values of the
function y = 1/(1 + x)
4. Write a program which can dectect whether the result of an mathematical computation
has complex parts
Chapter 2
Stochastic methods I
Stochastic methods use concepts from probability theory. Knowledge about stochastic methods is important in every field of science and engineering, because each data series contains
a certain element of chance or a certain scattering of the data.
2.1
In computer simulations, the element of chance is usually simulated by so-called randomnumbers, or pseudo-random-numbers. A random number generator is a function which should
generate a sequence of numbers which are distributed according to certain probability rules.
In case of equally-distributed random numbers, the numbers are usually between 0 and 1,
and all values can be obtained with the same probability. The random number generator in
MATLAB is called rand, and it can be called with arguments so that the result is not just a
single random number but a vector or matrix:
clear , format compact
rand
% output a
a=rand(1,4) % output a
b=rand(4)
% output a
, format short
random number
4x1 vector of random numbers
4x4 matrix of random numbers
This program using the function rand for equally distributed random numbers gives the
following output:
>> showrand
ans =
0.9501
a =
0.2311
b =
0.7621
0.4565
0.0185
0.8214
0.6068
0.4860
0.8913
0.4447
0.6154
0.7919
0.9218
0.7382
0.1763
0.4057
0.9355
0.9169
0.4103
0.8936
0.0579
22
2.1.1
n
n
1(
1 (
ai . and the variance = Var(a) =
&ai &a''2 .
n i=1
n 1 i=1
the mean of the squares of the differences between the respective samples and their mean.
The square root of the variance is called the standard deviation.
% PURPOSE: Calculate mean and Variance
% for the MATLAB-Random-Number Generator
clear
format compact
format short
n_rn=10000
rn_vec=rand(n_rn,1);
% rand(n_rm,1) gives line vector of length 10000,
% rand(1,n_rn) gives a row vector of length 10000,
% rand(n_rn) gives a square matrix of length 10000^2 and crashes the program
mean_rn=mean(rn_vec);
var_rn=var(rn_vec)
return
Exercise: Calculate by hand the theoretical mean and variance of the for random numbers
equally distributed between 0 and 1.
Another random number generator in MATLAB is randn, which creates random numbers
according to the Gauss distribution
)
1
(x xm )2
G(x) = exp
2 2
2
and the normally distributed random numbers from randn have mean xm and standard
deviation = 1.
Exercise 2 : Estimate the error-dependence in a statistical sequence of random numbers
from the number of random numbers used by comparing the theoretical variance for the
randn-random number generator with the actually measured variance.
2.1.2
A visualization for random numbers is just to draw the histogram: How many random numbers in the interval X. These intervals are called bins of the the histogram, the collection
of data in the histogram is often called binning. For a certain number of bins in the histogram, the distribution of the random numbers can be studied. For the following program
the output is given below and the drawn histogram is given on the right:
23
2
clear
format compact
format short
a=rand(1,4)
hist(a)
>> randhist
a =
0.3423
1.5
0.5
0.3544
0.7965
0.5617
0
0.3
0.4
0.5
0.6
0.7
0.8
If many bins are used, and few random numbers, the histograms is rough, if more
random numbers are used, the histogram is smooth:
50 random numbers,10 bins
clear
format compact
format short
a=rand(1,50);
subplot(3.9,2,1), hist(a)
set(gca,Xticklabel,)
title(50 random numbers,10 bins)
axis tight
10
0
500 random numbers,10 bins
60
40
b=rand(1,500);
subplot(3.9,2,3), hist(b)
set(gca,Xticklabel,)
title(500 random numbers,10 bins)
axis tight
d=rand(1,50000);
subplot(3.9,2,5), hist(d)
set(gca,Xticklabel,)
title(50000 random numbers,10 bins)
axis tight
subplot(3.9,2,7), hist(d,100)
title(50000 numbers,50 bins)
axis tight
20
0
50000 random numbers,10 bins
4000
2000
0
50000 numbers,50 bins
400
200
0
0.2
0.4
0.6
0.8
Exercise 3: Estimate the dependence of the statistical fluctuations, i.e. dependence in the
differences in the number of entries in the histogram on the number of entries.
A basic test for random numbers is whether the random numbers in a bin are the same
within the statistical fluctuations Much more sophisticated tests for Random Numbers can
be found in Knuth1 . Nevertheless, to evaluate the usability of a random number algorithm
for a given problem, one should not rely on theoretically available algorithms, but one should
1
24
test the algorithm for a problem with an unknown solution with a related problem for which
one knows the solution. Another visual way of controlling random number sequences is to
plot one sequence as the x- and the other as the y-coordinate:
clear
format compact
n_rn=100;
a=rand(n_rn,1);
b=rand(n_rn,1);
plot(a,b,.)
axis image
2.2
2.2.1
0.8
0.6
0.4
0.2
0.4
0.6
0.8
where X should be a numerical value. For general random number generators, very often
prime numbers have to be used as seed, so always read the documentation first.
2.2.2
Before random numbers could be easily and fast generated with computer algorithms, mathematicians used tabulated random numbers2 , similar as values for integrals are still used
today. Some of these random number tables had been compiled using Roulette Results from
the Casino of Monte Carlo in Monaco, and so Monte Carlo Methods got their name. In recent years, in Computer Science it has become fashionable to name some methods Las Vegas
methods instead of Monte Carlo methods, but the difference is purely academic.
See E.g. Random numbers in uniform and normal distribution : with indices for subsets / compiled
by Charles E. Clark, Chandler Pub. Co, 1966
2
25
N(in Circle)
a
/4
=
=
,
a!
1
N(in Square) = Ntotal
N(in Circle)
.
Ntotal
2.2.3
r=1
clear
format compact, format short
mc_step=10000
n_insize=zeros(mc_step,1);
n_try=zeros(mc_step,1);
i_inside=0;
for i_mc=1:mc_step
x=rand;
y=rand;
r2=x*x+y*y;
if r2<=1
i_inside=i_inside+1;
end
n_try(i_mc)=i_mc;
n_inside(i_mc)=i_inside;
end
4*i_inside/mc_step
return
Random numbers allow to simulate processes which are often considered to be deterministic
in a stochastic way. Let us in the following consider a league of teams, which sports (baseball,
soccer, basketball ....) does not matter. Each of the six teams has a certain game strength
Si . Let us define the probability for a team A to win against the other team B as
PAB =
SA max (Si )
.
SB SA + SB
In the following program, the game strength (team_quality) is the same for each team,
nevertheless you will find that usually one team wins. In the run which is depicted behind
the listing, the percentage of wins for all the teams are plotted. One can see that in the
beginning leading team 2 finishes as the last team, whereas the also leading team 6
wins the championship. In real life, sports reporters waste a lot of time and energy
on explaining such developments, but in out simulation, we can see that such such narrow
outcomes are just a result of chance. For stock exchange fluctuations, the same reasoning
applies.
clear , format compact,
n_team=5, n_game=100
format short
26
for i=1:n_team
team_quality(i)=11
end
n_games_played(1:n_team)=0
n_games_won(1:n_team)=0
for i_game=1:n_game
for i_team=1:n_team
for j_team=i_team+1:n_team
n_games_played(i_team)=n_games_played(i_team)+1;
n_games_played(j_team)=n_games_played(j_team)+1;
win_probability=...
...% relative probability
(team_quality(i_team)/team_quality(j_team))*...
...% normalization
max(team_quality)/(team_quality(i_team)+team_quality(j_team));
% assign a winner according to the probability
if (win_probability>rand)
n_games_won(i_team)=n_games_won(i_team)+1;
else
n_games_won(j_team)=n_games_won(j_team)+1;
end
score(n_games_played(i_team),i_team)=n_games_won(i_team);
score(n_games_played(j_team),j_team)=n_games_won(j_team);
end
end
end
% normalize the number of games won to a winning probability
normalization=ones(size(score));
normalization=cumsum(normalization(:,1));
plot(normalization,score(:,1)./normalization,--,...
normalization,score(:,2)./normalization,.,...
normalization,score(:,3)./normalization,+,...
normalization,score(:,4)./normalization,-.,...
normalization,score(:,5)./normalization,:,...
normalization,score(:,6)./normalization,-)
legend(team 1,team 2,team 3,team 4,team 5,team 6)
27
1
0.8
0.6
0.4
0.2
0
10
20
30
40
50
60
70
80
team 1
team 2
team 3
team 4
team 5
90
team 6
100
Exercise: Modify the quality and see how the winning probability changes. Find out how
strong you have to modify the winning probability so that it wins in all test runs.
2.2.4
In the example of calculating with random numbers, better statistic can be obtained by
using more Monte-Carlo steps. As an alternative, one can also run the programs with few
Monte-Carlo steps several times with different seeds, save the results and average the results.
This will reduce the noise in the data. For statistically independent data, as for our calculation
of , both approaches are independent.
The law of large numbers states that the actual probabilities are only realized after infinitely
many tries. For a finite number of realizations, the fluctuations in the systems can clearly be
felt.
An approach like the one above where successive Monte-Carlo data are obtained independently from each other is called simple sampling. If the Monte-Carlo data are chosen
depending on the previous data, this procedure is called importance sampling.
Homework 1: The obsolete Why-Function
Implement the old why-function (which does not exist any more) from Matlab Version 5.
When you typed why, you got the possible answers:
why not?;
dont ask!;
its your karma.;
stupid question!
how should I know?
can you rephrase that?
it should be obvious.
the devil made me do it.
the computer did it.
the customer is always right.
in the beginning, God created the heavens and the earth...
dont you have something better to do?;
because you deserve it
or
28
Chapter 3
Numerical Analysis I
3.1
Integers are represented according to the number representation the computer uses internally. For example, in the binary representation, integers are represented as combination of
0 and 1, in the hexadecimal (Greek-Latin for 16) representation, integers are represented as
combinations from 0 to A, see Tab. 3.1. If you need the conversion from decimal to binary,
decimal
00
01
02
03
04
05
06
07
08
09
binary
00000
00001
00010
00011
00100
00101
00110
00111
01000
01001
hexadecimal
00
01
02
03
04
05
06
07
08
09
decimal
10
11
12
13
14
15
16
17
18
19
binary
01010
01011
01100
01101
01110
01111
10000
10001
10010
10011
hexadecimal
0A
0B
0C
0D
0E
0F
10
11
12
13
30
of integers must be stored where the integers can only take very few values. The danger of
using the non-standard integer-types is that if one changes the compiler (or the computer
one works on), these data-types may not be available any more, and one has to rewrite the
whole program.
The C/C++-standards do not define the absolute accuracy of their data-types, but provides
the type int and longint, where the longint has possibly the larger number of digits (but may
have the same number as short int). Additionally, there is the unsigned-data-type, which
allows to represent a largest number in signed which is twice as large as in the signed data
type.
3.1.1
Fixed-Point numbers are created from integers by renormalizing the integer with a prefactor. Fixed-Point numbers are needed in environments where a constant absolute precision is
needed, for example in the banking sector, where the accuracy of an operation always must
be rounded to a certain digit, e.g. 1/10000 $, and this accuracy must be maintained over the
whole data range, from the smallest transactions of a few dollars to Billions of dollars.
3.2
In technical and scientific applications, the orders of magnitudes used are much larger than
e.g. in banking or administration. Trillions of Dollars (1012 ) are a lot of money, but trillions
of molecules is something rather microscopic. Therefore, the preferred data type in scientific
computations is the floating point number, where the numbers are spaced our irregularly,
more numbers in smaller intervals, so that the relative accuracy of operations is constant,
not the absolute accuracy as with integer numbers.
MATLAB performs all operations in floating point numbers (actually, in complex floating
point numbers). In contrast, many standard programming languages like C, C++, FORTRAN, do not perform type conversion during arithmetic operation, but only at the time
of the assignment of the result. That means an integer-division of a number by a larger
number gives 0, and depending on the data-type, the results have different accuracies, as in
the following example in FORTRAN90:
program test_implicit
implicit none
write(*,*) 3/7
write(*,*) 3./7.
write(*,*) 3.d0/7.d0
stop
end
! = 0
! = 0.428571
! = 0.428571428571429
Integer-division
REAL*4-Division
REAL*8 Division
3.2.1
31
Error
exact operation using real numbers and the absolute error #absolute = |C C|
3.2.2
Usage
They are the only numbers on a computer which which fast, numerical computations are
possible over a large range of possible values. Floating Point Operations, FLOPS, are usually
given as the benchmarks for computers, and currently the fastest computer in the World,
the Earth Simulator near Yokohama, can do about 40 Tera-Flops. The precision of the
declared variables is usually expressed in the declaration statement: In the FORTRAN77
standard, REAL*4/REAL*8 (or DOUBLE PRECISION) expressed that 4/8 Byte were used
to represent and the data.
3.2.3
Data-Layout
In floating point numbers, Mantissa and Exponent are stored in such a way that the number
is represented as the sum of powers of the base , precision t and lower and upper bounds
for the exponents e, L e U. A floating point number xcan then be represented as
x=
d1
dt
d2
+ 2 + ... t
'
with
0 di 1,
(i = 1, . . . , t)
The usual real numbers in a higher programming like C or FORTRAN language have the
following characteristics:
Kind
Byte/Bit mantissa/exponent
Range
valid digits
Real
4/32
23/8
8.431037 3.371038
6-7
307
308
Double
8/64
52/11
4.1910
1.6710
15-16
32
3.2.4
Example
The above representation does not give equidistant numbers, as can be seen if the distribution
of numbers for is plotted for = 2, 1 e 2, t = 3:
-4
-3
-2
-1
As can be seen by the above graph, floating point numbers have as many numbers between 1
to 10 as between 10 to 100, whereas integers and fixed point numbers have as many numbers
in the interval from 0 to 1 as from 1 to 2. In other words, if numbers are rounded to fixed
point numbers, there is a constant absolute error over the whole range of numbers, whereas
for floating point numbers, there is a constant relative error over the whole range of available
numbers.
The builtin-function in MATLAB to find out which is the largest relative space between
successive floating-point numbers is eps. This function depends on the implementation of
MATLAB as well as on the hardware and will give different results on different processors. If
you use a computer language other than MATLAB of FORTRAN90, where these functions
are built in, you can use the following algorithm:
% program eval_myeps
clear
format compact
% compute machine-epsilon
myeps=1.
myepsp1=myeps+1.
while (myepsp1>1)
myeps=0.5*myeps;
myepsp1=1+myeps;
end
myeps
Other builtin functions which are convenient to get ideas about the feasibility of some numerical algorithm are realmax, the largest representable floating point number, and realmin,
the smallest floating point number which is larger than 0. All these functions eps, realmin,
realmax are implementation dependent, i.e. their result may be different on different computer models, because the mathematical operations are wired in a different way on the
chip.
The actual number of valid digits of mantissa and exponent are usually not defined in
33
language-Standards, so that the IEEE-Standard (IEEE= Institute of Electrical and Electronics Engineers)
uses for double precision
S EEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
0 1
11 12
63
with the sign S, the exponents E and Mantissa digits F. whereas CRAY used something like
S EEEEEEEEEEEEEEEEE FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
0 1
17 18
63
which due to the lower accuracy and other idiosyncrasies in rounding, has now totally vanished. For most numerical computations, double precision is sufficient, and the errors with
single precision computations will be too large. Additionally to double precision, many manufacturers offered Real*16/Quadruple Precision, which usually will be considerably slower
than double precision.
Modern compilers and Processors, as the Pentium4 and the G4/G5 allow faster computations if the compiler options allow rounding for double precision, so that the results will be
considerably less accurate than double precision/16 digits, but still more accurate than single
precision/8 digits.
That 4/8 Byte are used does not mean that all compiler functions operate on these data
types are computed with the full accuracy, correct up to the last Bit. The IEEE-Standard
defined that all results have to be given in such a way that only the last/ least significant bit
is rounded. Because this can become quite costly, one can usually choose compiler options
which offer higher accuracy, but not so good performance, or faster but less accurate code.
The errors elf-implemented routines may suffer from additional errors which will be discussed
in the next sections.
3.3
Concerning what has been said in this section about accuracy, there are some things one can
do with integers which one should not do with floating point numbers. For a start, do not
check floating point numbers for equality
if (a==b)
but check for equality with a certain error #. Be sure whether you need the absolute error
n=10
epsilon=10^{-n}
if (abs(a-b)<epsilon)
or the relative error, which for two values a, b =
) 0 can be defined as
n=10
epsilon=10^{-n}
if (abs(a-b)<epsilon*max(abs(a),abs(b)))
34
3.4
Impossible Numbers
Several Mathematical operations are not well defined in mathematics, like dividing by 0, or
computing the real value of an asin of a number with an absolute value larger than 1. The
computer has to do something if the operation is mathematically undefined or meaningless.
Programs in compiler languages like C and FORTRAN usually crash, and leave it to the
programmer to find out where the error occurred.
If a variable is defined in FORTRAN as real, the result must also be real, so expressions like
sqrt(-1) or asin(1.5) crash the program. As the elementary datatype in MATLAB is the
complex array, such operations give in MATLAB the correct result, e.g.
> sqrt(-1)
ans = 0 + 1i
> asin(1.5)
ans = 1.57080 - 0.96242i
This may become a problem if the expected result is indeed real, but very near the undefined
value, e.g. if the result without rounding error should be 1, but due to rounding it is e.g..
1.000000000001, and the asin computed from it is
> asin(1.000000000001)
ans = 1.5708e+00 - 1.4143e-06i
so that the computation will be continued with a complex part. In such cases, the input
should always be checked with an if-statement whether it conforms to the expectations.
There is also an IEEE-Standard which defines such exceptions, e.g. what should be done if
e.g. a number is divided by 0. The result is stored in a bit-pattern which is outputted as NaN,
Not a Number. MATLAB is a bit more sophisticated. For a start, it gives the correct
result for the division, :
> 4/0
warning: division by zero
ans = Inf
> -2/0
warning: division by zero
ans = -Inf
for Inf, the usual rules apply, but some cases are different:
3.5. ERRORS
35
> Inf+3
ans = Inf
> Inf+Inf
ans = Inf
> Inf/Inf
ans = NaN
> Inf-Inf
ans = NaN
When tested for equality via the ==-Operator, one Idiosyncrasy is, that Infinity is always
equal to Infinity in MATLAB, but NaN is always unequal to NaN:
> 4==4
ans = 1
> Inf==Inf
ans = 1
> NaN==NaN
ans = 0
and for tests for NaN, the isnan-Function must be used:
> isnan(4)
ans = 0
octave:17> isnan(NaN)
ans = 1
octave:18>
To test which numbers are the largest and smallest, the MATLAB-Functions realmin and
realmax can be used. Because Inf, -Inf and NaN must are represented as Floating-Point
patterns in MATLAB, there are about three to four Bit-patterns less available in MATLAB
than in Compilers for e.g. FORTRAN or C which dont use Inf and Nan. Because the
Bit-Pattern of the largest Numbers are used, the largest represented floating-point number
is smaller than in the compilers.
3.5
Errors
As we have seen in the previous chapter, the representation of real numbers as floating point
approximation leads intrinsically to rounding errors. In the following, we will treat additional
sources of error which occur in the evaluation of algebraic equations.
3.5.1
Truncation error
Function evaluation
Many mathematical expressions are defined as an infinite process, for example the exponential
function is
x
x2 x3
exp(x) = 1 + +
+
...
(3.1)
1!
2!
3!
36
The error which results when e.g. the infinite series is instead computed with only a finite
number of operations, i.e. truncated after a finite step is called the truncation error. In fact,
if in a given interval a function f (x) is given by the infinite polynomial series with series
coefficients a1 , a2 , a3 . . . , a , if the series is truncated after n steps in a given interval, an
approximation
f (x) =
n
(
i=0
a%i xi + O(xn+1 )
(3.2)
can be found which has a smaller error than the truncated series using the coefficients of the
infinite series ai . Such a series is called an n-th order approximation of f (x), which often
makes use of the expansion of the function in terms of Orthogonal polynomials1
Whereas the exponential function exp(x) is defined in the infinite series with the coefficients
exp(x) =
(
xn
n=0
n!
the best finite approximation in the interval 0 x ln2 to exp(x) with 10 digits is
exp(x) = a0 + a1 x1 + a2 x2 + a3 x3 + a4 x4 + a5 x5 + a6 x6 + a7 x7 + (x)
with (x) 2 1010 and the coefficients are given in Tab.3.2. Be aware that the coefficients for the truncated polynomial approximation depend on the interval for which the
approximation should be used, to minimize the error.
n
an
0
1.00000 00000
1
-0.99999 99995
2
0.49999 99206
3
-0.16666 53019
4
0.04165 73475
5
-0.00830 13598
6
0.00132 98820
7 -0.00014 131261
1/n!
1.000000000000000000
-1.000000000000000000
0.500000000000000000
-0.166666666666666657
0.041666666666666664
-0.008333333333333333
0.001388888888888889
-0.000198412698412698
Table 3.2: Coefficients for the Polynomial approximation of exp(x) in the interval
0 x ln2 (middle column) and the corresponding coefficients of the infinite Taylor
series.
In practice, many transcendental functions f (x) which are introduced in elementary classes
of mathematical lessons are numerically better approximated by other approximations than
polynomial approximations, e.g. by making explicit use of divisions, which can itself mimic
operations of infinite order in x, either via Pade-Approximation (quotient of two polynomial
expressions) or via continued fractions.
An effective strategy, especially with periodic functions, is argument reduction, so that one
does not have to compute the Taylor series for large x, but for a small x near the origin
Chap. 22, Handbook of Mathematical Functions, M. Abramowitz, I. Stegun, National Bureau of
Standards.
1
3.5. ERRORS
37
by either shifting the periodic functions like sin, cos into the interval between [0, /4], or by
decomposing the function into a product of an integer argument and an non-integer argument,
like in the case of the exponential function, where one computes
exp(x) = exp(m + f ) = exp(m) exp(f ), m integer, |f | < 1.
Many approximation of transcendental functions can be found in Abramovitz/Stegun2 .
Other examples for truncation error can be found in other series expansion methods, e.g.
Fourier series truncated after a certain number of coefficients, or Pade approximations, where
an analytical function
+
ai xi
f (x) = +i=1
i
i=1 bi x
is approximated by the truncated Pade approximation
+
3.5.2
Rounding error
n
a
i xi
f(x) = +i=1
.
m i
i=1 bi x
Because we have only a finite number of digits available, when we try e.g. in Octave to
compute 5/9, we get
> format long
> 5/9
ans = 0.555555555555556
>
So, first of all, it is not necessary to input 5./9. like in FORTRAN when one wants to use
floating point numbers. On the other hand, one sees that the periodic fraction which is the
result must be rounded to 16 decimal digits.
Therefore, when we compute the exponential function in the following program,
% Example for rounding error in computing transcendental functions
clear
format compact
format long
x=-20.5
n_iter=100
myexp=0.
for i=0:n_iter-1
% Compute the Taylor-series for the exp-function
% x!=gamma(x+1)
myexp=myexp+x^i/gamma(i+1)
end
exp(x)
return
2
38
we obtain
myexp=-4.422614950123058e-07
as a result, instead of the correct
exp(x) = 1.250152866386743e 09.
As we see, the result is so very wrong that not even the sign of the result is correct, we get
a negative value for a computation which should always give positive values. The problem is
also not the range of the numbers, because the smallest number representable in MATLAB
precision is 10300 , much smaller than the correct result, 109 . The problem is also not a
truncation error, as we are still trying to add taylor contributions, even the result does not
change any more after the 95th iteration. The problem is that we try to add something which
is smaller than the last digit of the summation.
There are possibilites to circumvent such kinds of problems which will be explained later in
the lecture.
3.5.3
Catastrophic cancellation
Even for the last bit of a floating point function evaluation in double precision, which gives
about 16 digits accuracy, the 17th digit is of course wrong. The subtraction of numbers of
nearly equal size shifts these invalid digits in front, so that the results For expressions like
a=cos(x)^2-sin(x)^2
this gives dubious results whenever the argument x is a multiple of /4, with arbitrary number
of canceled digits. The problem can simply be circumvented by using the trigonometric
identity cos(2x) = cos(x)2 sin(x)2 so that
a = cos(2x)
(3.3)
always gives the result with the accuracy of the compilers evaluation of the cos- evaluation.
3.6
The following integral is positive because, because the integrand is positive in the whole
integration interval [0,1]:
En =
xn ex1 dx,
n = 1, 2, . . .
From partial Integration we obtain a relation between En and En1 , which can be used to
iteratively compute En if we have E0 given:
,
n x1
x e
dx = x e
-1 ,
-
n x1 -
= 1n
0
1
nxn1 ex1 dx
xn1 ex1 dx
En = 1 nEn1 ,
n = 2, . . .
(3.4)
(3.5)
,
,
xn ex1 dx
(3.6)
xn dx
(3.7)
0
1
0
xn+1 -1
39
E1
E2
E3
E4
E5
E6
E7
E8
E9
E10
E11
E12
E13
E14
E15
E16
E17
E18
E19
E20
0.367879441171442
0.264241117657115
0.207276647028654
0.170893411885384
0.145532940573080
0.126802356561519
0.112383504069363
0.100931967445092
0.09161229299417073
0.08387707005829270
0.07735222935878028
0.07177324769463667
0.06694777996972334
0.06273108042387321
0.05903379364190187
0.05545930172957014
0.05719187059730757
-0.02945367075153627
1.559619744279189
-30.19239488558378
(3.8)
n + 1 -0
1
=
,
(3.9)
n+1
which shows that for very large n, En is very small, and take E21 0 as an educated
guess, so that E20 = 0.5. The output for this iteration E20 E0 is given on the left, on the
right the output is overlayed with the output of the iteration E0 E20 .
E1
0.3678794411714423
E1
0.3678794411714423
0.2642411176571154
E2
0.2642411176571154
E2
0.2642411176571153
0.2072766470286539
E3
0.2072766470286539
E3
0.207276647028654
0.1708934118853843
E4
0.1708934118853843
E4
0.170893411885384
0.1455329405730786
E5
0.1455329405730786
E5
0.1455329405730801
0.1268023565615286
E6
0.1268023565615286
E6
0.1268023565615195
0.1123835040692999
E7
0.1123835040692999
E7
0.1123835040693635
0.1009319674456008
E8
0.1009319674456008
E8
0.1009319674450921
E
0.09161229298959281
E9
0.09161229298959281
0.09161229299417073
9
0.083877070104072
E10 0.083877070104072
E10 0.0838770700582927
0.07735222885520793
E11 0.07735222885520793
E11 0.07735222935878028
0.0717732537375049
E12 0.0717732537375049
E12 0.07177324769463667
0.06694770141243632
E13 0.06694770141243632
E13 0.06694777996972334
0.06273218022589153
E14 0.06273218022589153
E14 0.06273108042387321
0.05901729661162711
E15 0.05901729661162711
E15 0.05903379364190187
0.05572325421396629
E16 0.05572325421396629
E16 0.05545930172957014
0.0527046783625731
E17 0.0527046783625731
E17 0.05719187059730757
0.05131578947368421
E18 0.05131578947368421
E18 -0.02945367075153627
0.0250000000000000
E19 0.025
E19 1.559619744279189
0.5000000000000000
E20 0.5
E20 -30.19239488558378
40
As can be seen, the second iteration with the wrong starting value converges against the
right end-value exp(1), whereas the first iteration with the right starting value converges
against a wrong result. This shows the art of numerical computing, which is, to obtain a
correct end result with a good routine and a wrong starting value, instead of obtaining a
wrong end result with a correct starting, but a bad routine.
It will be later become obvious in this course that integration is always the good direction
in numerical computing, which can decrease initial errors, whereas the differentiation is the
bad direction, which can increase initial errors. This is in contrast to manual calculation,
where the differentiation is easier to treat than integration.
3.7
3.7.1
f (x) =
ai xi
i=1
(
f ()(x0 )
=1
(x x0 ) .
sin(x)
1st Order
3rd Order
5th Order
7th Order
0
2
5
cos(x)
0th Order
2nd Order
4th Order
6th Order
n
(
t
n=0
n=0
n=0
n!
=1+
t
t2 t3
+ + + ...
1! 2! 3!
(1)n
t2n+1
t3 t5
= t + ...
(2n + 1)!
3! 5!
(1)n
t2n
t2 t4 t6
= 1 + + ...
2n!
t!
4! 6!
6
exp(x)
0th Order
1st Order
2nd Order
3rd Order
4
2
0
If we truncate the (for transcendental functions infinite) series after a finite number of terms,
we obtain the Taylor approximation. The the evaluation of a Taylor approximation, e.g. of
fourth order with the coefficients a, b, c, d, e
f (x) = a + bx + cx2 + dx3 + ex4
series can be done in an efficient and in an inefficient way. Using the above formula directly,
we can write
f(x)=a+b*x+c*x*x+d*x*x*x+e*x*x*x*x
41
so that we need four additions and ten multiplications. If we use brackets around the expression in a skilled way, four additions and four multiplications are sufficient:
f(x)=a+(b+(c+(d+e*x)*x)*x)*x
It is easy to write down the derivative of the above polynom as
f(x)=b+(2*c+(3*d+4*e*x)*x)*x
8
In MATLAB, the evaluations of polynoms is implemented with the function polyval, the derivative with
polyder, but the order of the coefficients is the oppo6
site from the above example, and the graph can be seen
on the right:
clear, format compact
P=[1 0 -1]
x=linspace(-3,3,100);
y=polyval(P,x);
P_deriv=polyder(P);
y_deriv=polyval(P_deriv,x);
plot(x,y,-,x,y_deriv,--)
legend(f(x)=x^2-1,d/dx f(x)=2x)
grid
axis image
3.7.2
f(x)=x 1
d/dx f(x)=2x
Integration I
In the same way that many transcendental functions can be represented by an infinite Taylor
series but approximated as a finite polynomial series in x, integrals and derivatives can be
approximated by replacing the infinitely small differential dx by the finite difference x,
and the error can be expressed as a power of x, as in the approximation of transcendental
functions by finite power series. The simplest method to numerically integrate an integral
I=
f (x)dx
(3.10)
42
(
i
(3.11)
where b a is a multiple of x, and the integration points are spaced equidistantly.3 I (1)
means that the method is of first order in x, the error is of the order of x2 .
Numerical integration is sometimes called quadrature, maybe from the time where the
integral was approximated numerically by drawing squares under the graph, and this box
counting was the first non-analytical quadrature. As an example, let us compute the
integral
, b
erf(b)
erf(a)
2
exp(x )dx =
,
2
2
a
which is a bit unintuitive because its needs the error function erf to be represented analytically. With the integration bounds of [0, 1] the integral is with about 15 digits accuracy
,
Now let us approximate this integral with the rectangle midpoint rule, where we replace the integral with a Riemann sum with n corner points and we will evaluate
the function in the middle of the n 1 inIntegration with Rectangle Midpoint Rule
tervals of equal width h with the functon 1
evaluated at the middle instead of the left 0.9
or right end of the integration interval:
0.8
clear
format long
n=101
% n odd !
dx=1/(n-1)
% stepsize
xrect=[dx/2:dx:1-dx/2];
yrect=exp(-xrect.*xrect);
sum(yrect)*dx
,
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.4
0.2
0.2
0.4
0.6
0.8
1.2
1.4
f (x)dx =
/
h .
f (x0 ) + 2f (x1 ) + 2f (x2 ) + . . . + f (xn )
2
There are methods which dont chose the points equidistantly, but optimize the choice of points
so that the most accurate approximation is obtained with the minimum number of points.
3
43
Integration wit Trapeze Rule
1
0.9
0.8
0.7
0.6
0.5
0.4
clear
format long
0.3
n=101
% n odd !
0.2
dx=1/(n-1)
% stepsize
0.1
xtrap=[0:dx:1];
0
ytrap=exp(-xtrap.*xtrap);
0.4
(sum(ytrap)-.5*(ytrap(1)+ytrap(n)))*dx
0.2
0.2
0.4
0.6
0.8
1.2
1.4
Surprisingly, the result of 0.74681800146797 is one digit less accurate than the result with the
midpoint rule, though the program was more complicated, because we had to think about
a proper way to implement the trapeze shape for each interval. If we think about a graph
with mostly negative curvature, the trapeze rule will end with an approximation which is
constantly below the true function value. For the rectangles of the midpoint rule are partly
above, partly below the graph, so that there is error compensation already within one interval.
In the rectangle midpoint rule, we have chosen the quadrature points in the middle of the
interval. If we would have choosen the values for the function evaluation at the left/right
boundary of each interval, we would have obtainted 0.74997860426211/ 0.74365739867383,
considerably less accurate than the rectangle midpoint rule.
It can be shown4 that the
(3.12)
Trapeze rule
underestimated
(3.13)
so it is surprising that textbooks usually introduce numerical quadrature via the trapeze rule.
Because both formulae are correct up to the second power of hi , and the error is of the third
power, they are called formulae of second order.
More accurate, accurate to third order, is the composite Simpson-Rule S(f ), which makes use
of combining the rectangle midpoint rule R(f ) and the trapez rule T (f ). When we compare
the integral for the midpoint rule and for the trapeze rule, we see that in our integration
G.E. Forsythe, M. Malcom, C. Moler, Computer Methods for Mathematical computations, Prentice Hall 1977
4
44
intervall with a convex function, the trapeze rule allways gives a too small result, the midpoint
rule gives allways a too large result. Therefore, if we average R(f ) and T (f ), we will get a
better result than R(f ) and T (f ) alone. Because the error for R(f ) (Eqn. 3.12) is twice as
large as the error for T (f ) (Eqn. 3.13), we should not take the direct average 12 R(f ) + 12 T (f ),
but the weighted average so that the error of both rules cancels:
2
1
S(f ) = R(f ) + T (f ).
3
3
Its error can be shown5 to be of the order of
1 4 %%%%
h f (xi ).
2880 i
For our example with the integral from 0 to 1 over exp(x2 ), we obtain 0.746824132817537
as the result, instead of the exact 0.7468241328124270 . . . .
In our second-order formulae, we tried to approximate the graph with straight lines and
integrated the area below the curve. A parabola is determined by 3 points, and therefore one
can also try to approximate the graph via a parabola instead of a straight line to obtain a
Simpson rule directly by supplying three integration points for each interval.
It is therefore necessary to have an odd number of integration points, and the direct derivation
of the Simpson rule can be done for an integration interval with length 2h and by inserting
the Taylor expansion of the function f (x) with the th derivatives f () around the point
x0 so that
f (x) =
(
f ()(x0 )
=0
Simpson-Rule
(x x0 )
2h %
2h
f (x)dx =
%%
'
2h %
'
g(x)
f(x)
dx =
h
0
h
2h 3h 4h
(f (0) + 4f (h) + f (2h))
3
Using this formula for the single interval of length h, we can compose the formula for the
whole integration over [a, b] with the integration point from x0 = a to x2n = b :
,
f (x)dx =
The MATLAB-Program is
/
h.
f (x0 ) + 4f (x1 ) + 2f (x2 ) + 4f (x3 ) + . . . + f (x2n )
3
G.E. Forsythe, M. Malcom, C. Moler, Computer Methods for Mathematical computations, Prentice Hall 1977
5
45
clear
format long,format compact
n=101
% n odd !
dx=1/(n-1)
% Stepsize
xsimp=[0:dx:1];
ysimp=exp(-xsimp.*xsimp);
(4*sum(ysimp(2:2:n))+2*sum(ysimp(3:2:n-1))+ysimp(1)+ysimp(n))*dx/3
which gives the Result 0.74682413289418, slightly worse than the composed Simpson rule.
In the following table, we compare the errors of the different orders by underlining the correct
digits and introduce the Big-O-notation
Method
Exact
Rectangle, left endpoint
Rectangle, right endpoint
Rectangle, midpoint
Trapeze rule
Simpson
Composite Simpson
Integral
01
2
0 exp(x )dx
0.7468241328124270 . . .
0.74997860426211
0.74365739867383
0.74682719849232
0.74681800146797
0.74682413289418
0.746824132817537
Order correctness
O(h )
O(h)
O(h)
O(h2 )
O(h2 )
O(h3 )
O(h3 )
Several conclusions can be drawn from viewing the above table, which hold also for other
numerical methods which have an intrinsic truncation error:
1. If a formula if of nth order and a discretization of 1/100 of the interval length are used,
for a first order implementation, the error is about 1/100=1 %, for the second order
method it will be about 1/10.000 and for the third order method it will be 1/1000000.
(Of course, the prefactors in the order also have to be taken into consideration).
2. Therefore, it is not always necessary to increase the number of discretization steps to
obtain a more accurate result. The change from first order to second order in the
rectangle rule resulted from just switching the integration points by h/2.
3. If the theoretical accuracy cannot be reached, it is necessary to consider whether
1. The function under consideration does not fulfill the necessary criteria (smoothness
etc) or
2. to search for the error in the program resulting from incorrect prefactors, intervals
with incorrect bounds etc. If a formula of second order gives results with an error
proportional 1/(number of points), then the interval bounds are usually determined
wrongly.
4. Be aware that it is not possible to integrate functions analytically if their integral has
no solution due to divergence etc .....
In this section, we have discussed the error resulting from the integration over a whole interval.
This is also called a global error, in contrast to the error which occurs in the approximation
of the single interval. Numerical methods suffering from truncation error vary depending on
whether the global error is the same as the local error, whether the global error is larger than
46
the local error (many solvers for differential equations which do not conserve energy) or the
global error is smaller than the local error (error compensation as in the case of the Simpson
Integration.).
One generally should be very
careful in using a method with
Relative Accuracy for Integrating exp(x*x) between 0 and 1
low order accuracy and a small
10
Rectangular left and right boundary
time step. First of all, for
10
many problems, such a method
Trapeze Rule
can become quite time con10
suming.
Furthermore, the
Rectangular Midpoint Rule
10
more function evaluations occur, the more rounding errors
10
are accumulated.
The diagram to the right shows the
10
Simpson
cost-performance diagram, the
10
number of time steps plotted with respect to the accu10
racy. Cost-performance diagrams vary depending on the
10
10
10
10
10
10
10
evaluated functions. As can
Number of Integration Points
be seen, beyond 1000 integration points, the accuracy of the
Simpson method has already reached the limit of 16 digits of the double precision accuracy
and therefore the integral evaluation cannot be increased further by increasing the number
of integration points.
0
10
12
14
16
There are integration formulas which are easier to use than the numerical approximations
of the Riemann sum we introduced here, which are called Newton-Coates formulae. The
Midpoint rule is called an open Newton-Coates formula, because the endpoints of the integration intervall are not evaluated, the formulae for which the endpoints must be evaluated,
are alled closed Newton-Coates formulae. The following table shows the Taylor expansion for a single interval of length h for Newton-Coates formulae of different order with the
corresponding error term:
Name
Midpoint TrapezRule
Simpson
Simpson 3/8
Bode
Integral Formula
0 x2
f (x)dx = h[ 12 f1 + 12 f2 ]
0x1
x3
f (x)dx = h[ 13 f1 + 43 f2 + 13 f3 ]
0x1
x4
f (x)dx = h[ 38 f1 + 98 f2 + 98 f3 + 38 f4 ]
0x1
x5
14
64
24
64
x1 f (x)dx = h[ 45 f1 + 45 f2 + 45 f3 + 45 f4 +
14
45 f5 ]
Error Term
+O(h3 f %% )
+O(h5 f (4) )
+O(h5 f (4) )
+O(h7 f (6) )
Carrying the error compensation in formulas with truncation error further to higher orders,
by combining low order methods as in the case of the composite Simpson rule so that a
higher-order method results, is called Romberg integration. If the limit for infinitely high
orders is taken, this is called Richardson-extrapolation, and these ideas can also be applied
to differentiation and the solution of numerical differential equations.
3.7.3
47
Differentiation I
In the same way one can derive the Newton-Coates formulae for integrals from their Taylor
expansion in the previous section, one can derive formulae for the derivatives using the
Taylor expansion6 . Such approximations are often called finite difference formulas, as they
approximate the differential with the finite difference. For a data which take the value
fi2 , fi1 , fj , fj+1 , fj+2 at equidistant points, we get the following finite difference schemes
for first order derivatives.:
Name:
ForwardDifference
BackwardDifference
3point symmetric
3point asymmetric
5point symmetric
Leading error
xf %% (x)/2
xf %% (x)/2
x2 f %%% (x)/6
x2 f %%% (x)/3
x4 f %%%%% (x)/3
Note that the leading coefficients in front of the fi2 , fi1 , fj , fj+1 , fj+2 have to add up to 0.
For second order derivatives, similar schemes are be written down in the following table and
again the coefficients add up to 0:
Name:
3point symmetric
3point asymmetric
5point symmetric
Leading error
x2 f %%%% (x)/12
x2 f %%% (x)
x4 f %%%%%% (x)/90
In contrast to numerical integration, which smoothes out errors via error compensation,
numerical differentiation roughens up the solution. If high accuracy is desired, there are
usually better solutions than computing the derivatives directly via finite difference schemes.
The graph to the right shows the numerical integral of
1
sin(y)dy = 1 cos(x)
Clive A.J. Fletcher, Computational Techniques for Fluid Dynamics, Vol.1, 2nd. ed. Springer 1990
48
sin(x)+0.005*rand
d/dx sin(x)
cos(x)
1int(sin(x))
0.5
0
0.5
1
0
6
absolute error
10
12
10
12
10
12
0.1
0.05
0
0.05
cos(x)d/dx*sin(x)
cos(x)1+int(sin(1)
0.1
0
6
relative error
(cos(x)d/dx*sin(x))/cos(x)
(cos(x)1+int(sin(1))/cos(x)
4
2
0
2
0
plot(x,y,-.,x(1:nstep-1),diff(y)*idx,...
x(1:nstep-1),cos(x(1:nstep-1)),:,...
x(1:nstep-1),1-cumsum(y(1:nstep-1))*dx,--)
axis tight
legend(sin(x)+-0.005*rand,d/dx sin(x),cos(x),1-int(sin(x)))
subplot(3,1,2)
plot(x(1:nstep-1),diff(y)*idx-cos(x(1:nstep-1)),...
x(1:nstep-1),1-cumsum(y(1:nstep-1))*dx-cos(x(1:nstep-1)),:)
legend(cos(x)-d/dx*sin(x),cos(x)-1+int(sin(1))
title(absolute error)
axis tight
subplot(3,1,3)
plot(x(1:nstep-1),((diff(y)*idx-cos(x(1:nstep-1)))./cos(x(1:nstep-1))),...
x(1:nstep-1),(1-cumsum(y(1:nstep-1))*dx-cos(x(1:nstep-1)))...
./cos(x(1:nstep-1)),:)
legend((cos(x)-d/dx*sin(x))/cos(x),(cos(x)-1+int(sin(1))/cos(x))
axis tight
title(relative error)
return
Both the differential and the integral should give cos(x), but the differential is so noisy that
the result deviates visibly from the exact solution. The integral over the noisy result gives
nevertheless a smooth curve. This is again a case of a good and a bad direction of
numerical computing, as we encountered before by rewriting the iterative computation of the
49
equation
into
As can be seen, for the differentiation and its inverse operation, the integration, in numerical
analysis, the differentiation consists the bad direction, integration is the good direction.
In numerical analysis, integrals, also of higher order, can usually be computed with sufficient
precision, in contrast to derivatives, whereas in analytical calculations, it is usually always
possible to compute differentials, but very often the computation of closed forms for integrals
is problematic.
Exercises:
1. Write a program which produces floating point numbers for base 2 and mantissa 4, as well
as for base 4 with mantissa 2.
a) Chose the exponent so that both number systems are roughly comparable.
b) Plot the position of the numbers.
c) Compare both number systems: Which number system can be supposed to have the better
roundoff-properties.
2. Write a program which computes the exponential function exp(x) using the Taylor series
and one program which computes the exponential function by evaluating the integer part of
x using powers of the Euler number e and the non-integer-part using the taylor series. For
which size of the arguments become the
Chapter 4
Graphics
4.0.4
Instead of using for loops for setting up vectors and matrices, it is convenient in MATLAB to
use the implicit loops provided by the colon operator : and brackets for the array constructor
[]
>> a=[3:6]
a =
3
4
4.0000
4.5000
5.0000
5.5000
6.0000
This is different from loops in FORTRAN and C, where the stepsize is added as the third argument for a loop statement. Whereas the colon operator notation using : constructs a vector
with a given lower and upper bound for a given stepsize, [lower_bound:stepsize:upper_bound],
if instead of the stepsize the number of points is known, it is more convenient to use the
linspace-function
>> a=linspace(3,6,7)
a =
3.0000
3.5000
4.0000
4.5000
5.0000
5.5000
1000
10000
6.0000
52
CHAPTER 4. GRAPHICS
If several vectors should be concatenated, this can be done with the brackets for the arrayconstructor []
>> c=[1 3]
c =
1
3
>> c=[4 c b]
c =
Columns 1 through 6
4
Column 7
10000
10
100
1000
After a lot of vector operations, one usually one also needs functions which give informations
about the vectors used. The most elementary function, which displays information about
variables, is
>> who
Your variables are:
a
ans
2000
2000
20000
53
or via the colon-notation with : and round brackets so that for a vector
>> c=.2:.2:1.2
c =
0.2000
0.4000
0.6000
0.8000
1.0000
1.2000
the assignment of the second to the fourth element to a vector g can be written as
>> g=c(2:4)
g =
0.4000
0.6000
0.8000
The whole of a vector can be assigned without specifying the bounds like in
>> h=c(:)
h =
0.2000
0.4000
0.6000
0.8000
1.0000
1.2000
If the vector from a lower bound up to the end should be assigned, this can be done via the
the end statement in round brackets together with the colon operator :
>> v=c(4:end)
v =
0.8000
1.0000
1.2000
Functions which operate on vectors are usually defined in the canonical way, that means
in a way in which one expects the function to work. The functions prod and sum acting on a
vector behave in the way one expects, e.g. they give as a result the product and sum of the
vector elements. Whereas prod and sum are acting on vectors and give a scalar as a result,
the functions cumsum and cumprod which computed the cumulated sum and the cumulated
product give a vector as a result
>> cumsum(1:5)
ans =
[1 3 6
10
15]
One must be careful with the use of multiplicative operators *, / and ^, which are in
MATLAB in general interpreted in the sense of numerical linear algebra, so that columnand line- operations must match. If one wants to use these operators elementwise, one should
use their elementwise variants which are preceded by a ., as in .*, ./ and .^.
54
4.1
CHAPTER 4. GRAPHICS
Matrices can be manipulated in the same way as vectors, preferably with the colon operator
: and the brackets for the array-constructor []. Some elementary builtin MATLAB matrix
functions will be explained here because they make matrix construction easier. The ones
function sets up a matrix with ones as every element, as usual in MATLAB a single matrix
sets up a square two-dimensional matrix
>> ones(2)
ans =
1
1
1
1
For non-square matrixes, two indices have to be specified, where the first is the columns-index,
and the second is the row index, for example
>> ones(3,2)
ans =
1
1
1
1
1
1
The zeros function behaves in the same way as the ones function, only that it sets up
matrixes with 0 as every element
>> zeros(2,3)
ans =
0
0
0
0
0
0
In linear algebra, the identity matrix is very important, and therefore the unit matrix in
MATLAB is named eye, eyedentity / identity
>> eye(3)
ans =
1
0
0
0
1
0
0
0
1
It may be surprising, but the identity-matrix is also defined for non-square matrices, as the
following example shows
>> eye(2,5)
ans =
1
0
0
1
0
0
0
0
0
0
Another important matrix function is the constructor for the random matrix
0.6154
0.7919
0.9218
0.7382
55
0.1763
0.4057
0
0
0
1
0
0
1
0
A very convenient function similar to linspace in one dimension which can be used to set up
arguments for functions in higher dimensions is the meshgrid-function which the functionality
is as follows:
>> [X,Y] = meshgrid(1:3,10:14)
X =
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
Y =
10
10
10
11
11
11
12
12
12
13
13
13
14
14
14
56
4.2
4.2.1
CHAPTER 4. GRAPHICS
The elementary command in MATLAB for plotting functions etc. is the plot command.
Plots can be shown in the plotting window either alone or as one of many sub-plots, like in
the following example
>> x=[.1:.1:.5]
x =
0.1000
0.2000
>> y=[20:-4:1]
y =
20
16
12
0.3000
0.4000
0.5000
>> subplot(2,2, 1)
>> plot(y)
>> subplot(2,2,2)
>> plot(x,y)
which displays on the screen (note the differnt scale on the x-axis)
20
20
15
15
10
10
0
0.1
0.2
0.3
0.4
0.5
Plots of vectors can be done either by plotting the vector directly or by specifying two vectors,
the first will be taken as the x-axis. If the vector length does not match, MATLAB issues
an error message and stops the program execution. The plots are automatically done in the
sub-plot which has been called last.
A subtle way of plotting it the plot of a vector of complex numbers. If you have a complex
vector c, you can get the real part x and the imaginary part y via
x=real(c)
y=imag(c)
The command plot(c) has then the same effect as plot(x,y) which means that the imaginary part is plotted versus the real part.
If a new plotting window should be opened, this can be done via the figure command, the
first window is built is figure(1) command which is automatically executed if no plotting
57
window is open, figure(2) opens a second plotting window and so on. Plots are done in the
window for which the figure command was called last.
There is a wide variety of ways to influence graph annotation in MATLAB
0.3
Yaxis
0.2
0.1
x2
x2*log(x)
0
0.1
0.2
0.1
0.2
0.3
Xaxis
0.4
0.5
The legend created by label can be moved with the mouse. MATLAB graphics can be saved
in various styles (Postscript, encapsulated Postscript, JPEG, .....) via the print command.
The line-style (full lines, dotted lines, symbols) can be changed via the arguments in the plot
command
0.5
plot(x,log(x),...
x,x,:,...
x,x.^2,+,...
x,x.^3,*-)
0
0.5
4.3
1
1.5
2
2.5
0.1
0.2
0.3
0.4
0.5
Visualizing Arrays
-192
113
899
196
61
49
8
52
407
-192
196
611
8
44
59
-23
-8
-71
61
8
411
-599
208
208
-52
-43
49
44
-599
411
208
208
-49
-8
8
59
208
208
99
-911
29
-44
52
-23
208
208
-911
99
If a matrix is displayed with the plot command, the lines are each plotted as a vector, as in
the example for plot(rosser) below on the right.
58
CHAPTER 4. GRAPHICS
Arrays are plotted as arrays with the verbmesh-command, which plots the data in a
wire-frame-type of graph, as below on the right.
The view command can be used to set a different viewing angle for three-dimensional plots. It
is also possible to change the viewpoint interactively via the rotate3d command by pointing
with the mouse on the frame and pulling the frame of the 3D-graph.
1000
800
1000
600
400
500
200
0
0
500
200
400
1000
8
600
6
4
800
2
1000
4.4
59
subplot(2,2,4)
loglog(x,x,-,x,exp(x),--,x,1./x,:,x,log(x),-.,x,1./sqrt(x),-+)
axis([0.1 10 .01 20000 ])
title(logarithmic )
legend(x,exp(x),1/x,log(x),1/sqrt(x),2)
semilogarithmic in xdirection
linear plot
20
20
x
exp(x)
1/x
log(x)
1/sqrt(x)
15
10
15
10
0
0
x
exp(x)
1/x
log(x)
1/sqrt(x)
10
10
semilogarithmic in ydirection
4
10
10
10
logarithmic
4
10
x
exp(x)
1/x
log(x)
1/sqrt(x)
x
exp(x)
1/x
log(x)
1/sqrt(x)
10
10
10
10
10
10
10
10
10
10
Many systems in science and mathematics can be better understood by just plotting typical
properties in different scales. Logarithmic, linear, exponential and power laws can be found
in nature, and are easily identifiable by plotting the data in difference scales.
60
4.4.1
CHAPTER 4. GRAPHICS
Linear Plots
Typical linear plots result from linear response functions, the simplest is probably Hooks law, which is
sketched in the drawing to the right. It is a linear
law, and in the dynamical situation, when the spring is
pulled with a force in a certain frequency, the elongation
changes with the same frequency.
Such a linear response is not a matter of course.
There are nonlinear systems which respond with e.g.
frequency-doubling to an external stimulation, like in
the case where a high intensity red laser beam going into
a target comes out as a blue laser beam (blue light with
twice the frequency of the red light).
Crystall target
Red Beam
Blue Beam
LASER
4.4.2
Logarithmic Plots
If the y-axis of a plot is choosen logarithmically, exponential curves appear as straight lines,
so that logarithmic plots allow to identify exponential behavior. Typical examples for exponential behavior are time evolution plots. Radioactive decay and the increase of the GDP
in Economies are examples for such a time evolution. Below, the increase in the Dow Jones
Industrial Stock Index is shown. The curves in linear plot are bent, but more or less straight
in a logarithmic plot. It is a matter of ongoing debate whether this reflects rather the exponential increase in the strength of the US-Economy or just the exponential inflationary
effects.
If instead of the y-axis the x-axis is choosen in logarithmic scale, logarithmic curves become
straight lines. Logarithmic curves grow slower than linear curves. Typical examples for
logarithmic behavior are animal senses. Light and sound are perceived on a logarithmic
61
scale, i.e. sound is not twice as loud if the pressure of the sound wave is twice as high,
but 102 as hight.
4.4.3
Double logarithmic plots allow to identify power laws, functions of the form xr , where r
does not necessarily have to be integer. Power laws are usually found in nature when systems
suffer from finite size effects.
62
4.5
4.5.1
CHAPTER 4. GRAPHICS
4.5.2
63
Including Images
The command
image
puts a default-image (in GIF-Format) on the graphics-screen. In general, graphics of nearly
any format can be read and displayed using
name=imread(name.gif)
image(name)
The date in the variable name can then be manipulated like a usuall MATLAB-array.
4.6
Very often, one wants to save some output of MATLAB onto a file to include it in other
documents. For short output, it is simplest under a window-system to copy the desired lines
with the mouse into an editor. If the output becomes too long, one can use the command
diary on
and MATLAB will then output not only to the screen, but also in the file diary . If one
wants to redirect the output in a file with a special name, one can use the command
diary(special_filename)
To end the output in the diary, use the command
diary off
If you want to include the output in a LaTeX document and preserve the Computer-outputlook, you can use the
\begin{document}
\end{\document}
style, all the program examples in this scriptum are produced in such a way.
4.7
If you want to save the graphics on the MATLAB-Graphics screen as a file that should be
included in a text (for e.g. LaTeX or WORD), you have to use the print-command, with
the syntax
print -dFORMAT FILENAME
64
CHAPTER 4. GRAPHICS
orient
portrait
4.8
If you want to include jpeg in GUI-based word-processors, you can use the corresponding
menus and/or the mouse.
LaTeX, probably the most widely used word-processing program in the sciences, is not GUIbased. A (not so) short introduction about the various commands in various languages can
be found on ftp://ctan.tug.org/tex-archive/info/lshort/. LaTeX is rather a textprogramming-Language, which converts a file name.tex (the program) into a file name.dvi
(the device-independent-file) which can then be converted into general formats like Postscript
via
dvips -o name.ps name
which produces a postscript-file. These postscript-formats can then e.g. by the command
ps2pdf be used to transform the Postscript-format into PDF-format (Adobe-Acrobat). If
you want to prepare a LaTeX-document with the corresponding graphics, you have to load
a package with the software for including graphics. A widely used package is the epsfigpackage, so whereas for a conventional LaTeX-report, the header looks like
\documentclass[twoside,12pt]{report}
the header must contain
\documentclass[twoside,12pt]{report}
\usepackage{epsfig}
if postscript-graphics should be included. Graphics can then be included with
\epsfig=filename,width=??,height=??,angle= where either width or height must be given,
the angle can also be left away. Here are some examples:
\epsfig{file=graphiken/circle_square.eps,width=2cm}
r=1
65
\epsfig{file=graphiken/circle_square.eps,height=2cm}
r=1
\epsfig{file=graphiken/circle_square.eps,height=2cm,angle=-90}
r=1
\epsfig{file=graphiken/circle_square.eps,height=2cm,width=4cm}
r=1
In principle, all postscript-files should be includable in Latex, but some Programs produce
postscript-output which is not compatible with LaTeX. Under UNIX, one can use the command
ps2epsi name.ps name.epsi
to convert a file name.ps to a file name.epsi which corresponds to the encapsulated postscript
interchange format.
4.9
Processing Graphics
If you have graphics in another format than postscript, like *.jpeg or *.gif files, which you
want to include in LaTeX, you have to convert them into postscript with some other software.
One of the most widely used programs for this task under UNIX is xv, which allows to load
graphics in one format and to save it in another format, for example
xv name.jpg
will load the program name.jpg. Pressing the right button of the mouse will make a menu
appear, and the graphics can be saved as Postscript by choosing the appropriate menu (SAVE
FORMAT POSTSCRIPT).
Chapter 5
Linear Algebra
Usually, one learns about linear algebra in the first year of study, but often, one needs it much
later, when one has forgotten most of it already. MATLAB means Matrix LABoratory and
its first version was written by Cleve Moler so that his students could learn linear Algebra
more easily.
General documentation of MATLAB can also be found on http://www.mathworks.com/
access/helpdesk/help/techdoc/matlab.shtml
5.1
5.1.1
Matrix Manipulation
Matrix commands
The diagonal of a matrix can be extracted with the diag command in the following way:
A =
0.520109
0.510104
0.010375
0.340012
0.326988
0.782090
0.470293
0.636776
0.900370
> diag(A)
ans =
0.52011
0.32699
0.90037
If the input of the diag-command is a vector, diag constructs a matrix with the vector on
the diagonal, a typical example how commands are overloaded in MATLAB:
> b=[3 5 7]
b =
3 5 7
> A=diag(b)
A =
68
3
0
0
0
0
7
0.69184
0.39113
Because MATLAB knows the difference between column- and row-vectors, the transposeoperator can also be used to transform column- into row-vectors and vice versa:
> v=[1 2 3 4 5]
v =
1 2 3 4 5
> u=v
u =
1
2
3
4
5
For complex-valued matrices, the -operator gives the Hermitian conjugate matrix:
> H=rand(3)+sqrt(-1)*rand(3)
H =
0.59574 + 0.89043i 0.91601 + 0.87663i
0.71691 + 0.73996i 0.31324 + 0.44034i
0.38660 + 0.13756i 0.33661 + 0.71527i
0.19920 + 0.74066i
0.19254 + 0.85119i
0.29184 + 0.58186i
> G=H
G =
0.59574 - 0.89043i
0.91601 - 0.87663i
0.19920 - 0.74066i
0.38660 - 0.13756i
0.33661 - 0.71527i
0.29184 - 0.58186i
0.71691 - 0.73996i
0.31324 - 0.44034i
0.19254 - 0.85119i
The commands which extract the upper/lower trigonal matrix are triu/tril:
0.084814
0.585341
0.528991
69
0.208357
0.562931
0.860920
> tril(A)
ans =
0.95165
0.10917
0.66712
0.00000
0.58534
0.52899
0.00000
0.00000
0.86092
> triu(A)
ans =
0.95165
0.00000
0.00000
0.08481
0.58534
0.00000
0.20836
0.56293
0.86092
If the columns or rows should be flipped, i.e. if their order should be inverted, this can be
done with the commands fliup and fliplr, flip up down and flip left right:
> fliplr(A)
ans =
0.73180 0.40541
0.55208 0.79014
> flipud(A)
ans =
0.79014 0.55208
0.40541 0.73180
These two commands can be used to form a transposition for a complex matrix, which is not
the hermitian conjugate:
> A=rand(2)+sqrt(-1)*rand(2)
A =
0.839504 + 0.572899i 0.466803
0.086815 + 0.252680i 0.132638
> B=fliplr(flipud(A))
B =
0.132638 + 0.086518i 0.086815
0.466803 + 0.675260i 0.839504
5.2
+ 0.675260i
+ 0.086518i
+ 0.252680i
+ 0.572899i
Matrix Products
For matrices and vectors, there are a lot of ways products can be computed. There is no
difference between vectors and matrices, a vector is just a matrix with only one row or column.
The simplest form is the elementwise product, which uses the operator .*:
70
> u=[1 2 3 4]
u =
1 2 3 4
> v=[5 6 7 8]
v =
5 6 7 8
> u.*v
ans =
5 12
21
32
The inner product for a row-vector u and a column-vector w is computed with the operator
*:
> u=[1 2 3 4]
u =
1 2 3 4
> w=[1 1 2 2]
w =
1
1
2
2
> u*w
ans = 17
If instead of u*w we compute w*u, the result is the outer product
> w*u
ans =
1 2
1 2
2 4
2 4
3
3
6
6
4
4
8
8
Matrices can be treated in the same way as vectors with elementwise multiplication .* or
multiplication in the sense of linear algebra:
> A=[1 2
> 3 4]
A =
1 2
3 4
> B=[1 -1
71
> -2 2]
B =
1 -1
-2
2
> A*B
ans =
-3
-5
3
5
> A.*B
ans =
1 -2
-6
8
A matrix-vector product is performed like this:
> A=[1 2
> 3 4]
A =
1 2
3 4
> v=[1
> 2]
v =
1
2
> A*v
ans =
5
11
Matlab also has the Kronecker-Product
> u=[1 2 3 4]
u =
1 2 3 4
as a builtin function,
> v=[5 6 7 8]
v =
5 6 7 8
> kron(u,v)
ans =
5
6
7
10
12
14
16
15
18
21
24
20
24
28
32
72
> kron(u,v)
ans =
5 10 15
6 12 18
7 14 21
8 16 24
20
24
28
32
Whereas the elementwise matrix product computed with . is commutative, of course the
matrix product computed with is not commutative.
5.3
The angle between vectors to vectors v and w for any finite can be computed via their
inner/scalar-product as
|v w|
cos =
.
vv ww
For the inner/ scalar-product, we have the CauchySchwartzInequality
|v w|
vv
w w.
Vectors for which the scalar product is 0 are called orthogonal. Whereas orthogonality of two
vectors v and w can be defined in theoretical mathematics as the property that their scalar
product is zero, v w = 1, in numerical mathematics it is necessary to define orthogonality
in a way so that possible rounding errors are taken into account, as the following example
shows
> w=[sqrt(3) sqrt(3)]
w =
1.73205080756888 1.73205080756888
> v=[sqrt(3) -sqrt(3)]
v =
1.73205080756888 -1.73205080756888
> v*w
ans = -9.64636952420157e-17
Obviously the last result should be exactly zero, but due to the rounding errors in the
computation, there is a finite error. How the definition of orthogonality can be applied in
such a way that rounding errors are taken into account can be seen in the next section about
the rank of matrices.
5.3.1
73
> size(A)
ans =
2 2
size gives a two-row vector as an answer, the number of columns is size(A,2), the number
of rows size(A,1). Also the length of columns / rows for a column/ row vector v can be
computed with size(v,2) / size(v,1)
The rank of a matrix is in theoretical linear algebra the number of linear independent
rows/columns. Because the definition of linear independence is equivalent to the definition
of orthogonality, we will use the rank computation as the criterion for the orthogonality. The
rank of a matrix can be computed with MATLABSsrank command. How the rank command
works, will be explained later in the section about the singular value decomposition, along
how one should choose the optional threshold in MATLABSs rank command. First let us
review some theorems about the rank of matrices1
5.3.2
The outer product of two matrices of two vectors always gives a matrix of rank 1.
Example:
Random matrices have nearly always full rank, i.e. the rank of a matrix constructed with rand is the same as the number of the columns/rows, and if the number
rows/columns larger than the number of columns/rows, we have rank(A)=min(size(A)):
> A=rand(3,4)
A =
0.23382 0.43570
0.79868 0.34546
0.66927 0.71192
0.42862
0.69142
0.15419
0.97961
0.74305
0.11667
> rank(A)
ans = 3
Square matrices which have a rank smaller than their number of columns/rows are
called singular. They cannot be inverted, and systems of linear equations where the
equations form a singular matrix cannot be solved. Their determinant vanishes.
The Rank of a matrix does not change through transposition, complex or hermitian
conjugation.
The product of non-singular matrices as the same rank as the matrices themselves:
> A=rand(3,4)
A =
1
Roger A. Horn, Charles R. Johnson, Matrix Analysis Cambrigde University Press 1991
74
0.068071
0.885041
0.708093
0.420827
0.027155
0.489577
> B=rand(3)
B =
0.908288 0.703948
0.245781 0.950685
0.942011 0.726192
0.363589
0.097344
0.064962
> C=B*A
C =
0.52703
0.18896
0.50276
0.94231
0.92705
0.75283
0.57935
0.17690
0.44795
0.883968
0.417966
0.978820
1.45301
0.70990
1.19982
> rank(A)
ans = 3
> rank(B)
ans = 3
> rank(C)
ans = 3
For rank-deficient matrix, the rank of the product matrix is the same as that of the
matrix with the lowest rank:
> A=rand(2,4)
A =
0.200421 0.795092
0.838726 0.220597
0.896583
0.018236
0.454798
0.018493
> A(3,:)=A(2,:)
A =
0.200421 0.795092
0.838726 0.220597
0.838726 0.220597
0.896583
0.018236
0.018236
0.454798
0.018493
0.018493
> B=rand(3)
B =
0.94359 0.31700
0.50896 0.36833
0.48172 0.19705
0.20635
0.40063
0.42594
> C=B*A
C =
0.62806
0.85555
0.86569
0.43882
0.57430
0.52044
0.47035
0.44326
75
0.24569
0.23061
> rank(A)
ans = 2
> rank(B)
ans = 3
> rank(C)
ans = 2
5.3.3
Rank-Inequalities
5.3.4
Norms of a Matrix
Every Matrix-Norm can also be used as a vector-norm, but not vice versa. Therefore, we
explain here only definitions for matrix-norms. Analogous to real and complex Scalars, one
wants to use something like an absolute value also for matrices. Something which behaves
like an absolute value under addition is the Norm of a matrix.
Properties of MatrixNorms
||A|| 0 (Non-negativity)
||A|| = 0 if A = 0
||cA|| = |c| ||A|| for all real and complex c (homogeneity)
||A + B|| ||A|| + ||B|| (triangle inequality)
||AB|| ||A|| ||B||(sub-multiplicativity)
76
4. FrobeniusN. ||A||Fro
( Ai,j Aj,i )
norm(A,fro)
5.3.5
Determinant of a Matrix
The norm only fulfills sub-multiplicativity, i.e. the norm of a matrix product is equal or
smaller than the product of the norms of the factors. An absolute value which fulfills the
multiplicativity is the determinant, which can be computed in matlab via det(A):
det A det B = det(A B)
The exchange of two adjacent columns/rows inverts the sign of the matrix:
> A=rand(3)
A =
0.209224 0.413728
0.106481 0.192283
0.291095 0.436435
0.212479
0.074438
0.508115
> det(A)
ans = -0.0017939
> B=[A(:,2)
B =
0.413728
0.192283
0.436435
A(:,1) A(:,3)]
0.209224
0.106481
0.291095
0.212479
0.074438
0.508115
> det(B)
ans = 0.0017939
The determinant of the identity matrix is one, independent of its dimension.
Never use the Cramer Rule or the Jacobi expansion for the computation of a determinant, is is wasteful and numerically instable.
The numerically most suitable computation method for determinants is the so-called
LU-decomposition, where the matrix A is decomposed as a product of an lower triangular matrix L with 1s on the diagonal and an upper triangular Matrix U as
L U = A.
77
The determinant of A is therefore the product of the diagonal entries of U. Rowand Column-Permutations, so called pivoting, increases the numerical accuracy of the
decomposition, for details, see [Gol89]. The MATLAB-command which computes the
matrix determinant via LU-decomposition is det.
5.3.6
Matrix inverses
Nonsingular square matrices are inverted by the inv command. The elementwise division of
one matrix by another in MATLAB is written as A./B, where all entries of the divisor matrix
must be )= 0 to avoid an error message. This is totally different from the MATRIX division
A/B, which corresponds to the multiplication of matrix A with the inverse of matrix B:
> A=rand(2)
A =
0.29975 0.85007
0.88812 0.33290
> B=rand(2)
B =
0.89979 0.72370
0.53648 0.97567
> A/B
ans =
-0.33410
1.40492
1.11909
-0.70090
> C=inv(B)
C =
1.9926 -1.4780
-1.0956
1.8376
> A*C
ans =
-0.33410
1.40492
1.11909
-0.70090
Because the product C*A does not necessarily the same result as the product A*C, there is
also the right division of a matrix, which with the above matrices gives
> C*A
ans =
-0.71537
1.30362
> B\A
1.20181
-0.31963
78
ans =
-0.71537
1.30362
1.20181
-0.31963
If one tries to invert a singular matrix, MATLAB gives a result (usually wrong), and issues
an error message:
> A=[1 1
> 1 1]
A =
1 1
1 1
> inv(A)
warning: inverse: matrix singular to machine precision, rcond = 0
ans =
1 1
1 0
> B=inv(A)
warning: inverse: matrix singular to machine precision, rcond = 0
B =
1 1
1 0
> B*A
ans =
2 2
1 1
5.4
bik ckj
Therefore, there are 6 possible orders to to program the loops, but basically, there are only
two possibilities:
clear
format long
n=20
b=randn(n).*10.^(16*randn(n));
c=randn(n).*10.^(16*randn(n));
tic
% Equivalent to
a2=zeros(n);
for j=1:n
for i=1:n
a2(i,j)=b(i,:)*c(:,j);
end
end
toc
tic
% Version 2: Daxpy-Product
a3=zeros(n);
for j=1:n
for k=1:n
for i=1:n
a3(i,j)=a3(i,j)+b(i,k)*c(k,j);
end
end
end
toc
tic
% equivalent to
a4=zeros(n);
for j=1:n
for k=1:n
a4(:,j)=a4(:,j)+c(k,j)*b(:,k);
end
end
toc
return
79
80
We have also included the tic and toc command to profile the time used for a matrix
multiplication. It can be seen that MATLAB performs much faster if the inner loop is
evaluated using the :-notation.
The first version of the matrix-matrix multiplication has a inner vector product as a Kernel,
the inner part of the routine. The second version of the Matrix-multiplication has a kernel
which can be written as
(y = a(x + (y ,
an operation where the left side in words is A X Plus Y, for which often the acronym
SAXPY or DAXPY (S for single, D for double precision) is in use.
It turns out that both operations are numerically equivalent, and both need 2l3 floating point
operations (multiplications and additions).
It is common to give the speed of computers by how many Floating Point Operations Per
Second (Flops) then can perform. Modern PCs are in the range of a few hundreds MFlops,
Workstations are nowadays in the GFlops-Range, and the Earth Simulator, a Supercomputer
near Yokohama, can to about 4 TeraFLOPS.
Using programs to test the speed of Computers is called benchmarking.
5.5
5.5.1
x1
x2
..
.
xk
- b
- 1
- b2
- .
- .
- .
- bk
b1
b2
..
.
bk
and transform the matrix and the right-hand-side vector b via elementary row- and columnoperations (subtracting multiples of some rows from other rows) to upper triangular form,
where all the elements below the diagonal are 0:
a
1,1 a
1,2 a
1,k
0 a
2,2 a
2,k
..
..
..
..
.
.
.
.
0
0 a
k,k
b1
b2
..
.
bk
81
xk2 =
xi =
bk2 a
k2,k1 xk1 a
k2,k xk /
ak2,k2
k
(
1
bi
a
i,j xj
a
i,i
j=i+1
This scheme of eliminating elements so that a triangular coefficient matrix survives for which
the unknowns can be computed in a trivial way is called Gaussian elimination. As an example
in augmented form
9 3 4
x1
7
4 3 4 x2 = 8
1 1 1
x3
3
9 3 4 -- 7
4 3 4 - 8 .
1 1 1 - 3
1 1 1 -- 3
4 3 4 - 8 .
9 3 4 - 7
Next, we subtract 4times the first row from the second row and nine times the first row from
the last row:
1
1
1 -3
0 - 4 .
0 1
0 6 5 - 20
Finally, we add -6 times the second row to the last row, and obtain the triangular system
1
1
1
0
0 1
0
0 5
- 3
- 4 ,
- 4
from which we can compute the unknowns successively as x= 4/5, x= 4 and x= 1/5.
5.5.2
For numerical purposes, the two steps, reduction to a triangular system (elimination step),
and backward substitution (solution step), are often split up in two routines. A common
collections of subroutines for the computation of numerical linear algebra is the LINPACKpackage, which includes matrix inversions and orthogonalization methods for real and complex matrices. MATLABS routines for linear algebra are basically routines from LINPACK,
and Cleve Moler, the inventor of MATLAB was also a co-author of LINPACK.
82
0.5834664106369019
-0.6955339908599854
-0.0617007017135620
-0.0174380335956812
-0.2883380949497223
-1.1060823202133179
> [l,u]=lu(a)
l =
1.000000000000000
-0.044275098518011
0.556903499136249
0.000000000000000
1.000000000000000
0.577325122175704
0.000000000000000
0.000000000000000
1.000000000000000
u =
-1.068845629692078
0.000000000000000
0.000000000000000
0.583466410636902
-0.669700958047086
0.000000000000000
-0.017438033595681
-0.289110165605131
-0.929460456605607
The solution of a linear system Ax = b can be computed in MATLAB with the slashcommand, which is not only the division from the left for scalars,
5/4
ans = 1.25000000000000
> 5\4
ans = 0.800000000000000
but also for matrices. The Algebraic meaning is
A\B = A1 B,
A/B = A B 1 ,
and for matrices (Remember that MATLAB means MATRIX Laboratory), this is not necessarily the same. The solution for Ax = b can be obtained by formally dividing through A
from the left,
Ax = b A\Ax = A\b x = A\b.
The solution of the triangular system, including with testing whether Ax is really equal b,
can then be programmed in the following way:
83
> A=rand(3)
A =
0.63356
0.98480
0.60858
0.25786
0.13788
0.76457
0.71159
0.62761
0.90059
> b=rand(3,1)
b =
0.81931
0.61835
0.14195
> x=A\b
x =
-0.74296
-2.36996
2.67169
> A*x
ans =
0.81931
0.61835
0.14195
In LINPACK, the elimination step is called factoring (because the LU-decomposition produces
two factors, L and U), and the Double precision GEneral matrix FActoring is therefore called
DGEFA. The solution/substitution of the system is DGESL , SL for solution.
There exists also a LINPACK-benchmark, which sets up matrices in a well-defined way and
computes the matrix inverses, then computes the number of floating point operations and
the time and then computes the Flop-rate. In this way, the speed of computers has been
evaluated for decades.2
5.5.3
Matrix inversion
A matrix inversion can be computed in the same way as the solution of a linear system, which
we see that if we write the problem as
AA
1 0 0
0 1 0
.. .. . . ..
. .
. .
0 0 1
http://www.netlib.org/benchmark/linpackd in FORTRAN, but also available in other languages, the results in http://www.netlib.org/benchmark/linpackd/performance.ps
2
84
5.5.4
clear
format compact
n=150
A=randn(n);
b=randn(n,1);
flops(0)
x1=A\b;
flops
flops(0)
x2=inv(A)*b;
flops
return
>>
n =
150
ans =
2419042
ans =
6907967
Up to now, we have not discussed the error in matrix inversions. As we have not used any
order of approximation, it is clear that there will be neither truncation error nor discretization
error, and only rounding errors have to be taken into consideration. As a test case for the
matrix inversion, let us consider the matrix
A=
which has the inverse
1
1
1
1$ 1
1/$
1/$
(1 + $)/$
1/$
1.000000000000000
1.000000000000000
the inverse
99999999.4975241
-99999998.4975241
-99999999.4975241
99999999.4975241
5.6. EIGENVALUES
85
108 108
100000001
108
What went wrong? The numerical parameter which describes how accurate a matrix inversion
can be computed, or a linear system can be solved, is the condition number , which is
implemented in MATLABs cond-function. The condition number for a matrix A is defined as
the norm of the Matrix |A| divided by the norm of the inverse matrix |A1 |, or, if |A|/|A1 | <
1, then as the inverse = |A1 |/|A|.
There is a heuristic, which says that if the condition-number of a matrix A is 10k , for a
matrix inversion about k digits will be lost in accuracy. four our above matrix with # = 108 ,
the condition number is = 4 108 . as the error is about 0.5 for a matrix for which the
entries are of the order of 108 , we see that the predictions of the Heuristic are quite accurate.
We have discussed that there are two possible implementations of the matrix-multiplication,
the DOT and the DAXPY-product. The LU-decomposition can formally written as an operator O acting on the original matrix A, so that formally
O A = L U.
-1.00000000
1.00000000
and the error was not in the eight digit, but already in the first digit! The scalar product as
a kernel introduces rounding errors which cannot be predicted with the conventional formula
using the condition number.
5.6
Eigenvalues
The eigenvalues can be computed in MATLAB via the eig command. For a random matrix
A, we obtain the eigenvalues as
>> A=randn(2)
A =
0.5181
-1.2274
0.8397
0.1920
>> eig(A)
ans =
0.3551 + 1.0020i
0.3551 - 1.0020i
86
so one can see that the eigenvalues of a real square matrix are in general not real. For a
symmetric matrix, we see that
>> A=A+A
A =
1.0363
-0.3876
>> eig(A)
ans =
1.2167
0.2035
-0.3876
0.3840
we obtain real eigenvalues. Formally, the eigenvalues i of a matrix A are often introduced
as the roots of the characteristic polygyon of A,
det(A E) = 0,
E=
1 0 0
0 1 0
.
.. .. . . ..
. .
. .
0 0 1
))
a 0
0 b
1 0
0 1
**
= (a )(b ) = 0,
we see that the solutions for are exactly a and b. In other words, the eigenvalues of a
diagonal matrix are the diagonal matrix entries themselves. As we have seen above, the
eigenvalues of a real symmetric diagonal matrix are real. If we look at the characteristic
polynomial, we see that for an upper triangular matrix, the off-diagonal elements vanish in
the characteristic polynomial, so also for a trigonal matrix the eigenvalues are exactly the
diagonal elements:
A =
1.0363
0
>> eig(A)
ans =
1.0363
0.3840
-0.3876
0.3840
5.6. EIGENVALUES
5.6.1
87
5.6.2
4.23606797749979
4.12310562561766
4.23570259468110
4.23606684261261
4.23606797397526
4.23606797748884
4.23606797749976
4.23606797749979
4.23606797749979
clear
format compact
format long
v=[1
1];
v=v/norm(v);
A=[1 2
2 3];
eig(A)
for i=1:8
v=A*v;
norm_of_v=norm(v)
v=v/norm(v);
end
1 2
Now, if we have a matrix A =
we can call eig not only to compute the eigen2 3
values, but using two output-arguments in constructor-brackets [], we can also obtain the
eigenvectors as
>> [u,l]=eig(A)
u =
0.85065080835204
-0.52573111211913
l =
-0.23606797749979
0
0.52573111211913
0.85065080835204
0
4.23606797749979
88
(the eigenvectors l are then not outputted as a vector, but as a diagonal matrix). In our
above example, where we iteratively multiplied the vector v with the matrix A, the end result
for v is
>> v
v =
0.52573111213781
0.85065080834049
which is the right column of u, and therefore the eigenvector to the larger eigenvalue 2 =
4.23606797749979. In other words, out iterative multiplication of a vector to a matrix is a way
to find the largest eigenvalue and the eigenvector corresponding to this largest eigenvector,
and in the literature, this method is often called the power method, because it corresponds
to multiplying a power of A onto v :
An v > (umax
The matrix u which contains the eigenvectors is at the same time the transformation which
transforms A onto diagonal form, so that
%
u Au =
0.23606797749979
0
0 4.23606797749979
5.6.3
5.6. EIGENVALUES
89
Then one can set up the so-called companion matrix CP for P (x) e.g. as
CP
0
0
..
.
1
0
..
.
0
1
..
.
..
.
0
0
..
.
0
0
0
1
a0 a1 a2 ak1
and the eigenvalues of CP are the roots of the polynomial P (x). For example, the polynomial
P (x) = x3 2x2 5t + 6 = 0
has the roots x1 = 1, x2 = 2, x3 = 3. If we set up the companion matrix as
C =
0
0
-6
1
0
5
0
1
2
5.6.4
Stability Analysis
Eigenvalues play an important role in stability analysis, i.e. in the analysis whether a numerical problem is stable or not. Instability usually results from eigenvectors which are
larger than 1 (or, in some cases, different from 1). As an example of how eigenvalues enter
in the solution of problems, let us look at the example problem for the ordinary differential
equations:
function dydt = f(t,y)
% necessary parameters as global variables
global D
global omega0
% velocity component
dydt(1)=-omega0^2 * y(n,2)- 2*D*y(n,1) ;
% position component
dydt(2)=y(n,1);
return
This can also be written in matrix notation as
90
2D 2
Obviously, the Matrix A =
has Eigenvalues, and the integration step is given
0
1
by At., so in fact errors in the time integration can be analyzed by analyzing the eigenvalues
of Adt. For this harmonic oscillator, the Matrix is trigonal and the eigenvalues are obviously
(2Ddt, dt) , and are therefore constant in time, so the problem can be analyzed also purely
analytical. For more complicated problems, like for the ordinary differential equations of the
Lorentz attractor
dx
dt
dy
dt
dz
dt
= (y x)
= rx y xz
= xy bz,
with real constants , r, b the matrix of the ordinary differential equation is obviously a nondy dz
linear function, because the time evolution ( dx
dt , dt , dt ) cannot be written as a product of
a matrix A independent of x, y, z and the vector x, y, z, like in the case of the harmonic
oscillator. The classical way to analyze the stability of such a system is to linearize the
matrix, usually a risky business, because the linearized matrix is not guaranteed reproduce
the full behavior of the non-linear system. The modern approach is to simply perform the
time integration and output representative values for the eigenvalues of the matrix
B = r 1 x .
0
x b
Now let us solve the Lorentz-Model with constant stepsize using the Euler method and let
us plot the Eigenvalues of the matrix. We know already that the Euler-method is bad,
so our solution will be inaccurate, but it will be much more interesting to implement the
Euler-method in two different ways and see how the two solutions diverge from each other.
% Compute the Lorenz-Model
clear,format compact
n=20;
r=60;
b=8/3;
sigma=10;
5.6. EIGENVALUES
91
t_max=1.3
dt=0.01 % diverges with this timestep: dt=0.011;
ndt=round(t_max/dt);
x=zeros(ndt,1);
mateig=zeros(ndt,1);
x(1)=1;
y=x; z=x;
bild=0;
k=[1 1 1];
k(:,2:ndt)=zeros(3,ndt-1);
prop=[ -sigma sigma 0,
0
-1
0,
0
0 -b];
% L\"osung der DGL direkt
for i=1:ndt-1
dx=sigma*(y(i)-x(i))*dt;
dy=(x(i)*(r-z(i))-y(i))*dt;
dz=(x(i)*y(i)-b*z(i))*dt;
x(i+1)=x(i)+dx;
y(i+1)=y(i)+dy;
z(i+1)=z(i)+dz;
end
% Loesung der DGL mit Matrix% Vektor-Multiplikationen
for i=1:ndt-1
prop(2,1)=(r-k(3,i));
prop(3,1)=(k(2,i));
k(:,i+1)=dt*(prop*k(:,i))+k(:,i);
mat_eig(i+1)=max(abs(eig(prop*dt)));
end
subplot(4,1,1)
plot3(x(1:ndt),y(1:ndt),z(1:ndt));
subplot(4,1,2)
plot3(k(1,1:ndt),k(2,1:ndt),k(3,1:ndt));
subplot(4,1,3)
plot3(k(1,1:ndt)-x(1:ndt),...
k(2,1:ndt)-y(1:ndt),...
k(3,1:ndt)-z(1:ndt));
subplot(4,1,4)
plot(mat_eig)
This is the first surprise, that two implementations of the Euler-method dont give numerical
identical results, and the difference increases if we increase the maximal time. The next
92
surprise comes when we increase the timestep from dt=0.01 to dt=0.013. We can see that
then the Solution and the maximal eigenvalues start to diverge, and if the maximal time is
taken longer, the program even crashes because it reaches infinity. Here we have found the
property of the solution of differential equations, that the eigenvalues of the corresponding
matrix times the time step may not become larger than 1, or the solution does not converge
any more.
The eigenvalue-spectrum obtained from the Euler-method is also representative for the eigenvalues which we would obtain from higher-order Methods like Runge-Kutta, which are itself
only a sophisticated concatenation of Euler-steps with different step-size.
5.6.5
As in the case of the matrix inversion, there is a parameter which tells how accurately the
condition of the eigenvalues could be performed. In MATLAB, the function which gives the
eigenvalue condition number (different from the condition number cond for matrix inverses)
is called condeig, and it gives the inverses of the eigenvectors of the matrix.
Chapter 6
Ordinary differential equations
For ordinary differential equations there is a closed theory about which solution method
should be applied in which case. In the case of ordinary differential equations, the total
differential imposes additional constraints on the solution so that the numerical equations can
be satisfied more easily. In contrast, Partial differential equations are much more difficult to
treat numerically, because the boundary conditions impose certain constraints on the solution
method, so that in the case of nonlinear equations, the optimal choice for a solution strategies
is far from obvious.
6.1
6.1.1
Reference Example
Newtons equations of motion
Ordinary differential equations play a an important role in science and engineering, and
maybe the most central equation is Newtons equation of motion, which relates a time- ,
velocity and position-dependent force F (x, x,
t), mass m and a
F (x, x,
t)x = ma.
Rewriting the equation with the second derivative of the position x
, we get
F (x, x,
t) = m
x,
which due to the second derivation of x is called a ordinary differential equation of second
order. In general, it can be shown that n ordinary differential equations of order m can be
rewritten into n m coupled differential equations of first order. For the case of Newtons
equations of motion, this can be done by introducing the velocity v as the derivative of x so
that
F (x, x,
t) = mv,
v = x.
Because standard texts in numerical analysis prefer to deal with first order differential equations, is is importand to understand the latter form.
94
6.1.2
Linear oscillator
2 x, which
For simplicity, we set the mass m = 1. If the force takes the form 2Dxomega
0
corresponds to a linear spring with linear damping, the equation of motion takes the form
2Dx 02 = x
.
The solution of this equation with the Damping term 2D and the frequency of the undamped
oscillation 0 is
x(t) = x0 exp(Dt) cos(D t),
D =
&
02 D2 .
Though there are solution schemes to solve second order equations directly, it is usually
simpler to solve equation of second order by reducing them to a system of coupled first order
equations. For our problem, we introduce the velocity v and its time derivative v = a, so this
leads to the system of first order equations
v = 2Dx 02 x,
x = v.
8v9
x
(without
d
y = F (y, t),
dt
where F (y, t) becomes a vector-valued function with the time t and the vector y as argument.
6.2
When faced with a differential for an numerical implementation, intuitively one first wants
to replace the differential operator d by the finite difference as
dy
y(t + t) y(t)
dt
t
so that the first order approximation in the solution of one time-step ti with value y(ti ) and
the Function value Fi = F (yi , ti ) to the next is
y(ti + t) y(ti ) + Fi t,
which is called Euler method.
clear
format compact
D=.2 , omega0=1 % Damping and Force constant
x0=1 , v0=0 %Initial conditions
95
Euler
Computed
solution
y0
1
Euler, dt=0.1
Exact
Exact
solution
0.5
t0
t 0+dt
0.5
0
10
20
30
Strategy of the Euler method: Evaluate the value at the right side of the Result of the Euler method for the damped harmonic
interval via the starting value of the oscillator: The period and the amplitude are wrong.
left side and the tangent of the left
side of the interval.
By construction, the Euler-method is a first-order method, because we only retained the
terms proportional to t in the expansion. Of course, if we plot the absolute error for our
96
exponentially vanishing solution, the error also vanishes exponentially, therefore we dont
draw the error here.
6.2.1
Geometrically speaking, the Euler Method chooses the starting value and the tangent at
the same point for the integration, which is correct only in the infinitesimal limit. It can
be seen that the results obtained from the from the Euler Method are far from satisfying
for our timestep of dt = 0.1. If we decrease the step size, we can reduce the discretization error for the time step, but there is a limit which is reached for the above differential equation with dt = 0.001, and even using 1/10 and 1/100 the timestep does not
change the result significantly any more - by 10 times and 100 times the computational cost.
10
10
10
10
10
10
10
dt=102
dt=103
dt=104
dt=105
10
15
20
25
30
For some ordinary differential equations, we will increase also the rounding error (from adding
each new timestep) for integrating out the same time interval, so in the limit of dt 0, we
wont obtain the correct result with the Euler Method. Therefore there is one thing about the
Euler Method which should be kept in mind: NEVER USE THE EULER METHOD
IN A SERIOUS APPLICATION1
Except for stochastic differntial equations, where the stochastic noise destroys the systematic
error, but even then there may be better choices ....
1
6.2.2
97
There are several strategies which can be used to reduce the error in the Euler Mehod. One
possibility is to use for the timestep from t0 to t + dt the value of F (y, t0 + dt/2), in the
middle of the interval, instead of F (y, t0 ) on the left of the interval, which results in a second
order method. This is similar to the midpoint method in numerical quadrature, which gives
a second order method, whereas the rectangular method with the value at the end of the
integral gives only a quadrature rule of first order.
Modified Euler
1
Computed
solution
y0
Euler, dt=0.1
Exact
Exact
0.5
solution
0
t0
t0 +dt/2
t 0+dt
6.2.3
0.5
0
10
20
30
Heuns method
Heuns method uses the value y0 and the tangent F (y0 , t0 ) at the left of the interval to
computes the Euler step to the right of the interval F (y0 + dtF (y0 , t), t0 + dt) as an estimate/prediction of the value at the right intervall, then it calculates as corrected value for
F (y, t) the averages between the solution F (y0 , t0 ) at the left hand side of the interval and
F (y0 + dtF (y0 , t), t0 + dt) at the right side.
Heuns method is a second order method, and there is a certain structuarl similarity to the
trapeze rule in the quadrature, where also the left hand value and the right hand value are
used.
Nevertheless, there are some new perspectives about this method which allow to develope a
new class of integration methods for ordinary differential equations which are of higher than
second order:
1. The idea of first advancing the time integration in a predictor step, then to modify
the result in a corrector step, is the basis of the so-called predictor-corrector methods.
98
clear
; format compact
Modified Euler
Predicted
Solutions
99
Exact
solu
Computed tion
corrected
solution
Heun, dt=0.1
Exact
0.5
t0
t 0+dt
0.5
0
10
20
30
Strategy of Heuns method: Evaluate the values and tangents at the Result of Heuns method for the damped harmonic osleft and right side of the interval as cillator: The period and the amplitude are computed
predicted values and take the av- much more accurately than for the Euler method.
erage as corrected value.
6.2.4
Stability
Up to now we have focused in our investigations purely from the point of accuracy, in the
meaning that a numerical solution we get will have some finite error in comarison to the
exact solution of the problem, but will be more or less the same shape. Actually, a more
fundamental problem in numerical analysis is stability, loosely speaking, the mathematical
problem whether a numerical solution has the same shape of the exact solution at all. If
we look at the following first order differential equation
d
y(t) = 1 ty 1/3 ,
dt
which for y(0) = 1 is strictly real in the interval [0,5]. The numerical solution overshoots
for too large time-steps, as is shown in the following graphs for the numerical solution with
the Euler method, so that y(t) becomes negative, and therefore in MATLAB delivers the
complex roots of the negative values of y(t). The result of the numerical integration for too
large time-steps shows a total different shape than the exact solution, and is therefore called
unstable. It is therefore a primary aim to choose numerical methods and time-steps so that
the solution is stable, accuracy is only a secondary concern.
Regrettably, some methods which give very high accuracy for some problems give very small
stability with other problems. It is often advisable to check the stability of a method by
using different time steps to see if the numerical solution changes, or not. If a small change
of the time-step leads to only a small change in the solution, the solution is stable. The
mathematical definition of stability is that a solution undergoes only a small change for a
small change of the initial conditions, and in this respect, the time-step represents something
like an initial condition.
100
imaginary component
1.5
0.2
0.4
0.5
0.6
0
0.8
0.5
1.2
1
Euler, dt=0.5
Euler, dt=0.25
Euler, dt=0.125
Euler, dt=0.05
Exact
1.5
0
Euler, dt=0.5
Euler, dt=0.25
Euler, dt=0.125
Euler, dt=0.05
Exact
1.4
1.6
3
6.3
6.3.1
Before we proceed to higher order formulae, we should improve the readability of the code.
Here is a good opportunity to introduce MATLAB functions. For the Euler method, we had
% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y(n+1,2)=y(n,2)+dt*y(n,1);
As was emphasized in the first chapter, readability is tantamount in programming, and
whereas the Euler algorithm was still quite readable, for Heuns method, we hat to treat
% velocity
y_pred1=y(n,1)+dt*(-omega0^2 * y(n,2)- 2*D*y(n,1) );
% position
y_pred2=y(n,2)+dt*y(n,1);
101
% velocity
y(n+1,1)=y(n,1)+dt*(-omega0^2*.5*(y(n,2)+y_pred2)- 2*D*.5*(y(n,1)+y_pred1));
% position
y(n+1,2)=y(n,2)+dt*.5*(y(n,1)+y_pred1);
and this is certainly not readable any more. An insight is, that the forcelaw for the time
integration of the spring was inputted twice, once for the corrector step, and once for the
predictor step, so it would be a good idea to input the force law evaluation as a MATLAB
function.
6.3.2
Global variables
Most ordinary differential equations do not only need the time as an input parameter, which
we specified in the above example, but they need other input parameters as well. One of
the simplest ways to incorporate such input parameters is via the declaration of MATLAB
global attribut, which allows the specifiction of global variables, and which are of course not
limited in their use to functions for ordinary differential equations.
6.3.3
MATLAB functions
A typical MATLAB function is the following code, where the constructor brackets [] have
to be used for the output arguments and the round brackets () for the input arguments:
function [output_arg1,output_arg2]=function_name(input_arg1,input_arg2);
% Comment following the function declaration; This comment will be displayed
% when you type
% "help function_name"
% from the MATLAB prompt.
global a
% a global variable , which must be declared as global
% somewhere else and initialized
output_arg1=input_arg1+input_arg_2*a
output_arg2=input_arg1*input_arg_2
return % end of function
The function is called for example as
[out1,out2]=function_name(25,24)
If not all input-arguments are used in the function call, like
[out1,out2]=function_name(25)
MATLAB terminates with an error message. MATLAB functions cannot override input
arguments, in the following example,
102
function [output_arg1,output_arg2]=function_name(input_arg1,input_arg2);
output_arg1=input_arg1+input_arg_2
output_arg2=input_arg1*input_arg_2
input_arg1=15
return % end of function
the line
input_arg1=15
does not have any meaning, because only output-arguments (in constructor brackets [])
are recopied to the calling program. If not all output- or input-arguments are assigned,
MATLAB terminates with an error message. Overloading, the use of a variable number of
input-arguments is possible, and in this case one has to ask the number of the input arguments
with the MATLAB function nargin and the number of output arguments nargout. We will
not treat overloading here, but is is easy to find examples for overloaded methods by looking
at some MATLAB functions which exist in MATLAB-code (most MATLAB functions are
written in MATLAB) in the toolbox directory.
which histo
will display the directory in which the toolbox-MATLAB-function histo can be found, and it
is possible to load the function into the editor and view the usage of the operator overloading.
6.3.4
103
; format compact
104
an argument. Moreover, we have initialized tout and yout so that we dont loose time by
allocating new memory space when a new element is added to these vectors in each timestep:
function [tout,yout] = heun(yinit,tstart,tend,dt,f)
% Heuns Method:
% Runge-Kutta integrator (2nd order)
% Input arguments %
y = current value of dependent variable
%
t = independent variable (usually time)
%
dt = step size (usually timestep)
%
f = right hand side of the ODE; f is the
%
name of the function which returns dy/dt
%
Calling format f(y,t).
% Output arguments %
yout = new value of y after one stepsize dt
nsteps=ceil((tend-tstart)/dt)
dt
if nsteps<0
tstart
tend
dt
error (tend-tstart is not a positive multiple of dt)
end
if (abs(nsteps*dt-(tend-tstart))>1e-6*dt)
disp(warning: time interval not a multiple of timestep)
disp(inputed timestep:)
dt
dt=(tend-start)/nsteps
disp(use instead)
dt
end
dt
if (size(yinit,1)==1)
yinit=yinit;
end
%allocate necessary memory to save time:
yout=zeros(length(yinit),nsteps); %
tout=zeros(1,nsteps);
yout(:,1)=yinit;
y=yinit;
tout(1)=tstart;
n=1;
105
for k=1:nsteps
F1 = feval(f,y,tout(n));
t_full = tout(n) + dt;
ytemp = y + dt*F1;
F2 = feval(f,ytemp,t_full);
n=n+1;
y= y + .5*dt*(F1 + F2);
yout(:,n)=y;
tout(n)=t_full;
end
return
This program can then be called from a driver routine (a routine which does nothing else
than call a specific function) in such a way:
clear ;
format compact
global D, D=.2 ,
global omega0, omega0=1 % Damping and Force constant
omega_d=sqrt(omega0^2-D^2);
dt=0.1, t0=0, t_max=20 % time-step, start-time, end time
x0=1
%Initial conditions
v0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);
[t,y]=heun([v0;x0],t0,t_max,dt,harm_osc);
% exact solution
y_ex=(x0*exp(-D*t).*cos(omega_d*t));
subplot(2,2,1)
plot(t,(y(2,:)-y_ex)./y_ex,:)
legend(Heun, dt=0.1,Exact)
axis tight
6.4
6.4.1
The idea to evaluate not only a single integration point, and moreover compute within a single
timestep more integration points from previously computed auxiliary timesteps is realized in
the so-called Runge-Kutta algorithm. The formulae for the so-called classical Runge-Kutta
method are
dt
(k1 + 2k2 + 2k3 + k4 )
6
= f (ti , yi )
yk+1 = yk +
k1
106
yout(:,1)=yinit;
y=yinit;
tout(1)=tstart; %40
107
n=1;
half_dt = 0.5*dt;
dt_6=dt/6;
for k=1:nsteps
% y
F1 = feval(f,y,tout(n));
t_half = tout(n) + half_dt;
ytemp = y + half_dt*F1;
F2 = feval(f,ytemp,t_half);
ytemp = y + half_dt*F2;
F3 = feval(f,ytemp,t_half);
t_full = tout(n) + dt;
ytemp = y + dt*F3;
F4 = feval(f,ytemp,t_full);
y = y + dt_6*(F1 + F4 + 2.*(F2+F3));
n=n+1;
yout(:,n) = y;
tout(n)=t_full;
end
return;
The same driver as for Heuns program above can be used, just with the line
[t,y]=heun([v0;x0],t0,t_max,dt,harm_osc);
replaced by
[t,y]=rk4_class([v0;x0],t0,t_max,dt,harm_osc);
6.4.2
In the section for Eulers and Heuns method, we used as initial condition x0 = 1, v0 = 0 to
compare the numerical result with the exact solution
yex = (x0 exp(Dt) cos(d t))
Actually, this solution is not the exact solution for the initial value problem with v0 = 0, but
for the initial value problem with
v0 = D exp(Dt0 ) cos(d t0 ) d exp(Dt0 ) sin(d t0 )
The improvement for the Runge-Kutta method compared to Heuns method would not been
visible because the initial value for the integration is so far off that the numerical solution
is quite wrong. Whenever one computes numerical solutions to compare them with exact
solutions, one should be sure that they are the solutions for the identical problem.
108
1
correct initial condition
exact solution
incorrect initial condition
0.5
0.5
0
6.4.3
Accuracy
Now that we have outlined several algorithms with different order (different truncation error
with respect to the Taylor expansion of dt), we should compare the above methods with
respect to their cost and accuracy. As has been mentioned already above, the cost of a
Runge-Kutta step is four function evaluations per timestep, in contrast to a single function
evaluation for Euler and two function evaluations for Heun. Let us compare the accuracy
of the three methods, once for the absolute accuracy ycomputed yexact , and once for
the relative accuracy (ycomputed yexact )/yexact . For the Euler method, we obtain an
exponentially decaying error due to the fact that the solution decays exponentially, and the
relative error increases exponentially. The absolute error starts at the order of 102 , which
is the square of the order of the timestep (dt = 0.1)2 , as was expected.
Euler, dt=0.1, absolute error
10
2
2
10
10
10
10
0
10
20
30
40
10
20
30
40
For Heuns method, the absolute error starts at the order of 103 , which is the order of the
timestep (dt = 0.1)2 , which was also expected. The relative error is constant for a certain
time, and then increases exponentially.
10
10
109
10
10
10
10
20
30
40
50
10
20
30
40
50
For the classical Method by Runge and Kutta, the absolute error starts at the order of
106 , which is by sheer luck one order more accurate than the fifth power of the timestep
(dt = 0.1)4 , which as was expected as the absolute error. Again, as in Heuns method, the
relative error is constant for a certain time, and then diverges exponentially.
Class. RungeKutta, dt=0.1, absolute error
10
10
10
10
10
10
10
20
30
40
50
10
20
30
40
50
The last investigations have reviewed some old concepts and shown some important new
concepts for error analysis:
1. The order of the Euler, Heun and Runge-Kutta method are 1,2 and 4 respectively,
therefore the absolute error at the beginning of the integration process is of the order
+1 of the timestep dt2 , dt3 and dt4 for an initial amplitude of the order of 1.
2. The local error is the error for a single timestep, and the local absolute error at the
beginning of the integration is the same as the local relative error.
3. The behavior of the relative error is a bit more complicated, as can be seen, the relative
error increases during the integration process, but not monotonically. The error at
the end of the integration process is called the global error, and it can be seen that
the global relative error is much larger than the local absolute error. Whenever one
performs a time-integration of ordinary differential equations, one should know which
is the actually permitted error, and this is determined by the physical problem.
110
6.5
111
0.5
0.5
10
15
20
25
30
10
15
20
25
30
0.3
0.25
0.2
0.15
0.1
0.05
Above the solution was plotted, below we see the timestep. The time-adaption algorithm
changed the timestep depending on whether the oscillation was at a relative minimum or in
a straight motion. The accuracy of the time integration can be set by the input parameters
of the ode23 function, see help ode23. The accuracy diagram for the default accuracy is the
following:
ode23, dt=0.1, absolute error
10
10
10
10
10
20
30
40
50
10
20
30
40
50
The same plots can be made for the ode45 algorithm, which gives the following accuracy
diagram
112
10
10
10
10
10
10
10
10
20
30
40
50
10
20
30
40
50
0.5
0.5
10
15
20
25
30
35
40
45
50
10
15
20
25
30
35
40
45
50
0.3
0.25
0.2
0.15
0.1
0.05
and it can be seen that MATLAB starts with a very small timestep and then increases the
timestep significantly to reach the default accuracy of the time integrator. The advantage of
these adaptive methods for reasonableordinary differential equations are:
One can specify the relative and absolute errors on input, and obtain a solution which
is guaranteed to be inside the specified errors.
The performance is optimal, i.e. for the given method there will be no solution which
will be computable with less timesteps/less computer time.
Without knowing anythin about the system, and the relation between the timestep
and the error resulting from the timestep for the given set of equations, one obtains a
113
correct solution.
Therefore, it is allways a good idea to start an investigation of a problem with the above
method. But there are some caveats, for the case of unreasonable differential equations,
and these are systems which are often encountered in daily life, which are treated in the next
subsection.
6.5.1
Adaptive stepsize control needs some assumptions about the smoothness of the treated differential equations. There are some notorious physical situations which lead to non-smooth
problems:
Coulomb-friction: If an ordinary differential equations contains terms which contains
the sign of a function, like in the case of Coulomb-friction,
FCoul = sign(v),
it may happen that the solution for the equation is not smooth enough, so that even a
reduction of the time step does not lead to the same solution for different orders of the
function evaluation of the ODE-solver. In that case, the solver stops, or it continues
only with very small time steps so that the solution is not finished within finite time.
Bouncing balls: If an object flies in free motion in a gravitational field, its trajectories
are parabolic. If it hits a target, the motion is suddenly reversed. For the numerical
time integration, the free motion allows a very large timestep, whereas in the moment
where the target is hit, the timestep has to be drastically reduced. It is possible that
numerical solvers with adaptive stepsize control are not able to reduce the step-size
appropriately, and in the simulation the impacting particle may not be reflected, but
may fly through the target. The risk for such a mishap is higher for higher order
solvers, e.g. for 8th order.
Because the adaptive stepsize control needs some information about how the timestep must
be reduced, MATLAB allows to specify the way in which the timestep should be changed via
the options-command.
6.5.2
Coulomb Friction
In contrast to the friction of and in fluids, which is for small velocities v proportional to the
velocity, Coulomb friction, the friction of solid on solid surfaces, is proportional to the sign
of th friction only. Using the Coulomb friction coefficient and the normal force fn , we can
write the Coulomb friction as
FCoul = fn v.
Obviously, this forcelaw has a jump at v = 0, and we know from physics that the friction
FCoul can take any value from fn to fn for v = 0. Actually, there is a method to solve
114
such an undetermined Problem in a numerically exact way2 , but we will just try to use
the adaptive stepsize control in the
hope that we get a reasonable solution by decreasing the stepsize. Using
clear
the ordinary differential equation
format compact
global D
y = y 2Dsign (y)
,
Let us check the output for different
values of D using the programs to the
right and let us look at the timestep:
d/dx2 x + 2 D *sign(d/dx x) + x = 0
1
D=0.05
D=0.1
0.5
0.5
0
4
timestep
dt
10
D=0.05
D=0.1
10
tmax=7
D=0.05
[t1,y1]=ode23(lin_coul_osc,[0 tmax],[1 0]);
t1_plot=linspace(0,max(t1),2*length(t1));
y1_plot=interp1(t1,y1,t1_plot,spline);
D=0.1
[t2,y2]=ode23(lin_coul_osc,[0 tmax],[1 0]);
t2_plot=linspace(0,max(t2),2*length(t2));
y2_plot=interp1(t2,y2,t2_plot,spline);
t2_plot=[t2_plot tmax];
y2_plot=[y2_plot ; [0,1]];
subplot(2,2,1)
plot(t1_plot,y1_plot(:,1),...
t2_plot,y2_plot(:,1),:)
axis tight
legend(D=0.05,D=0.1)
title(d/dx^2 x + 2 D *sign(d/dx x) + x = 0)
subplot(2,2,2)
semilogy(t1(2:end),diff(t1),t2(2:end),diff(t2),:)
ylabel(dt)
xlabel(timestep)
legend(D=0.05,D=0.1)
axis([0 7 1e-5 1])
return
6.6
115
y2% = 1 yi2 y2 y1
clear
format compact
global mu,mu=500
t_max=1.5
D=0.2
tic
[t1,y1]=ode23(vanderpol,[0 t_max],[2 0]);
t1_plot=linspace(0,max(t1),2*length(t1));
y1_plot=interp1(t1,y1,t1_plot,spline);
toc
116
it may happen that the solver reduces the timestep to numerically 0 and the solution process
terminates with an error message.
vandermodeODE, ode23
2
10
10
10
timestep
1.9995
1.999
1.9985
1.998
1.9975
0.5
10
1.5
vandermodeODE, ode23s
200
400
600
800
1000
10
1.9995
timestep
1.999
1.9985
1.998
1.9975
6.7
0.5
1.5
10
10
20
30
6.7.1
St
ormer-Verlet Method
In the previous examples, we always included some damping in the system and rewrote the
ordinary differential equation as a system of coupled first order differential equations. For the
most widely used symplectic (energy-conserving) time integrator, it is not necessary to rewrite
the first order differential equation from second to first order, on the contrary, this method
is not able to handle velocity-dependent (first order terms) at all. Using the acceleration a
(=Force/mass), we can write the Verlet method for the coordinate x as
xi+1 = 2xi xi1 + ai dt2 .
117
118
F1 = feval(f,yout(2,1));
y_mdt=y(2,1)-F1*dt2;
F2 = feval(f,yout(2,1));
yout(2,2)=2*y(2,1)-y_mdt-F2*dt2;
tout(2)=dt;
for k=2:nsteps-1
F1 = feval(f,yout(2,k));
t_full = tout(k) + dt;
yout(2,k+1)=2*yout(2,k)-yout(2,k-1)+F1*dt2;
tout(k+1)=t_full;
end
return;
6.7.2
Precision
We compare the numerical solution for the harmonic oscillator without damping
function out=verlet_lin_osc(in)
% verlet-lin-osc
% linear oscillator with frequency omega
% for use with verlet-type integrator
global omega0
out=-omega0^2*in;
return
using the main program
clear ;
format compact
global D, D=0 ,
global omega0, omega0=1 % Damping and Force constant
omega_d=omega0;
dt=0.01, t0=0, t_max=300 % time-step, start-time, end time
x0=1
%Initial conditions
v0=-D*exp(-D*t0)*cos(omega_d*t0)-omega_d*exp(-D*t0)*sin(omega_d*t0);
tic
[t,y]=verlet([v0 x0],t0,t_max,dt,verlet_lin_osc);
y_ex=(x0*cos(omega_d*t*1)); % exact solution
[rkt2,rky2]=ode23(lin_osc,[t0 t_max],[v0 x0]);
y_rk2=(x0*exp(-D*rkt2).*cos(omega_d*rkt2)); % exact solution
119
10
10
10
50
100
150
Error for ode23
200
250
50
100
150
Error for ode45
200
250
300
50
100
150
200
250
300
10
10
It can be seen that the Verlet-Algorithm has a larger error for the initial timesteps, due to
our choice of the earliest timestep in first order. Nevertheless, the error bound is constant
over the whole integration interval. The remarkable property of the Verlet-method is that its
120
6.7.3
Velocities
The verlet methods only makes use of the coordinates, not of the velocities. Because the
velocities dont occur in the equations, they can only be estimated using the relation
vi = (ri+1 ri1 ) /2dt,
so that the velocities of a timestep are only known after the completion of the following
timestep. Therefore, it is not possible to incorporate velocity-dependent interactions in the
verlet-scheme.
Often, it is not clear how large a timestep should be chosen for a given dissipative problem.
There are some people who advocate the following procedure: Run the problem without dissipation and fix the timestep so that the change in energy during the simulation is negligible,
than use this timestep for the dissipative system. Our exploration of the symplectic integrator
shows that such a strategy is meaningless. The non-dissipative systems are a totally different class than the dissipative systems, even the best non-symplective integrators cannot
compete with quite mediocre symplectic integrators. Contrarywise, symplectic integrators
cannot be used with dissipation, in the above Verlet-Stormer integration there is no possibility to implement velocity-dependent forces, because at the time the forces must be computed,
the velocity is not yet known. The same is true for modifications like the velocity-Verlet
scheme, where one knows the velocity half a timestep too late, using the velocity from the
previous timestep introduces errors which are of the order of Verlet-scheme itself.