Anda di halaman 1dari 7

Breaking Strength of Thread

PowerPoint to accompany

Introduction to MATLAB
for Engineers, Third Edition

Thread breaking strength (in N) for 20 tests:

92, 94, 93, 96, 93, 94, 95, 96, 91, 93, 95, 95, 95,
92, 93, 94, 91, 94, 92, 93

William J. Palm III

Chapter 7
Statistics, Probability,
and Interpolation
(Module 9)

The six possible outcomes are:

91, 92, 93, 94, 95, 96.
We can use the histogram function to plot the
histogram (next figure).

Copyright 2010. The McGraw-Hill Companies, Inc. This work is only for nonprofit use by instructors in courses for which this textbook has been adopted.
Any other use without publishers consent is unlawful.

7-2

297

% Thread breaking strength data for 20 tests.

y = [92,94,93,96,93,94,95,96,91,93,...
95,95,95,92,93,94,91,94,92,93];
% The six possible outcomes are ...
91,92,93,94,95,96.
x = [91:96];
histogram(y)
ylabel(Absolute Frequency)
title([Absolute Frequency Histogram,...
for 20 Tests])
This creates the next figure.
7-3

7-4

Absolute frequency histogram for 100 thread tests.

Figure 7.12. This was created by the program on page 298.

How do
you think
this
compares
to 20
tests?

Use of the bar function for relative frequency histograms

(page 299)
Absolute frequency histograms show the total number of
samples
Relative frequency histograms show the fraction of samples
% Relative frequency histogram.
tests = 100;
y = [13,15,22,19,17,14]/tests;
x = [91:96];
bar(x,y),ylabel(Relative Frequency),...
title([Relative Frequency Histogram,...
for 20 Tests])
This creates the next figure.

7-5

7-6

Figure 7.13

Histogram functions Table 7.11

The hist function should be replaced by the histogram function

7-7

7-8

Command

Description

bar(x,y)

centres.

histogram(y)

histogram(y,n)

Aggregates the data in the vector y into n bins evenly

spaced between the minimum and maximum values in y.

histogram(y,x)

Aggregates the data in the vector y into bins whose edges

are specified by the vector x. The bin widths are the
distances between the edges. This behaviour is different to
hist and bar.

[z,edges] =
histcounts(y)

Returns two vectors z and x that contain the frequency

count and the bin edges. This uses the same algorithm as
histogram, but without producing the plot.

Scaled Frequency Histogram (pages 301-303)

Scaled frequency histograms show the likelihood of a number
occurring: they are the probability density function (pdf).
They are important because you can directly compare
histograms with different bin widths.
% Absolute frequency data.
y_abs=[1,0,0,0,2,4,5,4,8,11,12,10,9,8,7,5,4,4,3,1,1,0,1];
binwidth = 0.5;
% Compute scaled frequency data.
area = binwidth*sum(y_abs);
y_scaled = y_abs/area;
% Define the bins.
bins = [64:binwidth:75];
% Plot the scaled histogram.
bar(bins,y_scaled),...
ylabel(Scaled Frequency),xlabel(Height (in.))

7-9

7-10

Scaled histogram of height data for very many

measurements.
Once we have enough samples, we can specify a model for the
populations distribution (the line)

7-11

7-12

The basic shape of the normal distribution curve.

Figure 7.22, page 304

7-13

The effect on the normal distribution curve of increasing .

For this case = 10, and the three curves correspond to
= 1, = 2, and = 3.

7-14

7-15

7-16

Sample Statistics vs Population Statistics

The probability density function (pdf) is defined as:

A population is every single instance of something, e.g.:

all the people in Australia
every bolt manufactured in a factory

Note that Nx = area under an absolute frequency plot

The probability that the random variable x is no less
than a and no greater than b is written as P(a x b):

For a normal distribution, it can be computed as follows:

1
a
P(a x b) =
erf b erf
(7.23)
2

2
See pages 305-306.
7-17

Measuring how Good Statistics Are

A confidence interval is a measure of the accuracy of a
sample statistic.
It provides a range of values.
You expect the population statistic to lie within that range.
The standard definition is a 95% confidence interval, which
means that there is a 95% chance that the population statistic
lies within the range.
The width of the range increases as the probability of the
population statistic lying within the range increases.
The width of the range decreases as the sample size
increases.
Look at functions such as normfit to see how to obtain
these values in MATLAB.

7-19

Measuring the properties of the entire population is expensive and

often impractical.
A realisation is a single event or instance (one person or bolt).
A sample is a subset of a population (a collection of realisations),
which is affordable and practical to measure. If it is a
representative sample, the sample statistics will provide useful
information about the population statistics: you can infer the
population statistics and make decisions based on that information.
To be representative, the sample must be:
sufficiently
Representative
large
20% of the people in Aust
randomly
selected from
the population

Unrepresentative
0.01% of the people in Aust
All of Toowoomba to
represent all of Australia

7-18

Measuring Confidence Intervals

Common sample statistics that are measured are the mean
and standard deviation.
Because each sample is different, the sample statistic varies.
Therefore the sample statistic has a distribution (ergo a pdf).
Irrespective of the population distribution, it is expected that
the sample statistic has a normal distribution.
For instance, the sample pdf might be well-represented by a
binomial distribution (the population pdf has a binomial
distribution).

7-20

However, the sample mean is expected to have a normal

distribution.
The width of this distribution (the standard deviation of the sample
mean) decreases as the number of samples increases (The
Central Limit Theorem).
As your number of samples increases, your accuracy improves.
A consequence of this is that 5 samples each containing 10
realisations could be more accurate than 1 sample containing 100
realisations.

Random number functions Table 7.31

Sums and Differences of Random Variables (page 307)
It can be proved that the mean of the sum (or difference) of
two independent normally distributed random variables
equals the sum (or difference) of their means, but the
variance is always the sum of the two variances. That is, if x
and y are normally distributed with means x and y, and
variances 2x and 2y, and if u = x + y and = x y, then

u = x + y

(7.24)

= x y

(7.25)

u==x+y

Command

Description

rand

between 0 and 1.

rand(n)

Generates an n n matrix containing uniformly

distributed random numbers between 0 and 1.

rand(m,n)

Generates an m n matrix containing uniformly

distributed random numbers between 0 and 1.

rng(s)

to s.

rng(0)

state.

rng(shuffle)

Seeds all random number generators to a different state

(determined by the current time) each time it is executed.

(7.26)

7-21

7-22

randn

Generates a single normally distributed random number

having a mean of 0 and a standard deviation of 1.

randn(n)

Generates an n n matrix containing normally

distributed random numbers having a mean of 0 and a
standard deviation of 1.

randn(m,n)

Generates an m n matrix containing normally

distributed random numbers having a mean of 0 and a
standard deviation of 1.
Sets the state (seed) of all random number generators
to s.

rng(s)

rng(0)

state.

rng(shuffle)

Seeds all random number generators to a different state

(determined by the current time) each time it is
executed.
Generates a random permutation of the integers from 1
to n.

randperm(n)

7-23

The following session shows how to obtain the same

sequence every time rand is called.

7-24

>>rng(0)
>>rand
ans =
0.8147
>>rand
ans =
0.9058
>>rng(0)
>>rand
ans =
0.8147
>>rand
ans =
0.9058

You need not start with the initial state in order to

generate the same sequence. To show this, continue
the above session as follows.

The general formula for generating a uniformly distributed

random number y in the interval [a, b] is
y = (b a) x + a

>>s = rng;
>>rng(s)
>>rand
ans =
0.9058
>>rng(s)
>>rand
ans =
0.9058

(7.31)

where x is a random number uniformly distributed in the

interval [0, 1]. For example, to generate a vector y
containing 1000 uniformly distributed random numbers in
the interval [2, 10], you type y = 8*rand(1,1000) + 2.

7-25

7-26

If x is a random number with a mean of 0 and a

standard deviation of 1, use the following equation to
generate a new random number y having a standard
deviation of and a mean of .
y=x+

y = bx + c

(7.33)

and if x is normally distributed with a mean x and

standard deviation x, it can be shown that the mean
and standard deviation of y are given by

(7.32)

For example, to generate a vector y containing 2000

random numbers normally distributed with a mean of 5
and a standard deviation of 3, you type

y = bx + c

(7.34)

y = |b|x

(7.35)

y = 3*randn(1,2000) + 5.

7-27

7-28