BIOL 350
LECTURER: DR. GORDON LIGHTBOURN
p1qk-1,
p2qk-2,
p3qk-3,..
,
,
,
,
,,
,
(5.11)
Which are the relative expected frequencies corresponding to
the following counts of the rare events Y:
0, 1, 2, 3, 4, ., r,
The first term represents the relative expected frequency of
samples containing no rare events (0).
The second term, one rare event.
The third term, two rare events.
The fourth term, three rare events.
And so forth.
Explanation of the term e ,where e is the base of the natural log, is
2.71828 and is the parametric mean of the distribution.
(2)
Observed
frequencies
(3)
Absolute
expected
frequencies
(4)
Deviation from
expectation
-
Y
_________________________________________________________________________
0
75
66.1
+
1
103
119.0
2
121
107.1
+
3
54
64.3
4
30
28.9
+
5
13
10.4
+
6
2
3.1
7
1
17
0.8
14.5
+
+
8
0
0.2
9+
1
0.0
+
400
399.9
__________________________________________________________________________
EXAMPLE 5.5
Distribution of yeast cells in 400 squares of a
haemocytometer.
Column (1) lists the number of yeast cells observed in
each haemocytometer square.
Column (2) gives the observed frequency the
number of squares containing a given number of
yeast cells.
Note 75 squares contain no (0) yeast cells.
Most squares held either one or two cells.
Only 17 squares contained 5 or more yeast cells.
EXAMPLE 5.5
Why would we expect this frequency distribution to
be distributed in Poisson fashion?
We have a relatively rare event.
On average there 1.8 cells per square.
Relative to the amount of space, the number found is
very low.
We expect the occurrence of individual yeast cells in
a square is independent of the occurrence of other
yeast cells.
EXAMPLE 5.5
The mean of the rare events is the only quality we
need to know to calculate the relative expected
frequencies (of a Poisson distribution).
We do not know the parametric mean of the yeast
cells.
We employ an estimate (the sample mean) and
calculate the expected frequencies where equals
the sample mean of table 5.5.
It is convenient to rewrite expression 5.11 as:
i = i-1
for i = 1, 2, . Where 0 = e-
(5.12)
EXAMPLE 5.5
Note that the parametric mean has been replaced by
the sample mean .
Expression 5.12 yields relative expressed frequencies.
Absolute expected frequencies:
= n/e
0
EXAMPLE 5.5
The biological interpretation: the yeast cells seem to be
randomly dispersed in the counting chamber, indicating
thorough mixing of the suspension.
Note that in Table 5.5 we group the low frequencies at one
tail of the curve, uniting them by means of a bracket. For a
goodness of fit test no expected frequency, should be
less than 5.
Poisson distribution facts:
Computing expected frequencies we need to know only
one parameter the mean of the distribution.
The mean completely defines the shape of a given
Poisson distribution.
We have a simple relation between the two: = 2
The variance is equal to the mean.
EXAMPLE 5.5
In our example, variance = 1.965, not much larger than
the mean 1.80, indicating that the yeast cells are
distributed approximately in Poisson fashion.
FIGURE 5.3
(2)
Observed
frequencies
(3)
Absolute
expected
frequencies
(4)
Deviation
from
expectation
-
_________________________________________________________________________
0
100
77.7
+
1
9
37.6
2
6
9.1
3
8
1.5
+
4
1
17
0.2
10.8
+
+
5
0
0.0
0
6+
2
0.0
+
126
126.1
__________________________________________________________________________
= 0.4841
s2 = 1.308
CD = 2.702
TABLE 5.6
The first example, is from an ecological study of
mosses of the species Hypnum schreberi invading
mica residue of china clay. The ecologist laid out 126
quadrats. In each quadrat they counted the number
of moss shoots. Expected frequencies are calculated
using the mean number of moss shoots, = 0.4841, as
an estimate of .
We expect only 78 quadrats without a moss plant, we
find 100.
Also we expect 1.7 quadrats containing 3 or more
moss shoots, we find 11.
The center classes are less than expected.
TABLE 5.6
Instead of the near 38 expected quadrats with one
moss plant each, we find only 9.
This case illustrates clumping, which was also
encountered in the binomial distribution.
The sample variance s2 = 1.308, much larger than the
= 0.4841, yields a coefficient of dispersion CD =
2.702.
Biological explanation: the protonemata, or spores,
of the moss were carried in by water and deposited
at random but that each protonema gave rise to a
number of upright shoots, so counts of the latter
indicated a clumped distribution.
(2)
Observed
frequencies
(3)
Poisson
expected
frequencies
(4)
Deviation
from
expectation
-
_________________________________________________________________________
0
37
31.3
+
1
32
35.7
2
16
20.4
3
9
7.8
+
4
2
2.2
5
0
13
0.5
10.6
+
6
1
0.1
+
7+
1
0.0
+
98
98.0
__________________________________________________________________________
= 1.1429
s2 = 1.711
CD = 1.497
TABLE 5.7
The second example tests the randomness of
distribution of weed seeds in samples of grass seed.
We can estimate k (which is several thousand), and
q, which represents the large proportion of grass
seeds, as compared with p, the small proportion of
weed seed.
The data are structured as in a binomial distribution
with alternative states: weed seed and grass seed.
Only the number of weed seeds must be considered.
This is a binomial in which the frequency of one
outcome is very much smaller than that of the other,
and the sample size is large.
TABLE 5.7
We can use the Poisson distribution as a useful
approximation of the binomial frequencies for the tail of
the distribution.
We use the average number of weed seeds per sample of
seeds as our estimate of the mean and calculate Poisson
frequencies from the mean.
Although the pattern of deviations and the coefficient of
dispersion indicate clumping, this tendency is not
pronounced and we do not have sufficient evidence to
suggest this is not a Poisson distribution.
We conclude the seeds are randomly distributed through
out the sample.
If clumping had been found, it might mean that weed
seeds stuck together, for some physical reason
(2)
Observed
frequencies
(3)
Poisson
expected
frequencies
(4)
Deviation
from
expectation
-
_________________________________________________________________________
0
61
70.4
1
50
32.7
+
2
1
7.6
3
0
1
1.2
8.9
4+
0
0.1
112
112.0
__________________________________________________________________________
= 0.4643
s2 = 0.269
CD = 0.579
Biological explanation:
It was found that the adult female weevil tend to
deposit evenly rather than randomly over the
available beans.
This prevents too many egg being place on any one
bean and precluding heavy competition among the
developing larvae.
A contributing factor was competition between
larvae feeding on the same bean, generally resulting
in all but one being killed or driven away.
(2)
Observed
frequencies
(3)
Poisson
expected
frequencies
(4)
Deviation
from
expectation
-
_________________________________________________________________________
0
109
108.7
+
1
65
66.3
2
22
20.2
+
3
3
4.1
4
1
4
0.6
4.8
+
5+
0
0.1
Total
200
200.0
__________________________________________________________________________
= 0.610
s2 = 0.611
CD = 1.002