K B Athreya
years earlier, Henri Lebesgue of France extended the no- Kolmogorov saw in
tion of length of intervals in R, the real line, to a large Lebesgue's theory of
class M of sets in R, now called Lebesgue measurable measure on , an
sets. The extended function λ on M satisfied the condi- appropriate
tion that (R, M, λ) is a measure space, i.e., M is known mathematical model
now as a σ-algebra of subsets of R that included all inter- for studying random
vals and λ : M → [0, ∞] was such that λ is a measure. phenomena.
(See precise definition later.)
Kolmogorov saw in Lebesgue’s theory of measure on R,
an appropriate mathematical model for studying ran-
dom phenomena.
First, one identifies the set Ω of possible outcomes asso-
ciated with the given random phenomena. This set Ω is
called the sample space and a typical individual element
ω in Ω called a sample point. Even though the outcome
of the experiment is not predictable ahead of time, one
may be able to determine the ‘chances’ that some par-
ticular statement about the outcome is valid. The set
of ω’s for which a given statement is valid is called an
event. Thus, an event is a subset of the sample space Ω.
After identifying the sample space, one identifies a class
F of subsets of Ω (not necessarily all of P(Ω), the power
set of Ω, i.e., the collection of all possible subsets of Ω)
and then a set function P on F such that for an event
A in F , P (A) will represent the chance of the event A
happening. Thus, to a given random phenomenon, one
associates a triplet (Ω, F , P ) where Ω is the set of all
possible outcomes (called the sample space), a collection
F ⊂ P(Ω) (called the events collection) and a function
P on F to [0, ∞] (called a probability distribution). It
is reasonable to impose the following conditions on F
and P .
i) x1 ≤ x2 ⇒ F (x1) ≤ F (x2),
is a probability measure μ(t1 ,t2 ,··· ,tk )(·) on (Rk , B(Rk )).
Suppose this family of probability measures satisfy:
(ii) μ(t1 ,t2 ,··· ,tk ,tk+1 ) (A1 ×A2×· · ·×Ak ×R) =
i) x1 ≤ x2 ⇒ FX (x1) ≤ FX (x2),
{Xt1 (ω), Xt2 (ω), · · · , Xtk (ω)} are independent with re-
spect to P .
An example of an infinite sequence of independent ran-
dom variables is the following. Let Ω = [0, 1], F =
B[0, 1], the Borel σ-algebra of [0, 1], P = Lebesgue mea-
δi(ω)
sure. For each ω, let ω ≡ ∞ i=1 2i be the binary expan-
sion of ω in base 2. Then, it can be shown that for each
k < ∞, the functions δ1 (ω), · · · , δk (ω) are independent
on this (Ω, F , P ) with each δi having distribution
1
P {ω : δi (ω) = 0} = = P {ω : δi (ω) = 1},
2
called the Bernoulli( 12 ) distribution. One just needs to
verify that {ω : δ1 (ω) = s1, · · · , δk (ω) = sk } for any
given s1, · · · , sk ∈ {0, 1} is simply an interval of length
1
in [0, 1]. A similar result holds for expansion to base
2k
p, where p is an integer > 1.
The following results known as the ‘laws of large num-
bers’ are consequences of slightly more general results
due to Kolmogorov.
Theorem 1 (Weak) Law of Large Numbers. Let X1 , X2 ,
X3 , · · · , Xn be independent random variables on some
probability space (Ω, F , P ) such that
P {ω : X n (ω) → EX1 as n → ∞} = 1.
√ (X n − μ) b
1 x2
i) lim P a ≤ n ≤b = √ e− 2 dx
n→∞ σ a 2π
√ (X n − μ) b
1 x2
ii) lim P a ≤ n ≤b = √ e− 2 dx
n→∞ σn a 2π
where
1 2
n
2
σn2 ≡ Xi − X n , n ≥ 1.
n i=1
y x2
The function Φ(y) ≡ −∞ √12π e− 2 dx, −∞ < y < ∞
is called the standard normal distribution function or
Gaussian distribution named after the great German
mathematician Carl F Gauss3 . The function φ(y) = 3
Resonance, Vol.2, No.6, 1997.
2
dΦ(y)
√1 e − y2
dy
= 2π
is called the standard normal probabil-
ity density function. The graph of the curve (x, φ(x)) as
x varies over (−∞, ∞) looks like a bell and is referred
to as bell curve.
Suppose given 0 < α < 1 one wants to produce an
interval In based on observations X1 , X2 , · · · , Xn such
that
P (μ ∈ In ) → (1 − α) as n → ∞,
where μ = EX1 . For this, one first chooses a ∈ (0, ∞)
such that
+a
1 u2
Φ(a) − Φ(−a) = √ e− 2 du = (1 − α)
−a 2π
Note that if
aσn aσn
In ≡ X n − √ , X n + √
n n