Random Variables: Previously we defined a numerical variable is some phenomenon of interest to which
numerical values may be assigned. Now we define the random variable a little differently, by associating it
with the outcomes of a random experiment.
By random variables (rv), we mean a real number connected with the outcome of the experiment by a
predefined rule. For different outcomes it takes different values. The all possible values of a random variable
is called its range set. Random Variables are classified into discrete random variable and continuous random
variables. The discrete random variables can take only a limited number of values in a range whereas the
continuous random variable can take any value within a range.
The random variables are denoted by UPPERCASE letters and the values assumed by it denoted by LOWERCASE
letters.
Example: Consider the experiment of tossing of three coin. The sample space contains the points {HHH,
HHT, HTH, THH, TTH, THT, HTT, TTT} . To each of the outcome we can assign random variables,
say: X = number of heads, Y= number of tails, Z= excess of heads etc. Then we have following table.
It is apparent from the above table that the variable X and Y can take the values 0, 1, 2 or 3. The variable Z
can take the values 3, 1, -1 and –3.
Let X be a discrete rv and x1, x2,…………. , are the possible values of X. Then to each value xi we can
associate a number, p(xi) = P(X= xi) and call it the probability that X takes the value xi .
The set of values xi and p(xi) i.e (xi, p(xi) ) forms the probability distribution of X. Whenever the context
permits, we write x for xi and p(x) for p(xi). Thus we may define a Probability Distribution of a Discrete
Random Variable to be a mutually exclusive listing of all possible numerical values for that random variable
such that a particular probability of occurrence is associated with each value. Further, the probability
distribution of a discrete random variable may be,
1) A theoretical listing of all values of outcomes and probabilities. The numbers p(xi) are generated
using some mathematical logic. For example,
2) An Empirical Listing of values and their respective observed relative frequencies. ( p(xi) are calculated as
relative frequencies. At this point, we should understand that the relative frequencies we discussed in
previous lectures, is nothing but the empirical probability of the variable falling in the class or taking a
particular value.
PAGE 1 OF 11
Stat I_Random Variables and Probability Distribution
3) A subjective listing of values associated with their subjective probabilities. (p(xi) are calculated using
subjective reasoning)
Probability Mass Function: A mathematical model or formula, which gives the probabilities for the different
outcomes of a random variable, is called the Probability mass function of that variable.
Class Work:
• In the previous example of tossing 3 coins what is the probability distribution of X, Y, and Z.
• Three balls are drawn from a box containing 6 red and 8 white balls, suppose X= number of red balls
drawn, Y= number of white balls drawn. Form probability distribution of X and Y.
2. It shows how the total frequency of N or n It shows how the total probability of 1 is
is distributed among the different values of distributed among the different values of
variable or group of values. variable or range of values.
In order to summarize a discrete probability distribution we shall compute its major characteristics, the mean
and the standard deviation.
The mean (denoted by µ) of a probability distribution is the expected value of its random variable. The
expected Value of a discrete random variable may be considered as the weighted average over all possible
values- the weights being the probability associated with each of the values. This measure can be obtained by
multiplying each possible value x of X, by its corresponding probability P(X = x) and then summing the
products.
µ = E(X) = Σ x. P(X=x)
In the roll of a fair die, we can show that the expected value of X=the number shown is
E(X) = 3.5, it is not literally meaningful as we can never obtain a face of 3.5 while rolling a die. However we
can expect to observe the six different faces with equal probabilities, so we should have equal number of
ones, twos, ……., and sixes. In the long run, over many rolls, the average value would be 3.5.
Similarly, we can find the Variance of X, denoted by σ2 or Var(X) by using the following formula.
PAGE 2 OF 11
Stat I_Random Variables and Probability Distribution
The Carnival Game: It is played against a house (Casino). In this game, a player should pay Rs.4 for each roll
of a die, the house in turn, pays the player the amount shown in the face of the die. In this case the
expected payoff per game is only Rs. 3.5, because the expected value on the die is 3.5. It means that over
many roll, the payoff can be expected to average out Rs.3.5 per roll.
This situation can be described using another variable Y =Rupees won or Monetary value
This shows that we are expected to lose 50 paisa per roll, on the average. This type is game is not in favor of
a player. If the pay per roll to the house is decreased to Rs.3.5 then the game becomes fair.
The Game of Craps: It deals with rolling of a pair of fair dice. A field bet in the game of craps is a one-roll-
bet and is based on the outcome of the pair of die. For every Rs 1 bet you make: you can lose Rs 1 if the sum
is 5, 6, 7 or 8; you can win Rs.1 if the sum is 3, 4, 9, 10 or 11; or you can win Rs. 2 if the sum is 2 or 12.
Discuss the game in terms of player’s long run profit or loss.
Under-or-Over-Seven: In the game of Under-or-Over-Seven, a pair of fair dice are rolled once, and the
resulting sum determines whether or not the player win or lose his/her bet. The player can bet for
UNDER(seeing the sum under 7), SEVEN (seeing exactly seven) or OVER (seeing more than 7). The player can
bet for under, i.e. on seeing 2, 3, 4, 5, or 6, for such a bet he/she wins the same amount of bet if the sum is
under 7, and loses the amount if the sum is not under 7. Similarly he/she can be for seeing over 7 i.e. 8, 9,
10, 11, or 12. He/she can also bet for seeing 7; in this case he wins four times of his bet if the sum is 7
otherwise he loses.
Discuss this situation considering the players long run profit or loss, for each of the three types of bets.
Roulette Betting Strategies (Single and Multiple): Students who know the game may use this concept for
finding out expected profit in each betting strategy.
PAGE 3 OF 11
Stat I_Random Variables and Probability Distribution
These are the probability distribution in which the variables are distributed (the probabilities are assigned for
various values of a variable) according to some definite probability law expressed in the form of a
Mathematical Model. These distribution are classified as discrete and continuous according to the nature of
the underlying random variable. The study of these distributions are crucial because they can be used in one
or other real life situations to describe the distribution of different variables in a population under
consideration..
It is the distribution of a random variable which is related to the Bernoulli Process. A Bernoulli Process is an
experiment, in which trials are repeated a fixed number of times. The characteristics of the trials are as
follow:
i) The Outcomes of each trial is dichotomously classified: occurrence of a predefined event (called a
success) and non-occurrence of that event (called a failure)
ii) For each trial the probability of success is p, and the probability of a failure is q = 1-p, where p is a
number between zero and one.
iii) The trials are statistically independent: the outcome of a trial does not effect and not affected by the
outcomes of preceding or succeeding trials.
Whenever a group of individuals, animate or inanimate, could be classified dichotomously with respect to an
attribute, we may call the possession of that attribute as "Success" and the non-possession of that attribute as
"Failure". We can describe the underlying experiment as a Bernoulli Process.
For example,
i) Selecting persons from a dichotomously classified population (e.g. classified as Vegetarian/ Non-Veg)
ii) Quality Control Inspection of some product (Defective and Non-defective)
iii) Series of dice rolling/coin-flipping/ card drawing experiment (particular outcome being called a
Success.)
iv) Answering multiple choice questions in an examination. (Wild guessing)
v) Series of Langoor Boorja, Cowry shell (Kauda) Games
In any such Bernoulli process, we are interested in finding out the probabilities of:
The Binomial Probability Distribution assigns such probabilities by a mathematical model, called probability
mass function of Binomial Distribution. If we suppose X= no of successes in n trials then the possible values of
X are 0, 1, 2, …….,n. Then
(Note: the derivation of this formula will be discussed in the class, but derivation of formula is not necessary
for exams)
A discrete random variable whose probability distribution is given by the above formula is said to have a
Binomial Distribution with parameters n and p. Symbolically written as X∼B(n, p), X is called a Binomial
Variable.
PAGE 4 OF 11
Stat I_Random Variables and Probability Distribution
Cumulative Binomial Probabilities: Sometimes, we are interested to find out the probabilities of getting the
number of successes in a range, such probabilities are found by adding the required exact probabilities as
shown
r
Prob. Of getting at most r successes in n trials = P( X ≤ r ) = ∑ P( X =
j =0
j)
n
Prob of getting at least r success in n trials = P( X ≥ r ) = ∑ P( X =
j =r
j)
r −1
Prob of getting less than r successes in n trials = P( x < r ) = ∑ P( X =
j =0
j ) etc.
Caution !!: In discrete probability distribution, the including or not including of a equality sign will make a
difference in the answer. Note that P(X<r) and P(X ≤ r) are not the same for discrete probability distribution,
however these can be used interchangeably in case of Continuous Probability distribution.
In the Binomial Tables (See Appendix Table 3 in your Text-Book) the binomial probabilities P(X = r) = nCr prqn-r
are listed for various values of n and p. The use of this kind of tables will be practiced in the class.
1. Mean and Standard Deviation: Let X denote the number of successes in n trials, the probability of success
in each trial being p.
Then, the expected number of successes in n trials is the mean of this Binomial variable X. µ = E(X) = np
The variance of number of successes is σ2=npq
The Standard deviation of number of successes is σ = npq . Note that the mean is always greater than
variance in Binomial Distribution.
2. Shape(Skewness): Each time the set of parameters n and p is specified, a particular binomial distribution
can be generated. The parameter n is any positive integer and p is a number between 0 and 1, so for
different combination of values of n and p we have different binomial distributions, that have different
means, different standard deviations and different graphical shapes. The binomial distribution may be
skewed or symmetric depending on the value of n and p. We have following results. (Also see figures on pages
241, 242, 243)
PAGE 5 OF 11
Stat I_Random Variables and Probability Distribution
a) Student's Dilemma: In a Multiple-choice 50-question test, each question has 4 options marked A, B, C and
D, one of which is correct. Each question carry 2 marks and 1 mark will be deducted in case of a wrong
answer.
In such an exam a student answered 30 questions correctly. Now she wondered whether guessing the answers
would pay.
In such a situation, how would you use the concept of Binomial Distribution and Expected Value of Marks, so
that the student may come out of this dilemma?
i) Also discuss this question, when the minus system is 0.5 marks per wrong answer.
ii) Further suppose that out of 20 questions, in 8 questions she has confusion over 2 choices, in 7 questions
she has confusion over 3 choices and in 5 question she has no idea at all. Is it worth guessing in this situation?
What would be the best strategy? Negative marking is 1 per wrong answer.
b) Langoor Boorja: The game of Langoor Boorja is very popular gambling game in Kathmandu, in the
Dashain and Tihar Season, though illegal, you would see people playing this game in every locality. This
game is played with 3 cubes(in an another variation with 6 cubes) each having 6 faces painted with six
different shapes namely Langoor or The Flag, Boorja or the crown, Paan♥ or the heart, Iiet♦ or the diamond,
Chidi♣ or the club and Surath♠ or the Spade. The house thoroughly mixes these cubes inside a can, and then
turns the can upside down. The players bet amounts on different shapes. After the betting is complete, the
house opens the can. The amount bet on the shapes not shown are won by the house, and single, double or
triple of the amount of the bet is paid to the players whose shapes are shown, according as how many number
of such shape is shown up.
The situations in both variations of the game are shown in the table below.
Table: Possible Happenings and Winnings in Langoor Boorja Game with Rs.10 bet on Boorja
Three cube Game Six cube Game
Number of Boorja Amount won by you Number of Boorja Amount won by you
Shown Shown
0 you lose your Rs. 10 0 you lose your Rs. 10
1 Rs.10 1 you lose your Rs. 10
2 Rs. 20 2 Rs.20
3 Rs 30 3 Rs. 30
4 Rs. 40
5 Rs. 50
6 Rs.60
Now suppose X is the random variable, which denotes the number of boorja turned on any game.
PAGE 6 OF 11
Stat I_Random Variables and Probability Distribution
The Poisson distribution is named after the French Mathematician Simeon Denis Poisson (1781-1840). This is
a discrete probability distribution, which is used to explain the discrete random variable appearing with the
rare events : mean number of occurrence of such events being finite, but the probability of happening of
event is very small because the total number of possible cases is very large. We study the Poisson Distribution
in two approaches:
A Poisson Process is said to exist, if we can observe discrete events in an ‘area of opportunity’ – a continuous
interval (of time, of length, of surface area etc.) – in such a manner that if we shorten the area of
opportunity or interval sufficiently, we can find the following characteristics of the successes (happening of
that event)
Suppose we examine the number of customers arriving during the 1 PM to 2 PM lunch hour in a bank located in
the city center. An arrival of a customer , considered a success, is a discrete event at a particular point over
the continuous one hour interval. Over such an interval, further suppose that from the past data, we obtain a
mean of 180 arrivals. Now if we are to break the one hour interval into 3600 consecutive one-second
intervals, then
1. The expected number of customers arriving in any one-second interval would be .05.
2. The probability of having more than one customer arriving in any one second interval is zero.
3. The arrival of one customer in any one-second interval is independent of the situation of that one-second
interval within the hour.
4. The arrival of one customer in any one-second interval is independent of the arrival of any other
customer in any other one second interval.
More Examples of Poisson Process: The Poisson distribution has found application in a variety of fields such
as Queuing theory (waiting time problems), Insurance, Business, Economics and Industry .
Following are the some situation in which we can use Poisson distribution for the assessment of probabilities:
a) Number of telephone calls arriving at a telephone switchboard
b) The number of defects per unit of manufactured product.
c) The number of suicides reported in a particular day or the number of accidents in a crossroad or the
number of casualties due to a rare disease
d) Number of typographical error per page in a typed material.
Let X denote the number of occurrence of an event (success) and λ (lambda) be mean number of occurrence
of that event in a certain ‘area of opportunity’ . It is apparent that the random variable X may range from 0
to ∞ (0,1,2………). The mathematical model of Poisson distribution, called the probability mass function of
Poisson distribution gives the probability of exactly r successes
λr e − λ
P( X = r ) = probability of getting exactly r successes = (r =0, 1, 2…)
r!
(all the probabilities could be assigned) with the knowledge of this parameter λ, as by n and p in the case of
Binomial distribution. Note that as r increases, the probabilities decreases very rapidly justifying the rarity of
the events.
Use of Poisson Probability Table Appendix Table 4(a) and Appendix Table 4(b) will be discussed in the
class.
Cumulative Poisson Probabilities: The cumulative Poisson Probabilities are obtained by adding the
appropriate exact probabilities as in the case of Binomial Distribution. However, in the case of Poisson
distribution, the value of X may range to infinity, we cannot calculate cumulative probabilities of the type
P(X>r) and P(X ≥ r) by adding the exact probabilities. In this case, we have to subtract the complementary
probabilities from 1, as such,
P(X>r) =1- P(X ≤ r) and P(X ≥ r) = 1- P(X ≤ r - 1).
1. Mean and Variance: Let X~P(λ) then the mean of the Poisson distributed variable X is given by E(X)=λ and
Variance of X =Var(X)=λ. This is an interesting property of Poisson Distribution, that for a Poisson Distribution
the mean and variance is equal.
2. Shape: Each time the parameter λ is specified, a particular Poisson Distribution can be generated. A
Poisson Distribution will be skewed to the right(positively skewed) for small values of λ, and gradually
approaches symmetry with the peak at the middle as λ gets large. With different values of λ, we obtain a
family of Poisson Distribution.
It should be noted that the Poisson rv may theoretically range from zero to infinity. However when used as an
approximation to binomial, the Poisson rv can not exceed n. Moreover, with large n and small p, the second
equation given above implies that observing a large number of successes becomes small and approaches zero
quite rapidly. Due to severe degree of right-skewness in such a probability distribution, no difficulty arises
when applying the Poisson distribution to the binomial.
PAGE 8 OF 11
Stat I_Random Variables and Probability Distribution
A random variable is said to be continuous if it can take all possible values within a certain range. When a
mathematical expression is available to represent some underlying continuous random variable, the
probability that various values of the rv occur within a certain range or intervals may be calculated.
However, the exact probability that the rv take a particular value is zero, i.e. P(X=x) =0 when X is a
continuous rv. In such case, we are interested in finding out the probability that the value of rv falls in an
dx ⎛ dx ⎞
infinitesimal interval ⎜ x − , x + ⎟ , and we write,
2 ⎝ 2⎠
⎛ dx dx ⎞
f ( x )dx = P⎜ x − ≤ X ≤ x + ⎟ f(x)dx represents the area bounded by the curve y=f(x) and the
⎝ 2 2 ⎠
ordinates at the points x-dx/2 and x+dx/2. This function f(x) is called Probability Density Function of X and
specified mathematically for different theoretical continuous distribution (one of which is Normal
Distribution). When the function f(x) is graphed, it will follow a curve known as probability curve, the total
area under this curve is 1. The area under the curve, the x-axis and the ordinates at the point x=a and x=b
gives the required probability that the value of variable falls in the range (a,b).
b
P( a ≤ X ≤ b ) = ∫ f ( x )dx
a
THE NORMAL DISTRIBUTION:
The Normal Distribution is a particular continuous probability distribution. It has several important
theoretical properties.
However, in actual practice, above described theoretical properties are rather impossible to be observed in a
batch of data. We may observe that in some data batches, some variables approximately satisfy the
theoretical properties as below.
i) Numerous continuous phenomena seem to follow this distribution or can be approximated by it.
ii) The various discrete or continuous distributions may be approximated by this distribution, which
avoids computational difficulty. For example we can use normal distribution to find binomial
probabilities whenever np ≥ 5.
iii) Its provides the basis of Statistical Inference because of its relation to the Central Limit Theorem.
The Mathematical Model for Normal Distribution: The probability density function used to obtain the desired
probabilities, is given by
1 ⎛ x −µ ⎞ 2
1 − ⎜
2⎝ σ ⎠
⎟
f(x)= e Where,e = mathematical constant = 2.71828, π = Mathematical constant = 3.14159
σ 2π
PAGE 9 OF 11
Stat I_Random Variables and Probability Distribution
µ is the mean (-∞ <µ<+∞), σ is the standard deviation (0<σ<+∞),x is the value of
variable (-∞<x<+∞)
Examining the above expression the probabilities of random variable X is dependent only on the values of µ
and σ ( they are called parameters of the distribution). As such, for different combinations of the values of µ
and σ, we would have different Normal Distribution. (See figures on page 259)
A variable X having Normal Distribution with mean µ and standard deviation σ is denoted by writing X ~ N(µ,
σ2). X is called a Normal variable.
x−µ
The Standard Normal Distribution: Let X ~ N(µ, σ2). Let us define a new variable Z by, Z = Such a
σ
variable has no unit of measurement, and is called a standard normal variable. The mean of Z is 0 and its
variance and standard deviation is 1. Z ~ N(0, 1). The probability density function of a standard normal
z2
1 −
variable is given by, f(z)= e 2
, where − ∞ < z < ∞
2π
Area Property of Normal Distribution: Whatever be the values of µ and σ, for the normal distribution, the
total area under the normal distribution curve is 1 (or 100%). So, we may think of areas under the curve as
probability. The normal distribution has following area properties.
Use of Normal Table in finding Probabilities: In dealing with continuous distribution, we always calculate
probabilities in a range, to calculate such probability directly we need the thorough knowledge of definite
integration. But fortunately, tables are available for calculation of probabilities. In Normal Probability tables,
the probabilities (area under normal curve) are listed for different values of Z. (Detailed discussion and
practice will be done in the class).
The table may be used for two purposes
To find the fractiles of the normal distribution, the following equation should be solved:
P(X ≤ Fk) = k, where, Fk= k-fractile, k =0.01, 0.02, .........................0.99
PAGE 10 OF 11
Stat I_Random Variables and Probability Distribution
For large n, it would be very inconvenient to calculate Binomial Probability in the situations like above, and
the tables are not provided for every value of n and p. In such situation we may use normal distribution to
approximate the binomial distribution.
Caution: We must always adjust the upper and lower limit using Continuity Correction by adding and
subtracting 0.5
Use this approximation only when both np and nq in the binomial distribution is at least 5.
Also note P(X < 3)= P(X≤ 2), P(X >5) = P(X ≥ 6), P(3 ≤ X ≤ 7) = P(2 < X < 8) type relations to avoid confusions in
applying CCA. (End of Random Variables and Probability distributions)
PAGE 11 OF 11