Class 2
AMS-UCSC
AMS-133/206
1 / 19
Topics
Topics
We will talk about...
AMS-133/206
2 / 19
Topics
Topics
We will talk about...
AMS-133/206
2 / 19
Topics
Topics
We will talk about...
Maximum likelihood estimators Examples of Maximum likelihood estimators Limitations of maximum likelihood estimators
AMS-133/206
2 / 19
Topics
Topics
We will talk about...
Maximum likelihood estimators Examples of Maximum likelihood estimators Limitations of maximum likelihood estimators Problems
AMS-133/206
2 / 19
Maximum likelihood estimation This is a method that chooses as the estimate of the value that provides the largest value of the likelihood function This concept was introduced by R.A. Fisher in 1912. It can be applied to most problems and often produces reasonable values for an estimator of . For large samples it yields an excellent estimator of . Notation: Let the random variables X1 , X2 , . . . , Xn be a sample from a continuous or discrete distribution with p.d.f or p.f f (x |) with belonging to some parameter space . The parameter can be a real value or a vector. If we observe a sample vector x = (x1 , x2 , . . . , xn ), the joint p.d.f or p.f will be denoted as fn (x|).
AMS-133/206
3 / 19
Likelihood function:
Denition
Likelihood function When the joint p.d.f or the joint p.f fn (x|) is regarded as a function of given the values x1 , x2 , . . . , xn it is called a likelihood function
If x comes from a discrete distribution and the probability fn (x| ) is
very high when takes a particular value 0 , and it is very small for any other value of , then we would naturally estimate the value of by 0 .
If x comes from a continuous distribution, we will try to nd the value
of that makes fn (x|) very large, and use this value to estimate . This concept is formalized as follows:
AMS-133/206
4 / 19
Maximum likelihood estimator/estimate For each possible value of x, we consider (x) the value of which = (X) the estimator dened that way. The maximizes fn (x|). Let is a maximum likelihood estimator of . After X = x is estimator observed, the value (x) is the maximum likelihood estimate MLE is an abbreviation for maximum likelihood estimate or estimator. MLE is required to be an element of the parameter space
AMS-133/206
5 / 19
Example 1
Lifetimes of electronic components
The following data is observed: X1 = 3, X2 = 1.5 and X3 = 2.1. The random variables have been modeled as samples from an exponential distribution with parameter . The likelihood function is: f3 (x|) = 3 exp(6.6) where x = (3, 1.5, 2.1). The value of maximizing f3 (x|) will be the same value that maximizes log f3 (x|). We nd the MLE by maximizing: L() = log f3 (x|) = 3 log() 6.6 Taking the derivative dL()/d , setting the derivative to zero and solving for yields = 3/6.6 = 0.455 To check whether is a maximum, calculate the second derivative and check whether is negative. The second derivative is negative at the value 0.455. Therefore this value is a maximum likelihood estimate.
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 6 / 19
Example 2
Test for a disease
Suppose a medical test for a certain disease is given to the public by the Department of Public Health. The test in 90% reliable in the following sense:
There is a 90% probability that the test will give a positive response if
Lets X the random variable that represents the test results. X = 0 when the test is negative. X = 1 when the test is positive. Consider the parameter space = {0.1, 0.9}, where = 0.1 means that the person tested does not have the disease, and = 0.9 means that the person tested has the disease. Given , X has a Bernoulli distribution with parameter . The likelihood function is: f (x |) = x (1 )1x
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 7 / 19
Example 2 (Cont.)
Test for a disease
It is clear that = 0.1 maximizes the likelihood when x = 0 is observed. If x = 1 is observed: f (1|) = 0.1 if = 0.1 0.9 if = 0.9
In this case = 0.9 maximizes the likelihood when x = 1 is observed. The MLE is: = 0.1 if X = 0 0.9 if X = 1
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 8 / 19
Example 3
Sampling for a Bernoulli distribution
Let X1 , X2 , . . . , Xn be a random sample from a Bernoulli distribution with parameter (unknown) (0 1). For all observed values x1 , x2 , . . . , xn where xi = 0 or 1, the likelihood function can be written as:
n
fn (x|) =
i =1
xi (1 )1xi
xi ) log +(n
i =1
We take the derivative dL()/d , set this derivative to zero and solve for . This derivative is zero at = x n . By examining the second derivative it can be proved that this value maximizes L().
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 9 / 19
Example 3
Sampling for a Bernoulli distribution
If
= 0 then L() is a decreasing function of for all , and L achieves a maximum in = 0. = n then L() is an increasing function of for all , and L achieves a maximum in = 1. = X n The MLE is
n i =1 xi
n i =1 xi
If
AMS-133/206
10 / 19
Example 4
Sampling from a Normal distribution with unknown mean
Let X1 , X2 , . . . , Xn be a random sample from a Normal distribution with unknown mean and known variance 2 . For all observed values x1 , x2 , . . . , xn the likelihood function is: fn (x|) = 1 1 exp[ 2 n / 2 2 2 (2 )
n
(xi )2 ]
i =1
From this equation we see that fn (x|) will be maximized for the value of that minimizes:
n n n
Q ( ) =
i =1
(xi )2 =
i =1
xi2 2
i =1
xi + n2
Note that Q is a quadratic function in with positive coecient on 2 , This means that Q will be minimized where its derivative is zero.
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 11 / 19
Example 4 (Cont.)
Sampling from a Normal distribution with unknown mean
After calculating the derivative of dQ ()/d , setting this to zero and n . This solving for we nd = x n . Therefore the MLE for is =X 2 estimator is not aected by the value of .
AMS-133/206
12 / 19
Example 5
Sampling from a Uniform distribution
Let X1 , X2 , . . . , Xn be a random sample from a Uniform distribution in the interval [0, ] where the value of the parameter is unknown ( > 0). The pdf for each observation is as follows: f (x|) = The joint pdf has the form: fn (x|) = 1/n if 0 xi (i=1,. . . ,n) 0 otherwise 1/ if 0 x 0 otherwise
The MLE will be the value of such that xi for i = 1, . . . , n and maximizes 1/n . Since this is a decreasing function of , the MLE will be the smallest value of such that xi for i = 1, . . . , n. This value is = max{X1 , X2 , . . . , Xn } = max{x1 , x2 , . . . , xn }, and the MLE of is
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 13 / 19
Nonexistence of an MLE
Example
Let X1 , X2 , . . . , Xn be a random sample from a Uniform distribution in the interval [0, ] where the value of the parameter is unknown ( > 0). The pdf for each observation is as follows: f (x|) = 1/ if 0 < x < 0 otherwise
The only dierence with the example above is that the weak inequalities has been replaced with strict inequalities. This mean that the MLE will be a value of for which > xi for i = 1, . . . , n and it maximizes 1/n . Since must be strictly greater that xi it does not include the value of = max{x1 , x2 , . . . , xn }. It can be chosen arbitrarily close to the value max{x1 , x2 , . . . , xn }, but it can not be chosen equal to this value. Therefore the MLE does not exist.
Winter 2012. Session 1 (Class 2) AMS-133/206 Jan 12, 2012 14 / 19
Non-uniqueness of an MLE
Example
Let X1 , X2 , . . . , Xn be a random sample from a Uniform distribution in the interval [, + 1] where the value of the parameter is unknown ( < ). The joint pdf is as follows: fn (x|) = 1 for xi + 1, (i = 1, . . . , n) 0 otherwise
that min{x1 , x2 , . . . , xn }.
The condition xi + 1 for i = 1, . . . , n is equivalent to the
condition max{x1 , x2 , . . . , xn } 1.
AMS-133/206
15 / 19
The joint likelihood fn (x|) can be written as follows: fn (x|) = 1 for max{x1 , x2 , . . . , xn } 1 min{x1 , x2 , . . . , xn } 0 otherwise
It is possible then to select a MLE any value of in the interval: max{x1 , x2 , . . . , xn } 1 min{x1 , x2 , . . . , xn } The MLE is not unique. All values inside the interval can be MLEs.
AMS-133/206
16 / 19
Problems
Problem 1
Problem 1
It is not know what proportion p of the purchases of a certain brand of breakfast cereal is made by women and what proportion is made by men. In a random sample of 70 purchases of this cereal, it was found that 58 were made by women and 12 were made by men. Find the MLE of p .
AMS-133/206
17 / 19
Problems
Problem 2
Problem 2
Suppose that X1 , X2 , . . . , Xn form a random variable for which the mean is known, but the variance is unknown. Find the MLE of 2
AMS-133/206
18 / 19
Problems
Problem 2
AMS-133/206
19 / 19