1 Exercise 1
Recall that we have two estimators of parameter in a Uniform (0, ) distribution. One is the MLE, the maximum of the sample, and other the MOME, which is twice the sample average. Suppose we have a random sample of size n from a Uniform (0, ) population. Which of the two estimators should is preferable? Answer this question by comparing the mean squared errors, i.e., E[(estimator of ( )2 ], of the two estimators, computed using the Monte Carlo simulations. Use a variety of values for n and , e.g., n = 5, 10, 30, 100, and = 1, 2, 4. Do you see any patterns in the results?
1.1 Overview
The code is written in a way that results are obtained for all suggested values of and n at onces. Estimate of the parameter is obtained by simulating many random data samples of dierent sizes using the R command runif(n,min=0,max=) to generate random numbers from an uniform distribution for some predetermined suggested values of and sample sizes. For each and sample size n a Monte Carlo simulation is performed of size 10000. In every run the parameter is estimated using the Method of Moments and Maximum Likelihood. After every simulation the mean square error is calculated for both methods and comparison is made between them. To visualize the results a plot of superimposed curves is generated for the mean square error for the two methods and for the dierent values. It is presented how the error for the two methods changes as a function of the sample size.
1.0
0.4
0.6
0.8
0.2
G G G G
0.0
G G
20
40
60
80
100
Figure 1.1: Superimposed curves of Mean Square Error as a function of sample size for the Method of Moment and Maximum likelihood with dierent parameters.
0.06
0.08
0.10
G G
0.04
0.02
G G
0.00
G G
20
40
60
80
100
Figure 1.2: Zoom in the Mean square error range of [0.0,0.1] to observe more clearly the errors when sample size is 100.
1.3 R Code
1 2 3 4 5 6 7 8 9 10
m=10000 # Number o f Monte C a r l o S i m u l a t i o n s t h e t a=c ( 1 , 2 , 4 ) # Parameter t o e s t i m a t e n=c ( 5 , 1 0 , 3 0 , 1 0 0 ) # Sample s i z e symb=c ( 1 , 5 , 4 ) # l i n e type mle=r e p ( 0 ,m) # i n i t i a l i z e mle v a r i a b l e mome=r e p ( 0 ,m) # i n i t i a l i z e mome v a r i a b l e mlerms=a r r a y ( 0 , dim=c ( l e n g t h ( n ) , l e n g t h ( t h e t a ) ) ) # i n i t mlerms a r r a y momerms=a r r a y ( 0 , dim=c ( l e n g t h ( n ) , l e n g t h ( t h e t a ) ) ) # i n i t momerms a r r a y f o r ( i i i i n 1 : l e n g t h ( t h e t a ) ) { # l o o p t h a t advan ces t h e s u g g e s t e d t h e t a values f o r ( i i i n 1 : l e n g t h ( n ) ) { # l o o p t h a t ad vances t h e s u g e s t e d sample sizes f o r ( i i n 1 :m ) { # l o o p f o r t h e Monte C a r l o s i m u l a t i o n nd=r u n i f ( n [ i i ] , min=0,max=t h e t a [ i i i ] ) # uni for m d i s t r i b u t i o n mle [ i ]=max( nd ) # MLE c a l c u l a t i o n mome [ i ] = 2 . 0 mean ( nd ) # MOME c a l c u l a t i o n } mlerms [ i i , i i i ]=mean ( ( mle t h e t a [ i i i ] ) 2 . 0 ) # MLE RMS c a l c u l a t i o n momerms [ i i , i i i ]=mean ( ( mome t h e t a [ i i i ] ) 2 . 0 ) # MOME RMS c a l c u l a t i o n } } # PLOTTING RESULTS pdf ( f i l e =" r m s F u l l . pdf " , width =13 , h e i g h t =8) for ( i i i in 1: length ( theta ) ) { p l o t ( n , mlerms [ , i i i ] , type=" o " , l t y=symb [ i i i ] , lwd =2, cex =2 , cex . l a b =2, ylim=c ( 0 . 0 , max( momerms ) ) , x l a b=NA, y l a b=NA, a x e s=F) box ( ) a x i s ( s i d e = 1 , cex . a x i s =1.5) a x i s ( s i d e = 2 , cex . a x i s =1.5) mtext ( s i d e = 1 , " Sample S i z e ( n ) " , l i n e = 3 , cex =2.5) mtext ( s i d e = 2 , "Mean Squared E r r o r " , l i n e = 2 . 3 , cex =2.5) par ( new=TRUE) p l o t ( n , momerms [ , i i i ] , type=" o " , l t y=symb [ i i i ] , lwd =2, cex =2 , cex . l a b =2, pch =22 , ylim=c ( 0 . 0 , max( momerms ) ) , x l a b=NA, y l a b=NA, a x e s=F) par ( new=TRUE) } l e g e n d ( " t o p r i g h t " , c ( e x p r e s s i o n ( p a s t e ( "MLE, " , t h e t a ==1)) , e x p r e s s i o n ( p a s t e ( " MOME, " , t h e t a ==1)) , e x p r e s s i o n ( p a s t e ( "MLE, " , t h e t a ==2)) , e x p r e s s i o n ( p a s t e ( "MOME, " , t h e t a ==2)) , e x p r e s s i o n ( p a s t e ( "MLE, " , t h e t a ==4)) , e x p r e s s i o n ( p a s t e ( "MOME, " , t h e t a ==4)) ) , pch =21:22 , cex =2 , l t y=c ( 1 , 1 , 5 , 5 , 4 , 4 ) , lwd=2 ) dev . o f f ( )
11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29 30 31 32 33
34 35 36
37
./mp2ex1.r
2 Exercise 2
We know how to construct a large sample condence interval for population proportion p. How good is this condence interval when n is not very large? Answer this question by computing the coverage probability of this interval using Monte Carlo simulations. Take level of condence to be 95% but use a variety of values for n and p, e.g., n = 5, 10, 30, 100, and p = 0.05, 0.1, 0.25, 0.5, 0.9, 0.95. Do you see any patterns in the results?
2.1 Overview
Similar as in exercise 1, the code generate results for all suggested values. To investigate how the coverage probability is aected when sample size is small Monte Carlo simulation is performed for coverage probability. To estimate proportion sample data from Bernoulli distribution is generated with dierent sizes and mean of the sample data is used as a proportion estimate to generate coverage probability. The Monte Carlo simulation is performed for all combinations of values for n and p. To smooth values obtained from each simulation for the corresponding values of n and p average of the coverage probability is calculated and then behavior of the dierent sample sizes is observed by making plots of coverage probability vs sample size for the dierent p and coverage probability vs suggested probabilities for dierent sample sizes.
0.4
Coverage Probability
0.2
G G G G G G G G
0.0 0
0.1
0.3
200
400
600
800
1000
0.4
Coverage Probability
0.3
G G
0.2
0.1
0.0
0.2
0.4
0.6
0.8
Probability (p)
2.3 R Code
1 2 3 4 5 6 7 8 9 10 11 12 13
m=10000 # Number o f Monte C a r l o S i m u l a t i o n s n=c ( 5 , 1 0 , 3 0 , 1 0 0 0 ) # Sample s i z e p=c ( 0 . 0 5 , 0 . 1 , . 2 5 , . 5 , . 9 , . 9 5 ) # S u g g e s t e d p r o b a b i l i t i e s symbnum=c ( 0 , 1 , 2 , 4 , 5 , 1 6 ) # p o i n t type symbol f o r p l o t i n g covprob_tmp=r e p ( 0 ,m) covprob=a r r a y ( 0 , dim=c ( l e n g t h ( n ) , l e n g t h ( p ) ) ) # i n i t covprov v a r i a b l e f o r ( i i i i n 1 : l e n g t h ( p ) ) { # Loop t h a t c h a n g e s s u g g e s t e d p f o r ( i i i n 1 : l e n g t h ( n ) ) { # Loop t h a t p r o g r e s s e s s u g g e s t e d n f o r ( i i n 1 :m) { rb=rbinom ( n [ i i ] , 1 , p [ i i i ] ) # B e r n o u l l i d i s t r i b u t i o n p_e s t=mean ( rb ) # estimate probability covprob_tmp [ i ]=qnorm ( 0 . 9 7 5 ) s q r t ( p_e s t (1 p_e s t ) /n [ i i ] ) # coverage probability } covprob [ i i , i i i ]=mean ( covprob_tmp ) # Average from a Monte C a r l o simulation } } # PLOTTING RESULTS pdf ( f i l e =" covprob . pdf " , width =13 , h e i g h t =8) for ( i i i in 1: length (p) ) { p l o t ( n , covprob [ , i i i ] , type=" o " , lwd =2, cex =2, cex . l a b =1.5 , pch=symbnum [ i i i ] , yl im=c ( 0 . 0 , 0 . 4 ) , x l a b=NA, y l a b=NA, a x e s=F) box ( ) a x i s ( s i d e = 1 , cex . a x i s =1.5) a x i s ( s i d e = 2 , cex . a x i s =1.5) mtext ( s i d e = 1 , " Sample S i z e ( n ) " , l i n e = 3 , cex =2.5) mtext ( s i d e = 2 , " Coverage P r o b a b i l i t y " , l i n e = 2 . 3 , cex =2.5) par ( new=TRUE) } l e g e n d ( " t o p r i g h t " , c ( "p=0.05 " , "p=0.1 " , "p=0.25 " , "p=0.5 " , "p=0.9 " , "p=.95 " ) , pch= symbnum , cex =2 , l t y =1, lwd=2 ) dev . o f f ( ) pdf ( f i l e =" covprob2 . pdf " , width =13 , h e i g h t =8) for ( i i i in 1: length (n) ) { p l o t ( p , covprob [ i i i , ] , type=" o " , lwd =2, cex =2, cex . l a b =1.5 , pch=symbnum [ i i i ] , yl im=c ( 0 . 0 , 0 . 4 ) , x l a b=NA, y l a b=NA, a x e s=F) box ( ) a x i s ( s i d e = 1 , cex . a x i s =1.5) a x i s ( s i d e = 2 , cex . a x i s =1.5) mtext ( s i d e = 1 , " P r o b a b i l i t y ( p ) " , l i n e = 3 , cex =2.5) mtext ( s i d e = 2 , " Coverage P r o b a b i l i t y " , l i n e = 2 . 3 , cex =2.5) par ( new=TRUE) } l e g e n d ( " t o p r i g h t " , c ( "n=5" , "n=10" , "n=30" , "n=100" ) , pch=symbnum , cex =2, l t y =1 , lwd=2 ) dev . o f f ( )
14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29 30
31 32 33 34 35
36 37 38 39 40 41 42 43
44
./mp2ex2.r