Anda di halaman 1dari 8

International Journal of Computer and Information Technology (ISSN: 2279 – 0764)

Volume 05 – Issue 03, May 2016

Missing Values Estimation Comparison in Split- Plot


Design
Layla A. Ahmed
Department of Mathematics, College of Education,
University of Garmian
Kurdistan Region –Iraq
Email: laylaaziz424 [AT] yahoo.com

Abstract---- The present study focuses on treating the missing Recently Subramani and Ponnuswamy (1989) have
values in the split- plot design. Three methods have been used to discussed the non-iterative least squares estimation of
treat the missing values: Coons, Haseman and Gaylor, and missing values in experimental designs and presented
Rubin method. To make preference among these methods some randomized block designs and latin square designs[7]. Bhatra
statistical measurements have been used, which are absolute
error (AE), mean squares error (MSE) and Akaike information
(2013) studied the estimation of m missing observations by
criterion (AIC). From the practical work it is concluded that: specifying the positions and by not positions of the missing
In the case of one missing value was obtained the same values are presented in case of a randomize block design [1].
estimates for missing value. As in cases of two and three missing Three methods have been used to treat the missing values:
values show that the best method for estimating missing values Coons, Haseman and Gaylor, and Rubin method. To make
is Coons method. preference among these methods some statistical
measurements have been used, which are absolute error (AE)
Keywords-- Split- Plot Design, Estimating Missing Values, mean square error (MSE) and Akaike information
Mean Squares Error, Akaike Information Criterion. criterion (AIC).
.
II. SPLIT- PLOT DESIGN
I. INTRODUCTION

Split –plot designs were originally developed by Fisher


In designed experiments sometimes it is so happens that
(1925) for use in agricultural experiments [5].The split -plot
the observations on some experimental units are not
design usually used because of some limitation in space or to
available. For example in an industrial experiment, the
facilitate treatment application. The two factors are divided
observations are misplaced or cannot be collected; in a
into a main plot effect and a sub- plot effect. The precision is
medical experiment, patients may with draw from the
greater of the sub- plot factor than it is for the main- plot
treatment programmer or the experimenter may fail to record
factor. If one factor is more important to the researcher, and
the results. Similarly in an agricultural experiment the plants
if the experiment can facilitate it, then the sub- plot factor
may be eaten away are animals.
should be used for this factor.
In all situations the resulting data are called non- orthogonal
The mathematical model for split- plot design is [11]:
[7]. One of the first papers on the subject of estimating the
missing yield was published by Allan and Wishart (1930)
𝑦𝑖𝑗𝑘 = 𝜇 + 𝛼𝑖 + 𝜌𝑘 + 𝛽𝑗 + 𝛼𝛽 𝑖𝑗 + 𝛿𝑖𝑘 + 𝜀𝑖𝑗𝑘 (1)
[10]. Yates (1933) showed that by choosing values that
𝑖 = 1,2, … . . , 𝑎
minimize the residual sum of squares, one can obtain the
𝑗 = 1,2, … . . , 𝑏
correct least squares estimates of all estimable parameters as
well as the correct residual sum of squares[1][3].Bartlett 𝑘 = 1,2, … . , 𝑟
Where:
(1937), Anderson (1946) and Coons (1957) has used the
analysis of covariance model to analyze the experiments with 𝑦𝑖𝑗𝑘 : The value of any observation
missing data[9]. 𝜇: General mean
Rubin (1972) used non- iterative technique to estimate 𝛼𝑖 : Effect of main- plot factor (A)
missing values and in a way that using least squares and 𝜌𝑘 : Effect of block
make the sum of squares error equal to zero. 𝛽𝑗 : Effect of sub- plot factor (B)
Haseman and Gaylor (1973) described a simple non- iterative (𝛼𝛽)𝑖𝑗 : Effect of the interaction between A and B
technique to estimate missing values by solving a set of 𝛿𝑖𝑘 : Error of main plot
simulations linear equations that can be written directly. 𝜀𝑖𝑗𝑘 : Error of sub plot
The analysis of variance for split- plot design is:

www.ijcit.com 337
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

TABLE1: ANOVA FOR SPLIT- PLOT DESIGN


Sum square of total:
(F.Cal)
(S.O.V.) (d.f.) (S.S.) (M.S.)
𝑎 𝑏 𝑟 2 𝑌…2
𝑆𝑆𝑇 = 𝑘=1 𝑌𝑖𝑗𝑘 − (2) MSblock
𝑖=1 𝑗 =1 𝑎𝑏𝑟
𝑆𝑆𝑏𝑙𝑜𝑐𝑘 Fcal 
Blocks r-1 𝑆𝑆𝑏𝑙𝑜𝑐𝑘 MSE(a)
a-1 𝑆𝑆𝐴 𝑟−1
A 𝑆𝑆𝐴 MSA
Sum square of block: 𝑎−1
Fcal 
Error(a) (a-1)(r-1) 𝑆𝑆𝐸(𝑎) 𝑆𝑆𝐸(𝑎) MSE(a )
2 𝑎 − 1 (𝑟 − 1)
𝑌..𝑘 𝑌…2
𝑆𝑆𝑏𝑙𝑜𝑐𝑘 = − (3)
𝑎𝑏 𝑎𝑏𝑟 MSB
𝑆𝑆𝐵
Fcal 
Sum square of factor A (main- plot): 𝑩 b-1 𝑆𝑆𝐵 MSE(b)
𝑆𝑆𝐴𝐵. 𝑏−1
𝐴𝐵 (a-1)(b-1) 𝑆𝑆𝐴𝐵 MSAB
𝑌𝑖..2 𝑌…2
Fcal 
𝑎−1 𝑏−1 MSE(b)
𝑆𝑆𝐴 = − (4) 𝐸𝑟𝑟𝑜𝑟(𝑏) a(b-1)(r-1) 𝑆𝑆𝐸(𝑏) 𝑆𝑆𝐸(𝑏)
𝑏𝑟 𝑎𝑏𝑟
𝑎 𝑏 − 1 (𝑟 − 1)
Sum square of error A:
Total abr-1 𝑆𝑆𝑇
2
𝑌𝑖.𝑘 𝑌…2
𝑆𝑆𝐸(𝑎) = − − 𝑆𝑆𝐴 − 𝑆𝑆𝑏𝑙𝑜𝑐𝑘 (5)
𝑏 𝑎𝑏𝑟
III. METHODS OF ESTIMATING MISSING
Sum square of factor B (sub- plot): VALUES

𝑌.𝑗2.. 𝑌…2
𝑆𝑆𝐵 = − (6)
𝑎𝑟 𝑎𝑏𝑟 A. Coons Method
Coons (1957) was used analysis of covariance model to
Sum square of interaction effect AB: analyze the experiments with missing values .the technique
employs the computational procedures of a covariance
𝑌𝑖𝑗2 . 𝑌…2
𝑆𝑆𝐴𝐵 = − − 𝑆𝑆𝐴 − 𝑆𝑆𝐵 (7) analysis using a dummying X covariance as follows:
𝑟 𝑎𝑏𝑟 In the case of one missing value:
To estimate of missing value by covariance analysis
Sum square of error B:
conducting the following steps [9], [5]:
1) Consider the original data as the dependent
𝑆𝑆𝐸(𝑏) = 𝑆𝑆𝑇 − 𝑆𝑆𝑏𝑙𝑜𝑐𝑘 − 𝑆𝑆𝐴 − 𝑆𝑆𝐵 − 𝑆𝑆𝐴𝐵 − 𝑆𝑆𝐸(𝑎) (8)
variable y of the covariance analysis and inset the
Each has an associated degree of freedom. Mean squares are value of zero in the cell which has the missing
defined as sums of squares divided by degrees of freedom, observation.
the analysis of variance as shown in table(1). 2) Define a variable x where:
𝑋=0 𝑖𝑓 𝑌 ≠ 0
𝑋 = −𝑛 𝑖𝑓 𝑌 = 0
Where: n is the total number of observation in the
experiment including the missing value.
3) Carry out the analysis of covariance.
4) Compute the estimate of the regression coefficient:
𝐸𝑋𝑌
𝛽𝐸 = (9)
𝐸𝑋𝑋
And multiply by n to estimate the missing value:
𝑋 = 𝑛𝛽𝐸 (10)

In the case of more than one missing value:


1) Put 𝑌 = 0 for all missing values.
2) Define a variable 𝑋𝑚 where:
𝑋𝑚 = 0 𝑖𝑓𝑓 𝑌 ≠ 0
𝑋𝑚 = −𝑛 𝑖𝑓𝑓 𝑌 = 0
3) With more than one missing observation a multiple
covariance analysis is required.

www.ijcit.com 338
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

The computations required to obtain the sum of


products 𝑋𝑚 𝑋𝑛 and 𝑋𝑚 𝑌, since each 𝑋𝑚 is
associated with a single missing value and therefore 1, 𝐼𝑓𝑌𝑕 𝑎𝑛𝑑 𝑌𝑔 𝑎𝑟𝑒 𝑜𝑓𝑡𝑕𝑒 𝑠𝑎𝑚𝑒
has only one non-zero cell. 𝜓𝑔𝑕 𝐴3 = 𝑙𝑒𝑣𝑒𝑙 𝑓𝑜𝑟 𝑓𝑎𝑐𝑡𝑜𝑟 𝐴.
In computing 𝑋𝑚 𝑋𝑛 , two situations may be 0, 𝑂𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒.
encountered.
a) The two missing values associated with 𝑋𝑚 𝑇𝑕 𝐴1 = Total for main unit containing the missing value.
and 𝑋𝑛 occur in the same level of the given
source of variation.
𝑋𝑚 𝑋𝑛 = 𝑛(Degree of freedom for the given 𝑇𝑕 𝐴2 = Total of all sub units that receive the treatment
source of variation) combination 𝑎𝑖 𝑏𝑗 .
b) The two missing values occur in the different
levels of the given source of variation. 𝑇𝑕 𝐴3 = Total of all observations that receive
the ith level of A.
𝑋𝑚 𝑋𝑛 = −𝑛(Degree of freedom for the given
source of variation) C. Rubin Method
4) Compute the estimates of the regression In (1972) Rubin used non- iterative technique to estimate
coefficients ( 𝛽1𝐸 , 𝛽2𝐸 ,…, 𝛽𝑚𝐸 ) by solving m missing values and in a way that using least squares and
equations: make the sum of squares error equal to zero [2].

X = −PR−1 (14)
𝐸𝑋1 𝑋1 𝛽1𝐸 + 𝐸𝑋1 𝑋2 𝛽2𝐸 + ⋯ + 𝐸𝑋1 𝑋𝑚 𝛽𝑚𝐸 = 𝐸𝑋1 𝑌
.
. (11) Where:
. 𝑃, 𝑋 = Vector (1 × 𝑚).
𝐸𝑋𝑚 𝑋1 𝛽1𝐸 + 𝐸𝑋𝑚 𝑋2 𝛽2𝐸 + ⋯ + 𝐸𝑋𝑚 𝑋𝑚 𝛽𝑚𝐸 = 𝐸𝑋𝑚 𝑌 𝑅 = 𝑁𝑜𝑛 − 𝑠𝑖𝑛𝑔𝑢𝑙𝑎𝑟 𝑚𝑎𝑡𝑟𝑖𝑥 𝑚 × 𝑚 .
𝑌𝑖𝑗 . 𝑌.𝑗𝑘 𝑌.𝑗 .
We estimate the missing values by the following formula: 𝑒𝑖𝑗𝑘 = 𝑌𝑖𝑗𝑘 − − + (15)
𝑏 𝑟 𝑏𝑟

𝑋𝑖 = 𝑛𝛽𝑖𝐸 , 𝑖 = 1,2,3, … , 𝑚 (12) 𝑟𝑘𝑘 = 1 − − +


1 1 1
(16)
𝑏 𝑟 𝑏𝑟

1
B. Haseman and Gaylar Method 𝑟𝑘𝑘 = (17)
𝑏𝑟

Haseman and Gaylor (1973) suggested a non- iterative


technique to estimate m missing values by solving m of IV. STATISTICAL MEASUREMENTS
simulations linear equations, the formula as follows[6]:
After estimating missing values, the missing values are
𝑟 − 1 𝑏 − 1 𝑌𝑕 + 𝑚
𝑔≠𝑕 𝑌𝑔 𝜓𝑔𝑕 (𝐴3 − 𝑟𝜓𝑔𝑕 𝐴1 − replaced by the estimated values and the usual computations
𝑏𝜓𝑔𝑕 𝐴2 = 𝑟𝑇𝑕 𝐴1 + 𝑏𝑇𝑕 𝐴2 − 𝑇𝑕 (𝐴3 ) (13) procedures of the analysis of variance is applied to the
augmented data set with some modifications subtract one
Where: from the error degree of freedom for each missing value.
Some statistical measurements have been used, which are:
𝑟 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑒𝑠 mean squares error (MSE) is calculated as shown in equation
𝑏 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑒𝑣𝑒𝑙𝑠 𝑜𝑓 𝑡𝑕𝑒 𝑠𝑒𝑐𝑜𝑛𝑑 𝑓𝑎𝑐𝑡𝑜𝑟. (8) and table (1).
And absolute error (AE) is the absolute of the difference
1, 𝐼𝑓𝑌𝑕 𝑎𝑛𝑑 𝑌𝑔 𝑎𝑟𝑒 𝑜𝑓 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑙𝑒𝑣𝑒𝑙𝑠 between estimated value and real value, and calculated as
𝑜𝑓 𝑓𝑎𝑐𝑡𝑜𝑟 𝐵, 𝑏𝑢𝑡 𝑓𝑟𝑜𝑚 𝑡𝑕𝑒 𝑠𝑎𝑚𝑒 𝑙𝑒𝑣𝑒𝑙𝑠 𝑜𝑓 follows:
𝜓𝑔𝑕 𝐴1 =
𝑓𝑎𝑐𝑡𝑜𝑟 𝐴 𝑎𝑛𝑑 𝑖𝑛 𝑡𝑕𝑒 𝑠𝑎𝑚𝑒 𝑏𝑙𝑜𝑐𝑘.
𝐴𝐸 = 𝑦𝑖 − 𝑦𝑖 (18)
0, Otherwise.
Where:
𝑦𝑖 : Real value.
1, 𝐼𝑓𝑌𝑕 𝑎𝑛𝑑 𝑌𝑔 𝑎𝑟𝑒 𝑜𝑓a particular 𝑦𝑖 : Estimated value.
𝜓𝑔𝑕 𝐴2 = level for the factors A and B.
0, 𝑂𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒.

www.ijcit.com 339
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

And Akaike information criterion (AIC) is a measure of the FIGURE 1: HISTOGRAM FOR DATA
relative quality of statistical methods for a given set of data,
is calculated as follows: Histogram of y
Normal
Mean 38.74
4
AIC  n ln  2  2(k  1) (19)
StDev
N
15.31
24

Where: 3

 2

Frequency
: Mean square of error.
2
𝑘 : Number of variables in the model.
𝑛 : Total number of observations.
1

V. THE PRACTICAL PART 0


10 20 30 40 50 60 70
y

A. Data Description The above chart in figure (1), explained that the data
Data on height (cm) of eucalyptus plants from a field trial
experiment distributed normal distribution, and to test the
under split- plot design with two treatments, three blocks
homogeneity the hypothesis is given by:
given in (Jayaraman). Let A denoted the main- plot factor
(pit size) and B, the sub plot factor (fertilizer treatments), 𝐻0 : 𝑎𝑙𝑙 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑠 𝑎𝑟𝑒 𝑒𝑞𝑢𝑎𝑙
then the resulting data is as follows [8]: (20)
𝐻1 : 𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑛𝑜𝑡 𝑒𝑞𝑢𝑎𝑙𝑒
Missing values in the experiment are not missing originally,
but I assumed it was missing. The value of Bartlett's test equal to (2.57) with p- value
(0.92), for test the p- value is greater than the value of level
TABLE 2: DATA EXPERIMENT of significant (   0.05 ), this means cannot reject the null
hypothesis and there is no problem of homogeneity of
Blocks Total
A B I II III
variance.
Carry out the analysis of variance which is given in table (3):
𝑏1 25.38 61.35 37.00 123.73
𝒂𝟏 𝑏2 46.56 66.73 28.00 141.29
𝑏3 66.22 35.70 35.7 137.62 TABLE 3: ANOVA FOR DATA EXPERERIMENT.
𝑏4 30.68 58.96 21.58 111.22
(F.Tab.)
Total 168.84 222.74 122.28 513.86
(F.Cal 𝜶=
(S.O.V.) (d.f.) (S.S.) (M.S.)
) 𝟎. 𝟎𝟓
𝑏1 19.26 55.8 57.6 132.66
𝒂𝟐 𝑏2 19.96 33.96 31.7 85.62 Blocks 2 1938.5 969.25
𝑏3 22.22 58.4 51.96 132.6 A 1 228.35 228.35
𝑏4 16.82 45.6 26.55 88.97 2 1161.34 580.67 0.39 4.75
Error(a)
78.26 193.76 167.83 439.85
Total 3 487.82 162.61 1.01 3.49
𝑩
247.1 416.5 290.11 953.71 3 388.21 129.40 0.81 3.49
𝐴𝐵
12 1928.15 160.68
𝐸𝑟𝑟𝑜𝑟(𝑏)
Before analyzing the data, should be verified from the Total 23 6132.37
distribution of the data. To test the normal property was used
the histogram, as shown in figure (1) as follows.
Three methods have been used to treat the missing
values: Coons, Haseman and Gaylor, and Rubin method. To
make preference among these methods some statistical
measurements have been used, which are absolute error
(AE), mean squares error (MSE) and Akaike information
criterion (AIC).

www.ijcit.com 340
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

B. Estimating Missing Value TABLE 5: THE ANALYSIS AFTER ESTIMATING MISSING VALUE

Case 1: One Missing Value Methods Estimate of AE MSE AIC


In the table (2) assume that 𝑦122 is the observation. Let missing value
𝑋1 be the corresponding observed value but is unknown. Now
we estimate the missing value 𝑋1 based on the following
methods: Coons 53.19 8.16 166.95 132.82

1. Coons Method H&G 53.19 8.16 166.95 132.82


Let n equal the total number of observations in the
experiment including the missing one. Consider the original Rubin 53.19 8.16 166.95 132.82
data as the depended variable Y of the covariance analysis
and insert the value of zero in the cell which has the missing
observation. Set up a concomitant variable X which takes By using the above mentioned methods the missing
the value of –n in the cell corresponding to the substituted values were estimated to compression among these methods
zero value and the value of zero elsewhere. by relating on AE, MSE and AIC of the estimated value in
order to ascertain its proximity to the real value.
TABLE 4: ANCOVA FOR CACE 1. For one missing value all of the three methods indicated
S.O.V d.f 𝑋𝑌 𝑋2 above produce the same result.
Block 2 -162.33 48
A 1 -7.28 24
Case 2: Two Missing Values
E(a) 2 120.53 48 That 𝑦122 and 𝑦213 are the observations. Let 𝑋1 and 𝑋2
be the corresponding observed values but are unknown. Now
B 3 240.26 72 we estimate the missing values 𝑋1 and 𝑋2 based on the
AB 3 200.64 72
E(b) 12 638.28 288
following methods:

Total 23 552 1. Coons Method


A multiple covariance analysis is used to handle the
problem of two missing values. Assign the value of zero to
By using equation (9), we get the two missing values (𝑦122 = 0) and (𝑦213 = 0). Set up two
concomitant variables 𝑋1 and 𝑋2 for each missing values.
𝐸𝑋𝑌 Each of 𝑋1 = 0 in all cells except in that cell corresponding
𝛽𝐸 = = 2.216
𝐸𝑋𝑋 to , in that one cell 𝑋1 = −𝑛. Similarly, each of 𝑋2 = 0 in
A missing value is estimated by equation (10): all 𝑋2 = −𝑛. The computation of a multiple covariance is
given in table (6).
𝑋 = 24 2.216 = 53.19
TABLE 6: ANCOVA WITH TWO MISSING VALUES
2. Haseman and Gaylar Method
S.O.V d.f 𝑋1 𝑌 𝑋2 𝑌 𝑋1 𝑋2 𝑋𝑚2
By equation (13), we get
3 − 1 4 − 1 𝑋 = 3 156.01 + 4 74.56 − 447.13
𝑋 = 53.19 Block 2 -219.93 131.84 -24 48
A 1 -64.88 64.88 -24 24
3. Rubin Method E(a) 2 178.13 -28.73 24 48

By equation (16) and (17) we get B 3 188.66 34.22 -24 72


AB 3 109.12 129.8 24 72
156.01 74.56 447.13 E(b) 12 638.28 497.36 0 288
𝑃 =0− − + = −26.595
4 3 12
1 1 1
𝒓 = 1 − − + = 0.5 Total 23 829.38 829.38 -24 552
4 3 12
By equation (14), we get
−(−26.595)
𝑋= = 53.19
0.5
By using equation (11), we get
The usual analysis of variance is calculated as show in table
(5). 288 𝛽1𝐸 + 0𝛽2𝐸 = 638.28
0 𝛽1𝐸 + 288 𝛽2𝐸 = 497.36

www.ijcit.com 341
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

TABLE 7: THE ANALYSIS AFTER ESTIMATING TWO MISSING


−1 VALUES
𝛽1 288 0 638.28
=
𝛽2 0 288 497.36
Methods Estimate of AE MSE AIC
missing value
𝛽1 2.215
=
𝛽2 1.726 53.16 8.19
Coons 170.59 133.340
𝛽1 41.42 16.18
2.215
= 47.61 13.74
𝛽2 1.726
H&G 175.29 133.995
Missing values are estimated by equation (12): 33.51 24.09
47.62 13.73
𝑋1 53.16 Rubin 175.27 133.992
=
𝑋2 41.424 33.53 24.07

2. Haseman and Gaylar Method


The analysis in table (7) showed that the (MSE) and
By equation (13), we get (AIC) of the Coons method less than the (MSE) and (AIC) of
6𝑋1 + 𝑋2 = 319.14 the H&G and the Rubin method, and estimated values of
𝑋1 + 6𝑋2 = 248.68 Coons method are closer to the real values.
𝑋1 −1
6 1 319.14 Case 3: Three Missing Values
=
𝑋2 1 6 248.68 Assume that 𝑦122 , 𝑦213 and 𝑦231 are the observations.
Let 𝑋1 , 𝑋2 and 𝑋3 be the corresponding observed values but
𝑋1 47.61
= are unknown.
𝑋2 33.51 We estimate the missing values 𝑋1 , 𝑋2 and 𝑋3 based on the
following methods:
3. Rubin Method
1. Coons Method
By equation (16) and (17), we get
A multiple covariance analysis is used to handle the
156.01 74.56 447.13 problem of two missing values. Assign the value of zero to
𝑝1 = 0 − − + = −26.595 the two missing values (𝑦122 = 0) , (𝑦213 = 0) and (𝑦231 ). Set
4 3 12
110.23 75.06 382.25
𝑝2 = 0 − − + = −20.72 up three concomitant variables 𝑋1 , 𝑋2 and 𝑋3 for each
4 3 12
missing values. The computation of a multiple covariance is
1 1 1 given in table (8).
𝒓𝒌𝒌 = 1 − − + = 0.5
4 3 12
1 TABLE 8: ANCOVA WITH THREE MISSING VALUES
𝒓𝒌𝒌 = = 0.083
12
S.O.V 𝑋1 𝑌 𝑋2 𝑌 𝑋3 𝑌 𝑋1 𝑋2 𝑋1 𝑋3 𝑋2 𝑋3 2
𝑋𝑚
𝑝 = (−26.595 −20.72)
Block 242.15 109.63 132.52 -24 -24 -24 48
0.5 0.083 A -87.1 87.1 87.1 -24 -24 -24 24
𝑅= E(a) 200.35 -50.95 251.3 24 24 24 48
0.083 0.5
Missing values are estimated by equation (14):
B 166.44 12 -184.84 -24 -24 -24 72
𝑋1 47.62 AB 131.34 107.58 21.86 24 24 24 72
=
𝑋2 33.53 E(b) 638.28 541.8 499.22 0 0 0 288

Total 807.16 807.16 807.16 -24 -24 -24 552

By using equation (11):

288 𝛽1𝐸 + 0 + 0 = 638.28


0 + 288 𝛽2𝐸 + 0 = 541.8
0 + 0 + 288 𝛽3𝐸 = 499.22

www.ijcit.com 342
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

𝛽1 −1 TABLE 9: THE ANALYSIS AFTER ESTIMATING THREE MISSING


288 0 0 638.28 VALUES
𝛽2 = 0 288 0 541.8
𝛽3 0 0 288 499.22
Methods Estimates of AE MSE AIC
missing values
𝛽1 2.216
53.19 8.16
𝛽2 = 1.881 Coons 45.15 12.45 185.67 135.375
𝛽3 1.733 41.60 19.38

42.84 18.51
Missing values are estimated by equation (12): H&G 33.19 24.41 191.46 136.112
28.93 6.71
𝑋1 53.19
𝑋2 = 45.15 42.87 18.48
Rubin 33.23 24.37 191.36 136.099
𝑋3 41.60 28.97 6.75

2. Haseman and Gaylar Method


The analysis in table (9) showed that the (MSE) and
By equation (13), we get (AIC) of the Coons method less than the (MSE) and (AIC) of
the H&G and the Rubin method, and estimated values of
6𝑋1 + 𝑋2 + 𝑋3 = 319.14 Coons method are closer to the real values.
𝑋1 + 6𝑋2 + 𝑋3 = 270.90
𝑋1 + 𝑋2 + 6𝑋3 = 249.61 VI. CONCLUSIONS

𝑋1 −1
6 1 1 319.14
𝑋2 = 1 6 1 270.90 The results of the study of estimating missing values are
𝑋3 1 1 6 249.61 summarized and tabulated in tables (5, 7, and 9)) which
contain the MSE, AE and AIC, we have observed that:
𝑋1 42.84
𝑋2 = 33.19 1. In the case of one missing value was obtained the same
𝑋3 28.93 estimates for missing value.

3. Rubin Method 2. The results of application in cases of two and three missing
By equation (16) and (17), we get values show that the best method for estimating missing
values is Coons method, because it is minimum mean squares
156.01 74.56 447.13 error, minimum absolute mean square error and minimum
𝑝1 = 0 − − + = −26.595
4 3 12
110.23 75.06 360.03 Akiakes information criterion.
𝑝2 = 0 − − + = −22.575 3. Increase the number of missing values leads to increased
4 3 12
difference between estimated values given by different
56.04 110.38 360.03
𝑝3 = 0 − − + = −20.8 methods.
4 3 12

1 1 1 REFERENCES
𝒓𝒌𝒌 = 1 − − + = 0.5
4 3 12
[1] B. N.Ch. Charyulu and T. Dharamyadav, Estimation of Missing
1 Observation in Randomized Block Design, International Journal
𝒓𝒌𝒌 = = 0.083
12 of Technology and Engineering Science, Vol. 1, No. 6,
2013,pp618-621.
𝑝 = (−26.595 −22.575 −20.8) [2] D. B. Rubin, A non- Iterative Algorithm for Least Square
Estimation of Missing Values in any Analysis of Variance Design,
Journal of Applied Statistic, 21, 1972, pp 136-141.
0.5 0.083 0.083 [3] F. Yates, Analysis of Replicated Experiment When the Field
𝑅= 0.5 0.083 Results are incomplete, Journal of Experimental Agriculture, 1, 2,
0.5 1933, Pp. 129-142.
Missing values are estimated by equation (14): [4] G. E. Boyhan, Agricultural Statistical Data Analysis Using Stata,
Taylor and Francis group, Boca Raton, London, New York, 2013.
𝑋1 42.87 [5] I. Coons, The Analysis of Covariance as a Missing Plot
𝑋2 = 33.23 Technique, Journal of Biometrics, Vol.13, No. 3, 1957, pp 387-
𝑋3 28.97 405.

www.ijcit.com 343
International Journal of Computer and Information Technology (ISSN: 2279 – 0764)
Volume 05 – Issue 03, May 2016

[6] J. K.. Haseman and, D. W. Gaylor , An Algorithm for Non


Iterative Estimation of Multiple Missing Values for Grossed
Classifications, Journal of Technometrics, Vol. 15, No. 3, 1973, pp
631-636.
[7] J. Subramani, Non- Iterative Least Squares Estimation of Missing
Values in Cross- Over Designs Without Residual Effect, Journal of
Biometrics, Vol. 36, No. 3, 1994, pp285-292.
[8] K.. Jayaraman, A Statistical Manual for Forestry Research, Kerala
Forest Research Institute, FAO publication, Kerala, India,2000.
http://www.fao.org
[9] L. W. Ching, Missing Plot Techniques, a master’s report of
statistic, National Taiwan University, 1973.
[10] R. L. Anderson, Missing Plot Techniques, Journal of Biometrics,
American Statistical Association, Vol. 2, No. 3,1946, pp41-47.
[11] W. G. Cochran and G. M. Cox, Experimental Designs, Second
Edition, John Wiley and Sons, Inc, New York, 1957.

www.ijcit.com 344

Anda mungkin juga menyukai