Anda di halaman 1dari 5

International journal of advanced scientific and technical research Issue 2 volume 5, October 2012

Available online on http://www.rspublication.com/ ijst/index.ht ml ISSN 2249-9954

APPLICATION OF EXTRA SUMS OF SQUARES IN


DETECTING MULTICOLLINEARITY
1
AITUSI D. N. .2 BELLO A. Ojutomori ,3 EHIGIE T. O.
Department of Statistics, Auchi Polytechnic, Auchi

ABSTRACT
This paper examined the level of correlation existing between the cumulative grading system in
the computation of student academic achievement. One response and four predictor variables are
examined. Method of Extra Sum of Squares (ESS) obtained from the four fitted regression
models is adapted also correlation matrix is computed and the presence of multicollinearity
examined. Secondary data were obtained from 2009/2010 academic session published results of
graduated students of National Diploma (ND) program in Statistics of Auchi Polytechnic. A
sample size of 50 randomly selected from a list of 120 graduates that received the award of
National Diploma (ND) in Statistics was analyzed. The predictor variables exhib it some levels of
association. There is indication of correlation between the pairs of predictor variables. There is
indication of multicollinearity of the predictor variables; Previous Total Credit Units (PTCU),
Previous Total Credit Points (PTCP), Current Total Credit Units (CTCU) and Current Total
Credit Points (CTCP).

KEY WORDS: Award, Multicollinearity, Credit, Association, Predictor, Grade-point-average.

INTRODUCTION
To determine a students progress from one level (year) of academic programme to the
next and for the student to meet the requirement for graduation, he/she must have a satisfactory
achievement in Course Work (CW) and in Semester Examinations assessment of the student.
This is by a continuous process throughout the four-semester duration of all courses taken in first
year and second year of National Diploma and Higher National Diploma programs of
Polytechnics.
The overall performance of each student is determined by means of Grade Point Average
(GPA). This is obtained by multiplying the course Credit Unit by the numerical value of the
grade obtained. The Grade Point Average is then the total number of Value Points divided by the
total number of Credit Unit. The Grade Point Average (GPA) made by the student in two or
more semesters is referred to as the students Cumulative Grade Point Average. It is a weighted
average. Auchi Polytechnic (2010).
The final Cumulative Grade Point Average (CGPA) measures the academic achievements
leading to the class of award. The class of Academic Awards and associated CGPA are shown
below;
Classes of Award Class of CGPA
Distinction 3.50 4.00
Upper Credit 3.00 3.49
Lower Credit 2.50 2.99
Pass 2.00 2.49

Page 628
International journal of advanced scientific and technical research Issue 2 volume 5, October 2012
Available online on http://www.rspublication.com/ ijst/index.ht ml ISSN 2249-9954

The student academic achievement of interest is the Cumulative Grade Point Average of
fourth (4th ) semester, leading to the class of National Diploma award. What is the level of
association among the four predictor variables? Is there no multicollinearity between the four
predictors?
The existence of substantial correlation among a set of independent variables create
difficulties in computational accuracy, sampling stability and misleading in the substantive
interpretation of partial coefficients. Cohen (1975).
Variable of interest are:
Dependent Variable; Cumulative Grade Point Average (CGPA) denoted as (Y)
The predictor variables;
Previous Total Credit Units (PTCU) denoted as X1
Previous Total Credit Points (PTCP denoted as X2
Current Total Credit Units (CTCU) denoted as X3
Current Total Credit Points (CTCP) denoted as X4
This paper examined the level of correlation existing between the variables of interest
and also applied Extra Sum of Squares to detect the presence of Multicollinearity between the
predictor variables.

METHODOLOGY
Secondary data were obtained from 2009/2010 academic session published results of
graduated students of National Diploma (ND) program in Statistics of Auchi Polytechnic. A
sample size of 50 randomly selected from a list of 120 graduates that received vario us class of
academic award of National Diploma (ND) in Statistics.
Basic idea of Extra Sum of Squares (ESS); measures the marginal reduction in the error
sum of squares where one or several predictor variables are added to the regression model, given
that other predictor variables are already in the model. In other words, ESS measures the
marginal increase in the regression sum of squares when one or several predictor variables are
added to the regression model.
Extra sum of squares can be viewed either as a reduction in the error sum of squares or as
an increase in the regression sum of squares when the second predictor variable is added to the
regression model.
The reason why extra sums of squares are of interest is that they occur in a variety of tests
about regression coefficients where the question of concern is whether certain X variables can be
dropped from the regression model.
SSR(X1 ) = SSE(X1 ) .. i
If X2 is the extra variable, then the extra sum of square is obtained as;
SSR(X2 /X1 ) = SSE(X1 ) SSE(X1 , X2 ) . ii
SSR(X3 /X1 , X2 ) = SSE(X1 , X2 ) SSE(X1 , X2 , X3 ) .. iii
SSR(X4 /X1 , X2 , X3 ) = SSE(X1 , X2 , X3 ) SSE(X4 ) .. iv
The decomposition of the regression sum of square SSR(X1 , X2 ) into two marginal
components;
i) SSR(X1 ); measuring the contribution by including X1 alone in the model
ii) SSR(X2 /X1 ); measuring the additional contribution when X2 is included, given that X1
is already in the model. Nester et al (1996)

Page 629
International journal of advanced scientific and technical research Issue 2 volume 5, October 2012
Available online on http://www.rspublication.com/ ijst/index.ht ml ISSN 2249-9954

ANOVA table with Decomposition of SSR

Sources of Variation Extra Sum of Squares Degrees of Freedom Mean Squares


X1 SSR(X1 ) 1 MSR(X1 )
X2 /X1 SSR(X2 /X1 ) 1 MSR(X2 /X1 )
X3 /X1 , X2 SSR(X3 /X1 , X2 ) 1 MSR(X3 /X1 , X2 )
X4 /X1 , X2 , X3 SSR(X4 /X1 , X2 , X3 ) 1 MSR(X4 /X1 , X2 , X3 )
Error SSE(X1 , X2 , X3 , X4 ) n 5 MSE(X1 , X2 , X3 , X4 )
Total SST n- 1

When predictor variables are correlated, the marginal contribution of any one predictor variable
in reducing the error sum of squares varies, depending on which other variables are already in
the regression model. In other words, there is no unique sum of squares that can be ascribed to
any one predictor variable as reflecting its effect in reducing the total variation in response
variable Y.

RESULTS AND DISCUSSION

Table 1: Correlation Matrix of Cumulative Grading System


Y X1 X2 X3 X4
Y 1 -0.52 0.97 -0.21 0.87
X1 -0.52 1 -0.32 0.15 -0.52
X2 0.97 -0.32 1 -0.20 0.78
X3 -0.21 0.15 -0.20 1 0.07
X4 0.87 -0.52 0.78 0.07 1

The correlation matrix in table 1 indicates some levels of correlation. Previous Total Credit Units
(X1 ) as well as Current Total Credit Units (X3 ) is negatively correlated with Cumulative Grade
Point Average (Y: -0.52; -0.21). Previous Total Credit Points (X2 ) and Current Total Credit
Points (X4 ) have high level of positive correlation with Cumulative Grade Point Average (Y:
0.97; 0.87). As further shown on Table 1, the predictor variables exhibit some levels of
association.

Table 2: Regression Results for four- fitted models of Cumulative Grading System
We considered the marginal effect of adding several variables.

a. Regression of Y on X1
Y = -0.002 + 0.019X1
Sources of Sums of Degrees of Mean F P-value
Variation Squares Freedom Squares
Regression 8.453 1 8.453 746.903 0.000
Residual 0.543 48 0.011
Total 8.996 49

Variable Estimated Regression Estimated Standard /t/-value


Coefficient Error
X1 b 1 = 0.019 0.001 27.330

Page 630
International journal of advanced scientific and technical research Issue 2 volume 5, October 2012
Available online on http://www.rspublication.com/ ijst/index.ht ml ISSN 2249-9954

b. Regression of Y on X1 and X2
Y = 2.623 + 0.043X1 + 0.017X2
Sources of Sums of Degrees of Mean F P-value
Variation Squares Freedom Squares
Regression 8.881 2 4.440 1815.654 0.000
Residual 0.115 47 0.002
Total 8.996 49

Variable Estimated Regression Estimated Standard /t/-value


Coefficient Error
X1 b 1 = -0.043 0.003 13.233
X2 b 2 = 0.017 0.000 51.029

c. Regression of Y on X1 , X2 and X3
Y = 1.848 - 0.034X1 + 0.015X2 + 0.011X3
Sources of Sums of Degrees of Mean F P-value
Variation Squares Freedom Squares
Regression 8.961 3 2.987 3947.247 0.000
Residual 0.035 46 0.001
Total 8.996 49

Variable Estimated Regression Estimated Standard /t/-value


Coefficient Error
X1 b 1 = -0.034 0.002 16.639
X2 b 2 = 0.015 0.000 52.703
X3 b 3 = 0.011 0.001 10.291

d. Regression of Y on X1 , X2 , X3 and X4
Y = 2.161 - 0.030X1 + 0.014X2 - 0.031X3 + 0.014X4
Sources of Sums of Degrees of Mean F P-value
Variation Squares Freedom Squares
Regression 8.990 4 2.247 16563.858 0.000
Residual 0.006 45 0.000
Total 8.996 49

Variable Estimated Regression Estimated Standard /t/-value


Coefficient Error
X1 b 1 = -0.030 0.001 33.400
X2 b 2 = 0.014 0.000 97.477
X3 b 3 = -0.031 0.002 14.545
X4 b 4 = 0.014 0.001 28.280

Table 3: Extra Su m o f Squares Diagnostics of Multicollinearity in Cu mulative Grad ing System


Sources of Variation Extra Su ms Degrees of Mean Squares
of Squares Freedom
X1 0.543 1 0.543
X2 /X1 0.428 1 0.428
X3 /X1 , X2 0.080 1 0.080
X4 /X1 , X2 , X3 0.029 1 0.029
Error 0.006 45 1.333
Total 10.076 49

Table 3 shows the marginal difference in the regression sum of squares when additional predictor
variables were included in the Regression Model. The observed Extra Sums of Squares

Page 631
International journal of advanced scientific and technical research Issue 2 volume 5, October 2012
Available online on http://www.rspublication.com/ ijst/index.ht ml ISSN 2249-9954

associated with the predictor variables varies indicating existence of multicollinearity. Though,
the difference is small compared with the full regression sum of square = 8.990 because of the
existence of correlation with each other and with the response variable.

CONCLUSION
The final cumulative grade point average exhibit high level of correlation with previous and
current total credit units also with total credit points. The predictor variables - Previous Total
Credit Units (PTCU), Previous Total Credit Points (PTCP), Current Total Credit Units (CTCU)
and Current Total Credit Points (CTCP) exhibits some level of correlation, thus, there is an
indication of multicollinearity.

REFERENCE
Auchi Polytechnic (2010) Students Handbook of Information; Published by The Information &
Public Relation Unit, Office of the Rector, Auchi Polytechnic, Auchi.
Cohen Jacob (1975) Applied Multiple Regression/Correlation Analysis for the Behavioral
Sciences; JOHN WILEY & SONS, New York.
Leabo D. A. (1976) Basic Statistics; RICHARD D. IRWIN, INC. Irwin-Dprsey Limited
Georgetwon, Ontarlo.
Nester J, Kutner M. H, Nachtsheim C. J, Wasserman W. (1996) Applied Linear Statistical
Models (4th Ed);The McGraw-Hill Companies, Inc, USA.

Page 632

Anda mungkin juga menyukai