Anda di halaman 1dari 35

Three Variable Regression Model

Yi = B1+B2X2i+B3X3i_ Nonstochastic form, PRF Yi = B1+B2X2i+B3X3i+ui stochastic B2, B3 called partial regression or partial slope coefficients B2 measures the change in mean value of Y, per unit change in X2 holding the value of X3 constant Yi = b1+b2X2i+b3X3i+ei SRF

Assumptions
Linear relationship Xs are non-stochastic variables. No linear relationship exists between two or more independent variables (no multicollinearaity). Ex:X2i = 3 +2X3 Error has zero expected value, constant variance and normally distributed RSS = e2 = (Yi i)2

= (Yi b1-b2X2i-b3X3i)2

Least squire estimators

Like 2-variable case, we can derive formulae for var(b1), var(b2) & var(b3) and hence their S.E.s We can also estimate 2 as 2 = e2/(n 3) Goodness of fit, R2 = ESS/RSS R2 = [b2yix2i+b3 yix3i]/yi2 0 R2 1

Testing of hypothesis, t-test


Say, i = -1336.09 + 12.7413X2i+85.7640X3i (175.2725) (0.9123) (8.8019) p=0.000 0.000 0.000 R2 = 0.89, n =32 H0: B1=0, b1/se(b1)~ t(n-3) H0: B2=0, b2/se(b2)~ t(n-3) H0: B3=, (b3 - ) /se(b3)~ t(n-3)

Testing Joint Hypothesis, F Test


H0 : B 2 = B 3 = 0 Or, H0 : R2 = 0 X2 & X3 explain zero percent of the variation of Y

H1: At least one B 0


A test of either hypothesis is called a test of overall significance of the estimated multiple regression We know, TSS = ESS + RSS

F test
If computed F value exceeds critical F value, we reject the null hypothesis that the impact of explanatory variables is simultaneously equal to zero Otherwise we cannot reject the null hypothesis It may happen that not all the explanatory variables individually have much impact on dependent variable (i.e., some of the t values may be statically insignificant) yet all of them collectively influence dependent variable (H0 is rejected in F test) This happen only we have the problem of multicollinearity

Specification error
In this example we have seen that both the explanatory variables are individually and collectively different from zero

If we omit any one of these explanatory variable from our model, then there would be specification error
What would be b1, b2 & R2 in 2variable model?

Specification error
i = -1336.09 + 12.7413X2i+85.7640X3i
(175.2725) (0.9123) (8.8019) p=0.000 0.000 0.000 R2 = 0.89, n =32

i = -191.66 + 10.48X2
(264.43) (1.79) R2 = 0.53

i = 807.95 + 54.57X3i
(231.95) (23.57) R2 = 0.15

R2 versus Adjusted R2
Larger the number of explanatory variables in the model, the higher the R2 will be However, R2 does not take into account dof Therefore, comparing R2 values of the two models with same dependent variable but different numbers of explanatory variables is essentially like comparing apples and bananas We need a measure of fit that is adjusted for the no. of explanatory variables in the model

R2 versus Adjusted R2
Such a measure is called Adj R2

If k > 1, Adj R2 R2, as the no of explanatory variables increases in the model, Adj R2 becomes increasingly smaller than R2 It enable us to compare two models that have same dependent variable but different numbers of independent variables In our example, it can be shown that Adj R2=0.88 < 0.89 (R2)

(n 1) R 1 (1 R ) (n k )
2 2

When to add an additional variable?


We often faced with problem of deciding among several competing explanatory variables Common practice is to add variables as long as Adj R2 increases even though its numerical value may be smaller than R2

Computer output & Reporting

The Chicken Consumption Example


Explain US Consumption of Chicken Time Series Observations - 1950-1984

Variable Definitions
CHCONS - Chicken consumption in the US LDY - Log of disposable income in the US PC/PB - Price of Chicken relative to the Price of Best Red Meat

Data Time plots


Actual plots of the data over time follows Note the trends and cycles What are the relationships between the variables? Are movements in CHCONS related to movements in LDY and PC/PB?

CHCONS

10.0

20.0

30.0

40.0

50.0

60.0

0.0

1950 1952 1954 1956 1958 1960 1962 1964


YEAR

Time plot - CHCONS Actual Data

1966
1968

1970 1972
1974

1976 1978
1980

1982 1984

Timeplot-LDY Actual Data

10.0000

9.0000

8.0000

7.0000

6.0000

LDY

5.0000

4.0000

3.0000

2.0000

1.0000

0.0000

Year

Timeplot-PC/PB Actual Data

1.6000

1.4000

1.2000

1.0000

PC/PB

0.8000

0.6000

0.4000

0.2000

0.0000

1950

1953

1956

1959

1962

1965

1968

1971

1974

1977

1980

Year

1983

Chicken Consumption vs. Income


There may be a relationship between CHCONS and LDY
A simple plot of the two variables seems to reveal this Note the positive relationship

Scatter Plot - CHCONS vs. LYD

60.0

50.0

CHCONS

40.0

30.0

20.0

10.0

0.0 7.0000 7.5000 8.0000 8.5000 9.0000 9.5000

LYD

Chicken Consumption vs. Relative Price of Chicken


There may also be a relationship between CHCONS and PC/PB
A plot of these two variables shows the relationship Note the negative relationship

Scatter Plot - CHCONS vs PC/PB

60.0

50.0

40.0
CHCONS

30.0

20.0

10.0

0.0

0.0000

0.2000

0.4000

0.6000

0.8000

1.0000

1.2000

1.4000

1.6000

PC/PB

CHCONS = f(LDY)
Simple linear regression captures the relationship between CHCONS and LDY, assuming no other relationships This regression explains much of the change in CHCONS, but not everything The plotted regression line shows the hypothesized relationship and the actual data

CHCONS = f(LDY)
Coeff SE(b) R2 = 0.9641 F = 879.05 SSReg= 3639.12 (also called SSE)

LDY 15.86 0.53

Const. -92.17 4.34

SE(y) = 2.03 df = 33 SSResid = 136.61 (also called SSR)

Regression Line - CHCONS = f(LYD)

60.00

50.00

40.00
CHCONS

30.00

CHCONS = f(LYD)

Actual Data

20.00

10.00

0.00
7.0000 7.5000 8.0000 8.5000 9.0000 9.5000

LYD

CHCONS = f(PC/PB)
Another simple regression examines the relationship between CHCONS and PC/PB
While the line explains some of the variation of CHCONS, there is more unexplained error

CHCONS = f(PC/PB)
Coeff SEb PC/PB -28.83 2.93 Const. 50.77 1.75

R2 = 0.746 SE(y) = 5.39 F = 97.14 df = 33 SSReg = 2818.32 SSResid = 957.42 (also called ESS) (also called RSS)

Regression Line - CHCONS = f(PC/PB)

60.00

50.00

40.00
CHCONS

30.00

CHCONS=f(PC/PB)

Actual Data

20.00

10.00

0.00
0.0000 0.5000 1.0000 1.5000

PC/PB

CHCONS = f(LDY,PC/PB)
Coeff SEb

LDY 12.79 0.54

PC/PB -8.08 1.12

Const. -63.19 4.84

R2 = F = SSReg = (SSE)

.986 1149.89 3723.92

SEy = 1.27 df = 32 SSResid = 51.82 (SSR)

Actual vs. Predicted

60.0

50.0

40.0

CHCONS

30.0

Actual

CHCONS=f(LDY,PC/PB)

20.0

10.0

0.0

YEAR

Table 7.8 Gujarati: US Defense budget outlays 1962 1981


Yt= Defense budget outlays for year t ($ Bn) X2t=GNP for year t ($ Bn) X3t=US military sales/assistance ($ Bn) X4t=Aerospace industry sales ($ Bn) X5t= Military conflicts involving troops =0, if troops < 100000 =1, if troops > 100000

Table 8.10, Gujarati


Table gives data used by a telephone cable manufacturer to predict sales to a major consumer for the period 1968 1983 Y=annual sales in MPF (million paired feet) X2=GNP (billion $) X3=housing starts (1000 of units) X4=Unemployment rate (%) X5=Prime rate lagged 6 months X6= Customer line gains (%) Introduce later

Table 7.10, Gujarati Consider following demand function for money in US for 1980 1998

M t b1Yt rt e
b2 b3

ut

Where,

M = Real money demand Y = Real GDP r = Interest rate LTRATE: Long term interest rate (30 yr tr bond) TBRATE: 3 months tr bill rate