Anda di halaman 1dari 7

12/15/2014

EXCELMultipleRegression

EXCEL2007:MultipleRegression
A.ColinCameron,Dept.ofEconomics,Univ.ofCalif.Davis
ThisJanuary2009helpsheetgivesinformationon
MultipleregressionusingtheDataAnalysisAddin.
Interpretingtheregressionstatistic.
InterpretingtheANOVAtable(oftenthisisskipped).
Interpretingtheregressioncoefficientstable.
Confidenceintervalsfortheslopeparameters.
Testingforstatisticalsignificanceofcoefficients
Testinghypothesisonaslopeparameter.
Testingoverallsignificanceoftheregressors.
Predictingygivenvaluesofregressors.
Excellimitations.
Thereislittleextratoknowbeyondregressionwithoneexplanatoryvariable.
ThemainadditionistheFtestforoverallfit.
MULTIPLEREGRESSIONUSINGTHEDATAANALYSISADDIN
ThisrequirestheDataAnalysisAddin:seeExcel2007:AccessandActivatingtheDataAnalysisAddin
Thedatausedareincarsdata.xls
WethencreateanewvariableincellsC2:C6,cubedhouseholdsizeasaregressor.
ThenincellC1givethetheheadingCUBEDHHSIZE.
(ItturnsoutthatforthesedatasquaredHHSIZEhasacoefficientofexactly0.0thecubeisused).
ThespreadsheetcellsA1:C6shouldlooklike:

WehaveregressionwithaninterceptandtheregressorsHHSIZEandCUBEDHHSIZE
Thepopulationregressionmodelis:y=1+2x2+3x3+u
Itisassumedthattheerroruisindependentwithconstantvariance(homoskedastic)seeEXCEL
LIMITATIONSatthebottom.
Wewishtoestimatetheregressionline:y=b1+b2x2+b3x3
WedothisusingtheDataanalysisAddinandRegression.
http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

1/7

12/15/2014

EXCELMultipleRegression

TheonlychangeoveronevariableregressionistoincludemorethanonecolumnintheInputXRange.
Note,however,thattheregressorsneedtobeincontiguouscolumns(herecolumnsBandC).
Ifthisisnotthecaseintheoriginaldata,thencolumnsneedtobecopiedtogettheregressorsincontiguous
columns.
HittingOKweobtain

http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

2/7

12/15/2014

EXCELMultipleRegression

Theregressionoutputhasthreecomponents:
Regressionstatisticstable
ANOVAtable
Regressioncoefficientstable.
INTERPRETREGRESSIONSTATISTICSTABLE
Thisisthefollowingoutput.OfgreatestinterestisRSquare.

Explanation
MultipleR

0.895828 R=squarerootofR2

RSquare

0.802508 R2

AdjustedRSquare 0.605016 AdjustedR2usedifmorethanonexvariable


StandardError
0.444401 Thisisthesampleestimateofthestandarddeviationoftheerroru
Observations
5
Numberofobservationsusedintheregression(n)
Theabovegivestheoverallgoodnessoffitmeasures:
R2=0.8025
Correlationbetweenyandyhatis0.8958(whensquaredgives0.8025).
AdjustedR2=R2(1R2)*(k1)/(nk)=.8025.1975*2/2=0.6050.
Thestandarderrorherereferstotheestimatedstandarddeviationoftheerrortermu.
Itissometimescalledthestandarderroroftheregression.Itequalssqrt(SSE/(nk)).
Itisnottobeconfusedwiththestandarderrorofyitself(fromdescriptivestatistics)orwiththestandard
errorsoftheregressioncoefficientsgivenbelow.
R2=0.8025meansthat80.25%ofthevariationofyiaroundybar(itsmean)isexplainedbytheregressors
http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

3/7

12/15/2014

EXCELMultipleRegression

x2iandx3i.

INTERPRETANOVATABLE
AnANOVAtableisgiven.Thisisoftenskipped.

df SS
MS
F
SignificanceF
Regression 2 1.6050 0.8025 4.0635 0.1975
Residual
Total

2 0.3950 0.1975
4 2.0

TheANOVA(analysisofvariance)tablesplitsthesumofsquaresintoitscomponents.
Totalsumsofsquares
=Residual(orerror)sumofsquares+Regression(orexplained)sumofsquares.
Thusi(yiybar)2=i(yiyhati)2+i(yhatiybar)2
whereyhatiisthevalueofyipredictedfromtheregressionline
andybaristhesamplemeanofy.
Forexample:
R2=1ResidualSS/TotalSS(generalformulaforR2)
=10.3950/1.6050(fromdataintheANOVAtable)
=0.8025(whichequalsR2givenintheregressionStatisticstable).
ThecolumnlabeledFgivestheoverallFtestofH0:2=0and3=0versusHa:atleastoneof2and3
doesnotequalzero.
Aside:ExcelcomputesFthisas:
F=[RegressionSS/(k1)]/[ResidualSS/(nk)]=[1.6050/2]/[.39498/2]=4.0635.
ThecolumnlabeledsignificanceFhastheassociatedPvalue.
Since0.1975>0.05,wedonotrejectH0atsignficancelevel0.05.
Note:SignificanceFingeneral=FINV(F,k1,nk)wherekisthenumberofregressorsincludinghte
intercept.
HereFINV(4.0635,2,2)=0.1975.
INTERPRETREGRESSIONCOEFFICIENTSTABLE
Theregressionoutputofmostinterestisthefollowingtableofcoefficientsandassociatedoutput:

Coefficient St.error tStat Pvalue Lower95% Upper95%


Intercept
0.89655
0.76440 1.1729 0.3616 2.3924
4.1855
HHSIZE
0.33647
CUBEDHHSIZE 0.00209

0.42270 0.7960 0.5095 1.4823


0.01311 0.1594 0.8880 0.0543

http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

2.1552
0.0585

4/7

12/15/2014

EXCELMultipleRegression

Letjdenotethepopulationcoefficientofthejthregressor(intercept,HHSIZEandCUBEDHHSIZE).
Then
Column"Coefficient"givestheleastsquaresestimatesofj.
Column"Standarderror"givesthestandarderrors(i.e.theestimatedstandarddeviation)oftheleast
squaresestimatesbjofj.
Column"tStat"givesthecomputedtstatisticforH0:j=0againstHa:j0.
Thisisthecoefficientdividedbythestandarderror.Itiscomparedtoatwith(nk)degreesoffreedom
whereheren=5andk=3.
Column"Pvalue"givesthepvaluefortestofH0:j=0againstHa:j0..
ThisequalsthePr{|t|>tStat}wheretisatdistributedrandomvariablewithnkdegreesoffreedom
andtStatisthecomputedvalueofthetstatisticgiveninthepreviouscolumn.
Notethatthispvalueisforatwosidedtest.Foraonesidedtestdividethispvalueby2(also
checkingthesignofthetStat).
Columns"Lower95%"and"Upper95%"valuesdefinea95%confidenceintervalforj.
Asimplesummaryoftheaboveoutputisthatthefittedlineis
y=0.8966+0.3365*x+0.0021*z

CONFIDENCEINTERVALSFORSLOPECOEFFICIENTS
95%confidenceintervalforslopecoefficient2isfromExceloutput(1.4823,2.1552).
Excelcomputesthisas
b2t_.025(3)se(b2)
=0.33647TINV(0.05,2)0.42270
=0.336474.3030.42270
=0.336471.8189
=(1.4823,2.1552).
Otherconfidenceintervalscanbeobtained.
Forexample,tofind99%confidenceintervals:intheRegressiondialogbox(intheDataAnalysisAddin),
checktheConfidenceLevelboxandsetthelevelto99%.
TESTHYPOTHESISOFZEROSLOPECOEFFICIENT("TESTOFSTATISTICAL
SIGNIFICANCE")
ThecoefficientofHHSIZEhasestimatedstandarderrorof0.4227,tstatisticof0.7960andpvalueof
0.5095.
Itisthereforestatisticallyinsignificantatsignificancelevel=.05asp>0.05.
ThecoefficientofCUBEDHHSIZEhasestimatedstandarderrorof0.0131,tstatisticof0.1594andpvalue
of0.8880.
Itisthereforestatisticallyinsignificantatsignificancelevel=.05asp>0.05.
Thereare5observationsand3regressors(interceptandx)soweuset(53)=t(2).
Forexample,forHHSIZEp==TDIST(0.796,2,2)=0.5095.

http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

5/7

12/15/2014

EXCELMultipleRegression

TESTHYPOTHESISONAREGRESSIONPARAMETER
HerewetestwhetherHHSIZEhascoefficient2=1.0.
Example:H0:2=1.0againstHa:21.0atsignificancelevel=.05.
Then
t=(b2H0valueof2)/(standarderrorofb2)
=(0.336471.0)/0.42270
=1.569.
Usingthepvalueapproach
pvalue=TDIST(1.569,2,2)=0.257.[Heren=5andk=3sonk=2].
Donotrejectthenullhypothesisatlevel.05sincethepvalueis>0.05.
Usingthecriticalvalueapproach
Wecomputedt=1.569
Thecriticalvalueist_.025(2)=TINV(0.05,2)=4.303.[Heren=5andk=3sonk=2].
Sodonotrejectnullhypothesisatlevel.05sincet=|1.569|<4.303.
OVERALLTESTOFSIGNIFICANCEOFTHEREGRESSIONPARAMETERS
WetestH0:2=0and3=0versusHa:atleastoneof2and3doesnotequalzero.
FromtheANOVAtabletheFteststatisticis4.0635withpvalueof0.1975.
Sincethepvalueisnotlessthan0.05wedonotrejectthenullhypothesisthattheregressionparametersare
zeroatsignificancelevel0.05.
Concludethattheparametersarejointlystatisticallyinsignificantatsignificancelevel0.05.
Note:SignificanceFingeneral=FINV(F,k1,nk)wherekisthenumberofregressorsincludinghte
intercept.
HereFINV(4.0635,2,2)=0.1975.

PREDICTEDVALUEOFYGIVENREGRESSORS
Considercasewherex=4inwhichcaseCUBEDHHSIZE=x^3=4^3=64.
yhat=b1+b2x2+b3x3=0.88966+0.33654+0.002164=2.37006

EXCELLIMITATIONS
Excelrestrictsthenumberofregressors(onlyupto16regressors??).
Excelrequiresthatalltheregressorvariablesbeinadjoiningcolumns.
Youmayneedtomovecolumnstoensurethis.
e.g.IftheregressorsareincolumnsBandDyouneedtocopyatleastoneofcolumnsBandDsothatthey
areadjacenttoeachother.
http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

6/7

12/15/2014

EXCELMultipleRegression

Excelstandarderrorsandtstatisticsandpvaluesarebasedontheassumptionthattheerrorisindependent
withconstantvariance(homoskedastic).
Exceldoesnotprovidealternaties,suchasheteroskedasticrobustorautocorrelationrobuststandarderrors
andtstatisticsandpvalues.
MorespecializedsoftwaresuchasSTATA,EVIEWS,SAS,LIMDEP,PCTSP,...isneeded.
ForfurtherinformationonhowtouseExcelgoto
http://cameron.econ.ucdavis.edu/excel/excel.html

http://cameron.econ.ucdavis.edu/excel/ex61multipleregression.html

7/7

Anda mungkin juga menyukai