USING REGRESSIONANALYSIS
© A. MITRA
DETERMINISTIC AND PROBABILISTIC MODELS
Deterministic
Y = ƒ(X 1 , X 2 ,… X p – 1 )
Probabilistic
Y = ƒ X1, X 2 ,… X p – 1 + s
© A. MITRA
LINEAR MODEL
ƒ X = þO + þ 1 X
E Y = þO + þ 1 X
Yˆ = þˆ + þˆ X
O 1
© A. MITRA
LOGARITHMIC RELATIONSHIP
E Y = þO Log(þ 1 X)
© A. MITRA
(b) Logarithmic relationship
INVERSE RELATIONSHIP
E Y = þO + þ1(1/X)
© A. MITRA
QUADRATIC RELATIONSHIP
E Y = þO + þ 1 X + þ 2 X 2
© A. MITRA
MODEL ASSUMPTIONS
Assumption 1 The mean of the probability distribution of the error component is 0, i.e., E(s) = 0, for each and
every level of the independent variable.
Assumption 2 The variance of the probability distribution of the error component is constant for all levels of
the independent variable.The constant variance will be denoted by o 2 and this is often referred to as the
homoscedasticity assumption.
Assumption 3 The probability distribution of the error component is normal for each level of the independent
variable.
Assumption 4 The error components for any two different observations are independent of each other.
© A. MITRA
MODEL ASSUMPTIONS
X (Quantity ordered)
© A. MITRA
FIGURE 13-2 Assumptions for a regression model
MULTIPLE REGRESSION MODEL
Y = þO + þ1X1 + þ 2 X 2 + ⋯ + þ p – 1 X p – 1 + s
© A. MITRA
LEAST SQUARES METHOD
Residuals:
ei = Yi —Yˆi , i = 1,2,… , n
The least squares method finds the estimated model coefficients þˆO, þˆ1,… , þˆ p – 1 , such that SSE is minimized.
© A. MITRA
LEAST SQUARES METHOD
© A. MITRA
FIGURE 13-3 Method of least squares
SIMPLE LINEAR REGRESSION
þ̂O = Y¯ —þˆ1 X¯
© A. MITRA
INTERACTION BETWEEN INDEPENDENT VARIABLES
The functional relationship of Y with X1 is influenced by the level of another independent variable X 2 .
X2 = 1
X2 = 0
Y
© A. MITRA X1
PERFORMANCE MEASURES OF A REGRESSION MODEL
As model performance improves SSR will increase and SSE will decrease, for a given data set.
© A. MITRA
PARTITIONING OF TOTAL SUM OF SQUARES
Coefficient of determination:
SSR
R2 = = 1 —SSE
SST SST
0 Ç R2 Ç 1
© A. MITRA
ADJUSTED R 2
© A. MITRA
MODEL UTILITY
HO: þ1 = þ2 = ⋯ = þ p–1 = 0
H1: At least one of the þi parameters G 0
Test statistic:
MSR SSR/(p–1)
FO = =
MSE SSE/(n–p)
When HO is true, test statistic has an F-distribution with (p-1) degrees of freedom in the numerator and (n-p) degrees of
freedom in the denominator.
Reject HO if FO > F α,p–1,n–p
or if p-value < α (chosen level of significance)
© A. MITRA
MODEL UTILITY
Source of Degrees of Sum of squares Mean Square F-statistic p-value
Variation Freedom
Model p-1 SSR MSR = SSR/(p-1) F0 = MSR/MSE P[F p-1,n-p > F0]
© A. MITRA
SIGNIFICANCE OF INDIVIDUAL PREDICTORS
Test statistic:
þˆi
tO = SE(þˆ i)
Critical value:Two-tailed, found from t-distribution with (n-p) degrees of freedom and α/2 area on each tail.
Reject HO if tO X t α/2, n – p
or if p-value < α.
© A. MITRA
CONFIDENCE INTERVAL FOR MODEL PARAMETER
CI for þ i :
© A. MITRA
MODEL VALIDATION AND REMEDIAL MEASURES
© A. MITRA
RESIDUAL PLOTS FOR FUNCTIONAL FORMS
FIGURE 13-5 Residual plot for functional forms
© A. MITRA
© A. MITRA
CONSTANCY OF ERRORVARIANCE
Y ∗ = arcsin Y = arcsin p
or
p
Y ∗ = ln
1–p
© A. MITRA
EXPONENTIAL GROWTH OR DECAY OF DEPENDENT VARIABLE
Model:
Y = E Y s
Multiplicative Model
Variance of residuals increases with Yˆ.
© A. MITRA
EXPONENTIAL GROWTH OR DECAY
FIGURE 13-7 Residual plots for nonconstant error variability
Y ∗ = log Y
© A. MITRA
NORMALITY OF ERROR COMPONENT
Histogram of residuals
Box plot of residuals
Normal probability plot of residuals
Anderson-Darling test:
If p-value < α, reject the null hypothesis of normality of residuals.
© A. MITRA
ESTIMATION AND INFERENCES FROM REGRESSION MODEL
Inferences on individual parameters:
HO: þi = 0
H a : þi G 0
bi
Test statistic = tO = SE(b )i
Inferences on all parameters:
HO: þ1 = þ2 = ⋯ = þ p–1 = 0
H a : At least one þi G 0
MSR
Test statistic = FO = MSE
© A. MITRA
INFERENCES ON MODEL PARAMETERS
CI for þ i :
þ̂i ± t α / 2g, n – p SE(þˆi ), i = 1, 2, … , g
© A. MITRA
HYPOTHESIS TEST ON SUBSET OF PARAMETERS
S S E R – SS E F / ( p – g – 1 )
Test Statistic: FO =
SSE F /(n–p)
© A. MITRA
HYPOTHESIS TESTING ON SUBSET OF PARAMETERS
If FO X F α , p – g – 1 , n – p , reject HO
If p-value < α, reject HO
© A. MITRA
CONFIDENCE INTERVAL FOR MEAN RESPONSE
© A. MITRA
PREDICTION INTERVAL FOR INDIVIDUAL OBSERVATIONS
s 2 Yr new = SE 2 Ŷr + s 2
Prediction Interval:
© A. MITRA