DEMAND FORECASTING
OVERVIEW
Demand Curve Estimation
Identification Problem
Interview and Experimental Methods
Regression Analysis
Measuring Regression Model Significance
Measures of Individual Variable Significance
Demand/Sales/Revenue/Profit Forecasting Methods:
Single Equation Regression Models,
Simultaneous Equation Regression Models,
Autoregressive Integrated Moving Average (ARIMA)
Models, and
Vector Autoregressive (VAR) Models
KEY CONCEPTS
simultaneous relation multiple regression model
identification problem standard error of the estimate
consumer interview (SEE)
market experiments correlation coefficient
regression analysis coefficient of determination
deterministic relation degrees of freedom
statistical relation corrected coefficient of
determination
time series
cross section
F statistic
scatter diagram
t statistic
linear model
two-tail t tests
multiplicative model
one‑tail t tests
simple regression model
Demand Curve Estimation
Simple Linear Demand Curves
The best estimation method balances
marginal costs and marginal benefits.
Simple linear relations are useful for demand
estimation.
Using Simple Linear Demand Curves
Straight-line relations give useful
approximations.
Identification Problem
Changing Nature of Demand Relations
Demand relations are dynamic.
Interplay of Supply and Demand
Economic conditions affect demand and
supply.
Shifts in Demand and Supply
Curve shifts can be estimated.
Simultaneous Relations
Interview and Experimental
Methods
Consumer Interviews
Interviews can solicit useful information when
market data is scarce.
Interview opinions often differ from actual
market transaction data.
Market Experiments
Controlled experiments can generate useful
insight.
Experiments can become expensive.
Regression Analysis
What Is a Statistical Relation?
A statistical relation exists when averages are related.
Variable/Predictand/Regressand/Response/Endogenous
Variable
Explainatory Variable/Independent
Variable/Predictor/Regressor/Stimulus or Control
Variable/Exogenous Variable
Dependent variable Y is caused by X.
More later!
Q
Gathering Data
Y
Y = 0 + 1X
Y
X 1
0
X
The Error Term
Y = 0 + 1X +
is purely theoretical
Stochastic Error Term Needed Because:
Minor influences on Y are omitted from equation
(data not available)
Impossible not to have some measurement error
in one of the equation’s variables
Different functional form (not linear)
Pure randomness of variation (remember human
behavior!)
Example of Error
_
Q
Q
Best possible linear line through data
True vs. Estimated Regression Line
No one knows the parameters of the true
regression line:
Yt = + Xt + t (theoretical)
We must come up with estimates.
Y^t = ^ + ^Xt + et (estimated)
So how does OLS work?
OLS selects the
estimates of 0 and 1
that minimize the
squared residuals
Minimize difference
between Y and Y^
Statistical Software
Complex math behind
the scenes
OLS Regression Coefficient
Interpretation
Regression coefficients (’s) indicate the
change in the dependent variable
associated with a one-unit increase in the
independent variable in question holding
constant the other independent variables
in the equations (but not those not in the
equation)
A controlled economic experiment?
Another Example
The demand for beef
B = 0 + 1P + 2Yd
B = per capita consumption of beef per
year
Yd = per capita disposable income per year
P = price of beef (cents/pound)
Estimate this using Eviews
Overall Fit of the Model
Need a way to evaluate model
Compare one model with another
Compare one functional form with another
Compare combinations of independent
variables
Use coefficient of determination r2
r2 – The Coefficient of
Determination
Reported by Eviews every time you run a regression
Between 0 and 1
The larger the better
Close to one shows an excellent fit
Near zero shows failure of estimated regression to
explain variance in Y
Relative term
r2 = .85 says that 85% of the variation in the
dependent variable is explained by the independent
variables
Graphical r
2
r2 = 0
r2 = .95
r2 = 1
The Adjusted r 2
Slightly negative to 1
Accounts for degrees of freedom
Better estimate of fit
Don’t rely on any one statistic
Common sense and theory more important
Same interpretation as r2
Use adjusted r2 from now on!
The Classical Linear
Regression (CLR) Model
These are some basic assumptions which
when met, make the Ordinary Least
Squares procedure the “Best Linear
Unbiased Estimator” (aka BLUE).
When one or more of these assumptions is
violated, it is sometimes necessary to
make adjustments to our model.
Assumptions
(Yt=X1t+ X2t+...+t)
Linearity in coefficients and error term
has zero population mean
All independent variables are independent of
Error term observations are uncorrelated with
each other (no serial correlation)
has constant variance (no heteroskedasticity)
No independent variables are perfectly
correlated (multicollinearity)
H0: 1= 0
(i.e., X1 is not important)
HA: 10
(i.e., X1 is important, either
positively or negatively)
Testing the Hypothesis
Set up null and alternative hypothesis
Run regression and generate t-score.
Look up the critical value of the t-Statistic (tc), given
the degrees of freedom (n-k) in a two-tailed test
using X% level of significance (1%, 5%, 10%)
n = sample size, k = estimated coefficients
(including intercept)
Reject null (=0) if abs(tk)> tc
t Statistic Table on Page 754 of Hirschey
Interpretation of level of significance: 5% means
only 5% chance estimate is actually equal to zero or
not significant statistically (this is a 95% confidence)
Example
Taurus example with t-stats
Limitations of the t-Test
Possibly do nothing!
If t-scores are at or near significance
levels, you may want to “live with it”.
Drop one or more collinear variables.
Let the remaining variable pick up the joint
impact.
This is ok if you have redundancies.
Remedies - continued
Form a new variable:
e.g., if income and population are correlated,
you could form per capita income: I/Pop.
Other solutions I can help with on your projects
The Problem of
Serial Correlation
The fourth assumption of the CLR model
is:
“Observations of the error term are
uncorrelated with each other”
When this is not satisfied, we have a
problem known as serial correlation.
Examples of Serial Correlation
Positive Serial Negative Serial
Correlation Correlation
Q Q
P P
Consequences of Serial
Correlation
Pure serial correlation does not bias the
estimates
Serial correlation tends to distort t-scores
Serial correlation results in a pattern of
observations in which OLS gives a better
fit to the data than would be obtained in
the absence of the problem (t scores
higher).
Uses error to explain dependent variable
QUESTION:
Why is this a problem?
This suggests that t-statistics are
overestimated!
Type I error: You may falsely reject the
null hypothesis, when it is in fact true.
Neither, F-statistics nor t-statistics can be
trusted in the presence of serial
correlation.
Detection:
The Durbin-Watson d-test
t (Yt 0 1 X t )
t 1
2
t 1
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
SSE is to be minimised with respect to the parameters
0 and 1. We have to choose those values of 0 and 1
which will give as close a fit to data as is possible with
this specification. Let these values be and
0 . 1
We see that the standard model or the ordinary least
squares (OLS) model as it is more popular called and
the classical linear regression model [OLS with the
additional assumption of t ~ N (0, 2)] have 4
parameters. 0 and 1 – the parameters of linear
dependence, and E(t) and 2 - the parameters of the
probability distribution of . However, the assumption
of E(t) = 0 i.e., randomness of the disturbance is not
very restrictive. Let the expected value of I be non-
zero, say k, then
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
Yt EYi / X 0 1 ( X t ) E ( t )
0 1 ( X t ) k
( 0 k ) 1 ( X t )
ESS ESS
0and 0
0 1
whereESS [Yt 0 1 X t ]2
t
2