Anda di halaman 1dari 107

Demand Estimation

DEMAND FORECASTING
OVERVIEW
 Demand Curve Estimation
 Identification Problem
 Interview and Experimental Methods
 Regression Analysis
 Measuring Regression Model Significance
 Measures of Individual Variable Significance
 Demand/Sales/Revenue/Profit Forecasting Methods:
 Single Equation Regression Models,
 Simultaneous Equation Regression Models,
 Autoregressive Integrated Moving Average (ARIMA)
Models, and
 Vector Autoregressive (VAR) Models
KEY CONCEPTS
 simultaneous relation  multiple regression model
 identification problem  standard error of the estimate
 consumer interview (SEE)
 market experiments  correlation coefficient
 regression analysis  coefficient of determination
 deterministic relation  degrees of freedom
 statistical relation  corrected coefficient of
determination
 time series
 cross section
 F statistic
 scatter diagram
 t statistic
 linear model
 two-tail t tests
 multiplicative model
 one‑tail t tests
 simple regression model
Demand Curve Estimation
 Simple Linear Demand Curves
 The best estimation method balances
marginal costs and marginal benefits.
 Simple linear relations are useful for demand
estimation.
 Using Simple Linear Demand Curves
 Straight-line relations give useful
approximations.
Identification Problem
 Changing Nature of Demand Relations
 Demand relations are dynamic.
 Interplay of Supply and Demand
 Economic conditions affect demand and
supply.
 Shifts in Demand and Supply
 Curve shifts can be estimated.
 Simultaneous Relations
Interview and Experimental
Methods
 Consumer Interviews
 Interviews can solicit useful information when
market data is scarce.
 Interview opinions often differ from actual
market transaction data.
 Market Experiments
 Controlled experiments can generate useful
insight.
 Experiments can become expensive.
Regression Analysis
 What Is a Statistical Relation?
 A statistical relation exists when averages are related.

 A deterministic relation is true by definition.

 Specifying the Regression Model


 Dependent Variable/Explained

Variable/Predictand/Regressand/Response/Endogenous
Variable
 Explainatory Variable/Independent

Variable/Predictor/Regressor/Stimulus or Control
Variable/Exogenous Variable
 Dependent variable Y is caused by X.

 X variables are independently determined from Y.

 Least Squares Method


 Minimize sum of squared residuals.
Measuring Regression Model
Significance
 Standard Error of the Estimate (SEE) increases with
scatter about the regression line.
Goodness of Fit, r and R 2

 r = 1 means perfect correlation; r = 0


means no correlation.
 R2 = 1 means perfect fit; R2 = 0 means no
relation.
 Corrected Coefficient of Determination, R2
 Adjusts R2 downward for small samples.
F statistic
 Tells if R2 is statistically significant.
Measures of Individual Variable
Significance
t statistics
 t statistics compare a sample characteristic to the
standard deviation of that characteristic.
 A calculated t statistic more than two suggests a
strong effect of X on Y (95 % confidence).
 A calculated t statistic more than three suggests a
very strong effect of X on Y (99 % confidence).
 Two-tail t Tests
 Tests of effect.
 One-Tail t Tests
 Tests of magnitude or direction.
Demand Estimation
 What will happen to quantity demanded,
total revenue and profit if we increase
prices?
 What will happen to demand if consumer
incomes increase or decrease due to an
economic expansion or contraction?
 What affect will a tuition increase have on
Marquette’s revenue?
Practical Example: Port Authority
Transit Case

 How will the fare


price increase affect
demand and overall
revenues?
 What other factors,
besides fares, affect
demand?
Demand Estimation Using Market
Research Techniques
 How do we estimate the Demand
Function?
 Econometric Techniques (Your Project)
 Non-econometric Techniques
 Look first at Non-econometric Approaches
 What are these?
Consumer Surveys: Just Ask
Them
 Question customers to  Advantages:
estimate demand  Flexible
 “How many bags of chips  Relatively
would you buy if the price inexpensive to
was Rs. 2.29/bag?” conduct
 “How many cases of beer  Disadvantages
would you buy if the price  Many potential
of beer was Rs. biases
11.99/case?”  Strategic
 Compare different  Information
individuals’ responses  Hypothetical
 Interviewer
Market Experiments

 Firms vary prices and/or advertising and


compare consumer behavior
 Over time (e.g., before and after rebate offer)
 Over space (e.g., compare Delhi and Haryana
consumption when prices are varied between two
regions)
 Potential Problems
 Control of other factors not guaranteed.
 “Playing” with market prices may be risky.
 Expensive
Consumer Clinics and Focus
Groups
 Simulated market  Advantages
setting in which  Flexibility
consumers are given  Disadvantages
income to spend on a  Selectivity bias
variety of goods  Very expensive
 The experimenters
control income,
prices, advertising,
packaging, etc.
Econometrics
 “Economic Measurement”
 Collection of statistical techniques
available for testing economic theories by
empirically measuring relationships among
economic variables.
 Quantify economic reality – bridge the gap
between abstract theory and real world
human activity.
Practical Example
 How does the state of
Delhi set a budget?
 What is the process?
The Econometric Modeling
Process

1. Specification of the theoretical model


2. Identification of the variables
3. Collection of the data
4. Estimation of the parameters of the
model and their interpretation
5. Development of forecasts (estimates)
based on the model
Numbers Instead of Symbols!
 Normal model of consumer demand
 Q = f(P, Ps, Id)
 Q = quantity demanded of good, P =
good price, Ps = price of substitute good,
Id = disposable income
 Econometrics allows us to estimate the
relationship between Q and P, Ps and Id
based on past data for these variables
Q = 31.5 – 0.73P + 0.11Ps + 0.23Yd
 Instead of just expecting Q to “increase” if
there is an increase in Id – we estimate that
Q will increase by 0.23 units per 1 dollar of
increased disposable income
 0.23 is called an estimated regression
coefficient
 The ability to estimate these coefficients is
what makes econometrics useful
Regression Analysis
 One econometric approach
 Most popular among economists, business
analysis and social sciences
 Allows quantitative estimates of economic
relationships that previously had been
completely theoretical
 Answer “what if” questions
Regression Analysis Continued
 Regression analysis is a statistical technique that
attempts to “explain” movements in one variable,
the dependent variable, as a function of
movements in a set of other variables, called the
independent (or explanatory) variables, through
the quantification of a single equation.
 Q = f(P, Ps, Yd)
 Q = dependent variable
 P, Ps , Yd = independent variables
 Deals with the frequent questions of cause and
effect in business
What is Regression Really
Doing?
 Regression is the fitting of curves to data.

More later!

Q
Gathering Data

 Once the model is specified, we must collect


data.
 Time-series data
 e.g., sales for my company over time.
 What most of you will be using in your projects.
 Cross-sectional data
 e.g., sales of 10 companies in the food processing
industry at one point in time.
 Panel Data/Longitudinal Data/Pooled Data
 e.g., sales of 10 companies in the food processing
industry at various point in time.
Garbage In, Garbage Out

 Your empirical estimates will be only as


reliable as your data.
 Look at the two quotes from Stamp and
Valavanis that follow.
 You will want to take particular care in
developing your databases.
Sir Josiah Stamp
“Some Economic Factors in Modern Life”

 The government are very keen on amassing


statistics. They collect them, add them, raise
them to the n’th power, take the cube root and
prepare wonderful diagrams. But you must never
forget that every one of those figures comes in
the first instance from the village watchman, who
just puts down what he damn well pleases.
 Moral: Know where your data comes from!
Valavanis
 “Econometric theory is like an
exquisitely balanced French recipe,
spelling out precisely with how many
turns to mix the sauce, how many
carats of spice to add, and for how
many milliseconds to bake the
mixture at exactly 474 degrees of
temperature.”
Valavanis - continued
 “But when the statistical cook turns to raw
materials, he finds that hearts of cactus
fruit are unavailable, so he substitutes
chunks of cantaloupe; where the recipe
calls for vermicelli he uses shredded wheat;
and he substitutes green garment dye for
curry, ping-pong balls for turtle’s eggs, and,
for Chaligougnac vintage 1883, a can of
turpentine.”
 Moral: Be careful in your choice of proxy
variables
Economic Data
 You are in the process of gathering
economic data.
 Some will come from your firm.
 Some may come from trade publications.
 Some will come from the government.
 Must be of the same time scale (monthly,
quarterly, yearly, etc.)
Always be Skeptical
 Always approach your data with a critical
eye.
 Remember the quotes
 Just because something appears in a table
somewhere, does not mean it is necessarily
correct.
 Government data revisions.
 Does your data pass the “smell test”?
How to Begin the Data
Exercise
 First question you should ask yourself is:
 “If money were no object, what would be the
perfect data for my demand model?”
 From that basis, you can then start finding
what actual data you can get your hands on.
 There will be compromises that you have to
make. These are called proxy variables!
 Remember the Valavanis quote.
How to Choose a Good Proxy

 Proxy variables should be variables whose


movements closely mirror the desired variable
for which you do not have a measure.
 For example: Tastes of consumers are
difficult to measure.
 May use a time trend variable if you suspect these
are changing over time.
 May include demographic characteristics of the
population.
Dummy Variables
 Binary Variable
 Take on a “1” or a “0”
 Example: Trying to model salaries
 1 if you have a college degree, 0 if you
don’t
 Example: Model effect of Harley-Davidson
reunion years on demand
 1 for reunion years, 0 otherwise
Back to Regression Analysis
 Theoretical Model: Y = 0 + 1X + 
 Y is dependent variable
 X is independent variable
 Linear Equation (no powers greater than 1)
 ’s are coefficients – determine coordinates of the
straight line at any point
 0 is the constant term – value of Y when X is 0
(more on this later - no economic meaning but
required)
 1 is the slope term – amount Y will change when X
increases by one unit (can be 2 … n) holds all other
’s constant (except those not in model!)
 More about , the error term, later
Graphical Representation of
Regression Coefficients
Regression Line

Y
Y =  0 + 1X

Y 
X 1

0
X
The Error Term
 Y = 0 + 1X + 
  is purely theoretical
 Stochastic Error Term Needed Because:
 Minor influences on Y are omitted from equation
(data not available)
 Impossible not to have some measurement error
in one of the equation’s variables
 Different functional form (not linear)
 Pure randomness of variation (remember human
behavior!)
Example of Error

 Trying to estimate demand for SUV’s


 Demand may fall because of uncertainty
about the economy (what data do we use for
uncertainty?)
 Other independent variables may be omitted
 Demand function may be non-linear
 Demand for SUV’s is determined by human
behavior – some purely random variation
 All end up in error term
The Estimated Regression
Equation
 Theoretical Regression Equation:
Y = 0 + 1X + 
 Estimated Regression Equation:
Y^ = 103.40 + 6.38X + e
 Observed, real word X and Y values are used to
calculate coefficient estimates 103.40 and 6.38
 Estimates are used to determine Y-hat, the fitted
value of Y
 “Plug-in” X and get estimate of Y
Differences Between Theoretical and
Estimated Regression Equations
 0, 1 replaced with estimates ̂ 0 , ̂1 (103.40 and 6.38)
 Can’t observe true coefficients, we make estimates
 Best guesses given data for X and Y
 Y^ is estimated value of Y – calculated from the regression
equation (line through Y data)
 Residual e = Y – Yˆ
 Residual is difference between Y (data) and Yˆ (estimated
Y with regression)
 Theoretical model has error, estimated model has residual
A Simple Regression Example
in Eviews
 Demand for Ford Taurus
Ordinary Least Squares
Regression
 OLS Regression
 Most Common
 Easy to use
 Estimates have useful
characteristics
How Does Ordinary Least
Squares Regression Work?
 We attempt to find the curve that best fits
the data among all possibilities

 While there are a number of ways of


doing this, OLS minimizes the sum of the
squared residuals
Finding Best Fitting Line using
Ordinary Least Squares
Actual data points are
Y = 0 + 1X +  dependent variable (Y’s)
Yˆ = ˆ0 + ̂1 X + e P

“hat” is sample estimate


of true value _
OLS minimizes:  e 2 P
e = (Y – Yˆ )
OLS minimizes (Y- Yˆ )2

_
Q
Q
Best possible linear line through data
True vs. Estimated Regression Line
 No one knows the parameters of the true
regression line:
Yt =  + Xt + t (theoretical)
 We must come up with estimates.
Y^t = ^ + ^Xt + et (estimated)
So how does OLS work?
 OLS selects the
estimates of 0 and 1
that minimize the
squared residuals
 Minimize difference
between Y and Y^
 Statistical Software
 Complex math behind
the scenes
OLS Regression Coefficient
Interpretation
 Regression coefficients (’s) indicate the
change in the dependent variable
associated with a one-unit increase in the
independent variable in question holding
constant the other independent variables
in the equations (but not those not in the
equation)
 A controlled economic experiment?
Another Example
 The demand for beef
 B = 0 + 1P + 2Yd
 B = per capita consumption of beef per
year
 Yd = per capita disposable income per year
 P = price of beef (cents/pound)
 Estimate this using Eviews
Overall Fit of the Model
 Need a way to evaluate model
 Compare one model with another
 Compare one functional form with another
 Compare combinations of independent
variables
 Use coefficient of determination r2
r2 – The Coefficient of
Determination
 Reported by Eviews every time you run a regression
 Between 0 and 1
 The larger the better
 Close to one shows an excellent fit
 Near zero shows failure of estimated regression to
explain variance in Y
 Relative term
 r2 = .85 says that 85% of the variation in the
dependent variable is explained by the independent
variables
Graphical r
2

 r2 = 0
 r2 = .95
 r2 = 1
The Adjusted r 2

 Problem with r2: Adding another independent


variable never decreases r2
 Even a nonsensical variable
 Need to account for a decrease in “degrees of
freedom”
 Degrees of freedom = data observations –
coefficients estimated
 Example: 100 years of data, 3 variables
estimated (including constant)
 DF = 97
Adjusted r 2

 Slightly negative to 1
 Accounts for degrees of freedom
 Better estimate of fit
 Don’t rely on any one statistic
 Common sense and theory more important
 Same interpretation as r2
 Use adjusted r2 from now on!
The Classical Linear
Regression (CLR) Model
 These are some basic assumptions which
when met, make the Ordinary Least
Squares procedure the “Best Linear
Unbiased Estimator” (aka BLUE).
 When one or more of these assumptions is
violated, it is sometimes necessary to
make adjustments to our model.
Assumptions
(Yt=X1t+ X2t+...+t)
 Linearity in coefficients and error term
  has zero population mean
 All independent variables are independent of 
 Error term observations are uncorrelated with
each other (no serial correlation)
  has constant variance (no heteroskedasticity)
 No independent variables are perfectly
correlated (multicollinearity)

Will come back to some of these when we test our models


1st Assumption: Linearity
 We assume that the model is linear (additive)
in the coefficients and in the error term, and
specification is correct.
 e.g., Yt=X1+ X2+ is is linear in both,
whereas Yt=X1+ X2+ is not.
 Some nonlinear models can be transformed
into linear models.
 e.g., Yt=X1 X2
 We showed this can be transformed using logs to:
lnYt=lnlnX1+ lnX2+ ln
Hypothesis Testing
 In statistics we
cannot “prove” a
theory is correct
 Can “reject” a
hypothesis with a
certain degree of
confidence
Common Hypothesis Test
 H0:  = 0 – Null Hypothesis
 HA:   0 – Alternative Hypothesis
 Test whether or not the coefficient is
statistically significantly different from zero
 Does the coefficient affect demand?
 Two-tailed test
Does Rejecting the Null Hypothesis
Guarantee that the Theory is Correct?

 NO! It is possible that we are committing


what is known as a Type I error.
 A Type I error is rejecting that Null hypothesis
when it is in fact correct.
 Likewise, we may also commit a Type II
error
 A Type II error is failing to reject the Null
hypothesis when the alternative hypothesis is
correct.
Type I and Type II Error
Example
 Presumption of innocence until proven
guilty
 H0: The defendant is innocent
 HA: The defendant is guilty
 Type I error: sending an innocent
defendant to jail
 Type II error: freeing a guilty defendant
The t-Test, and the t-Statistic
 We can use the t-Test to do
hypothesis testing on individual
coefficients.
 Given the linear regression model:
 Yt=X1+ X2+...+t
 We can calculate the t-statistic for
each estimated value of i.e., hat),
and test hypotheses on that estimate.
Setting up the Null and
Alternative Hypotheses

 H0: 1= 0
 (i.e., X1 is not important)
 HA: 10
 (i.e., X1 is important, either
positively or negatively)
Testing the Hypothesis
 Set up null and alternative hypothesis
 Run regression and generate t-score.
 Look up the critical value of the t-Statistic (tc), given
the degrees of freedom (n-k) in a two-tailed test
using X% level of significance (1%, 5%, 10%)
 n = sample size, k = estimated coefficients
(including intercept)
 Reject null (=0) if abs(tk)> tc
 t Statistic Table on Page 754 of Hirschey
 Interpretation of level of significance: 5% means
only 5% chance estimate is actually equal to zero or
not significant statistically (this is a 95% confidence)
Example
 Taurus example with t-stats
Limitations of the t-Test

1. Does not indicate theoretical


validity
2. Does not test Importance of the
variable.
 The size of the coefficient does this.
F-test and the F-statistic

 You can also test


whether a group of
coefficients is
statistically significant.
 Look at the F-test for all
of the independent
variable coefficients.
 First set up the null and
alternative hypotheses.
H0 and HA for the F-test

 H0:   k0


 i.e., all of the slope coefficients are
simultaneously zero.
 HA: not H0
 i.e., at least one, if not more slope coefficients,
are nonzero.
 Note: It does not indicate which one or ones
of the coefficients are nonzero.
The Critical F
 As with the t-statistic, you must compare the
actual value of F with its critical value (Fc):
 Actual value from EVIEWS or

Fk-1, n-k = [r2/(k-1)]/[(1-r2)/(n-k)]


 FC must be looked up in a table, using the
appropriate degrees of freedom for the
numerator (k-1) and the denominator (n-k)
 Table on Page 751 (10%), 752 (5%) and 753
(1%) of Hirschey
The F-Test
 If FFC then you reject H0 (all
coefficients not equal to zero)
 If F<FC then you fail to reject H0

 Look at an Eviews example


Specification Errors
 Suppose that your make a mistake in
your choice of independent variables.
There are 2 possibilities:
 You omit an important variable
 You include an extraneous variable
 There are consequences in both cases.
Omitting an Important Variable
 Suppose your true regression model is:
Qt=Pt+ It+t
 Suppose you specify the model as:
Qt=Pt+t*
 Thus, error term of the misspecified model
captures the influence of income, It.
t*It+t
Consequences
 Prevents you from getting a
coefficient for income
 Causes bias in the price estimate
 Violates classical assumption of error
term not being correlated with an
explanatory (independent) variable
Inclusion of an Irrelevant
Variable
 A variable that is included, that does not
belong in your model also has
consequences.
 Does NOT bias the other coefficients.
 Lowers t-scores of other coefficients (so you
might reject)
 Will raise r2 but will likely decrease the
adjusted r2 (help you identify)
Example
 Annual Consumption of Chicken
 Y = consumption of chicken, PC = price of
chicken, PB = price of beef, I = disposable
income
 Y^ = 31.5 – 0.73PC + 0.11PB + 0.23I
 PC t-stat = -9.12, PB t-stat = 2.50, I t-stat =
14.22
 Adjusted r2 = 0.986
 Interpretation?
Example
 Add interest rate to the equation, R
 Y^ = 30 – 0.73PC + 0.12PB + 0.22YD + 0.17R
 PC t-stat = -9.10, PB t-stat = 2.08, YD t-stat =
11.05, R t-stat = 0.82
 Adjusted r2 = .985
 Lowers t-stats and adjusted r2
 t-stat suggests rejection and so does the
adjusted r2
How do you decide whether a
variable should be included?
 Trial and Error – Many EVIEWS runs!
 Start with THEORY!
 Use your judgement here!
 If theory does not provide a clear answer, then:
 Look at t-test
 Look at adjusted r2
 Look at whether other coefficients appear to be
biased when you exclude the variable from the
model.
Inclusion of Lagged Variables
 Some independent variables influence demand
with a lag.
 For example, advertising may primarily influence
demand in the following month, rather than the
current month.
 Thus, Qt=0+1Pt+2It+3At-1+t
 When there is a good reason to suspect a lag
(i.e., when theory suggests a lagged
relationship), you can investigate this option.
Eviews Lagged Variable
 Unemployment in previous time periods
important to current demand for Taurus?
Functional Form

 Don’t forget the constant term – no meaning


but required for classical assumptions
 Linear Form
 Double Log Form
 There are many others that we won’t discuss
Linear Form
 Y =  0 +  1X + 
 What we have looked at thus far
 Constant slope is assumed
 Y/X = 1
Double-Log Form
 Second most common
 Natural log of Y is independent variable and
natural log of X’s are dependent variables
 lnY = 0 + 1lnX1 + 2lnX2 + 
 Elasticities of the model are constant
 ElasticityY,Xk = %Y/ %X1 = 1 = constant
 Interpretation of coefficients: if X1 increases by
1% while the other X2 is held constant, Y will
change by 1%
 Can’t be any negative or 0 observations in your
data set (natural log not defined)
Violations of the Classical
Model
 Multicollinearity
 Serial Correlation
 Others
Problem of Multicollinearity
 Recall the CLR assumption that the
independent variables are not perfectly
correlated with each other
 This is called “perfect multicollinearity”
 Easy to detect
 OLS cannot estimate parameters in this
situation (put in the same independent twice
and Eviews can’t do it)
 Look at problem of imperfect
multicollinearity
Imperfect Multicollinearity
 This occurs when two or more
independent variables are highly, but
not perfectly correlated with each
other!
 If this is severe enough, it can influence the
estimation of the ’s in the model.
How to Detect the Problem
 There are some formal tests
 Beyond scope of this course
 Look for the tell-tale signs of the problem:
 High adjusted r2, high F-statistics, and low t-
scores on suspected collinear variables.
 Eviews example with Taurus
Remedies

 Possibly do nothing!
 If t-scores are at or near significance
levels, you may want to “live with it”.
 Drop one or more collinear variables.
 Let the remaining variable pick up the joint
impact.
 This is ok if you have redundancies.
Remedies - continued
 Form a new variable:
 e.g., if income and population are correlated,
you could form per capita income: I/Pop.
 Other solutions I can help with on your projects
The Problem of
Serial Correlation
 The fourth assumption of the CLR model
is:
“Observations of the error term are
uncorrelated with each other”
 When this is not satisfied, we have a
problem known as serial correlation.
Examples of Serial Correlation
 Positive Serial  Negative Serial
Correlation Correlation

Q Q

P P
Consequences of Serial
Correlation
 Pure serial correlation does not bias the
estimates
 Serial correlation tends to distort t-scores
 Serial correlation results in a pattern of
observations in which OLS gives a better
fit to the data than would be obtained in
the absence of the problem (t scores
higher).
 Uses error to explain dependent variable
QUESTION:
Why is this a problem?
 This suggests that t-statistics are
overestimated!
 Type I error: You may falsely reject the
null hypothesis, when it is in fact true.
 Neither, F-statistics nor t-statistics can be
trusted in the presence of serial
correlation.
Detection:
The Durbin-Watson d-test

 This is a test for first order serial


correlation
 This is the most common type in economic
models.
 Note that there are other tests (Q-test,
Breusch-Godfrey LM test), but we will not
cover them here.
 The d-statistic is derived from the
regression residuals (e).
Theoretical range of d-statistic
 If there is perfect positive serial correlation then
d=0.
 If there is perfect negative serial correlation then
d=4.
 If there is no serial correlation, then d=2
 Check this statistic in Eviews on your project
 If near 2 no problem, if different than 2 then …
Correction for Serial Correlation
using GLS
 Adding an autoregressive term solves serial
correlation problem
 Details is outside scope of class
 Soviet Defense spending model
 If your original regression model was:
LS SDH C USD SY SP
DW=0.62 a problem

 Simply add an AR(1) term to your command line:


LS SDH C USD SY SP AR(1)
DW=1.97 problem solved
Summary Steps for Project
 Think about theoretical model: what independent
variables make sense based on theory? (already
doing this)
 Collect data and examine it (already doing this)
 Choose a functional form (likely linear)
 Run regression models in Eviews
 Examine adjusted r2, t-stats, F-stat and exclude or
include variables based on these and theory
 Do you need lagged variables?
 Look for evidence of (and correct for)
multicollinearity or serial correlation
Summary Steps for Project
 Interpret your results
 Use model to forecast demand (next topic)
 I’ll do a “sample project” next time using
the Taurus data
Homework Continued
 Interpret Adjusted r2
 Which is a better measure of overall fit? Why?
 Is F-Stat Significant? What does it mean?
 Any evidence of serial correlation?
 How could this be corrected for?
 Estimate the equation as a log-log model
 Interpret the results
 Is beef a normal good?
 Is demand elastic or inelastic (for price and
income)?
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
 OLS estimation method estimates the values of the
parameters by minimising the sum of the error term.
 Let the theoretical econometric model be
 Yt = 0 + 1Xt + t
 With usual assumptions about .
  is an unknown random (stochastic) disturbance term.
However, statistical estimation requires regression of Y on
X and estimating 0, 1 and residuals – particular values of
. To distinguish from the unknown disturbance , the
sample residuals are termed as errors and denoted by .
Thus, the empirical model is
 Yt = 0 + 1Xt + t
 t = Yt – (0 + 1Xt)
 = Actual Value – Predicted Value

 (Yt - Yt )
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
 The method of least squares minimises the sum
of squared residuals (or errors), SSE or Error Sum
of Squares (ESS).
 This means that we are minimising the sum of
the squares of the vertical distances from the line
of regression. Alternatively, we could have
minimised the absolute sum of vertical distances
or sum of squares of the horizontal distances or
perpendicular distances (orthogonal estimators).
OLS confines to the minimisation of the sum of
squares of the vertical distances.
2
 i.e., it minimises n n

 t   (Yt  0  1 X t )
t 1
 2

t 1
ORDINARY LEAST SQUARES ESTIMATORS (OLS)
 SSE is to be minimised with respect to the parameters
0 and 1. We have to choose those values of 0 and 1
which will give as close a fit to data as is possible with
 
this specification. Let these values be and
 0 . 1
 We see that the standard model or the ordinary least
squares (OLS) model as it is more popular called and
the classical linear regression model [OLS with the
additional assumption of t ~ N (0, 2)] have 4
parameters. 0 and 1 – the parameters of linear
dependence, and E(t) and 2 - the parameters of the
probability distribution of . However, the assumption
of E(t) = 0 i.e., randomness of the disturbance is not
very restrictive. Let the expected value of I be non-
zero, say k, then
ORDINARY LEAST SQUARES ESTIMATORS (OLS)

Yt  EYi / X   0  1 ( X t )  E ( t )
  0  1 ( X t )  k
 (  0  k )  1 ( X t )
ESS ESS
  0and   0
 0 1
 
whereESS      [Yt  0  1 X t ]2
t
2

are the estimated values of the parameters.

Anda mungkin juga menyukai