• We have developed the simple regression model in which we included only an intercept term and
• Oftentimes we would think that this is rather naive because we are not able to explain much of
the variation in Y and we may have theoretical justification for including other variables.
• We have hinted already at the idea that we would include more than one right-hand side variable
in a model. Indeed, we often have any number of regressors on the right hand side.
• To this end, we develop the OLS model with more than one rhs variable. To make the notation
Y = Xβ + ²
• We can only include observations with fully defined values for each Y and X.
64
7. #6 implies that the error terms have no covariance: E[²i ²j ] = 0 ∀ i 6= j.
• The linear estimator for β, denoted β̂ is found by minimizing the sum of squared errors over β
where
N
X N
X
SSE = ²2i = (y − βX)2
i=1 i=1
or in matrix notation
• We know that ²0 ² = y 0 y − 2β 0 X 0 y + β 0 X 0 X 0 β.
• The second term is clearly linear in β since X 0 y is a k-element vector of known scalars, whereas
f (β) = a0 β = a1 β1 + a2 β2 + · · · + ak βk = β 0 a
where a = X 0 y.
• Taking partial derivatives with respect to each of the βi and arranging the results in a column
vector yields
a1
a2
0
∂(a β) ∂(β a)
0
a
= =
3 =a
∂β ∂β .
.
ak
∂(2β 0 X 0 y)
= 2X 0 y
∂β
• The quadratic term can be rewritten as β 0 Aβ where the matrix A is of known constants, i.e.,
65
X 0 X. We can write this as
· ¸ a11 a12 a13 β1
f (β) = β1 β2 β3 a21 a22 a23
β
2
a31 a32 a33 β3
= a11 β12 + a22 β22 + a33 β32 + 2a12 β1 β2 + 2a13 β1 β3 + 2a23 β2 β3
∂(β 0 Aβ)
= 2Aβ
∂β
for any symmetric A. From our SSE, we have A = X 0 X and substituting we obtain
∂(β 0 X 0 Xβ)
= 2(X 0 X)β.
∂β
∂SSE
= −2X 0 y + 2X 0 Xβ = 0
∂β
• This is a matrix version of the simple regression model. There is one fonc for every parameter to
be estimated.
• We solve these k first order conditions by dividing by 2, taking X 0 y to the right-hand side and
solving for β.
66
• Unfortunately, we cannot divide when it comes to matrices, but we do have the matrix analogue
• The first two matrices on the left hand side cancel each other out to become the identity matrix,
β̂ = (X 0 X)−1 X 0 y
• Note that the matrix-notation version of β̂ is very analogous to the scalar version derived in the
P
• (X 0 X)−1 is the matrix analogue to the denominator of the simple regression estimator β̂, x2i .
P
• Likewise X 0 y is the matrix analogue to the numerator of the simple regression estimator β̂, xi yi .
• We look again at the first two moments of β̂ in matrix form: E[β̂] and var(β̂).
E[β̂] = β + 0
E[β̂] = β
67
• The cov(β̂) is found by taking the E[(β̂ − β)(β̂ − β)0 ]. This leads to the following:
cov(β̂) = σ 2 (X 0 X)−1
• We use the fitted residuals ²̂ and adjust for the appropriate degrees of freedom:
· ¸
²̂0 ²̂
σ̂ 2 =
N −k
where k is the number of right-hand side variables (including the constant term).
• Having shown that E[β̂] = β and cov(β̂) = σ 2 (X 0 X)−1 we move to prove the Gauss-Markov
Theorem.
• The Gauss-Markov Theorem states that β̂ is BLUE or Best Linear Unbiased Estimator. Our
estimator is the ”best” because it has the minimum variance of all linear unbiased estimators.
68
• For E[β̃] = β it must be true that
= E[C 0 Xβ + C 0 ²]
= C 0 Xβ
Lemma: β̃ = β̂ + [C 0 − (X 0 X)−1 X 0 ]y
Proof: β̃ = C 0 y, thus
β̃ = C 0 y + (X 0 X)−1 X 0 y − (X 0 X)−1 X 0 y
= β̂ + [C 0 − (X 0 X)−1 X 0 ]y
(1)
Lemma: β̃ = β̂ + [C 0 − (X 0 X)−1 X 0 ]²
β̃ = β̂ + [C 0 − (X 0 X)−1 X 0 ][Xβ + ²]
= β̂ + C 0 Xβ − β + [C 0 − (X 0 X)−1 X 0 ]²
= β̂ + [C 0 − (X 0 X)−1 X 0 ]²
• With these two lemmas we can continue to prove the Gauss-Markov theorem. We have determined
that both β̂ and β̃ are unbiased. Now, we must prove that cov(β̂) ≤ cov(β̃).
69
Now, take advantage of our lemmas and that β̃ is unbiased to obtain
• The matrix [C 0 − (X 0 X)−1 X 0 ][C 0 − (X 0 X)−1 X 0 ]0 is non-negative semi-definite. This is the matrix
• Is β̂ consistent?
• Assume
1
lim (X 0 X) = Qxx which is nonsingular
N →∞ N
• Theorem: plimβ̂ = β.
Then
σ 2 −1
lim cov(β̂) = lim Q =0
N →∞ N xx
which implies that the covariance matrix of β̂ collapses to zero which then implies
plimβ̂ = β
• Some express concern that there might be price manipulation in the retail gasoline market. To
see if this is true, monthly price, tax, and cost data were gathered from the Energy Information
70
• Here is a time plot of the retail and wholesale price of gasoline (U.S. Average)
250
200
150
100
50
gasprice wprice
------------------------------------------------------------------
gasprice | Coef. Std. Err. t P>|t| [95% Conf. Int]
--------------+---------------------------------------------------
fedtax | 1.268 .159 7.94 0.000 .953 1.583
avestatetax | .725 .203 3.57 0.000 .325 1.125
wholesaleprie | 1.091 .011 92.62 0.000 1.068 1.115
trend | .033 .009 3.62 0.000 .015 .051
_cons | 5.281 2.698 1.96 0.051 -.031 10.594
-----------------------------------------------------------------
• The dependent variable is measured in pennies per gallon, as are all independent variables.
1. For every penny in federal tax, the retail gasoline price increases by 1.268 pennies.
71
2. For every penny in state sales tax the price increases by only 0.725 cents.
3. For every penny in wholesale price, the retail price increases by 1.091 pennies.
4. The time trend, which advances by one unit for every month starting in January 1985,
indicates that the average real price of gasoline increases by about 0.03 cents per gallon per
5. The multiple regression results do not suggest a tremendous amount of pricing power on the
6. The R2 is very high; approximately 99.2% of the variation in retail gasoline prices are
explained by the variables included in the model (although it should be noted that the data
7. To return to the conspiracy theory that prices are actively manipulated by retailers, the
95% confidence interval of the wholesale price parameter is [1.068, 1.115]. At the maximum,
the historical pre-tax markup on marginal cost at the retail level is approximately 11.5%,
8. One other conclusion is that while wholesale price increases are associated with retail price
decreases, it is also true that wholesale price decreases are associated with retail price de-
• What if we defined a dummy variable that took a value of one when the wholesale price of gasoline
declined from one month to the next and included that as an additional regressor. If the retail
market reacts symmetrically to increases and decreases in wholesale price changes, this dummy
. tsset obs
time variable: obs, 1 to 264
. sum wpdown
72
. reg allgradesprice fedtax avestatetax wprice wpdown obs
------------------------------------------------
gasprice | Coef. Std. Err. t P>|t|
-------------+----------------------------------
fedtax | 1.328 .132 10.00 0.000
avestatetax | .741 .168 4.39 0.000
wprice | 1.101 .009 112.04 0.000
wpdown | 3.777 .348 10.84 0.000
obs | .027 .007 3.63 0.000
_cons | 2.245 2.258 0.99 0.321
-------------------------------------------------
• The historical data suggest that the price of gasoline is 3.777 cents higher in months when the
wholesale price declines. This suggests that there is an asymmetric effect of wholesale price
• Many in information providing industries are anxious about software piracy. Many policy sug-
gestions have been made and the industry is pursuing legal remedies against individuals.
• However, there might be economic influences on the prevalence of software piracy. Bezmen
and Depken (2006, Economics Letters) looks at the impact of various socio-economic factors on
estimated software piracy rates in the United States from 1999, 2000, and 2001.
• Consider the simple regression model in which piracy is related to per capita income (measured
in thousands):
-----------------------------------------------------------------
Robust
Piracy | Coef. Std. Err. t P>|t| [95% Conf. Interval]
73
--------+--------------------------------------------------------
lninc | -24.39808 3.932914 -6.20 0.000 -32.17 -16.62616
_cons | 114.1777 13.55488 8.42 0.000 87.39163 140.9638
-----------------------------------------------------------------
• As expected, states with greater income levels have lower levels of software piracy.
• What if we include other factors such as Economic Freedom (from the Fraser Institute), the level
of taxation (from the Tax Foundatin), unemployment (from the Bureau of Labor Statistics), and
two dummy variables to control for year 2000 and year 2001:
3. States with greater taxation (which might proxy for enforcement efforts) tend to pirate less.
5. The parameter on yr01 suggests that piracy was greater in 2001 than in 2000 or 1999.
74