Anda di halaman 1dari 18

Lecture 4

The Bivariate
Regression:
Inference
4.1

Aims and Learning Objectives


By the end of this session students should be able to:
Understand why we conduct statistical inference
Calculate and interpret interval estimation and
hypothesis tests
Distinguish between Type I and Type II Errors
Interpret p-Values
4.2

4.1 Introduction
In Lecture 2 we looked at how we calculate
point estimates of the regression parameters, and
in Lecture 3 under what circumstances these are
considered to be BLUE (best linear unbiased
estimators)
Also, we determined the probability distribution of
OLS estimators
The fact that the estimator follows a particular
probability distribution helps us relate the sample
to the population
4.3

4.2 Statistical Inference


Our goal, therefore, is to use the estimates from
the sample to infer something about the population
For our purposes, we assume the sample data we
have is our best and only information about the
population

4.4

4.2 Statistical Inference


How do we decide whether our sample estimates
are close to the population parameters?
Remember, each sample from the population gives
^
^
us different estimates of 1 and 2, resulting in the
sampling distributions discussed in Lecture 3
It is quite possible that one particular estimate comes
from an unbiased distribution but is far from the
population parameter
Use statistical inference by running tests on ^2
4.5

4.2 Statistical Inference


Recall from Lecture 3, if assumptions A1 to A6
hold:

2 ~ N 2 ,
2

xi
We would test the hypothesis H0: 2 = b0
versus H1: 2 b0

4.6

We use the (student) t - distribution

t =

( 2 )
^
se(2)

Where

se( 2 )

~ t (n2)

2
i

Under H0, t has a t Distribution with n2


degrees of freedom.
4.7

Student-t vs. Normal Distribution


0.4
0.3

normal
t, 10 d.f.

0.2

t, 5 d.f.

0.1
0
-6

1.
2.
3.
4.
5.

-5

-4

-3

-2

-1

Both are symmetric bell-shaped distributions.


Student-t distribution has fatter tails than the normal.
Student-t converges to the normal for infinite sample.
Student-t conditional on degrees of freedom (df).
Normal is a good approximation of Student-t for the first few decimal
places when df > 30 or so.

4.8

4.3 Interval Estimation


Sample is never a perfect representation of the
population from which it is drawn.
In order to quantify the likely magnitude of
the error, it is often useful to be able to specify
a range of values, within which we could state
with reasonable certainty that the population
parameter we are estimating should lie.

4.9

Interval Estimation
Standard deviation known
95% confidence interval
^
^
2 - 1.96 sd 2 2 + 1.96 sd

Standard deviation unknown, estimated by standard error


95% confidence interval
^
^
^
^
2 - t (n-2,2.5%) se( 2) 2 2 + t (n-2,2.5%) se( 2)

General Formula:
Pr [sample estimate critical value standard error] = 1-

4.10

4.4 Hypothesis Testing


1. Determine null and alternative hypotheses.
2. Specify the test statistic and its distribution
as if the null hypothesis were true.
3. Select and determine the rejection region.
4. Calculate the sample value of test statistic.
5. State your conclusion.

4.11

Hypothesis Testing
Step 1: State the null hypothesis, H0:
the hypothesis we wish to test
(e.g. H 0 : 2 b0 )
Step 2: State the alternative hypothesis, H 1,
which is true if H0 is false.
One-sided (e.g. H1: 2 > b0 or H1: 2 < b0 )
Two-sided (e.g. H1: 2 b0)
4.12

Hypothesis Testing
Step 3: Select the significance level, , of the test
(typically, = 0.1, 0.05 or 0.01).
Step 4: Calculate the test-statistic

2 b0
t
Se( 2 )

Step 5: Calculate the critical values of the distribution

t ;n 2

if one-sided; t / 2;n 2 if two-sided.


4.13

Hypothesis Testing
Step 6: Apply the decision rule
For one-sided test, H1:2 < b0 , if t t ;n 2
then reject H0. Otherwise do not reject.
For one-sided test, H1:2 > b0 , if t t ;n2
then reject H0. Otherwise do not reject.
H1:2 b0 if t t / 2;n 2 ,

Two-sided test,
then
reject H0. Otherwise do not reject.

4.14

Hypothesis Testing
t Distribution
Region of Non-rejection

f(t)

Reject
region

Reject
region

()

/2
-t

/2
t

red area = rejection region for 2-sided test


4.15

Hypothesis Testing
s.d. of 2 known

s.d. of 2 not known

discrepancy between
hypothetical value and sample
estimate, in terms of s.d.:

discrepancy between
hypothetical value and sample
estimate, in terms of s.e.:

2 b0
z
s.d

2 0
t
Se( 2 )

5% significance test:

5% significance test:

reject H0: 2 = 0 if

reject H0: 2 = 0 if

> 1.96 or

< -1.96

t > tc or t < -tc

Accordingly, we refer to the test statistic as a t statistic. In other respects the test
procedure is much the same.
4
4.16

4.5 Type I and Type II errors


Type I error:
We make the mistake of rejecting the null
hypothesis when it is true.
= Prob (rejecting H0 when it is true).
Type II error:
We make the mistake of failing to reject the null
hypothesis when it is false.
= Prob (failing to reject H0 when it is false).

4.17

4.6 p-Values
A p-value of a test is calculated from the absolute
value of the t-statistic
It provides an alternative approach to reporting the
significance of regression coefficients
The p-value reports the probability of falsely rejecting
the null hypothesis that 2 = 0 against 2 0. It
thereby provides an exact probability of a Type I
error
General Rule: If the p-value is smaller than the
chosen value of (significance level) then the test
procedure leads to rejection of the null hypothesis
(based on a two-sided test)
4.18

Anda mungkin juga menyukai