Anda di halaman 1dari 30

Introduction to Econometrics

Ordinary Least Squares


REVIEW

2
Variables, constants, coefficients and parameters
• Variabel: sesuatu yang besarnya dapat berubah, sesuatu y
ang nilainya bisa berbeda-beda
Misalnya: harga, laba, pendapatan, biaya, pendapatan nasi
onal, konsumsi, investasi, impor, ekspor, dsb.
Biasanya diwakili dengan simbol: P, , R.
• Model ekonomi dapat diselesaikan untuk diperoleh nilai-nila
i solusi dari suatu set variabel tertentu.
– Variabel endogen: variabel yang kita cari nilai solusinya.
– Variabel eksogen: berasal dari luar model.
• Konstanta: sesuatu yang besarnya/nilainya tidak berubah.
• Koefisien: konstanta yang bergabung/melekat pada variab
el. Misalnya: 7P
• Parameter: nilai yang berbeda-beda dapat diberikan pada p
arameter. Biasanya diwakili oleh simbol seperti a, b, c, atau
huruf latin: ,, 
Jenis Variabel

Kualitatif atau atribut


(Misal: Jenis kelamin)
Data
Diskrit
Kuantitatif or
(Misal: Jumlah anak)
numerik
Continuous
(Misal: Jam tidur dalam semalam)
Variabel Kualitatif
• Memiliki karakteristik yang non numerik.
• Misalnya: Jenis kelamin, agama, merek sepeda
motor, selera, dll.
Kadang variabel kualitatif dikonversi menjadi angka untuk me
mudahkan penghitungan statistik. Sebagai contoh, jawaban “
Suka” in diberi kode 1, “Tidak Suka” diberi kode 0. Tapi peng
kodean tidak mengubah sifat variabel.
Variabel Kuantitatif
• Informasi dilaporkan secara numerik
• Misalnya: tinggi badan, berat badan, jumlah anak, dll.

1. Kuantitatif diskrit:
data yang diperoleh dengan cara menghitung. Data berupa bilan
gan bulat yang terpisah-pisah satu dg yang lain.
Misal:
– jumlah mahasiswa dalam satu kelas: 1, 2, 3, 10,.. dst,
– jumlah mobil di surabaya

2. Kuantitatif continuous:
data yang diperoleh dengan cara mengukur. Data berada dalam
satu rentang tertentu.
Misal: tinggi badan, jarak antara rumah dengan kampus, dll.
Skala pengukuran data
1. Nominal,
2. Ordinal,
3. Interval, dan
4. Rasio .
Skala Nominal
 Merupakan tingkat pengukuran yg paling sederh
ana untuk klasifikasi data
 Data diklasifikasikan berdasarkan kategori dan ti
dak berurutan atau tak menunjukan rendah ke ti
nggi atau sebaliknya

• Misalnya
 variabel jenis kelamin : pria dan wanita
 warna, partai, lokasi, dsb
Skala Ordinal
 Meliputi data yang diatur secara berurutan, akan
tetapi perbedaan antara nilai data tidak dapat dit
entukan atau tidak berarti

• misalnya
 variable sikap : 3 = “setuju”, 2 = “ragu-ragu/ tida
k berpendapat”, dan 1= “tidak setuju”
 pendidikan (SD, SLTP, SLTA, perguruan tinggi)
Skala Interval
• Serupa dengan Ordinal Level, dengan tambahan bahwa
perbedaan antar nilai data menjadi berarti dan dapat dite
ntukan. Secara natural tidak ada nilai nol.
• merupakan tingkat pengukuran urutan dari rendah ke tin
ggi

• misalnya
variable nilai ujian : A = 86-99, B = 76-85, C = 66-75, D = 5
6-65
Skala Rasio
• Serupa dengan Ordinal Level, bedanya di sini nilai nol m
enjadi berarti
 Juga merupakan tingkat pengukuran urutan dari rendah
ke tinggi
 Memberikan informasi tentang nilai sebenarnya respond
en/ objek yang diukur

• Misalnya
 variable nilai ujian dari 0 - 100
Getting a Feel for the Data
Introduction
• Economics suggests interesting relations, often with
policy implications, but most of the time theory
does not provide quantitative magnitudes of causal
effects.
• What is the price elasticity of cigarettes?
• What is the effect of reducing class size on student
achievement?
• What is the effect on earnings of a year of education?

13
Controlled experiments
• The focus of this course is the use of statistical and
econometric methods to quantify causal effects.
• Ideally, we would like to conduct a controlled
experiment with cigarette prices, class size, returns
to education, etc.
• But most of the time, we cannot do so and must use
observational (non-experimental) data.

14
Observational studies

• Observational studies pose major challenges: e.g.


consider the estimation of returns to education…
• Confounding effects (omitted factors)
• Simultaneous causality
• “Correlation does not necessarily imply causation”

15
In this course…
You will…
• learn methods for estimating causal effects using
observational data;
• learn some tools that can be used for other
purposes, for example forecasting using time series
data;
• focus on applications – theory is used only as
needed to understand the “why”’s of the methods;
• learn to do analysis and evaluate the work of other
econometric applications.

Sounds exciting?
16
Stages of
applied
econometric
research

17
Types of data
• Cross-sectional data
• Time-series data
• Panel data

18
The Simple Regression Model

Y X
• We begin our study with an
Dependent Independent
extensively investigation of variable variable
the estimation of a linear
Explained Explanatory
relationship between two
variable variable
variables, Yi and Xi, of the
form: Predictand / Predictor
Predicted
Regressand Regressor
Outcome Covariate
for
Response Stimulus
Controlled Control
variable variable
19
The Population Regression Function
(PRF)
• Imagine we wish to find the relationship between father’s
height and son’s height.
• Imagine we had data on ALL the world’s father’s and son’s
heights.
• Then we can get the Population Regression Function!

son' s height i     (father' s height ) i  ui

• But we can’t observe the heights of ALL fathers and sons in


the world throughout time.
• So we have to do with a sample!

20
Pearson’s sample
• The following
scatter diagram
shows the
heights of 1,078
fathers and their
full-grown sons,
in England, circa
1900.
• There is one dot
for each father-
son pair.

21
So how can we draw a line?
• Using the
ordinary least
squares (OLS)
method, we can
get the sample
regression line:

son' s height i  ˆ  ˆ (father' s height )i

22
Idea of OLS
• Given

• The idea is to find α


and β (that draws a
line) which
minimizes the sum
of the residuals’
squared.

23
Fitted Values and Residuals

24
Hypothesis testing

Step 1:
Step 2:
observed  expected
t  stat 
SE
Step 3:
Use i) p-value (reject null if p-value < α/2)
ii) “critical value” (reject null if |TS| > 1.96)
iii) Confidence interval method (reject null if
null is outside the (1- α)% C.I.)
25
The Population and Sample
Regression Lines
• In general, the simple population regression
line can be written as:
E(Y | X i )     X i or Yi     X i  ui

• Taking a sample, the estimated regression


line can be written as either:

Ŷi  ˆ  ˆX i or Yi  ˆ  ˆX i  uˆi

26
The Stochastic Term (ui)
• 1. Randomness of human behavior
• 2. Unavailability of data
• 3. Omission of variables from the function
• 4. Wrong or imperfect specification of the
functional form
• 5. Errors of aggregation
• 6. Errors of measurement

27
Decomposing the Variation
Y
Y
Y  Yˆ 

Y  Y 

Yˆ  Y 

X
 
Yi  Y  [  X i  Y ]  [Yi  (  ˆX i )]
ˆ

Total = Explained + Residual 28


SST, SSE, SSR

 i   i
ˆ X )  Y ]  [Y  (ˆ  ˆ X )]
2 2 2
(Y  Y )  [(ˆ   i i

SST = SSE + SSR


Total Sum of Squares
SST
Total variation around the average of y

Explained Sum of Squares


SSE
Variation verified by the regression line

Residual[unexplained] Sum of Squares


SSR
Variation not verified by the regression line

29
SSE SSR
R 
2
 1 (0  R  1) 2

SST SST
•The R2 can be interpreted as the fraction of sample
variation in y that is explained by x
Note 1: in the simple regression analysis, R2 is the same as the
square of the correlation coefficient, r.
Note 2:
SSE SSR
1 
SST SST

 Y  Y  
2
uˆ 2

 
i i

 Y  Y   Y  Y 
2 2
i i
30

Anda mungkin juga menyukai