Anda di halaman 1dari 145

OUTLINE

BAGIAN I STATISTIK INDUKTIF

METODE DAN DISTRIBUSI SAMPLING Pengertian Populasi dan Sampel

Teori Pendugaan Statistik Metode Penarikan Sampel

Pengujian Hipotesis Sampel Besar Kesalahan Penarikan Sampel

Pengujian Hipotesis Sampel Kecil Distribusi Sampel Rata-rata dan Proporsi

Analisis Regresi dan Korelasi Linear Distribusi Sampel Selisih Rata-rata dan
Proporsi
Analisis Regresi dan Korelasi Berganda Faktor Koreksi untuk Populasi Terbatas

Fungsi, Variabel, dan Masalah dalam Dalil Batas Tengah


Analisis Regresi
HUBUNGAN SAMPEL DAN
POPULASI

Populasi Sampel
8- 4

Sampling Methods and the Central Limit Theorem

What is sample ?
POPULATION
D
B
A
Z
C …..
SAMPLE
B
8- 5

Why sample?
The physical
impossibility of
checking all items in
the population.
The time-consuming
The cost of studying aspect of contacting
all the items in a the whole population.
population.
The adequacy of
The destructive sample results
nature of in most cases.
certain tests.
Why Sample the Population?
DEFINISI
Populasi
kumpulan dari semua kemungkinan orang-orang, benda-benda,
dan ukuran lain yang menjadi objek perhatian atau kumpulan
seluruh objek yang menjadi perhatian

Terbatas Tidak terbatas


unsurnya terbatas berukuran N suatu populasi yang mengalami
proses secara terus-menerus
Contoh: populasi bank, populasi
sehingga ukuran N menjadi
perusahaan reksa dana
tidak terbatas perubahan
nilainya
DEFINISI

Sampel
suatu bagian dari populasi tertentu yang menjadi perhatian

Probabilitas Nonprobabilitas
Merupakan suatu sampel yang Merupakan suatu sampel yang
dipilih sedemikian rupa dari dipilih sedemikian rupa dari
populasi sehingga masing-masing populasi sehingga setiap anggota
anggota populasi memiliki tidak memiliki probabilitas atau
probabilitas atau peluang yang peluang yang sama untuk
sama untuk dijadikan sampel dijadikan sampel
OUTLINE
BAGIAN I STATISTIK INDUKTIF

METODE DAN DISTRIBUSI SAMPLING Pengertian Populasi dan Sampel

Teori Pendugaan Statistik Metode Penarikan Sampel

Pengujian Hipotesis Sampel Besar Kesalahan Penarikan Sampel

Pengujian Hipotesis Sampel Kecil Distribusi Sampel Rata-rata dan Proporsi

Analisis Regresi dan Korelasi Linear Distribusi Sampel Selisih Rata-rata dan
Proporsi
Analisis Regresi dan Korelasi Berganda Faktor Koreksi untuk Populasi Terbatas

Fungsi, Variabel, dan Masalah dalam Dalil Batas Tengah


Analisis Regresi
METODE PENARIKAN SAMPEL

Metode Penarikan Sampel

Sampel Probabilitas Sampel Nonprobabilitas


(Probability Sampling) (Nonprobability Sampling)

1. Penarikan sampel acak sederhana (simple 1. Penarikan sampel sistematis (systematic


random sampling) sampling)
2. Penarikan sampel acak terstruktur (stratified 2. Penarikan sampel kuota (quote sampling)
random sampling) 3. Penarikan sampel purposive (purposive
3. Penarikan sampel cluster (cluster sampling) sampling)
METODE PENARIKAN SAMPEL
• pengambilan sampel dari populasi secara acak
Sampel Acak tanpa memperhatikan strata yang ada dalam
populasi dan setiap anggota populasi memiliki
Sederhana kesempatan yang sama untuk dijadikan sampel.

Simple Random Sample


A sample formulated so
that each item or person
in the population has the
same chance of being
included.
METODE PENARIKAN SAMPEL

• pengambilan sampel dari populasi secara acak tanpa


memperhatikan strata yang ada dalam populasi dan
Sampel Acak Sederhana setiap anggota populasi memiliki kesempatan yang
sama untuk dijadikan sampel.

Sistem • Sama sistem arisan.


Kocokan

Menggunakan • Memilih sampel dengan menggunakan suatu tabel


acak. Dalam penggunaannya ditentukan terlebih dahulu
Tabel Acak titik awal (starting point).
METODE PENARIKAN SAMPEL
• Penarikan dikatakan sampel sistematis apabila setiap
unsur atau anggota dalam populasi disusun dengan
Sampel Sistematis cara tertentu–secara alfabetis, dari besar kecil atau
sebaliknya–kemudian dipilih titik awal secara acak lalu
setiap anggota ke-K dari populasi dipilih sebagai
sampel

Systematic Random Sampling


The items or individuals of the
population are arranged in some
order. A random starting point is
selected and then every kth member
of the population is selected for
the sample.
Bab 11 Metode dan Distribusi
Sampling
METODE PENARIKAN SAMPEL
• Penarikan sampel acak terstruktur dilakukan dengan
Sampel Acak membagi anggota populasi dalam beberapa
Terstruktur subkelompok yang disebut strata, lalu suatu sampel
dipilih dari masing-masing stratum.

Stratified Random
Sampling: A
population is first
divided into subgroups,
called strata, and a
sample is selected from
each stratum.
Bab 11 Metode dan Distribusi
Sampling
PROSES STRATIFIKASI

Populasi tidak berstrata Populasi terstrata


CONTOH PENARIKAN SAMPEL ACAK
TERSTRUKTUR
Stratum Kelompok Jumlah Persentase dari Jumlah sampel
anggota total per stratum

1 Bulat 5 21 2 = (0,21 × 10)


2 Segi tiga 7 29 3 = (0,29 × 10)
3 Kotak 12 50 5 = (0,50 × 10)

Jumlah total 24 100 10


CONTOH PENARIKAN SAMPEL
ACAK TERSTRUKTUR
Stratum Jumlah anggota Persentase dari Jumlah sampel
kelompok total per stratum
Perbankan 20 36 5(20/55) × 15
Asuransi 17 31 5(17/55) × 15
Pembiayaan 9 16 2(9/55) × 15
Efek 9 16 2(9/55) × 15

Jumlah total 55 100 15


PENARIKAN SAMPEL KLUSTER

Sampel Terstruktur Sampel Kluster


KESALAHAN PENARIKAN SAMPEL

Kesalahan penarikan • Merupakan perbedaan antara nilai statistik


sampel sampel dengan nilai parameter dari populasi.
DISTRIBUSI SAMPEL RATA-RATA
DAN PROPORSI

Distribusi sampel:

Distribusi sampel dari rata-rata hitung sampel adalah suatu distribusi


probabilitas yang terdiri dari seluruh kemungkinan rata-rata hitung
sampel dari suatu ukuran sampel tertentu yang dipilih dari populasi,
dan probabilitas terjadinya dihubungkan dengan setiap rata-rata
hitung sampel.
DISTRIBUSI SAMPEL RATA-RATA DAN PROPORSI
Bank ROA
Bank Lippo Tbk 2
Bank BRI Tbk 4
Maybank Indocorp Tbk 6
BPD Jawa Tengah 4
Bank BTPN 4

a. Nilai rata-rata populasi

 x 2  4  6  4  4 20
   4
N 5 5

b. Nilai rata-rata populasi dan sampel apabila diambil sampel 2 dari 5 bank

N! 5! 5!
CnN     10
n!(N  n)! 2!(5  2)! 2!3!
DISTRIBUSI SAMPEL RATA-RATA DAN PROPORSI

2. Perhitungan rata-rata dari setiap sampel


No. Kombinasi Kombinasi ROA Rata-rata Hitung

Lippo – BRI 2+4 (6/2 )= 3


Lippo – Maybank 2+6 (8/2 )= 4
Lippo – BPD Jateng 2+4 (6/2 )= 3
Lippo – BTPN 2+4 (6/2 )= 3
BRI – Maybank 4+6 (10/2 )= 5
BRI – BPD Jateng 4+4 (8/2 )= 4
BRI – BTPN 4+4 (8/2 )= 4
Maybank – BPD Jateng 6+4 (10/2 )= 5
Maybank – BTPN 6+4 (10/2 )= 5
BPD Jateng – BTPN 4+4 (8/2 )= 4
DISTRIBUSI SAMPEL RATA-RATA DAN PROPORSI

X
1
X N
Cn

1
X  (3  4  3  3  5  4  4  5  5  4)  40/10  4
10
DISTRIBUSI SAMPEL RATA-RATA DAN
PROPORSI
c. Nilai rata-rata populasi

Populasi Sampel
Nilai rata- Frekuensi Probabilitas Nilai rata- Frekuensi Probabilitas
rata rata
2 1 (1/5) = 0,20 3 3 (3/10) = 0,30

4 3 (3/5) = 0,60 4 4 (4/10) = 0,40

6 1 (1/5) = 0,20 5 3 (3/10) = 0,30

Jumlah 5 1,00 10 1,00


DISTRIBUSI SAMPEL RATA-RATA DAN
PROPORSI
Distribusi probabilitas dalam bentuk poligon

0,7 0,5
0,6
0,4
0,5
0,3
0,4
0,3 0,2
0,2
0,1
0,1
0 0

2 4 6 2 4 6
DISTRIBUSI SAMPEL RATA-RATA DAN
PROPORSI
d. Standar deviasi populasi

 x  
2


N
X X– (X – )2
x
2 -2 4
4 0 0
6 2 4
4 0 0
4 0 0

  X  
2
X = 20  8,0
 = 20/5 = 4
 
2
 X  5  8 5  1,3
DISTRIBUSI SAMPEL RATA-RATA
DAN PROPORSI
Standar deviasi sampel

 Nn
s
n N 1
X xX –  x
(X – )2
3 -1 1
4 0 0
3 -1 1
3 -1 1
5 1 1
4 0 0
4 0 0
5 1 1
5 1 1
 X X  6,0
2
X = 40
xX = 40/10 = 4  X    6 10  0,77
1 2
 N x
Cn
8- 29
Example 1
Partner Hours
The law firm of Dunn 22
Hoya and 26
Hardy
Associates has five
Kiers 30
partners. At their
weekly partners Malory 26
meeting each Tillman 22
reported the
number of hours If two partners are
they billed clients selected randomly, how
for their services many different samples
last week. are possible?
8- 30

5 objects 5! A total of 10
taken 2 at 5 C2   10
2! (5  2)! different
a time. samples
Partners Total Mean
1,2 48 24
1,3 52 26
1,4 48 24
1,5 44 22
2,3 56 28
2,4 52 26
2,5 48 24
3,4 56 28
3,5 52 26
4,5 48 24

Example 1
8- 31

As a sampling distribution

Sample Mean Frequency Relative


Frequency
probability

22 1 1/10
24 4 4/10
26 3 3/10
28 2 2/10

Example 1 continued
8- 32

Compute the mean of the sample means.


Compare it with the population mean.

The mean of the sample means


Notice that the
22(1)  24(4)  26(3)  28(2) mean of the
X   25.2
10 sample means is
exactly equal to
The population mean the population
mean.
22  26  30  26  22
  25.2
5
Example 1 continued
8- 33

Thinking Challenge

Suppose you’re interested


in the average amount of
money that students in this
class (the population) have
on them. How would you
find out?
8- 34

Rp. 50.000
Or
Rp. 25.000 – Rp. 100.000
8- 35
 Term of estimation
 Parameter
 Statistic
 Estimator
 Estimate

Estimate Population with Sample


Parameter... Statistic
Mean  x
Proportion p p^
2

2
Variance s
Differences 1 -  2 x1 -x2
8- 36

SAMPLING PROCESS
Population Random Sample
Mean
Mean, , is X = 50 

unknown

  Ex:
 
Sample 
 The scores of 50 students of mid test value Identify
  the target parameter and the point estimator if 10
 randomly choosen of student!

8- 37

A point estimate is a single


value (statistic) used to
estimate a population value
( p a r a m e t e r ) .
8- 38

The scores of 50 students of mid test value Identify the


target parameter and the point estimator if 10 Randomly
randomly chosen of student! chosen

55  71  76  77  85  ...  96
xˆ  x  E ( x)     81.5
10
(55  81,5)2  (71  81,5)2  ...  (96  81,5)2
s  E(s )   
2 2 2
 146,056
9
39  48  63  ....  97  99
  79,98 How about population ?
50
(39  79,98)2  (48  79,98)2  ..... (97  79,98)2  (99  79,98)2
 
2
 152,387
49
8- 39

Central Limit Theorem


For a population with a mean  and a variance 2
the sampling distribution of the means of all possible
samples of size n generated from the population will be
approximately normally distributed.

The standard error of the This approximation improves


mean is the standard with larger samples.
deviation of the standard The mean of the sampling
deviation of the sample distribution equal to m and
means given as: the variance equal to 2/n.
x = 
n Central Limit Theorem
8- 40

Sample means
follow the normal the underlying population
probability follows the normal
distribution under distribution
two conditions:

OR

the sample size is large


enough even when the
underlying population
may be nonnormal

Sample Means
8- 41

X 
To determine the probability z
that a sample mean falls s n
within a particular region,
use

Use  in place of s if the population


standard deviation is known.

Sample Means
8- 42

Suppose the mean selling


price of a gallon of gasoline
in the United States is $1.30.
Further, assume the
distribution is positively
skewed, with a standard
deviation of $0.28. What is
the probability of selecting a
sample of 35 gasoline
stations and finding the
sample mean within $.08?

Example 2
8- 43

Step One : Find the z-values corresponding to


$1.22 and $1.38. These are the two points within
$0.08 of the population mean.

X  $1.38  $1.30
z   1.69
s n $0.28 35

X  $1.22  $1.30
z   1.69
s n $0.28 35
Example 2 continued
8- 44

Step Two: determine the probability of a z-value


between -1.69 and 1.69.

P(1.69  z  1.69)  2(.4545)  .9090

We would expect about 91


percent of the sample
means to be within $0.08 of
the population mean.

Example 2 continued
8- 45

Ilustrasi sampel mean yang diketahui anggotanya

Angka keberhasilan mahasiswa Statistika 15 orang siswa tahun


pertama di UGM yang dipilih secara acak dari suatu angkatan
tertentu sebagai berikut: 2,30; 2,60; 3,10; 1,90; 2,10; 3,40; 2,10;
2,70; 2,00; 1,80; 2,90; 2,40; 2,60; 3,60 dan 2,10. (a) Berapa
persen mahasiswa yang memiliki nilai (IPK) kurang dari 3,00;
(b) Berapa persen mahasiswa yang memiliki nilai (IPK) lebih
dari 2,90 dan (c) berapa persen yang memiliki nilai (IPK)
antara 2,50 hingga 3,80
8- 46

No Nilai (x) (x- x (x-x 2


1 2,30 -0,20667 0,042711
Mencari nilai Rata-
2 2,60 0,093333 0,008711 rata/mean ( x ) dan standar
3 3,10 0,593333 0,352044 deviasi /s
4 1,90 -0,60667 0,368044
5 2,10 -0,40667 0,165378
6 3,40 0,893333 0,798044

  
7 2,10 -0,40667 0,165378

2
x x
8 2,70 0,193333 0,037378
s2 
9 2.00 -0,50667 0,256711 n 1
10 1,80 -0,70667 0,499378
11 2,90 0,393333 0,154711
12 2,40 -0,10667 0,011378
13 2,60 0,093333 0,008711 s s 2
14 3,60 1,093333 1,195378
15 2,10 -0,40667 0,165378
Jumlah: 37,60 Jumlah: 4,229333
2
s = 4,229333 / 14
Rata-rata ( x) = 37,60 / 15
= 0,302095
= 2,506667 s = 0,549632
8- 47

Menghitung Z-value Z
 x  x
s n

3,00  2,51 0,49


Z   0,23
0,55 15 2,13

a. Persen mahasiswa yang memiliki nilai (IPK) kurang dari 3 ,00 berarti P(<
3,00) atau dalam nilai Z-value P (Z< 0,23) adalah 1-P (Z ≥ 0,23). Dari tabel
distribusi normal diperoleh P(Z ≥ 0,23) = 0,409, sehingga P(Z<0,23)=1-
0,409 = 0,591.
Jadi mahasiswa statitika UGM yang memiliki nilai (IPK) kurang dari 3
sebanyak 59,10%
8- 48

2,90  2,51 0.39


Z   0,18
0,55 15 2,13
b. Persen mahasiswa yang memiliki nilai
lebih dari 2,90 berarti P(X > 2,90).
P(X > 2,90) = 1- P(X ≤ 2,90) = 1- P(Z
≤ 0,18)
= 1- 0,5714 = 0,4286 atau 42, 86% 2,50  2,51  0.01
Z   0,005
0,55 15 2,13
c. Persen mahasiswa yang memiliki nilai antara 2,50
hingga 3,80 berarti P(2,50 ≤ X ≤ 3,80). xx
P(2,50 ≤ X ≤ 3,80) = P(-0,005 ≤ Z ≤ 0,61) Z
s n
karena distribusi simetris sehingga
P(Z ≤0,005) + P(Z ≤ 0,61) = 0,0040 + 0,2291
= 0,2331 3,80  2,51 1,29
= 23,31% Z   0,61
0,55 15 2,13
HUBUNGAN STANDAR DEVIASI SAMPEL
DAN POPULASI

Hubungan antara x  Nn


dan  untuk s
populasi terbatas n N 1

Hubungan antara x

dan  untuk
populasi yang tidak s
terbatas n
DISTRIBUSI SAMPLING PROPORSI

Nilai rata-rata 1
proporsi Pp  N
Cn

p  Pp
Standar deviasi 1 2
Sp  N
sampel proporsi Cn

Standar deviasi P 1  P  Nn


Sp  
proporsi n N1
SKEMA SELISIH POPULASI ATAU SAMPEL

Sampel 1
Populasi 1
berukuran
1, 1
X 1, Sx1

Apakah
X1 ,X2  1 , 2

Populasi 2 Sampel 2
2, 2 berukuran
X2 , Sx2
SKEMA SELISIH POPULASI ATAU SAMPEL

Pada dasarnya setiap sampel berukuran n yang diambil dari populasi


merupakan variabel random dan cenderung mendekati normal. Oleh sebab
itu, distribusi dari selisih rata-rata dan proporsi pada dasarnya juga
mengikuti pola distribusi normal.

Xx1  x2  X1  X1   1   2 Pp1  p2  Pp1  Pp2  p1  p2

Distribusi selisih proporsi


Distribusi selisih rata-rata
DISTRIBUSI SAMPEL SELISIH RATA-
RATA DAN PROPORSI
Nilai rata-rata distribusi sampel selisih rata-rata x1 – x2

Xx1x2  X1  X2  1  2
Nilai Standar deviasi distribusi sampel selisih rata-rata x1 – x2

s2x1 s2x2
sx1 x2  s2x1  s2x2  
n1 n2

Nilai Z untuk distribusi sampel selisih rata-rata

Z
 x1  x2   1  2 
sx1 x2
Bab 11 Metode dan Distribusi
Sampling
DISTRIBUSI SAMPEL SELISIH RATA-
RATA DAN PROPORSI
Nilai rata-rata distribusi sampel selisih proporsi Pp  p 2
1

Pp1 p2  Pp1  Pp2  p1  p2

Nilai Standar deviasi distribusi sampel selisih rata-rata p p2


1

P1 (1  P1 ) P2 (1  P2 )
Sp1 p2  Sp12  Sp22  
n1 n2

Nilai Z untuk distribusi sampel selisih rata-rata

(p1  p2 )  (P1  P2 )
Z
Sp1 p2
FAKTOR KOREKSI
Penyesuaian standar deviasi untuk rata-rata hitung:

 Nn
sx 
n N1

Penyesuaian standar deviasi untuk proporsi:

p 1  p  N  n
sp 
n N1
SAMPEL SAMA DENGAN POPULASI,
VARIAN SAMPEL 2/N
• Distribusi sampel:

• Untuk populasi dengan rata-rata  dan varians 2,


rata-rata hitung distribusi sampel dari seluruh
kemungkinan kombinasi sampel berukuran n yang
diperoleh dari populasi akan mendekati distribusi
normal, di mana rata-rata hitung distribusi sampel
sama dengan rata-rata hitung populasi (x – ) dan
varians distribusi sampel sama dengan 2/n.

Bab 11 Metode dan Distribusi


Sampling
LEMBAR KOSOSNG
ESTIMASI TITIK
(POINT ESTIMATION)
OUTLINE
Bagian I Statistik Induktif

Metode dan Distribusi Sampling Pengertian Teori dan Kegunaan


Pendugaan

Teori Pendugaan Statistik Pendugaan Titik Parameter

Pendugaan Interval
Pengujian Hipotesa Sampel Besar
Kesalahan Standar dari Rata-rata
Hitung Sampel
Pengujian Hipotesa Sampel Kecil
Menyusun Interval Keyakinan
Analisis Regresi dan Korelasi Linier
Interval Keyakinan Rata-rata
dan Proporsi
Analisis Regresi dan Korelasi Berganda
Interval Keyakinan Selisih Rata-rata
dan Proporsi
Konsep Dasar Persamaan Simultan
Memilih Ukuran Sampel
CAKUPAN STATISTIKA
ESTIMATION METHODS

Estimation

Point Interval
Estimation Estimation
POINT ESTIMATION
Thinking Challenge
Suppose you’re
interested in the
average amount of
money that students
in this class (the
population) have on
them. How would
you find out?
Rp. 50.000
Or
Rp. 25.000 – Rp. 100.000
Term of estimation
– Parameter
– Statistic
– Estimator
– Estimate

Estimate Population with Sample


Parameter... Statistic
Mean  x
Proportion p p^
2

2
Variance s
Differences 1 -  2 x1 -x2
What is point estimate?

A point estimate is a single


value (statistic) used to
estimate a population value
( p a r a m e t e r ) .

Ex:
ESTIMATION PROCESS
Population Random Sample
Mean
Mean, , is  
 X = 50
unknown
 Ex:
 
 
Sample 
 The scores of 50 students of mid test value
  Identify the target parameter and the point
 estimator if 10 randomly choosen of student!

The scores of 50 students of mid test value Identify
the target parameter and the point estimator if 10 Randomly
randomly chosen of student! chosen

55  71  76  77  85  ...  96
xˆ  x  E ( x)     81.5
10
(55  81,5)2  (71  81,5)2  ...  (96  81,5)2
s  E(s )   
2 2 2
 146,056
9
39  48  63  ....  97  99
  79,98 How about population ?
50
(39  79,98)2  (48  79,98)2  ..... (97  79,98)2  (99  79,98)2
 
2
 152,387
49
SIFAT-SIFAT PENDUGA

• Unbiased • Efficient • Consistent


estimator estimator estimator

69
PENDUGA TIDAK BIAS

• Jika di dalam sampel


random yang
berasal dari
populasi, rata-rata
atau nilai harapan E( X ) =

(expexted value, X )
dari statistik sampel
sama dengan
parameter populasi
() atau dapat
dilambangkan
dengan E( X ) = 

E( X )  
70
Unbiased estimators

X or X̂ and S 2  and 2
These random variables are These fixed constants are
examples of statistics or examples of parameter or
estimators targets

X or X̂ Is unbiased estimator of  If E X orE (Xˆ )  

2
E(S )=2
True µ = E(X)

Bias = E(x) - µ
bias

µ E(X)
PENDUGA EFISIEN

Penduga Efisien

Penduga yang efisien adalah penduga yang tidak bias dan mempunyai
varians terkecil (sx2) dari penduga-penduga lainnya.

sx12

sx12 < sx22


sx22

73
Efficient estimators (minimum variance)
Estimators called efficient if the distribution of an
estimator to be highly concentrated or have a small
variance than another.
 2a

 2b  2b<  2a
b efficient
estimator than a

Efficiency of u relatif to w  Var (w) / Var (u)


PENDUGA KONSISTEN

Penduga Konsisten
Penduga yang konsisten adalah nilai dugaan (X ) yang semakin mendekati
nilai yang sebenarnya  dengan semakin bertambahnya jumlah sampel
(n).

n tak terhingga

n sangat besar

n besar

n kecil

75
Consistent estimator
One of conditions that makes an estimator consistent is:
If its bias and variance both appraoach zero

Lim E(Xn) = µ Lim Var(Xn) = 0


and
n ∞ n ∞

NOTE: Consisteny is more abstract, because it


defined as a limit: A consistent estimator is one
that concentrates in a narrower and narrower band
aroud its sample size n increases indefinitely .
Conclusion of Point Estimation

1. Provides a single value


• Based on observations from one
sample
2. Gives no information about how
close the value is to the unknown
population parameter
3. Example: Sample mean x = 3 is
point estimate of unknown
population mean
EXERCISE
1. Suppose each of the 200.000 adults in city under study has
eaten a number X of fast-food meals in the past week.
However, a residential phone survey on a week-day
afternoon misses those who are working-the very people
most likely to eat fast foods. As shown in the table below,
this leaves small population who would respond, especialy
small for higer values of X.
X= Number of Whole target Subpopulation
Meals (population) responding

Freq. Real. Freq. Freq. Real. Freq.


f f/N f f/N
0 100.000 0,50 38,000 0,76
1 40.000 0,20 6,000 0,12
2 40.000 0,20 4,000 0,08
3 20.000 0,10 2,000 0,04
Total 200.000 1,00 50.000 1,00
a. Find the mean µ of the whole targets population?
b. Find the sample mean  of the subpopulation who would
respond?
c. What is the estimator efficient or unbiased?
2. Suppose that a surveyor is traying to determine the area of a
rectangular field, in which the measured length X and the
measuered width Y are independent random variabeles that
fluctuate widely about the true values, according to the
following probability distribution
X P(X) Y P(Y)
8 0.25 4 0.50
10 0.25 6 0.50
11 0.50
The calculte area A = XY of course is a random variable, and
is used to estimate the true area. If the true length and width
are 10 and 5, respectively,
a. Is X an unbiased estimator of the true length?
b. Is Y an unbiased estimator of the true width?
c. Is A an unbiased estimator of the true area?
Estimation

Point Interval
Estimation Estimation
PENDUGAAN INTERVAL
Estimate

Point estimate Interval estimate

• sample mean • confidence interval for mean


• sample proportion • confidence interval for proportion

Point estimate is always within the interval estimate


Point estimation obtain
Parameter = Statistic ± Error

Bias = E(x) - µ

bias

µ E(X)=

  X ± Error
Error  X   or X  
Motivation Cont..

Estimation of point estimation of the parameter is not


sufficient. It is necessary to analyse and see how confident
we can be about this particular estimation. One way of doing
it is defining confidence intervals. If we have estimated  we
want to know if the “true” parameter is close to our estimate.
In other words we want to find an interval that satisfies
following relation:

P( L    U )  1  
Sample statistic
Confidence interval
(point estimate)

L  U
Confidence limit Confidence limit
(lower) (upper)

A probability that the population parameter falls


somewhere within the interval.
L  U

Lower Upper
Confidence Confidence
Limit
Point Estimate
Limit
Width of
confidence interval
 A point estimate is a single number,
◦ How much uncertainty is associated with a point
estimate of a population parameter?
 An interval estimate provides more information about a
population characteristic than does a point estimate. It
provides a confidence level for the estimate. Such interval
estimates are called confidence intervals
RUMUS INTERVAL PENDUGAAN

S : statistik yang merupakan penduga parameter populasi (P)

P : parameter populasi yang tidak diketahui

Sx : standar deviasi distribusi sampel statistik

Z : suatu nilai yang ditentukan oleh probabilitas yang berhubungan


dengan pendugaan interval, Nilai Z diperoleh dari tabel luas
di bawah kurva normal

C : Probabilitas atau tingkat keyakinan yang dalam praktek sudah


ditentukan dahulu

s – Zsx : nilai batas bawah keyakinan

s + Zsx : nilai batas atas keyakinan

88
What is interval estimation?

Interval estimation is an Interval


among two statistics which a
population parameter probably lies.

L  U
X =  ± Zx
x_

-2.58x -1.65x  +1.65x +2.58x


-1.96x +1.96x

90%
95%
99%
A confidence interval is a range
of values within which the
population parameter is
expected to occur.

The two confidence intervals


that are used extensively are the
95% and the 99%.
Unknown Population Parameters Are Estimated

Estimate Population with Sample


Parameter... Statistic

Mean  x

Proportion p p^

Variance  2 s 2

Differences  -  x - x
1 2 1 2

X =  ± Zx  =X ± Zx


(1)   X  Error
(2) Error  X   or X  
X   Error
(3) Z 
x x
(4) Error  Z x
(5)   X  Z x
L  U
Sampling Distribution of Sample Mean
x_
/2 1-  /2
_
X
 x = 
Intervals (1 – α)% of
extend from intervals
X – ZσX to contain μ
X + ZσX α% do not
Large number of intervals
Factors that
determine
the width of
a confidence
interval
1. Data dispersion The variability in the
• Measured by 
population, usually
estimated by s
2. Sample size

X  The sample size, n
n

3. Level of confidence
The desired level of
(1 – )
confidence
• Affects Z
Confidence
Intervals

Mean Proportion

σ
σ Known
Unknown
A. Large Sample with  Unknown

x  s/ n
 = X ± Zx
1. A sample size 49 is taken from an approximately
normal population. The sample mean and standard
deviation are 24 and 14 respectively. Find a 90%
confidence interval for the population mean

Solve the problem is


A sample size, n= 49.
The sample mean, µ =  = 24
Standard deviation, s = 14
90% confidence interval for the population mean is (1-)% ?
_ µ =X =  ± Zx
90% x
/2 1-  /2
_
X
_ = 
x
?
_ ?
  10%
x   x  z / 2 x  x  z / 2 x
/2  5 % _  s/n
x
Z/2  1,64  14 / 49 = 2
µ = 24 ± (1,64 x 2) = 24 ± 3,28

Intervals extend from µ = 20,72 to 27,28

20,7 µ  27,3 µ (20,7, 27,3)


x  Z / 2 x    x  Z / 2 x

B. Interval Estimates of single mean with  known

 
X  Z / 2     X  Z / 2 
n n
2. A sample of size 100 is taken from a population of known
standard deviation of 30. The sample mean is found to be
150. Find a 95% confidence interval for the population mean

Solve the problem is ?


C. Small Sample (n < 30) with  Unknow

s
X t
n

S S
X  t / 2     X  t / 2 
n n

df  n  1
Ilustration
A random sample of n = 25 has x = 50 and s = 8. Set up a
95% confidence interval estimate for .

S S
X  t / 2     X  t / 2 
n n
8 8
50  2.064     50  2.064 
25 25
46.69    53.30
3. A sample of size 9 is taken from an approximately normal
population. The sample mean and standard deviation are
1.7 and .6 respectively. Find a 95% confidence interval for
the population mean

Solve the problem is ?


1. How much money does the average professional football
fan spend on food at a single football game? That
question was posed to 47 randomly selected football
fans. The sample results provided a sample mean and
standard deviation of $17.00 and $3.40, respectively.
Find and interpret a 99% confidence interval for µ.

2. The following data represent the scores of a sample of 50


randomly chosen students on a standardized test.
39 48 55 63 66 68 68 69 70 71
71 71 73 74 76 76 76 77 78 79
79 79 79 80 80 82 83 83 83 85
85 86 86 88 88 88 88 89 89 89
90 91 92 92 93 95 96 97 97 99
a. Write a 95% confidence interval for the mean score of all
students who took the test.
b. Identify the target parameter and the point estimator.
3. To help consumers assess the risks they are taking, the Food
and Drug Administration (FDA) publishes the amount of
nicotine found in all commercial brands of cigarettes. A new
cigarette has recently been marketed. The FDA tests on this
cigarette yielded a mean nicotine content of 25.1 milligrams
and standard deviation of 2.5 milligrams for a sample of n = 90
cigarettes. Find a 95% confidence interval for µ.
4. Find the value of t0 such that the following statement is true:
P(-t0 ≤ t ≤ t0) = .99 where df = 9.
5. Find the value of t0 such that the following statement is true:
P(-t0 ≤ t ≤ t0) = .95 where df = 15.
6. A random sample of 20 professional working mothers revealed
they listened to the radio and average (mean) of 40 minutes per
day with a standard deviation of 8.6 minutes. Develop a 95
percent confidence interval for the population mean listening
time.
7. You are interested in purchasing a new car. One of the many
points you wish to consider is the resale value of the car after 5
years. Since you are particularly interested in a certain foreign
sedan, you decide to estimate the resale value of this car with
a 99% confidence interval. You manage to obtain data on 17
recently resold 5-year-old foreign sedans of the same model.
These 17 cars were resold at an average price of $12,610 with
a standard deviation of $700. What is the 99% confidence
interval for the true mean resale value of a 5- year-old car of
this model?
8. We wish to develop a confidence interval for the population
mean. The shape of the population is not known, but we have
a sample of 40 observations. We decide to use the 92 percent
level of confidence. Find The appropriate value of z ?
9. Colleges and universities rely on money contributed by
individuals and corporations for their operating expenses. Much of
this money is invested in a fund called an endowment, and the
college spends only the interest earned by the fund. A recent
survey of eight private colleges in the United States revealed the
following endowments (in millions of dollars): 70.5, 55.4, 233.9,
497.4, 117, 155.6, 105.2, and 216.6. What value will be used as the
interval estimate for the mean endowment of all private colleges in
the United States?
10. A marketing research company is estimating the average total
compensation of CEOs in the service industry. Data were
randomly collected from 18 CEOs and results provided a sample
mean and standard deviation of $ 4,008,720 and $1,107,552
respectively. Find a 95% confidence interval for µ.
11. A marketing research company is estimating the average total
compensation of CEOs in the service industry. Data were randomly
collected from 18 CEOs and the 99% confidence interval was
calculated to be ($2,181,260, $5,836,180). Based on the interval above,
do you believe the average total compensation of CEOs in the service
industry is more than $1,500,000?
a. Variances () are known large sample
Two population means are comonly compared by forming
their difference:
( µ1 - µ2 )

The difference is the population target to be estimated. A


reasonable estimate of this is the corresponding difference
in sample mean :
( 1 -  2 )

The appropriate confidence interval arround the estimate:

( µ1 - µ2 ) = ( 1 -  2 ) ± Z/2 . SE
( µ1 - µ2 ) = ( 1 -  2 ) ± Z/2 . SE

 12  22

n1 n2

b. Variances () are unknown large sample

( µ1 - µ2 ) = ( 1 -  2 ) ± Z/2 . SE

s12 s22

n1 n2
4. Independent random samples of 100 observations each are
chosen from two normal populations with the following means
and standard deviations.
Population 1 Population 2
µ1 = 15 µ2= 13
1 = 3 2 = 2
Find the mean and standard deviation of the sampling
distribution of (x1 - x2) and calculate 94% interval Confidence of
(x1 - x2) .
Solve the problem is ?
(x1 - x2) = ( µ1 - µ2 ) = 15-13 =2

1 2 32 22 13 1
     13  0,361
n1 n2 100 100 100 10

( µ1 - µ2 ) = ( 1 -  2 ) ± Z/2 . SE ?
5. In order to compare the means of two populations, independent
random samples of 144 observations are selected from each
population with the following results.
Sample
_ 1 Sample
_ 2
x1 = 7,123 x2 = 6,957
s12 = 175 s22 = 225

Use a 97% confidence interval to estimate the difference between


the population means (µ1 - µ2). Interpret the confidence interval.
Solve the problem is ..
c. Variances () are unknown of small sample
Confidence Interval for the Difference between Two
Means in case of Independent Samples.
General Case

A particular Case:
Both Populations have Equal Variances
Sp 
2  ( X 1  X 1 ) 2
  ( X 2  X 2 ) 2

(n1  1)  (n2  1)

Recommended formula that requires the additional assumption that


1=2. In this case degrees of freedom is d.f = n1 + n2 -2
Form large class, a sample of 4 grades were drawn: 64, 66, 89, 77.
from a second large class, an independent sample of 3 grades were
drawn: 56, 71 and 53. calculate the 95% confidence interval for the
difference between the two class means, µ1-µ2!

Solution

Class 1 Class 2

X1 X1-1 (X1-1)2 X2 X2-2 (X2-2)2

64 -10 100 56 -4 16
66 -8 64 71 11 121
89 15 225 53 -7 49
77 3 9 2=180/3 0 186
1=296/4 0 398 = 60.0
= 74.0
Illustration Cont…

S p2 
 1 1  2 2
( X  X ) 2
 ( X  X ) 2

(n1  1)  (n2  1)

398  186 584


S p2    116.8  117
(4 1)  (3 1) 5
Problem

5. Consider the following set of salary data:

27 20

a. Find the mean and standard deviation of the sampling


distribution of (x1 - x2).

b. Construct a 99% confidence interval for (x1 - x2).


c. Construct a 90% confidence interval for (x1 - x2).
d. Construct a 96% confidence interval for (x1 - x2).
The sample consists of n matching pairs (dependent samples). A
sample of students’ grades in fall was compared to sample of
students’ grades in the spring, this is dependent samples case and
called matched or paired samples.
_ _
The difference of samples X1 and X2 is D = X1-X2 or d = X1-X2

Confidence Interval for Matching Pairs is

with
illustration
Confidence
Intervals

Mean Proportion

σ
σ Known
Unknown
1. Assumptions
• Random sample selected
• Normal approximation can be used if

npˆ  15 and nqˆ  15


2. Confidence interval estimate

ˆˆ
pq ˆˆ
pq
pˆ  z 2   p  pˆ  z 2 
n n
A random sample of 400 graduates showed 32
went to graduate school. Set up a 95%
confidence interval estimate for p.
ˆˆ
pq ˆˆ
pq
pˆ  Z / 2   p  pˆ  Z / 2 
n n

.08  .92 .08  .92


.08  1.96   p .08  1.96 
400 400

.053  p  .107
You’re a production
manager for a
newspaper. You want
to find the % defective.
Of 200 newspapers, 35
had defects. What is
the 90% confidence
interval estimate of the
population proportion
defective?
pˆ  qˆ pˆ  qˆ
pˆ  z / 2   p  pˆ  z / 2 
n n

.175  (.825) .175  (.825)


.175  1.645   p  .175  1.645 
200 200

.1308  p  .2192
SE = Sampling Error
I don’t want to
X   SE
(1) Z  sample too much
x x or too little!


(2) SE  Z 2 x  Z 2
n
( Z 2 ) 
2 2

(3) n 2
( SE )
What sample size is needed to be 90% confident
the mean is within  5? A pilot study suggested
that the standard deviation is 45.

( Z 2 )  1.645   45 
2 2 2 2

n   219.2  220
5
2 2
( SE )
SE = Sampling Error

pˆ  p SE
(1) Z 
 pˆ  pˆ
pq
(2) SE  Z 2 pˆ  Z 2
n
2
( Z 2 ) pq
(3) n If no estimate of p is
( SE )2 available, use p = q = .5
What sample size is needed to estimate p with
90% confidence and a width of .03?

width .03
SE    .015
2 2
2 2
Z  /2 . p.q 1.64 .0.5.05
n 2
 2
 3025
SE 0.015
You work in Human
Resources at Merrill Lynch.
You plan to survey
employees to find their
average medical expenses.
You want to be 95%
confident that the sample
mean is within ± $50.
A pilot study showed that 
was about $400. What
sample size do you use?
( Z 2 ) 
2 2

n 2
( SE )

1.96   400 
2 2


 50  2

 245.86  246
The American Kennel Club
wanted to estimate the proportion
of children that have a dog as a
pet. If the club wanted the
estimate to be within 3% of the
population proportion, how many children would they need to
contact? Assume a 95% level of confidence and that the club
estimated that 30% of the children have a dog as a pet.

2
 1.96 
n  (.30)(.70)   897
 .03 
Summary
1. A Florida newspaper reported on the topics that teenagers most want to
discuss with their parents. The findings, the results of a poll, showed that 46%
would like more discussion about the family's financial situation, 37% would
like to talk about school, and 30% would like to talk about religion. These and
other percentages were based on a national sampling of 510 teenagers.
a. Estimate the proportion of all teenagers who want more family
discussions about family's financial . Use a 98% confidence level.
b. Estimate the proportion of all teenagers who want more family
discussions about school. Use a 95% confidence level.
c. Estimate the proportion of all teenagers who want more family
discussions about religion. Use a 96% confidence level.
2. A random sample of 4000 U.S. citizens yielded 2110 who are in favor of gun
control legislation.
a. Find the point estimate for estimating the proportion of all Americans
who are in favor of gun control legislation.
b. Estimate the true proportion of all Americans who are in favor of gun
control legislation using a 90% confidence interval.
3. We intend to estimate the average driving time of Chicago commuters.
From a previous study, we believe that the average time is 42 minutes with
a standard deviation of 10 minutes. We want our 90 percent confidence
interval to have a margin of error of no more than plus or minus 5 minutes.
What is the smallest sample size that we should consider?

4. Sales of a new line of athletic footwear are crucial to the success of a


company. The company wishes to estimate the average weekly sales of the
new footwear to within $300 with 99% reliability. The initial sales indicate
that the standard deviation of the weekly sales figures is approximately
$1300. How many weeks of data must be sampled for the company to get
the information it desires?

5. Private colleges and universities rely on money contributed by individuals


and corporations for their operating expenses. Much of this money is
invested in a fund called an endowment, and the college spends only the
interest earned by the fund. A recent survey of eight private colleges in the
United States revealed the following endowments (in millions of dollars):
70.5, 55.4, 233.9, 497.4, 117, 155.6, 105.2, and 216.6. What value will be
used as the point estimate and 94% confidence interval for the mean
endowment of all private colleges in the United States?
6.
7.

8.
9
10
Latihan Soal
A. Seorang importir menerima kiriman 2 macam lampu pijar yang
masing-masing bermerek Sinar dan Terang, dalam jumlah yang
besar sekali. Importir diatas secara random memilih dari kedua
merek diatas masing-masing 50 buah lampu serta menguji daya
tahannya secara cermat sekali. Dari hasli pengujian tersebut,
lampu pijar Sinar diketahui memiliki daya tahan rata-rata sebesar
1.282 jam sedangkan lampu pijar Terang memiliki daya tahan
rata-rata sebesar 1.208 jam. Berdasarkan pengalamannya
sebagai importir lampu pijar sedemikian itu, diketahui pula
bahwa standar deviasi kedua merek lampu pijar kurang lebih
konstan dan masing-masing sebesar 80 dan 94 jam. Buatlah
interval estimasi beda rata-rata daya tahan lampu pijar dengan
interval keyakinan 95%.

B. Seperempat dari 300 orang konsumen yang diwawancarai


menyatakan tidak menyukai sabun mandi cap Dewi. Tentukan
interval keyakinan sebesar 99% untuk menduga proporsi
populasi konsumen yang tidak menyukai sabun mandi merek
Dewi.
C. Sebuah sampel yang terdiri dari 100 kendaraaan bermotor
masing-masing telah dipilih dari populasi yang terdiri
kendaraaan bermotor di dua kota A dan B. Di Kota A, 72 buah
Kendaraan ternyata sudah melunasi pajak kendaraan sedangkan
di Kota B, hanya 66 kendaraan saja yang pajaknya sudah
dilunasi. Buatlah suatu interval keykinan sebesar 97% untuk
menduga beda proporsi pelunasan pajak kendaraan bermotor di
kedua kota tersebut.

D. Sebuah perusahaan yang merupakan seponsor bagi sebuah


acara TV ingin menduga/memperoleh gambaran tentang proporsi
(persentasi) jumlah pemirsa yang mengikuti acara (program)
yang diseponsorinya. Dalam hal pendugaan diatas, perusahaan
tersebut mengetahui bahwa pada acara-acara yang lalu diikuti
oleh 25% dari semua pemirsa dirumah.
a) Berapa besar jumlah sampel yang harus digunakan dalam proses pendugaan
jika perusahaan tersebut bersedia menerima selisih/kesalahan sebesar ± 5 %
dan dengan tingkat kepercayaan 95%
b) Jika dalam soal di atas, perusahaan sama sekali tidak mengetahui
bagaimana reaksi jumlah pemirsa terhadap acaranya, berapa besar sampel
yang harus digunakan dengan tingkat kepercayaan 96%?
TERIMAKASIH

Anda mungkin juga menyukai