Analisis Regresi dan Korelasi Linear Distribusi Sampel Selisih Rata-rata dan
Proporsi
Analisis Regresi dan Korelasi Berganda Faktor Koreksi untuk Populasi Terbatas
Populasi Sampel
8- 4
What is sample ?
POPULATION
D
B
A
Z
C …..
SAMPLE
B
8- 5
Why sample?
The physical
impossibility of
checking all items in
the population.
The time-consuming
The cost of studying aspect of contacting
all the items in a the whole population.
population.
The adequacy of
The destructive sample results
nature of in most cases.
certain tests.
Why Sample the Population?
DEFINISI
Populasi
kumpulan dari semua kemungkinan orang-orang, benda-benda,
dan ukuran lain yang menjadi objek perhatian atau kumpulan
seluruh objek yang menjadi perhatian
Sampel
suatu bagian dari populasi tertentu yang menjadi perhatian
Probabilitas Nonprobabilitas
Merupakan suatu sampel yang Merupakan suatu sampel yang
dipilih sedemikian rupa dari dipilih sedemikian rupa dari
populasi sehingga masing-masing populasi sehingga setiap anggota
anggota populasi memiliki tidak memiliki probabilitas atau
probabilitas atau peluang yang peluang yang sama untuk
sama untuk dijadikan sampel dijadikan sampel
OUTLINE
BAGIAN I STATISTIK INDUKTIF
Analisis Regresi dan Korelasi Linear Distribusi Sampel Selisih Rata-rata dan
Proporsi
Analisis Regresi dan Korelasi Berganda Faktor Koreksi untuk Populasi Terbatas
Stratified Random
Sampling: A
population is first
divided into subgroups,
called strata, and a
sample is selected from
each stratum.
Bab 11 Metode dan Distribusi
Sampling
PROSES STRATIFIKASI
Distribusi sampel:
x 2 4 6 4 4 20
4
N 5 5
b. Nilai rata-rata populasi dan sampel apabila diambil sampel 2 dari 5 bank
N! 5! 5!
CnN 10
n!(N n)! 2!(5 2)! 2!3!
DISTRIBUSI SAMPEL RATA-RATA DAN PROPORSI
X
1
X N
Cn
1
X (3 4 3 3 5 4 4 5 5 4) 40/10 4
10
DISTRIBUSI SAMPEL RATA-RATA DAN
PROPORSI
c. Nilai rata-rata populasi
Populasi Sampel
Nilai rata- Frekuensi Probabilitas Nilai rata- Frekuensi Probabilitas
rata rata
2 1 (1/5) = 0,20 3 3 (3/10) = 0,30
0,7 0,5
0,6
0,4
0,5
0,3
0,4
0,3 0,2
0,2
0,1
0,1
0 0
2 4 6 2 4 6
DISTRIBUSI SAMPEL RATA-RATA DAN
PROPORSI
d. Standar deviasi populasi
x
2
N
X X– (X – )2
x
2 -2 4
4 0 0
6 2 4
4 0 0
4 0 0
X
2
X = 20 8,0
= 20/5 = 4
2
X 5 8 5 1,3
DISTRIBUSI SAMPEL RATA-RATA
DAN PROPORSI
Standar deviasi sampel
Nn
s
n N 1
X xX – x
(X – )2
3 -1 1
4 0 0
3 -1 1
3 -1 1
5 1 1
4 0 0
4 0 0
5 1 1
5 1 1
X X 6,0
2
X = 40
xX = 40/10 = 4 X 6 10 0,77
1 2
N x
Cn
8- 29
Example 1
Partner Hours
The law firm of Dunn 22
Hoya and 26
Hardy
Associates has five
Kiers 30
partners. At their
weekly partners Malory 26
meeting each Tillman 22
reported the
number of hours If two partners are
they billed clients selected randomly, how
for their services many different samples
last week. are possible?
8- 30
5 objects 5! A total of 10
taken 2 at 5 C2 10
2! (5 2)! different
a time. samples
Partners Total Mean
1,2 48 24
1,3 52 26
1,4 48 24
1,5 44 22
2,3 56 28
2,4 52 26
2,5 48 24
3,4 56 28
3,5 52 26
4,5 48 24
Example 1
8- 31
As a sampling distribution
22 1 1/10
24 4 4/10
26 3 3/10
28 2 2/10
Example 1 continued
8- 32
Thinking Challenge
Rp. 50.000
Or
Rp. 25.000 – Rp. 100.000
8- 35
Term of estimation
Parameter
Statistic
Estimator
Estimate
SAMPLING PROCESS
Population Random Sample
Mean
Mean, , is X = 50
unknown
Ex:
Sample
The scores of 50 students of mid test value Identify
the target parameter and the point estimator if 10
randomly choosen of student!
8- 37
55 71 76 77 85 ... 96
xˆ x E ( x) 81.5
10
(55 81,5)2 (71 81,5)2 ... (96 81,5)2
s E(s )
2 2 2
146,056
9
39 48 63 .... 97 99
79,98 How about population ?
50
(39 79,98)2 (48 79,98)2 ..... (97 79,98)2 (99 79,98)2
2
152,387
49
8- 39
Sample means
follow the normal the underlying population
probability follows the normal
distribution under distribution
two conditions:
OR
Sample Means
8- 41
X
To determine the probability z
that a sample mean falls s n
within a particular region,
use
Sample Means
8- 42
Example 2
8- 43
X $1.38 $1.30
z 1.69
s n $0.28 35
X $1.22 $1.30
z 1.69
s n $0.28 35
Example 2 continued
8- 44
Example 2 continued
8- 45
7 2,10 -0,40667 0,165378
2
x x
8 2,70 0,193333 0,037378
s2
9 2.00 -0,50667 0,256711 n 1
10 1,80 -0,70667 0,499378
11 2,90 0,393333 0,154711
12 2,40 -0,10667 0,011378
13 2,60 0,093333 0,008711 s s 2
14 3,60 1,093333 1,195378
15 2,10 -0,40667 0,165378
Jumlah: 37,60 Jumlah: 4,229333
2
s = 4,229333 / 14
Rata-rata ( x) = 37,60 / 15
= 0,302095
= 2,506667 s = 0,549632
8- 47
Menghitung Z-value Z
x x
s n
a. Persen mahasiswa yang memiliki nilai (IPK) kurang dari 3 ,00 berarti P(<
3,00) atau dalam nilai Z-value P (Z< 0,23) adalah 1-P (Z ≥ 0,23). Dari tabel
distribusi normal diperoleh P(Z ≥ 0,23) = 0,409, sehingga P(Z<0,23)=1-
0,409 = 0,591.
Jadi mahasiswa statitika UGM yang memiliki nilai (IPK) kurang dari 3
sebanyak 59,10%
8- 48
Hubungan antara x
dan untuk
populasi yang tidak s
terbatas n
DISTRIBUSI SAMPLING PROPORSI
Nilai rata-rata 1
proporsi Pp N
Cn
p Pp
Standar deviasi 1 2
Sp N
sampel proporsi Cn
Sampel 1
Populasi 1
berukuran
1, 1
X 1, Sx1
Apakah
X1 ,X2 1 , 2
Populasi 2 Sampel 2
2, 2 berukuran
X2 , Sx2
SKEMA SELISIH POPULASI ATAU SAMPEL
Xx1x2 X1 X2 1 2
Nilai Standar deviasi distribusi sampel selisih rata-rata x1 – x2
s2x1 s2x2
sx1 x2 s2x1 s2x2
n1 n2
Z
x1 x2 1 2
sx1 x2
Bab 11 Metode dan Distribusi
Sampling
DISTRIBUSI SAMPEL SELISIH RATA-
RATA DAN PROPORSI
Nilai rata-rata distribusi sampel selisih proporsi Pp p 2
1
P1 (1 P1 ) P2 (1 P2 )
Sp1 p2 Sp12 Sp22
n1 n2
(p1 p2 ) (P1 P2 )
Z
Sp1 p2
FAKTOR KOREKSI
Penyesuaian standar deviasi untuk rata-rata hitung:
Nn
sx
n N1
p 1 p N n
sp
n N1
SAMPEL SAMA DENGAN POPULASI,
VARIAN SAMPEL 2/N
• Distribusi sampel:
Pendugaan Interval
Pengujian Hipotesa Sampel Besar
Kesalahan Standar dari Rata-rata
Hitung Sampel
Pengujian Hipotesa Sampel Kecil
Menyusun Interval Keyakinan
Analisis Regresi dan Korelasi Linier
Interval Keyakinan Rata-rata
dan Proporsi
Analisis Regresi dan Korelasi Berganda
Interval Keyakinan Selisih Rata-rata
dan Proporsi
Konsep Dasar Persamaan Simultan
Memilih Ukuran Sampel
CAKUPAN STATISTIKA
ESTIMATION METHODS
Estimation
Point Interval
Estimation Estimation
POINT ESTIMATION
Thinking Challenge
Suppose you’re
interested in the
average amount of
money that students
in this class (the
population) have on
them. How would
you find out?
Rp. 50.000
Or
Rp. 25.000 – Rp. 100.000
Term of estimation
– Parameter
– Statistic
– Estimator
– Estimate
Ex:
ESTIMATION PROCESS
Population Random Sample
Mean
Mean, , is
X = 50
unknown
Ex:
Sample
The scores of 50 students of mid test value
Identify the target parameter and the point
estimator if 10 randomly choosen of student!
The scores of 50 students of mid test value Identify
the target parameter and the point estimator if 10 Randomly
randomly chosen of student! chosen
55 71 76 77 85 ... 96
xˆ x E ( x) 81.5
10
(55 81,5)2 (71 81,5)2 ... (96 81,5)2
s E(s )
2 2 2
146,056
9
39 48 63 .... 97 99
79,98 How about population ?
50
(39 79,98)2 (48 79,98)2 ..... (97 79,98)2 (99 79,98)2
2
152,387
49
SIFAT-SIFAT PENDUGA
69
PENDUGA TIDAK BIAS
(expexted value, X )
dari statistik sampel
sama dengan
parameter populasi
() atau dapat
dilambangkan
dengan E( X ) =
E( X )
70
Unbiased estimators
X or X̂ and S 2 and 2
These random variables are These fixed constants are
examples of statistics or examples of parameter or
estimators targets
2
E(S )=2
True µ = E(X)
Bias = E(x) - µ
bias
µ E(X)
PENDUGA EFISIEN
Penduga Efisien
Penduga yang efisien adalah penduga yang tidak bias dan mempunyai
varians terkecil (sx2) dari penduga-penduga lainnya.
sx12
73
Efficient estimators (minimum variance)
Estimators called efficient if the distribution of an
estimator to be highly concentrated or have a small
variance than another.
2a
2b 2b< 2a
b efficient
estimator than a
Penduga Konsisten
Penduga yang konsisten adalah nilai dugaan (X ) yang semakin mendekati
nilai yang sebenarnya dengan semakin bertambahnya jumlah sampel
(n).
n tak terhingga
n sangat besar
n besar
n kecil
75
Consistent estimator
One of conditions that makes an estimator consistent is:
If its bias and variance both appraoach zero
Point Interval
Estimation Estimation
PENDUGAAN INTERVAL
Estimate
Bias = E(x) - µ
bias
µ E(X)=
X ± Error
Error X or X
Motivation Cont..
P( L U ) 1
Sample statistic
Confidence interval
(point estimate)
L U
Confidence limit Confidence limit
(lower) (upper)
Lower Upper
Confidence Confidence
Limit
Point Estimate
Limit
Width of
confidence interval
A point estimate is a single number,
◦ How much uncertainty is associated with a point
estimate of a population parameter?
An interval estimate provides more information about a
population characteristic than does a point estimate. It
provides a confidence level for the estimate. Such interval
estimates are called confidence intervals
RUMUS INTERVAL PENDUGAAN
88
What is interval estimation?
L U
X = ± Zx
x_
90%
95%
99%
A confidence interval is a range
of values within which the
population parameter is
expected to occur.
Mean x
Proportion p p^
Variance 2 s 2
Differences - x - x
1 2 1 2
3. Level of confidence
The desired level of
(1 – )
confidence
• Affects Z
Confidence
Intervals
Mean Proportion
σ
σ Known
Unknown
A. Large Sample with Unknown
x s/ n
= X ± Zx
1. A sample size 49 is taken from an approximately
normal population. The sample mean and standard
deviation are 24 and 14 respectively. Find a 90%
confidence interval for the population mean
X Z / 2 X Z / 2
n n
2. A sample of size 100 is taken from a population of known
standard deviation of 30. The sample mean is found to be
150. Find a 95% confidence interval for the population mean
s
X t
n
S S
X t / 2 X t / 2
n n
df n 1
Ilustration
A random sample of n = 25 has x = 50 and s = 8. Set up a
95% confidence interval estimate for .
S S
X t / 2 X t / 2
n n
8 8
50 2.064 50 2.064
25 25
46.69 53.30
3. A sample of size 9 is taken from an approximately normal
population. The sample mean and standard deviation are
1.7 and .6 respectively. Find a 95% confidence interval for
the population mean
( µ1 - µ2 ) = ( 1 - 2 ) ± Z/2 . SE
( µ1 - µ2 ) = ( 1 - 2 ) ± Z/2 . SE
12 22
n1 n2
( µ1 - µ2 ) = ( 1 - 2 ) ± Z/2 . SE
s12 s22
n1 n2
4. Independent random samples of 100 observations each are
chosen from two normal populations with the following means
and standard deviations.
Population 1 Population 2
µ1 = 15 µ2= 13
1 = 3 2 = 2
Find the mean and standard deviation of the sampling
distribution of (x1 - x2) and calculate 94% interval Confidence of
(x1 - x2) .
Solve the problem is ?
(x1 - x2) = ( µ1 - µ2 ) = 15-13 =2
1 2 32 22 13 1
13 0,361
n1 n2 100 100 100 10
( µ1 - µ2 ) = ( 1 - 2 ) ± Z/2 . SE ?
5. In order to compare the means of two populations, independent
random samples of 144 observations are selected from each
population with the following results.
Sample
_ 1 Sample
_ 2
x1 = 7,123 x2 = 6,957
s12 = 175 s22 = 225
A particular Case:
Both Populations have Equal Variances
Sp
2 ( X 1 X 1 ) 2
( X 2 X 2 ) 2
(n1 1) (n2 1)
Solution
Class 1 Class 2
64 -10 100 56 -4 16
66 -8 64 71 11 121
89 15 225 53 -7 49
77 3 9 2=180/3 0 186
1=296/4 0 398 = 60.0
= 74.0
Illustration Cont…
S p2
1 1 2 2
( X X ) 2
( X X ) 2
(n1 1) (n2 1)
27 20
with
illustration
Confidence
Intervals
Mean Proportion
σ
σ Known
Unknown
1. Assumptions
• Random sample selected
• Normal approximation can be used if
ˆˆ
pq ˆˆ
pq
pˆ z 2 p pˆ z 2
n n
A random sample of 400 graduates showed 32
went to graduate school. Set up a 95%
confidence interval estimate for p.
ˆˆ
pq ˆˆ
pq
pˆ Z / 2 p pˆ Z / 2
n n
.053 p .107
You’re a production
manager for a
newspaper. You want
to find the % defective.
Of 200 newspapers, 35
had defects. What is
the 90% confidence
interval estimate of the
population proportion
defective?
pˆ qˆ pˆ qˆ
pˆ z / 2 p pˆ z / 2
n n
.1308 p .2192
SE = Sampling Error
I don’t want to
X SE
(1) Z sample too much
x x or too little!
(2) SE Z 2 x Z 2
n
( Z 2 )
2 2
(3) n 2
( SE )
What sample size is needed to be 90% confident
the mean is within 5? A pilot study suggested
that the standard deviation is 45.
( Z 2 ) 1.645 45
2 2 2 2
n 219.2 220
5
2 2
( SE )
SE = Sampling Error
pˆ p SE
(1) Z
pˆ pˆ
pq
(2) SE Z 2 pˆ Z 2
n
2
( Z 2 ) pq
(3) n If no estimate of p is
( SE )2 available, use p = q = .5
What sample size is needed to estimate p with
90% confidence and a width of .03?
width .03
SE .015
2 2
2 2
Z /2 . p.q 1.64 .0.5.05
n 2
2
3025
SE 0.015
You work in Human
Resources at Merrill Lynch.
You plan to survey
employees to find their
average medical expenses.
You want to be 95%
confident that the sample
mean is within ± $50.
A pilot study showed that
was about $400. What
sample size do you use?
( Z 2 )
2 2
n 2
( SE )
1.96 400
2 2
50 2
245.86 246
The American Kennel Club
wanted to estimate the proportion
of children that have a dog as a
pet. If the club wanted the
estimate to be within 3% of the
population proportion, how many children would they need to
contact? Assume a 95% level of confidence and that the club
estimated that 30% of the children have a dog as a pet.
2
1.96
n (.30)(.70) 897
.03
Summary
1. A Florida newspaper reported on the topics that teenagers most want to
discuss with their parents. The findings, the results of a poll, showed that 46%
would like more discussion about the family's financial situation, 37% would
like to talk about school, and 30% would like to talk about religion. These and
other percentages were based on a national sampling of 510 teenagers.
a. Estimate the proportion of all teenagers who want more family
discussions about family's financial . Use a 98% confidence level.
b. Estimate the proportion of all teenagers who want more family
discussions about school. Use a 95% confidence level.
c. Estimate the proportion of all teenagers who want more family
discussions about religion. Use a 96% confidence level.
2. A random sample of 4000 U.S. citizens yielded 2110 who are in favor of gun
control legislation.
a. Find the point estimate for estimating the proportion of all Americans
who are in favor of gun control legislation.
b. Estimate the true proportion of all Americans who are in favor of gun
control legislation using a 90% confidence interval.
3. We intend to estimate the average driving time of Chicago commuters.
From a previous study, we believe that the average time is 42 minutes with
a standard deviation of 10 minutes. We want our 90 percent confidence
interval to have a margin of error of no more than plus or minus 5 minutes.
What is the smallest sample size that we should consider?
8.
9
10
Latihan Soal
A. Seorang importir menerima kiriman 2 macam lampu pijar yang
masing-masing bermerek Sinar dan Terang, dalam jumlah yang
besar sekali. Importir diatas secara random memilih dari kedua
merek diatas masing-masing 50 buah lampu serta menguji daya
tahannya secara cermat sekali. Dari hasli pengujian tersebut,
lampu pijar Sinar diketahui memiliki daya tahan rata-rata sebesar
1.282 jam sedangkan lampu pijar Terang memiliki daya tahan
rata-rata sebesar 1.208 jam. Berdasarkan pengalamannya
sebagai importir lampu pijar sedemikian itu, diketahui pula
bahwa standar deviasi kedua merek lampu pijar kurang lebih
konstan dan masing-masing sebesar 80 dan 94 jam. Buatlah
interval estimasi beda rata-rata daya tahan lampu pijar dengan
interval keyakinan 95%.