Anda di halaman 1dari 5

= population mean

= population st. dev


2 = hypothesized population variance
x = sample mean
s = sample st- .dev
S2 = sample variance
n = number of sample values SAMPLE SIZE
E = margin of error
a a
z or t = confidence coefficient
2 2
m the number of sample elements
p p-hat sample proportion
p population proportion
n number of sample values
sep standard error
E margin of error
a
z = confidence coefficient
2
= Type II Error Probability
1- = Correct decision Probability
population mean (unknown)
0 hypothesized value of the population mean
x sample mean
0 hypothesized value of the population mean
x standard error of the mean
CI = Confidence interval
f0 observed frequency in a particular cell
fe expected frequency in a particular cell if H0 is true
n = sum of sample sizes in all groups
c = Number of groups
Tj = Sum of ranks in the jth group
nj = Number of values in the jth group (j = 1, 2, ... , c)
0+ 1 X i -> linear component
i -> random error component
Yi dependent variable
0 population Y intercept
1 population slope coefficient
Xi independent variable
value of X for observation i
Y^i - estimated (predicted) Y value for observation i
b0 estimate of the regression intercept
b1 estimate of the regression slope
Y - Mean value of the dependent variable
Y i observed value of the dependent variable
Y^i - predicted value of Y for the given X i value
2
r coefficient of determination
SYX standard error of the estimate
b1 - regression slope coefficient
1 - hypothesized slope
S b - standard error of the slope
1

FSTAT - follows an F distribution with k numerator and


k the number of independent variables
(n k - 1) - denominator degrees of freedom

Equally Likely Outcomes Rule Interval estimate of p


M
P(A) =
N
M = Number of ways A can happen
p = p + z
a
2
x
p q
n
a
2
(outcomes in A) ( z ) p q
N = Number of possible outcomes -> When p known, n = 2
(outcomes in S) E
2

Example: One Coin Tossed -> When p unknown, n =


P(T) = 1 / 2 = 0,5 = 50% P(H) = 1 / 2 = 0,5 = a 2
50% ( z ) 0.25
2
2
Mean of a Random Variable E
= mean of random variable
x = mean of a data variable Z-test
= xi*P(xi) x 0
z=
x
Variance of a Random Variable
2= variance of random variable Standard error of the mean x
s2 = variance of a data variable -> If f = n/N < 0.05 or 5%
2 = (xi-)2*P(xi)
If known -> x =
n
Standardization Formula s
X If unknown -> x =
Z= n
-> If f = n/N > 0.05 or 5%
Reverse Standardization
X = Z + If known -> x =

n
x
N n
N 1
ALL CONFIDENCE INTERVALS :
Point estimate + (Critical Value x
Standard Error )
If unknown -> x =
s
n
x

N n
N 1

Confidence interval
Margin of Error (E) a
a CI = 0 + z x
known : E z x 2
2 n - LCL lower confidence limit is with CI =
a
unknown : x 0 - z x
n 2
- UCL upper confidence limit is with CI =
a a
Confidence level ( z ) 0 + z x
2 2
-> Interval estimate of when known :
a
x+ z x Standard error of difference between
2 n two means :
-> Interval estimate of when
-> When 1 and 2 are known :
unknown :
x+ t
a
2
x
s
n
(x x )=
1 2
+
n1 n2
21 22

-> When 1 and 2 are unknown and


n1 and n2 30:

(x x )=
1 2
+
n1 n 2
s 21 s 22
McNemar Test
Finite population BC
Z STAT =
f=

of the interval
N n
N 1
- used to reduce the width B+C

Chi-Square Test for a Variance or


Proportion p Standard Deviation
2
m n1 S
p =
n

2
Margin of error E when estimating a x STAT =
2 2 2 2
population proportion : Reject H0 if X STAT > X if X STAT < X
a
a 2 2
z p (1p ) z
E= x or x
2 n 2

p q
n
where q =1- p only if n is large
Simple Linear Regression
Y i= 0 + 1 X i+ i

Large Sample Proportion Hypothesis


Testing An estimate of the population
Z-test regression line
p P 0 Y^i=b 0+ b1 X i
z=
p

Standard error of the proportion The Least Squares Method


b0 +b 1 X i ^
2
p

p =
n
P0 Q 0
when f<0.05 Y i

PQ
p = 0 0
n N1
N n when f > 0.05 Y iY^i 2=min

min
The Chi-square test statistic
2
f 0f e
Measures of Variation
SST = SSR + SSE
2
x STAT = allcells SST total sum of squares total variation
x 2STAT for the 2x2 case has 1 degree Y iY 2
of freedom
SST =
Measures the variation of the Y i
The average proportion values around their mean Y
x 1+ x 2 x SSR regression sum of squares
p= =
n 1+ n2 n explained variation
Y^iY
2

The Marasculio Procedure


Critical range = SSR=

p j (1p j ) p j ' (1 p j ' ) Variation attributable to the relationship


x 2
a
nj
+
nj' between X and Y
SSE error sum of squares unexplained
|pj pj| > critical range for j and j
variation
H0 : 1 =2 =3
Y iY^i
2


SSE=
Variation in Y attributable to factors
H1: Not all of the j are equal (j = 1, 2, 3)

Expected cell frequencies

overall sample
fe n
row total column total

Wilcoxon Rank-Sum Test for differences in 2 medians


Checking the rankings to verify T1 and T2
n(n+1)
T1+ T2 where n = n1+n2
2
T1 sum of ranks from smaller sample

Kruskal-Wallis Rank test

[ ]
c 2
12 Tj
H=
n(n+1) j =1 n j
3( n+ 1)

Residual Analyses
ei = Y iY^i

The Durbin-Watson Statistic


e i1
e i

2

n


i=2
D=

Inferences About the Slope


x i X

2



SYX S
Sb = = YX
SSX
1

S b = estimate of the standard error of the slope


1

S YX =
SSE
n2
- standard error of the estimate
T test
b 1 1
t STAT = also d.f. = n-2
Sb 1

F-test for significance

MSR SSK
F STAT = where MSR=
MSE k
SSE
MSE=
nk 1

Confidence interval estimate for the slope


b1 t S b 1
2

t-test for a Correlation Coefficient


r
t STAT =

where :
1r 2
n2

r=+ r 2 if b1 >0r= r 2 if b1 <0