The t-test
Used to compare quantitative data between two samples, to see if the difference between them is just random
variance in a single population or the samples are truly from two populations w/ different mean values. It
simply calculates the chance of seeing the observed difference if the two samples are from the same population
the critical cut-off value is usually 5%. Like other tests, the test involves a summary statistic, distributed over a
characteristic distribution.
The subtracted 0 is because our null hypo states that the difference between the two samples is 0, it reminds us
of what we’re testing. The “standard” t test procedure is for n<30 and the two samples w/ the same variance. It
is as follows:
(̅̅̅
𝑦1 − 𝑦 ̅̅̅2 ) − 0
𝑡𝑠 =
𝑆𝐸(𝑌̅̅̅1 −𝑌̅̅̅2)
Unpooled SE:
2
(𝑛1 − 1)𝑠12 + (𝑛2 − 1)𝑠22
𝑠pooled =
(𝑛1 + 𝑛2 ) − 2
2
1 1
𝑆𝐸pooled = √𝑠pooled ( + )
𝑛1 𝑛2
2 2
𝑠1 𝑠2 𝑠pooled 𝑠pooled
𝑆𝐸(𝑌̅̅̅1 −𝑌̅̅̅2 ) =√ + & 𝑆𝐸pooled = √ +
√𝑛1 √𝑛2 𝑛1 𝑛2
In analyzing data w/ unequal n, one needs to decide between the two methods. With σ1=σ2, the pooled method
should be used , however the unpooled method yields pretty similar results in this case. With unequal
population standard deviations, the pooling is wrong and the unpooled method should be used. Therefore, the
pooled method is never necessary, and is of no extra benefit; that’s why everyone prefers the unpooled method,
which is easier. The confidence interval for the difference between the two means is as follows:
(𝑦̅1 − 𝑦̅2 ) ∓ 𝑡0.05 SE(𝑌̅1 −𝑌̅2 )
One sample t-test for the mean of a normal distribution w/ unknown variance
Used when we want to, for example, compare a subpopulation w/ the whole population. Population variance is
unknown, but the sample SD is known. One sample t-test is used, w/ the denominator being the sample’s SE.
𝑥̅ − 𝜇0
𝑡=
𝑠𝑑 ⁄√𝑛
It’s just like the z statistic, but for a small sample size (n). The t distribution is different for every value of n-1,
but for n=∞, it will become a normal distribution.
The Z-test
One sample z-test for the mean of a normal distribution w/ known variance
Same as the previous, but w/ the denominator being the population’s σ, divided by the sample’s n
𝑥̅ − 𝜇0
𝑧=
𝜎⁄√𝑛
It uses the normal distribution for any value of n. use P(Z=X)=97.5% for 5% significance level.
The F-test
Can be interpreted as a global mean value comparison, not one between two mean values. This test involves two
df values, one for the numerator and one for denominator.
𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑖 , 𝐻1 : The 𝜇𝑖 are not equal
The test statistic for the f-test:
𝑝𝑜𝑜𝑙𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒
𝑓=
𝑀𝑆(𝑤𝑖𝑡ℎ𝑖𝑛)
There is quite the relation between f-test and t-test. T-test w/ pooling is actually an f-test w/ I=2, and 𝑡𝑠2 = 𝐹𝑠