Anda di halaman 1dari 13

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

EXAMPLES FOR TWO POPULATIONS (INDEPENDENT) Suppose students were classified as smokers and nonsmokers. Randomly selected students from each group were asked on the number of times they came down with cough last year. The data collected are as follows: Smokers : 3 Non-smokers: 0 5 1 4 2 2 1 4 2 5 6

At the 5% level of significance, can it be said that smokers tend to have more occurrences of cough than the nonsmokers?

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

=> TEST FOR NORMALITY Let X1 = number of times a smoker had cough X2 = number of times a nonsmoker had cough Ho: The number of times smokers and nonsmokers had cough follow a Normal distribution Ha: The number of times smoker and nonsmokers had cough do not follow a Normal distribution Test Procedure/Statistic: Wilk-Shapiro/W Decision Rule: Reject Ho if p-value < 0.05, Otherwise, fail to reject Ho Computation: (using stata) sort students; by students: swilk cough
-> students = SMOKERS Shapiro-Wilk W test for normal data Variable | Obs W V z Prob>z -------------+------------------------------------------------cough | 7 0.97270 0.359 -1.386 0.91720 -> students = nonsmokers Shapiro-Wilk W test for normal data Variable | Obs W V z Prob>z -------------+------------------------------------------------cough | 5 0.97076 0.345 -1.176 0.88013

Decision: Smokers: Since 0.91720 > 0.05, fail to reject Ho. Nonsmokers: Since 0.88103 > 0.05, fail to reject Ho. Conclusion: There is evidence to say that the number of times smokers and nonsmokers had cough follow a Normal distribution.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

=> TEST FOR EQUALITY OF VARIANCE

12 = variance of no. of times smokers had cough Let


2 2 =variance of no. of times nonsmokers had cough

2 Ho: 1

2 = 2

2 Ha: 1

2 2

Test Procedure: F- test at =0.05 Test Statistic: Fc = s1


2 2 2 2 s2 ; s1 > s2

Decision Rule: Reject Ho if Fc>F0.05(6,4)=6.16 or p-value < =0.05, Otherwise, fail to reject Ho. Computation: (manual computation) 2 s1 1.3451852 Fc = 2 = = 2.5850 2 s2 0.83666
using stata: sdtest cough, by(students)
ratio = sd(SMOKERS) / sd(nonsmoke) Ho: ratio = 1 Ha: ratio < 1 Pr(F < f) = 0.8114 f = degrees of freedom = 2.5850 6, 4 Ha: ratio != 1 2*Pr(F > f) = 0.3771 Ha: ratio > 1 Pr(F > f) = 0.1886

Decision Rule: Since Fc=2.585<F0.05(6,4)=6.16 or p-value=0.3771 > =0.05, we fail to reject Ho. Conclusion: We have sufficient evidence to say that variances of no. of times the smokers and nonsmokers had cough are equal.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

=> TEST FOR MEAN DIFFERENCE (INDEPENDENT) Let 1 = mean no. of cough of smokers 2 = mean no. of cough of nonsmokers Ho: 1 = 2 Ha: 1 > 2 Test Procedure: One-tailed t-test at =0.05 1 2 1 Test Statistic: t = (x x ) D sp + c 1 2 o n1 n2 Decision Rule: Reject Ho if tc > t0.05 (7+5-2) =1.812 or p-value < =0.05, Otherwise, fail to reject Ho. Computation: (manual computation) x1 x2 = 4.142857-1.2=2.942857
(7 1)1.345185 2 + (5 1) .836662 13.66731 sp = = = 1.3667 7+52 10
2

tc =

(2.942857) 0 2.942857 = = 4.2990 "1 1% 0.6845 1.3667 $ + ' #7 5&

using stata: ttest cough, by(students)


diff = mean(SMOKERS) - mean(nonsmoke) Ho: diff = 0 Ha: diff < 0 Pr(T < t) = 0.9992 t = degrees of freedom = 4.3006 10 Ha: diff != 0 Pr(|T| > |t|) = 0.0016 Ha: diff > 0 Pr(T > t) = 0.0008

Decision: since tc=4.2990 > t0.05 (10) =1.812 or p-value=0.0008<0.05, we reject Ho


STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion 4

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

Conclusion: At =0.05, we have sufficient evidence to say that the mean no. of times smokers had cough is higher than nonsmokers Suppose normality assumption is not satisfied, nonparametric test => TEST FOR TWO MEDIAN (INDEPENDENT) Let Md1 = median no. of cough of smokers Md2 = median no. of cough of nonsmokers Ho: Md1 = Md2 Ha: Md1 Md2 Test Procedure: Mann-Whitney U test Decision Rule: Reject Ho if Tc< T0.05 (5, 7)=5 Otherwise, fail to reject Ho. N N N S N N S S S S S S 0 1 1 2 2 2 3 4 4 5 5 6 1 2.5 2.5 5 5 5 7 8.5 8.5 10.5 10.5 12 R1 = 1+2.5+2.5+5+5 = 16 R2 = 5+7+8.5+8.5+10.5+10.5+12 = 62
U1 = (5 * 7 ) + U2 = (5 * 7 ) + 5 (5 + 1) 2 2 7 (7 + 1) 16 = 34 62 = 1

use

Tc = min(34,1)=1

Decision: Since Tc=1< T0.10 (5, 7) = 5, we reject H0 Conclusion: At =10%, we have sufficient evidence to say that the two samples come from different population.
STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion 5

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

EXAMPLES FOR TWO POPULATIONS (RELATED) A researcher performed an experiment to determine if an atmosphere involving exposure to carbon monoxide (CO) has an impact on breathing capability. The individuals in the study were exposed to breathing chambers, one of which contained a high concentration of CO. Several breathing measures were made for each subject for each chamber and the breathing frequency is assumed to follow the normal distribution. Nine individuals participated in the experiment with the following results on breathing frequency (in number of breaths per minute). Individual 1 2 3 4 5 6 7 8 9 With CO 31 26 25 50 46 34 50 35 37 Without CO 30 27 24 43 41 34 46 23 30 Does it appear that the average breathing frequency is higher when there is exposure to CO? Use =5%. Individual With CO (popn1) Without CO (popn2) di = With Co- Without Co
. summarize di Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------di | 9 4 4.213075 -1 12

1 31 30 1

2 26 27 -1

3 25 24 1

4 50 43 7

5 46 41 5

6 34 34 0

7 50 46 4

8 35 23 8

9 37 30 7

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

=> TEST FOR NORMALITY Ho: The data on the differences of breathing frequency is normally distributed. Ha: The data on the differences of breathing frequency is not normally distributed. Test Procedure/ Statistic: Wilk-Shapiro test/ W Decision Rule: Reject Ho if p-value < =0.05. Otherwise, fail to reject Ho. Computation: (see STATA output)
. swilk di Shapiro-Wilk W test for normal data Variable | Obs W V z Prob>z -------------+------------------------------------------------di | 9 0.93571 0.945 -0.094 0.53754

Decision: since p-value=0.053754 > 0.05, we fail to reject H0 Conclusion: At =0.05, we have sufficient evidence to say that the data on the differences of breathing frequency is normally distributed.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

=> TEST FOR EQUALITY OF VARIANCE H o:


2 2 1 = 2

H a: 2 2 1 2

Test Procedure/statistic: F- test / Fc Decision Rule: Reject Ho if Fc>F0.05 (8,8)=3.44 or p-value < =0.05. Otherwise, fail to reject Ho.
. sdtest wco = woco
ratio = sd(wco) / sd(woco) Ho: ratio = 1 Ha: ratio < 1 Pr(F < f) = 0.6356 Ha: ratio != 1 2*Pr(F > f) = 0.7287 f = degrees of freedom = 1.2883 8, 8

Ha: ratio > 1 Pr(F > f) = 0.3644

Decision: since p-value=0.7287 > 0.05, we fail to reject H0 Conclusion: At =0.05, we have sufficient evidence to say that the two populations have equal variances.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

=> TEST FOR MEAN DIFFERENCE (RELATED) Ho: 1 = 2 or 1 - 2 = 0


sD

Ha: 1 > 2 or 1 - 2 > 0

Test Procedure: One-tailed t-test at =0.05 Test Statistic: t = d Do


c

Decision Rule: Reject Ho if tc > t0.05 (8)=1.860 or p-value < =0.05,Otherwise, fail to reject Ho. Computation:

tc =
. ttest di =0

40 4.213075 9

= 2.8483

One-sample t test -----------------------------------------------------------------------------Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------di | 9 4 1.404358 4.213075 .761544 7.238456 -----------------------------------------------------------------------------mean = mean(di) t = 2.8483 Ho: mean = 0 degrees of freedom = 8 Ha: mean < 0 Pr(T < t) = 0.9892 Ha: mean != 0 Pr(|T| > |t|) = 0.0215 Ha: mean > 0 Pr(T > t) = 0.0108

Decision: Since tc=2.8483 > 1.860 or pvalue= 0.01808 < 0.05, we Reject Ho Conclusion: we have sufficient evidence to say that the mean breathing frequency is higher when there is exposure to CO.
STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion 9

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

Suppose normality assumption is not nonparametric test => TEST FOR TWO MEDIAN (RELATED) Individual 1 2 3 4 5 With CO (popn1) 31 26 25 50 46 Without CO (popn2) 30 27 24 43 41 di = With Co- Without Co 1 -1 1 7 5 Ho: Md1 = Md2 Test Procedure: Sign test Decision Rule: Reject Ho if p-value < =0.05 Otherwise, fail to reject Ho. Computation: Using stata: signtest woco=wco
One-sided tests: Ho: median of woco - wco = 0 vs. Ha: median of woco - wco > 0 Pr(#positive >= 7) = Binomial(n = 8, x >= 7, p = 0.5) =

satisfied, 6 34 34 0 7 50 46 4 8 35 23 8

use 9 37 30 7

Ha: Md1 Md2

0.0352

Decision: Since pvalue= 0.0352< 0.05, we Reject Ho Conclusion: we have sufficient evidence to say that the median breathing frequency is higher when there is exposure to CO.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

10

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

TEST ON TWO POPULATION PROPORTION 1. A random check of drivers on South Luzon Expressway (SLEX) revealed that 44 out of 100 driver from the province and 65 out of 100 drivers from Manila use seatbelts. Do the data provide evidence that more drivers from the province use seatbelts while driving at the SLEX compared to drivers from Manila? Test at 5% level of significance. Let P1 be the proportion of drivers from province who use seatbelt
P2 be the proportion of drivers from Manila who use seatbelt

Ho: P1 = P2; the proportion of drivers from province who use seatbelts is equal to the proportion of drivers from Manila Ha: P1 > P2; the proportion of drivers from province who use seatbelts is greater than to the proportion of drivers from Manila Test Procedure: Z-test Test Statistic: Zc Decision Rule:: reject Ho if |Zc| > Z0.05=1.645, otherwise, fail to reject Ho Computation: p =
100(0.44) +100(0.65) = 0.545 100 +100

Zc =

0.44 0.65 0.545(1 0.545)(2 / 100)

= 2.9819

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

11

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

Decision: Since Zc = -2.9819 < Z0.05=1.645, we fail to reject Ho Conclusion: at = 5%, we have sufficient evidence to say the proportion of drivers from province who use seatbelts is less than or equal to the proportion of drivers from Manila

2. IMF Telecommunications Company decided to give a call incentive to those subscribers who averaged 15 hours per month in calls last year. After promoting it and giving the incentive to the subscribers, they randomly selected 100 subscribers and ask them to evaluate the companys service before and after the promo. The following data was obtained:
Before the incentive After the incentive satisfied not satisfied 35 3 56 6

satisfied not satisfied

Test if there is a difference between the proportion of subscribers who were satisfied before and after the incentive. Use =5%.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

12

STAT 166: STATISTICS FOR SOCIAL SCIENCES Exercise: Test of Hypothesis on Two Population

Ho: The proportion of subscribers who were satisfied before the incentive is equal to the proportion of subscribers who were satisfied after the incentive. Ha: The proportion of subscribers who were satisfied before the incentive is not equal to the proportion of subscribers who were satisfied after the incentive. Test Procedure: Approximate Z-test Test Statistic:
ZC =

SF FS SF + FS

Decision Rule: Reject Ho if |Zc| > Z0.05 = 1.645, otherwise fail to reject Ho. Computation: SF FS 35 56 ZC = = = 2.2014 SF + FS 35 + 56 Decision: Since |Zc| = 2.2014 > Z0.05 = 1.645, Reject Ho. Conclusion: At 5% level of significance, the proportion of subscribers who were satisfied after the incentive is not equal to the proportion of subscribers who were satisfied before the incentive.

STAT166 Statistics for Social Sciences First Semester 2012-2013 JCYnion

13