23 tayangan

Diunggah oleh 03435013877

- ECON1203 Project
- Ip Mgt516 Research Methodology
- AnimalBehaviorlab.pdf
- quantitative techniques 2
- T Test
- Business Statistics Level 3/Series 4 2008 (3009)
- Reliability Analysis for Complex Repairable Systems
- ch11
- Teaching Plan 2013
- Happiness succeeds Enlightenment: A way forward for leading balanced life
- PERSEN VIABILITAS
- Exercise.,.,
- Rotor Track and Balance Improvements.pdf
- Woah
- Chi – Square
- Lecture 2 - research question-types-Ho.ppt
- Chapter13_Lecture2
- Math for XI Science Temp1(2)
- homogeneity of variance tutorial
- class4

Anda di halaman 1dari 92

CHI-SQUARE TESTS

THE CHI-SQUARE

DISTRIBUTION

Definition

The chi-square distribution has only one

parameter called the degrees of freedom. The

shape of a chi-squared distribution curve is

skewed to the right for small df and becomes

symmetric for large df. The entire chi-square

distribution curve lies to the right of the vertical

axis. The chi-square distribution assumes

nonnegative values only, and these are denoted

by the symbol χ2 (read as “chi-square”).

2

Figure 11.1 Three chi-square distribution

curves.

3

Example 11-1

Find the value of χ² for 7 degrees of

freedom and an area of .10 in the

right tail of the chi-square

distribution curve.

4

Table 11.1 χ2 for df = 7 and .10 Area in the

Right Tail

Distribution Curve

df .995 … .100 … .005

1 .000 … 2.706 … 7.879

2 .010 … 4.605 … 10.597

. … … … … …

7 .989 … 12.017 … 20.278

. … … … … …

100 67.328 … 118.498 … 140.169

²

Required value of χ 5

Figure 11.2

df = 7

.10

0 12.017 χ²

6

Example 11-2

Find the value of χ² for 12 degrees of

freedom and area of .05 in the left tail

of the chi-square distribution curve.

7

Solution 11-2

Area in the right tail

= 1 – Area in the left tail

= 1 – .05 = .95

8

Table 11.2 χ2 for df = 12 and .95 Area in the

Right Tail

Distribution Curve

df .995 … .950 … .005

1 .000 … .004 … 7.879

2 .010 … .103 … 10.597

. … … … … …

12 3.074 … 5.226 … 28.300

. … … … … …

100 67.328 … 77.929 … 140.169

²

Required value of χ 9

Figure 11.3

df = 12

.05

0 5.226 ²

χ

10

A GOODNESS-OF-FIT TEST

Definition

An experiment with the following

characteristics is called a

multinomial experiment.

11

Multinomial Experiment

cont.

1. It consists of n identical trials (repetitions).

2. Each trial results in one of k possible

outcomes (or categories), where k > 2.

3. The trials are independent.

4. The probabilities of the various outcomes

remain constant for each trial.

12

A GOODNESS-OF-FIT TEST

cont.

Definition

The frequencies obtained from the

performance of an experiment are called the

observed frequencies and are denoted by

O. The expected frequencies, denoted by E,

are the frequencies that we expect to obtain if

the null hypothesis is true. The expected

frequency for a category is obtained as

E = np

Where n is the sample size and p is the

probability that an element belongs to that

category if the null hypothesis is true.

13

A GOODNESS-OF-FIT TEST

cont.

Degrees of Freedom for a Goodness-

of-Fit Test

In a goodness-of-fit test, the degrees

of freedom are

df = k – 1

possible outcomes (or categories) for

the experiment.

14

Test Statistic for a

Goodness-of-Fit Test

The test statistic for a goodness-of-fit

test is χ2 and its value is calculated as

(O − E ) 2

χ2 = ∑

E

where

O = observed frequency for a category

test is always right-tailed.

15

Example 11-3

A bank has an ATM installed inside the bank, and

it is available to its customers only from 7 AM to 6

PM Monday through Friday. The manager of the

bank wanted to investigate if the percentage of

transactions made on this ATM is the same for

each of the five days (Monday through Friday) of

the week. She randomly selected one week and

counted the number of transactions made on this

ATM on each of the five days during this week.

The information she obtained is given in the

following table, where the number of users

represents the number of transactions on this ATM

on these days. For convenience, we will refer to

these transactions as “people” or “users.” 16

Example 11-3

users

At the 1% level of significance, can we

reject the null hypothesis that the

proportion of people who use this ATM

each of the five days of the week is the

same? Assume that this week is typical of

all weeks in regard to the use of this ATM.

17

Solution 11-3

H0 : p1 = p2 = p3 = p4 = p5 = .20

H1 : At least two of the five proportions

are not equal to .20

18

Solution 11.3

There are five categories

Five days on which the ATM is used

Multinomial experiment

We use the chi-square distribution to

make this test.

19

Solution 11-3

Area in the right tail = α = .01

k = number of categories = 5

df = k – 1 = 5 – 1 = 4

The critical value of χ2 = 13.277

20

Figure 11.4

α = .01

χ2

Critical value of χ2 13.277 21

Table 11.3

Category Observed p Expected (O – E) (O – E)2

(Day) Frequency Frequency

(O − E ) 2

O E = np E

Tuesday 197 .20 1200(.20) = 240 -43 1849 7.704

Wednesday 204 .20 1200(.20) = 240 -36 1296 5.400

Thursday 279 .20 1200(.20) = 240 39 1521 6.338

Friday 267 .20 1200(.20) = 240 27 729 3.038

22

Sum =

Solution 11-3

All the required calculations to find

the value of the test statistic χ2 are

shown in Table 11.3.

(O − E ) 2

χ =∑

2

= 23.184

E

23

Solution 11.3

The value of the test statistic χ2 =

23.184 is larger than the critical value

of χ2 = 13.277

It falls in the rejection region

Hence, we reject the null hypothesis

24

Example 11-4

In a National Public Transportation survey

conducted in 1995 on the modes of

transportation used to commute to work, 79.6%

of the respondents said that they drive alone,

11.1% car pool, 5.1% use public transit, and

4.2% depend on other modes of transportation

(USA TODAY, April 14, 1999). Assume that these

percentages hold true for the 1995 population of

all commuting workers. Recently 1000 randomly

selected workers were asked what mode of

transportation they use to commute to work. The

following table lists the results of this survey.

25

Example 11-4

transportation

Number of workers 812 102 57 29

whether the current pattern of use of

transportation modes is different from

that for 1995.

26

Solution 11-4

H0: The current percentage distribution

of the use of transportation modes

is the same as that for 1995.

H1: The current percentage distribution

of the use of transportation

modes is different from that for

1995.

27

Solution 11-4

There are four categories

Drive alone, carpool, public transit, and

other

Multinomial experiment

We use the chi-square distribution to

make the test.

28

Solution 11-4

Area in the right tail = α = .025

k = number of categories = 4

df = k – 1 = 4 – 1 = 3

The critical value of χ2 = 9.348

29

Figure 11.5

α=.

025

9.348

χ2 30

Critical value of χ2

Table 11.4

Category Observed p Expected (O – E) (O – E)2

Frequency Frequency

(O − E ) 2

O E = np E

Car pool 102 .111 1000(.111) = 111 -9 81 .730

Public transit 57 .051 1000(.051) = 51 6 36 .706

Other 29 .042 1000(.042) = 42 -13 169 4.024

5.782

Solution 11-4

All the required calculations to find

the value of the test statistic χ2 are

shown in Table 11.4.

(O − E ) 2

χ =∑

2

= 5.782

E

32

Solution 11-4

The value of the test statistic χ2 =

5.782 is less than the critical value of

χ2 = 9.348

It falls in the nonrejection region

Hence, we fail to reject the null

hypothesis.

33

CONTINGENCY TABLES

University

Full-Time Part-Time

Male 6768 2615 Students who

are male and

Female 7658 3717 enrolled part-

time

34

A TEST OF INDEPENDENCE

OR HOMOGENEITY

A Test of Independence

A Test of Homogeneity

35

A Test of Independence

Definition

A test of independence involves a test of the

null hypothesis that two attributes of a

population are not related. The degrees of

freedom for a test of independence are

df = (R – 1)(C – 1)

Where R and C are the number of rows and

the number of columns, respectively, in the

given contingency table.

36

A Test of Independence

cont.

Test Statistic for a Test of Independence

The value of the test statistic χ2 for a test

of independence is calculated as

(O − E ) 2

χ =∑2

E

where O and E are the observed and expected

frequencies, respectively, for a cell.

37

Example 11-5

Violence and lack of discipline have

become major problems in schools in the

United States. A random sample of 300

adults was selected, and they were asked

if they favor giving more freedom to

schoolteachers to punish students for

violence and lack of discipline. The two-

way classification of the responses of

these adults is represented in the

following table.

38

Example 11-5

In Favor Against No Opinions

(F) (A) (N)

Men (M) 93 70 12

Women (W)the 87

Calculate 32frequencies

expected 6 for

this table assuming that the two

attributes, gender and opinions on the

issue, are independent.

39

Table 11.6

Solution 11-5

In Favor Against No Opinion Row

(F) (A) (N) Totals

Women (W) 87 32 6 125

Column Totals 180 102 18 300

40

Expected Frequencies for

a Test of Independence

The expected frequency E for a cell is

calculated as

E=

sample size

41

Table 11.7

Solution 11-5

In Favor Against No Opinion Row

(F) (A) (O) Totals

(105.00) (59.50) (10.50)

(75.00) (42.50) (7.50)

42

Example 11-6

Reconsider the two-way classification table

given in Example 11-5. In that example, a

random sample of 300 adults was selected,

and they were asked if they favor giving

more freedom to schoolteachers to punish

students for violence and lack of discipline.

Based on the results of the survey, a two-

way classification table was prepared and

presented in Example 11-5. Does the

sample provide sufficient information to

conclude that the two attributes, gender

and opinions of adults, are dependent? Use

a 1% significance level.

43

Solution 11-6

H0: Gender and opinions of adults are

independent

H1: Gender and opinions of adults are

dependent

44

Solution 11-6

α = .01

df = (R – 1)(C – 1) = (2 – 1)(3 – 1) = 2

The critical value of χ2 = 9.210

45

Figure 11.6

α = .01

9.210 χ2

Critical value of χ2 46

Table 11.8

In Favor Against No Opinion Row

(F) (A) (N) Totals

Men 93 70 12 175

(M) (105.00) (59.50) (10.50)

Women 87 32 6 125

(W) (75.00) (42.50) (7.50)

Column 180 102 18 300

Totals

47

Solution 11-6

(O − E ) 2

χ2 = ∑

E

=

( 93 − 105.00 )

2

+

( 70 − 59.50 )

2

+

( 12 − 10.50 )

2

+

( 87 − 75.00 )

2

+

( 32 − 42.50 )

2

+

( 6 − 7.50 )

2

= 1.371 + 1.853 + .214 + 1.920 + 2.594 + .300 = 8.252

48

Solution 11-6

The value of the test statistic χ2 =

8.252

It is less than the critical value of χ2

It falls in the nonrejection region

Hence, we fail to reject the null

hypothesis

49

Example 11-7

A researcher wanted to study the

relationship between gender and

owning cell phones. She took a

sample of 2000 adults and obtained

the information given in the following

table.

50

Example 11-7

Phones

Men 640 450

Women 440 470

you conclude that gender and owning

cell phones are related for all adults?

51

Solution 11-7

H0: Gender and owning a cell phone

are not related

H1: Gender and owning a cell phone

are related

52

Solution 11-7

We are performing a test of

independence

We use the chi-square distribution

α = .05.

df = (R – 1)(C – 1) = (2 – 1)(2 – 1) = 1

The critical value of χ2 = 3.841

53

Figure 11.7

α = .05

3.841 χ2

Critical value of χ2 54

Table 11.9

Own Cell Do Not Own Cell Row

Phones (Y) Phones Totals

(N)

Men 640 450 1090

(M) (588.60) (501.40)

Women 440 470 910

(W) (491.40) (418.60)

Totals

55

Solution 11-7

(O − E )

2

χ =∑

2

=

( 640 − 588.60)

2

+

( 450 − 501.40 )

2

588.60 501.40

+

( 440 − 491.40 )

2

+

( 470 − 418.60)

2

491.40 481.60

= 4.489 + 5.269 + 5.376 + 6.311 = 21.445

56

Solution 11-7

The value of the test statistic χ2 =

21.445

It is larger than the critical value of χ2

It falls in the rejection region

Hence, we reject the null hypothesis

57

A Test of Homogeneity

Definition

A test of homogeneity involves

testing the null hypothesis that the

proportions of elements with certain

characteristics in two or more different

populations are the same against the

alternative hypothesis that these

proportions are not the same.

58

Example 11-8

Consider the data on income

distributions for households in

California and Wisconsin given in

following table:

Californi Wisconsin Row Totals

a

High Income 70 34 104

Medium 80 40 120

Income

Low Income 100 76 176

59

Example 11-8

Using the 2.5% significance level, test

the null hypothesis that the

distribution of households with regard

to income levels is similar

(homogeneous) for the two states.

60

Solution 11-8

H0: The proportions of households that

belong to different income groups are the

same in both states

H1: The proportions of households that

belong to different income groups are

not the same in both states

61

Solution 11-8

α = .025

df = (R – 1)(C – 1) = (3 – 1)(2 – 1) = 2

The critical value of χ2 = 7.378

62

Figure 11.7

α=.

025

7.378 χ2

Critical value of χ2 63

Table 11.11

California Wisconsin Row Totals

(65) (39)

Medium income 80 40 120

(75) (45)

Low income 100 76 176

(110) (66)

Column Totals 250 150 400

64

Solution 11-8

(O − E ) 2

χ2 = ∑

E

=

( 70 − 65)

2

+

( 34 − 39)

2

+

( 80 − 75)

2

65 39 75

+

( 40 − 45)

2

+

( 100 − 110 )

2

+

( 76 − 66 )

2

45 110 66

= .385 + .641 + .333 + .566 + .909 + 1.515 = 4.339

65

Solution 11-8

The value of the test statistic χ2 =

4.339

It is less than the critical value of χ2

It falls in the nonrejection region

Hence, we fail to reject the null

hypothesis

66

INFERENCES ABOUT THE

POPULATION VARIANCE

Estimation of the Population Variance

Hypothesis Tests About the

Population Variance

67

INFERENCES ABOUT THE

POPULATION VARIANCE

cont.

Sampling Distribution of (n – 1)s2 / σ2

If the population from which the

sample is selected is (approximately)

normally distributed, then

(n − 1) s 2

σ 2

1 degrees of freedom.

68

Estimation of the

Population Variance

Assuming that the population from

which the sample is selected is

(approximately) normally distributed,

the (1 – α)100% confidence interval

for the population variance σ2 is

(n − 1) s 2 (n − 1) s 2

to

χα / 2

2

χ 1−α / 2

2

69

Example 11-9

One type of cookie manufactured by

Haddad Food Company is Cocoa

Cookies. The machine that fills

packages of these cookies is set up in

such a way that the average net weight

of these packages is 32 ounces with a

variance of .015 square ounce.

70

Example 11-9

From time to time the quality control

inspector at the company selects a sample of

a few such packages, calculates the variance

of the net weights of these packages, and

construct a 95% confidence interval for the

population variance. If either both or one of

the two limits of this confidence interval is

not the interval .008 to .030, the machine is

stopped and adjusted.

71

Example 11-9

A recently taken random sample of 25

packages from the production line

gave a sample variance of .029 square

ounce. Based on this sample

information, do you think the machine

needs an adjustment? Assume that the

net weights of cookies in all packages

are normally distributed.

72

Solution 11-9

n = 25 s2 = .029

α = 1 - .95 = .05

α / 2 = .05 / 2 = .025

1 – α / 2 = 1 – .025 = .975

df = n – 1 = 25 – 1 = 24

χ2 for 24 df and .025 area in the right tail =

39.364

χ2 for 24 df and .975 area in the right tail =

12.401

73

Figure 11.9

df = 24

α = .025

2

39.36 χ2

χ2

Value of α / 2

4

74

Figure 11.9

df = 24

α = .025

1−

2

12.40 χ2

1

Value ofχ

2

1−α 2

75

Solution 11-9

(n − 1) s 2

(n − 1) s 2

to

χα / 2

2

χ 1−α / 2

2

to

39.364 12.401

.0177 to .0561

76

Solution 11-9

Thus, with 95% confidence, we can

state that the variance for all

packages of Cocoa Cookies lies

between .0177 and .0561 square

ounce.

77

Hypothesis Tests About

the Population Variance

The value of the test statistic χ2 is calculated

as

(n − 1) s 2

χ =

2

σ 2

hypothesized value of the population variance,

and n – 1 represents the degrees of freedom.

The population from which the sample is

selected is assumed to be (approximately)

normally distributed.

78

Example 11-10

One type of cookie manufactured by Haddad Food

Company is Cocoa Cookies. The machine that fills

packages of these cookies is set up in such a way

that the average net weight of these packages is

32 ounces with a variance of .015 square ounce.

From time to time the quality control inspector at

the company selects a sample of a few such

packages, calculates the variance of the net

weights of these packages, and makes a test of

hypothesis about the population variance.

79

Example 11-10

She always uses α = .01. The

acceptable value of the population

variance is .015 square ounce or

less. If the conclusion from the test

of hypothesis is that the population

variance is not within the acceptable

limit, the machine is stopped and

adjusted.

80

Example 11-10

A recently taken random sample of

25 packages from the production line

gave a sample variance of .029

square ounce. Based on this sample

information, do you think the

machine needs an adjustment?

Assume that the net weights of

cookies in all packages are normally

distributed.

81

Solution 11-10

H0 :σ2 ≤ .015

The population variance is within the

acceptable limit

H1: σ2 >.015

The population variance exceeds the

acceptable limit

82

Solution 11-10

α = .01

df = n – 1 = 25 – 1 = 24

The critical value of χ2 = 42.980

83

Figure 11.10

α = .01

χ2

42.980

Critical value of χ2

84

Solution 11-10

(n − 1) s

2

(25 − 1)(.029)

χ =

2

= = 46.400

σ 2

.015

From H0

85

Solution 11-10

The value of the test statistic χ2 = 46.400

It is greater than the critical value of χ 2

Hence, we reject the null hypothesis H0

We conclude that the population variance is not within

the acceptable limit

The machine should be stopped and adjusted

86

Example 11-11

The variance of scores on a standardized

mathematics test for all high school seniors was

150 in 2002. A sample of scores for 20 high school

seniors who took this test this year gave a variance

of 170. Test at the 5% significance level if the

variance of current scores of all high school seniors

on this test is different from 150. Assume that the

scores of all high school seniors on this test are

(approximately) normally distributed.

87

Solution 11-11

H0: σ2 = 150

The population variance is not different

from 150

H1: σ2 ≠ 150

The population variance is different from

150

88

Solution 11-11

α = .05

Area in the each tail = .025

df = n – 1 = 20 – 1 = 19

The critical values of χ2 32.852 and

8.907

89

Figure 11.11

Reject H0

α /2 = .025 α /2 = .025

8.907 32.852

Two critical values of χ2

90

Solution 11-11

(n − 1) s

2

(20 − 1)(170)

χ =

2

= = 21.533

σ 2

150

From H0

91

Solution 11-11

The value of the test statistic χ2 =

21.533

It is between the two critical values of χ2

It falls in the nonrejection region

Consequently, we fail to reject H0.

92

- ECON1203 ProjectDiunggah olehShaira Rahman
- Ip Mgt516 Research MethodologyDiunggah olehSiddharth Singh
- AnimalBehaviorlab.pdfDiunggah olehSip Bio
- quantitative techniques 2Diunggah olehamity_acel
- T TestDiunggah olehMc Ortega
- Business Statistics Level 3/Series 4 2008 (3009)Diunggah olehHein Linn Kyaw
- Reliability Analysis for Complex Repairable SystemsDiunggah olehsilvaralph
- ch11Diunggah olehTibet Boğazköy Akyürek
- Teaching Plan 2013Diunggah olehnsrnath
- PERSEN VIABILITASDiunggah olehrizkyanapuspita
- Exercise.,.,Diunggah olehchumy
- Rotor Track and Balance Improvements.pdfDiunggah olehrobiny
- Happiness succeeds Enlightenment: A way forward for leading balanced lifeDiunggah olehIjemls Ijemls
- WoahDiunggah olehdemson
- Chi – SquareDiunggah olehglemarmonsales
- Lecture 2 - research question-types-Ho.pptDiunggah olehJoseph Claveria
- Chapter13_Lecture2Diunggah olehNdomadu
- Math for XI Science Temp1(2)Diunggah olehagung_jackson21
- homogeneity of variance tutorialDiunggah olehapi-163017967
- class4Diunggah olehNdivhuho Neosta
- CHI SQUARE TEST.docxDiunggah olehshubendu ghosh
- notes-chapters 202123 kcDiunggah olehapi-313284571
- 5 step modelDiunggah olehapi-248766679
- CHAPTER FOUR OF COUNSELLING.docxDiunggah olehDaniel Obasi
- Video QuestionsDiunggah olehsebaszj
- 2040 W16 T2 W SolutionsDiunggah olehMO
- Jawaban Chp 4Diunggah olehjhon
- Charter 1999Diunggah olehValentina Avakian Soledad
- Characteristics of Application Software MaintenanceDiunggah olehOdabor Oseb
- 22Diunggah olehmuralidharan

- Chapter 13:Diunggah oleh03435013877
- Analysis of VarianceDiunggah oleh03435013877
- Chapter 11:Diunggah oleh03435013877
- Chapter 10:Diunggah oleh03435013877
- Hypothesis Tests About the Mean and ProportionDiunggah oleh03435013877
- Estimation of the Mean and ProportionDiunggah oleh03435013877
- Chapter 14:Diunggah oleh03435013877
- Sampling DistributionsDiunggah oleh03435013877
- Discrete Random Variables and Their Probability DistributionsDiunggah oleh03435013877
- ProbabilityDiunggah oleh03435013877
- Chapter 3 :Diunggah oleh03435013877
- Organizing DataDiunggah oleh03435013877
- IntroductionDiunggah oleh03435013877

- IvarDiunggah olehmounabs
- Model- vs. design-based sampling and variance estimationDiunggah olehFanny Sylvia C.
- VariabilityDiunggah olehJelica Vasquez
- Design LayoutDiunggah olehCristina Balan
- rr220105-probability-and-statisticsDiunggah olehSrinivasa Rao G
- APPLIED STATISTICS FOR THE BEHAVIORAL SCIENCES.docxDiunggah olehMichaelister Ordoñez Monteron
- random variablesDiunggah olehFreddy Belizario
- Cis SimulationDiunggah olehGaurav Sharma
- Buried Flexibile Pipe_Geomechanics and EngineeringDiunggah olehjacs127
- MonteCarlito_v1_10Diunggah olehyasarinu
- SIMPLE AND BIAS-CORRECTED MATCHING ESTIMATORS FOR AVERAGE TREATMENT EFFECTSDiunggah olehcognoscenti75
- Chap001.pptDiunggah olehEduardo Andraders
- Application of Bio Statistics in PharmacyDiunggah olehAhmed Zia
- Value Added Tax SyatemDiunggah olehHari Kumar
- Technical Report 4 CEngDiunggah olehSeamus Creed
- 89 1522829504_04-04-2018.pdfDiunggah olehRahul Sharma
- xtxtnbregDiunggah olehalicorpanao
- Syll Sample Mstat PSB 2014Diunggah olehxyza304gmailcom
- CP_RPT_FDD_RDD0075v0.4Diunggah olehSudheer Kumar Chunduru
- Statistics PaperDiunggah olehkishan ivjay
- Autocorrelation x 4Diunggah olehtusharsinghal94
- Tools of Research-1Diunggah olehDr Syed Manzoor H Shah
- The Impact of Intellectual Capital on Competitive AdvantageDiunggah olehmaher76
- CB_REPORTDiunggah olehaliarafat110
- MH3511midterm2017Q (1)Diunggah olehFrancis Tan
- Reliability.Assessment.of.Electric.Power.Systems.Using.Monte.Carlo.Methods..pdfDiunggah olehVictor Carvalho
- Some Notes From the Book- Pairs Trading- Quantitative Methods and Analysis by Ganapathy Vidyamurthy Weatherwax_vidyamurthy_notesDiunggah olehalexa_sherpy
- Cerinte 1,2Diunggah olehGeorge Martin
- Gillespie 1996Diunggah olehSebastian Vallejo
- Cheat Sheet of AwesomenessDiunggah olehBob Thorton