Anda di halaman 1dari 11

MODULE 1 Case of One Sample

This module discusses some topics nonparametric methods dealing with one sample cases. The topics to be included are the following: 1. 2. 3. 4. 5. 6. The Binomial test (Dichotomous Test) The chi square Goodness of fit test The Kolmogorov Smirnov Ones Sample Test Test for Distributional Symmetry The One Sample Runs Test for Randomness The Change Point Test a. Method for Binomial Variable b. Method for Continuous Variable

This module also provides detailed descriptions on when and how to use the nonparametric methods mentioned above. Example/s is/are also presented at the end of each topic.


At the end of this module, you should be able to: 1. understand different nonparametric tests applied to one sample cases; 2. to know when and how to use these tests; and 3. to know how to use SPSS, SAS, and R studio for these types of procedures.



1. The Binomial test When do we apply the binomial test? The binomial test is used when we have a dichotomous data, that is, we are dealing with data that involves only two outcomes. Examples of dichotomous data are male or female, yes or no, agree or disagree, category A or category B, and many others. We use this test if we want to determine if the proportions of objects or individuals falling in each category differs from chance or some pre specified probabilities of falling into those categories.

What are the assumptions of the binomial test? 1. Consists of repeated trials. 2. Outcome of each trial can be classified as success or failure. 3. Probability of success remains constant from trial to trial. 4. The trials are independent.

Procedure: To test 1. Determine . 2. Determine the frequencies of the observe occurrences in each of the two categories. 3. The method of finding the probability of occurrence under of the observed values, or values even more extreme, depends upon the sample size: a. If , Appendix Table D gives the one-tailed probabilities under of various values as small as an observed . Specify , and determine whether the test should be one-tailed or two-tailed. b. If , test using ( ) Appendix Table A gives the probability associated with the occurrence under of values as large as an observed . Table A gives one-tailed probabilities; for a two-tailed test, double the obtained probability. 4. If the probability associated with the observed value or an even more extreme value is equal to or less than , reject . Otherwise, do not reject . Remark: 1. As increases, the binomial distribution tends toward the normal distribution. 2. The tendency is rapid when is close to ; but is slower when is close to 0 or 1. Example: In the study of the effects of stress, an experimenter taught 18 college students 2 different methods of tying the same knot. Half of the respondents (randomly selected from the group of 18) learned method A first, and half learned method B first. The predictions was that stress would induce regression, i.e., that the subject would revert to the first-learned method of tying the knot. Each subject was categorized according to whether the subject used the knot-tying method learned first or the one learned second when asked to tie the knot under stress.

Knot-tying method chosen under stress. Method Chosen FirstSecondlearned learned Frequency 16 2 Solution: To be encoded.

Total 18


For further illustrations and to know how to use SPSS open the website:

Using the data given in example 1, run the SPSS. Are the results the same? 2. The Chi-square Goodness-of-fit test When do we use the chi square goodness fit test? This test is used to determine whether a significant difference exists between an observed number of objects or responses falling in each category and an expected number based upon the null hypothesis. Moreover, this test assesses the degree of correspondence between the observed and expected observations in each category. Assumptions: 1. When ( ), for all (otherwise use Binomial Test) 2. When , no more than 20% of the should be smaller than 5 and none is smaller than 1. 3. Data in discrete categories (nominal or ordinal). Procedures: 1. Cast the observed frequencies into the categories. The sum of the frequencies should be , the number of independent observations. 2. Determine the expected frequencies (the s) for each of the cells. When , and more than 20% of the s are less than 5, combine adjacent categories where this is reasonable, thereby reducing and increasing the values of some s. When , the chi-square test for the one-sample goodness-of-fit test is accurate only if each expected frequency is 5 or larger.

3. Compute

using ( )

4. Determine the degrees of freedom, , where is the number of parameters estimated from the data and used in calculating the expected frequencies. 5. By reference to Appendix Table C, determine the probability associated with the occurrence of under of a value as large as the observed value of for the degrees of freedom appropriate for the data. If the probability is less than or equal to , reject . Example: Horse-racing fans often maintain that in a race around a circular track significant advantages accrue to the horses in certain post positions. Any horses post position is its assigned post in the starting line-up. Position 1 is closest to the rail on the inside of the track; position 8 is on the outside, farthest from the rail in an eight-horse race. We may test the effect of post position by analyzing the race results, given according to post position, for the first month of racing in the season at a particular circular track. Wins accrued on a circular track by horses from eight post positions Post position 1 2 3 4 5 6 7 8 No. of wins 29 19 18 25 17 10 15 11 Expected 18 18 18 18 18 18 18 18

Total 144

3. The Kolmogorov-Smirnov One-sample Test Applications: Another test for goodness-of-fit Concern with the degree of agreement between the distribution of a set of sample values (observed scores) and some specified theoretical distribution Determines whether the scores in a sample can reasonably be taught to have come from a population having the theoretical distribution

Assumptions: 1. The distribution of the underlying variable being tested, as specified by the cumulative frequency distribution. 2. The variables being measured are at least an ordinal scale.

Procedures: 1. Specify the theoretical cumulative distribution, i.e., the cumulative frequency distribution expected under . 2. Arrange the observed scores into a cumulative distribution and convert the cumulative frequencies into cumulative relative frequencies [ ( )]. For each interval find the expected cumulative relative frequency ( ). 3. Compute the test statistic given by, | ( ) ( )| 4. Refer to Appendix Table F to find the probability (two-tailed) associated with the occurrences with the occurrence under of values as large as the observed value of . If that probability is equal to or less than , reject . Example: Data concerning the duration of strikes which began in 1965 in the United Kingdom were collected and analyzed, and predictions were made with the use of the mathematical model. Cumulative frequency max (days) 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9 - 10 10 - 10 11 - 12 12 - 13 13 - 14 14 - 15 15 - 16 16 - 17 17 - 18 18 - 19 19 - 20 20 - 25 duration CF Obs 203 352 452 523 572 605 634 660 683 697 709 718 729 744 750 757 763 767 771 788 CF Pred 212.81 348.26 442.06 510.45 562.15 602.34 634.27 660.1 681.32 698.97 713.82 726.44 737.26 746.61 754.74 761.86 768.13 773.68 778.62 796.68 Cumulative frequency ( ) 0.242 0.419 0.538 0.623 0.681 0.720 0.755 0.786 0.813 0.830 0.844 0.855 0.868 0.886 0.893 0.901 0.908 0.913 0.918 0.938 0.253 0.415 0.526 0.608 0.669 0.717 0.755 0.786 0.811 0.832 0.850 0.865 0.878 0.889 0.898 0.907 0.914 0.921 0.927 0.948 relative ( ) | ( ) ( )|

0.012 0.004 0.012 0.015 0.012 0.003 0.000 0.000 0.002 0.002 0.006 0.010 0.010 0.003 0.006 0.006 0.006 0.008 0.009 0.010

25 - 30 30 - 35 35 - 40 40 - 50 >50

804 812 820 832 840

807.86 815.25 820.39 826.86 840.01

0.957 0.967 0.976 0.990 1.000

0.962 0.971 0.977 0.984 1.000

0.005 0.004 0.000 0.006 0.000

Power: Kolmogorov-Smirnov is more powerful than 4. Test for Distributional Symmetry Applications: -


is small.

The test is used to determine whether the sample was drawn from a symmetrical distribution with an unknown median The test involves the examination of subsets of three variables (or triplets) to determine the likelihood that the distribution is skewed to the left or the right.

Assumptions: 1. Variables are measured in at least ordinal scale. 2. The test is reasonably good for . Procedures: 1. For each subset of size 3 in the sequence of observations determine whether it is right or left triple or neither. The coding is given as follows: Right triple Left triple Neither 2. Calculate the quantities defined as follows: and for each variable ( and ( ( ) . The and are ) )

. 3. Compute the test statistic, where

( (

)( )( [

) ) ( )( )(


4. The significance of Rejection region:

] ( )( ) may be found by use of Appendix Table A. and or (

, reject

Example: For purposes of illustrating the test, the data for the saltiness judgment for one level of salt concentration is analyzed: 13.53, 28.42, 48.11, 48.64, 51.40, 51.91, 67.98, 79.13, and 103.05. Test the null hypothesis that The distribution of saltiness judgment is symmetric against the alternative that The distribution of saltiness judgment is asymmetric at .

5. The One-Sample Runs Test of Randomness Applications: Use to test the hypothesis that the sample taken from a population is random or independent based on the order or sequence in which scores were obtained The technique used are based on the number of runs which a sample exhibits

A run is a succession of identical symbols which are followed and preceded by different symbols or by no symbols at all. Procedures: 1. Arrange the and observations in their order or occurrence. = no. of elements of one kind = no. of elements of the other kind = binary events 2. Count the number of runs . 3. a. If and are both 20 or less, use Table G. Table G gives the critical values of two-tailed test with . For two-tailed test: is rejected if or . For one-tailed test with , is rejected if

for a

b. If either and

for for

: few runs observed or : many runs observed where

) )

is larger than 20, use

( (

Use Table A for the critical values of .

Example 1: The data below show the aggression scores of the sample of 24 children. Aggression scores in order of occurrence: 31, 23, 36, 43, 51, 44, 12, 26, 43, 75, 2, 3, 15, 18, 78, 24, 13, 27, 86, 61, 13, 7, 6, 8. Test the null hypothesis that the aggression scores occur randomly above and below the median throughout the experiment against the alternative that The aggression scores are not random at . Example 2: The data below shoes the order of 30 males (M) and 20 females (F) in queue in theater box office. M,F,M,F,M,M,M,F,F,M,F,M,F,M,F,M,M,M,M,F,M,F,M,F,M,M,F,F,F,M,F,M,F,M,F,M,M,F,M,M,F, M,M,M,M,F,M,F,M,M. Test the null hypothesis that The order of males and females in the queue is random against the alternative The order of males and females in the queue is random at .

6. The Change-Point Test Applications: Useful when one wishes to test the hypothesis that there has been a change in the distribution of a sequence of events.

A. Method of Binomial Variable - appropriate to use when the data are binary and are observations of some binomial process. Procedures:

1. Code each of the observations as 1 or 0 for success or failure, respectively. 2. Calculate the total number of successes, , in the observations. Let . 3. Calculate the test statistic given by | where success ( )|

4. For small sample (i.e. and are both less than or equal to 25), use Table LII and for large samples, use Table LIII to determine whether is rejected or not. Note: If Example: In a study of the effect of change in payoff in a two-choice probability learning task, the payoff or reward given to a subject was changed ( or not changed). The hypothesis was that a change in payoff for correct responses would affect the level of responding by the subject. The experimenter wished to determine whether there was a change in the parameter of the binary sequence of responses over the last 240 trials. If there was a change for those subjects who experienced a change in payoff, then it might be concluded that the change in payoff induced a change in response level. To illustrate the test, response sequence for two subjects will be analyzed. Subject A received 10 cents for each correct response throughout the experiment. Subject B received 10 cents until trial 120, after which the payoff reduced to 1 cent for each correct response. The data are summarized below. , reject

Data for two subjects in probability learning experiment. Response for subject A no change in Response for subject B change in payoff payoff Success Failure Success Failure Total 178 62 167 73 240

Test at trials. trials.

, Ho: There is no change point in the sequence of responses over the last 240 H1: There is a change point in the sequence of responses over the last 240

B. Method for Continuous Variables Steps: 1. Rank order the observations in the sequence of observations. 2. Calculate the sum of the ranks for each point in the sequence of observations 3. For each point in the sequence use | ( )| to calculate the difference between the observed and predicted sum of ranks. is the maximum and divides the sequence into the observations before the change and into the observations after the change. 4. Depending upon the values of and , the method for testing varies. a. Small samples. At the point at which the maximum occurs, use the values and to enter Appendix table J to determine whether to reject the null hypothesis that there is no change in the sequence in favor of . b. Large samples. ( or ). Use the observed value of and to calculate the value of using ( ) If the observed value of reject . ( ) found in the normal table,

exceeds the critical value of

Example: Study of effects of amphetamine on neuronal activity. Two researchers were measuring the firing rate of neurons in the caudate nucleus as a function of time after injection of various isomers of amphetamine. The researcher wanted to know whether there was a change in firing rate during the time that the measurements were being taken. If occurred, it would be evidence for action of the drug at the site where measurements were made. Let . Ho: There is no change in the neuronal firing rate as a function of time. H1: There is a change in the neuronal firing rate as a function of time.



In this module you have read some of the nonparametric methods for one sample cases. You were able to identify the different usage of each test as well as being able to use statistical software such as SPSS, SAS and R Studio.


1. Siegel, Sidney and Castellan, John Jr., Nonparametric Statistics for Behavioral Sciences, 2nd Edition., McGraw-Hill, 1988. 2. Gibbons, J. D., Nonparametric Statistical Inference, 2nd. Edition, Marcel Dekker, Inc., New York and Basel, 1985.