Vassarstats - Subchapter 15a - The Friedman Test

Print this Window
©Richard Lowry, 19992017
All rights reserved.
Subchapter 15a.
The Friedman Test for 3 or More Correlated Samples
We have noted several times that the Assumptions of oneway ANOVA for correlated samples

analysis of variance is quite robust with (see text of Chapter 15 for details)
respect to the violation of its assumptions,
providing that the k groups are all of the ~equalinterval scale of measurement
same size. In the correlatedsamples ~independence of measures within each group
ANOVA this provision is always satisfied, ~normal distribution of source population(s)
since the number of measures in each of ~equal variances among groups
the groups is necessarily equal to the ~homogeneity of covariance
number of subjects in the repeated
measures design, or to the number of
matched sets in the randomized blocks design.
Still, there are are certain kinds of correlatedsamples situations where the violation of one or more
assumptions might be so thoroughgoing as to cast doubt on any result produced by an analysis of
variance. In cases of this sort, a useful nonparametric alternative can be found in a rankbased procedure
known as the Friedman Test.
There are two kinds of correlatedsamples situations where the advisability of the nonparametric alternative
would be fairly obvious. The first would be the case where the k measures for each subject start out as
mere rankorderings.|
E.g.: To assess the likely results of an upcomming election, the 30 members of a presumably
representative "focus group" of eligible voters are each asked to rank the 3 candidates, A, B, and C, in the
order of their preference (1=most preferred, 3=least preferred).
And the|second would be the case where these measures start out as mere ratings.|
E.g.: The members of the "focus group" are instead asked to rate candidates on a 10point scale
(1=lowest rating, 10=highest).
In both of these situations the assumption of an equalinterval scale of measurement is clearly not met.
There's a good chance that the assumption of a normal distribution of the source population(s) would also
not be met. Other cases where the equalinterval assumption will be thoroughly violated include those in
which the scale of measurement is intrinsically nonlinear: for example, the decibel scale of sound intensity,
the Richter scale of earthquake intensity, or any logarithmic scale.
I will illustrate the Friedman test with a ratingscale example that is close to Violin
my amateur violinist's heart. The venerable auction house of Snootly & Snobs
will soon be putting three fine 17thand 18thcentury violins, A, B, and C, up subjects A B C
for bidding. A certain musical arts foundation, wishing to determine which of
these instruments to add to its collection, arranges to have them played by 1 9.0 7.0 6.0
each of 10 concert violinists. The players are blindfolded, so that they cannot 2 9.5 6.5 8.0
tell which violin is which; and each plays the violins in a randomly determined 3 5.0 7.0 4.0
sequence (BCA, ACB, etc.). 4 7.5 7.5 6.0
5 9.5 5.0 7.0
They are not informed that the instruments are classic masterworks; all they 6 7.5 8.0 6.5
know is that they are playing three different violins. After each violin is played, 7 8.0 6.0 6.0
the player rates the instrument on a 10point scale of overall excellence 8 7.0 6.5 4.0
(1=lowest, 10=highest). The players are told that they can also give fractional 9 8.5 7.0 6.5
ratings, such as 6.2 or 4.5, if they wish. The results are shown in the adjacent 10 6.0 7.0 3.0
table. For the sake of consistency, the n=10 players are listed as "subjects."
¶Logic and Procedure
The Friedman test begins by rankordering the Original Measures Ranked Measures

measures for each subject. For the present example
we will assign the rank of "3" to the largest of a subjects A B C A B C
subject's three measures, "2" to the intermediate of
the three, and "1" to the smallest. Thus for subject 1, 1 9.0 7.0 6.0 3 2 1
the largest measure is in column A, the next largest 2 9.5 6.5 8.0 3 1 2
in column B, and the smallest in column C; so the 3 5.0 7.0 4.0 2 3 1
sequence of ranks across the row for subject 1 4 7.5 7.5 6.0 2.5 2.5 1
is 3,2,1. For subject 2 it is 3,1,2. And so forth. (The 5 9.5 5.0 7.0 3 1 2
guidelines for assigning tied ranks are described in 6 7.5 8.0 6.5 2 3 1
Subchapter 11a in connection with the Mann 7 8.0 6.0 6.0 3 1.5 1.5
Whitney test.) The null hypothesis in this scenario is 8 7.0 6.5 4.0 3 2 1
that the three violins do not differ with respect to 9 8.5 7.0 6.5 3 2 1
whatever it is that blindfolded expert players would 10 6.0 7.0 3.0 2 3 1
judge to be the overall excellence of an instrument.
Ranked Measures
subjects A B C This would entail that each of the six possible sequences of A,B,C ranks
1,2,3
1 3 2 1 1,3,2
2 3 1 2 2,1,3
3 2 3 1 2,3,1
4 2.5 2.5 1 3,1,2
5 3 1 2 3,2,1
6 2 3 1 is equally likely, hence that the three columns will tend to include a random
7 3 1.5 1.5 jumble of 1's, 2's, and 3's, in approximately equal proportions. In this case,
8 3 2 1 the sums and the means of the columns would also tend to come out
9 3 2 1 approximately the same.
10 2 3 1
sums 26.5 21.0 12.5
means 2.65 2.10 1.25
In most respects you will find the logic of the Friedman test quite
similar to that of the KruskalWallis test examined in Subchapter 14a.
For any particular value of k (the number of measures per subject),
the mean of the ranks for any particular one of the n subjects is
(k+1)/2.
Thus for k=3, as in the present example, it is 4/2=2; for k=4, it would
be 5/2=2.5; and so on. On the null hypothesis, this would also be the
expected value of the mean for each of the k columns. Similarly, the
expected value for each of the column sums would be this amount
multiplied by the number of subjects: n(k+1)/2. For the present example, with n=10, it would be
(10)(4)/2=20.
The following items of symbolic notation are the same ones we used in connection with the KruskalWallis
test:
TA = the sum of the n ranks in column A
MA = the mean of the n ranks in column A
TB = the sum of the n ranks in column B
MB = the mean of the n ranks in column B
TC = the sum of the n ranks in column C
MC = the mean of the n ranks in column C

Tall = the sum of the nk ranks in all columns combined
[In all cases equal to nk(k+1)/2]T
Mall = the mean of the nk ranks in all columns combined
[In all cases equal to (k+1)/2]
∙The Measure of Aggregate Group Differences
Also the same as in the KruskalWallis test is the measure of the aggregate degree to which the k group
means differ. It is the betweengroups sum of squared deviates defined in Subchapter 14a as SSbg(R), the "
(R)" serving as a reminder that this particular version of SSbg is based on ranks. The following table
summarizes the values needed for the calculation of SSbg(R).
A B C All
counts 10 10 10 30
n=10 [subjects]T
sums 26.5 21.0 12.5 60.0 k=3 [measures per subject]T
nk=30
means 2.65 2.10 1.25 2.0
As in all other cases heretofore examined, the squared deviate for any particular group mean is equal to the
squared difference between that group mean and the mean of the overall array of data, multiplied by the
number of observations on which the group mean is based. Thus, for each of our current three groups
A: 10(2.65—2.0)2 = 4.225
B: 10(2.10—2.0)2 = 0.100
C: 10(1.25—2.0)2 = 5.625
SSbg(R) = 9.950
Once again we can write the conceptual formula for SSbg(R) as
SSbg(R) = ( [n (M —M ) ]
g g all
2
As usual, the subscript "g"
means "any particular group."
Except now, since each group is necessarily of the same size, it can be reduced to the simpler form
SSbg(R) = n ( (Mg—Mall)2 n = number of subjects
For the same reason, the computational formula (less susceptible to rounding errors) can take the simpler
form
. (Tg)2 (Tall)2
SSbg(R) = n = number of subjectsT
—
nka k = measures per subject
na
For the present example, with n=10, k=3, and values of Tg and Tall as indicated above, this would come out
as
(26.5)2+(21.0)2+(12.5)2 (60)2
SSbg(R) = —
10 30
= 9.95
∙The Sampling Distribution of SSbg(R)
When we examined the KruskalWallis test in Subchapter 14a, we saw that SSbg(R) can be converted into
the measure designated as H, which can then be referred to the sampling distribution of chisquare for
df=k—1. The same is true of the Friedman test; the only difference is in the details of the conversion. For
the Friedman test, the resulting measure is spoken of simply as a value of chisquare and takes the form
SSbg(R)
x =
k(k+1)/12
which for the present example comes out as
9.95 9.95
x = = = 9.95
3(3+1)/12 1
When k is equal to 3, the application of this "conversion" formula is merely pro forma; for in this case the
denominator of the ratio will always come down to 3(4)/12=1, so the resulting value of chisquare will always
be equal to SSbg(R). This, however, will not be so when k is something other than 3. With k=4 the
denominator will be 4(5)/12=1.67; with k=5 it will be 5(6)/12=2.5; and so on.
The following graph is borrowed once again from Chapter 8. As you can see, the observed value of =9.95,
when referred to the appropriate sampling distribution of chisquare, is significant beyond the .01 level.
Theoretical Sampling Distribution of ChiSquare for df=2
Our musical arts foundation can therefore conclude with considerable confidence
that the observed differences among the mean rankings for the three violins reflect
something more than mere random variability, something more than mere chance
coincidence among the judgments of the expert players.
∙An Alternative Computational Formula
Textbook accounts of the Friedman test usually give a different computational formula for chisquare. Its
advantage is that it can be (slightly) more convenient to use. Its disadvantage is that it does not give you the
faintest idea of just what the measure is measuring. But here it is anyway, just in case you ever need to
recognize it.

12
x = . (Tg)2— 3n(k+1)
nk(k+1)
As you can see in connection with our present example, the result comes out quite the same either way.
12
x = [(26.5)2+(21.0)2+(12.5)2]x — (3)(10)(4)
(10)(3)(4)
= (0.1 x 1299.5) — 120 = 9.95
The VassarStats web site has a page that will perform all steps of the Friedman test, including the rank
ordering of the raw measures.
End of Subchapter 15a.
Return to Top of Subchapter 15a
Go to Chapter 16 [TwoWay Analysis of Variance for Independent Samples]
Home Click this link only if the present page does not appear in a frameset headed by the logo Concepts and
Applications of Inferential Statistics

Vassarstats - Subchapter 15a - The Friedman Test

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Vassarstats - Subchapter 15a - The Friedman Test

Diunggah oleh

Hak Cipta:

Format Tersedia

Print this Window

We have noted several times that the Assumptions of oneway ANOVA for correlated samples

The Friedman test begins by rankordering the Original Measures Ranked Measures

SSbg(R) = n ( (Mg—Mall)2 n = number of subjects

Anda mungkin juga menyukai

Vassarstats - Subchapter 15a - The Friedman Test

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Vassarstats - Subchapter 15a - The Friedman Test

Diunggah oleh

Hak Cipta:

Format Tersedia

Print this Window

We have noted several times that the Assumptions of one­way ANOVA for correlated samples

The Friedman test begins by rank­ordering the Original Measures Ranked Measures

SSbg(R) = n ( (Mg—Mall)2 n = number of subjects

Anda mungkin juga menyukai

We have noted several times that the Assumptions of oneway ANOVA for correlated samples

The Friedman test begins by rankordering the Original Measures Ranked Measures