Somers' D: Data Setup

Somers’ dBA
Somers’ dBA is a measure of the strength and direction of association between an ordinal dependent
variable and an ordinal independent variable, with ties on the dependent variable. Somers’ dBA is
appropriate to use when the researcher wants to distinguish between a dependent variable B and an
independent variable A (i.e. since Goodman and Kruskal’s Gamma doesn’t make any distinction
between two ordinal variables). It takes on values from -1 (all pairs “disagree”) to 1 (all pairs “agree”).
# 𝑜𝑓 𝑎𝑔𝑟𝑒𝑒𝑚𝑒𝑛𝑡𝑠 − # 𝑜𝑓 𝑑𝑖𝑠𝑎𝑔𝑟𝑒𝑒𝑚𝑒𝑛𝑡𝑠
𝑑𝐵𝐴 =
# 𝑜𝑓 𝑝𝑎𝑖𝑟𝑠 𝑛𝑜𝑡 𝑡𝑖𝑒𝑑 𝑜𝑛 𝐴
2 [ #(+) − #(−) ]
=
𝑁 2 − ∑𝑘𝑗=1 𝐶𝑗2
where N = total number of observations, Cj = the marginal frequency of the jth value of the variable A
i.e. column totals ∀ j = 1, 2, …, k
Data Setup
Independent variable
A1 A2 … Ak Total
Dependent variable
B1 n11 n12 … n1k R1

B2 n21 n22 … n2k R2
⋮ ⋮ ⋮ ⋮ ⋮
Br nr1 nr2 … nrk Rr
Total C1 C2 … Ck N
The number of rows is not necessarily equal to the number of columns, i.e. r can be not equal to k.
Assumptions
➢ The researcher has one dependent variable and one independent variable, both of which are
measured on an ordinal scale.
➢ There needs to be a monotonic relationship between the dependent and independent variable.
A monotonic relationship exists when the variables increase in value together; or as one
variable value increases, the other variable value decreases.
Procedure
1. Construct the rxk contingency table.
2. Count the number of agreeing pairs, i.e.#(+). Start with the top left cell, looking to the lower
right, add all the values in these cells and multiply it with n11. Repeat this process for all nij. This
will sum up to the total number of agreeing/concordant pairs.
Page 1 of 11
3. Count the number of disagreeing pairs, i.e. #(-). The process is similar with the counting of the
number of agreeing pairs. Only this time, we start from the upper right cell, i.e. n1k, moving to
the left. The values in the lower left cells with respect to nij are summed up and multiplied by nij.
We again repeat this process for all observations. This will sum up to the total number of
disagreeing/discordant pairs.
4. Compute ∑𝑘𝑗=1 𝐶𝑗2 .
5. Plug in these values to the formula for dBA.
Remarks
1. It’s important to define which variable is independent and which is dependent as computing for
Somers’ d of (A,B) and (B,A) yield different results, i.e. dBA ≠ dAB.
Let’s say you wanted to know whether customer satisfaction (on a scale of 1 to 5) was
dependent on how friendly your sales staff were (on a scale of 1 to 3). If you switch the
independent and dependent variables around, you’ll be measuring how friendliness of
your sales staff was affected by customer satisfaction. That may be interesting
information, but it isn’t the relationship you’re interested in.
2. The strength of association between variables can be assessed by examining the absolute
value of Somers’ d.
3. Somers’ dBA = 1 if and only if there are no disagreements in order and each row has at most
one nonzero cell. The appearance of such contingency table would have the nonzero cells
descending from upper left to lower right like a staircase. Similarly, dBA = -1 if the nonzero cells
ascend from lower left to upper right.
4. Somers’ dBA = 0 if the variables are independent. The converse is not true.
5. The null distribution of Somers’ dBA is approximately Standard Normal, i.e. dBAN(0,1).
6. Two versions of Somers’ d exist: asymmetric and symmetric. The one being discussed, the
asymmetric version, is by far the most popular.
Example 29.2.1
Segal (1969) investigated the association between work satisfaction and perceived powerlessness
among a sample of Chilean physicians in a large hospital in Santiago. He believed that work
satisfactions would vary according to the perceived powerlessness of physicians. Powerlessness was
defined by Segal as “the feeling that one does not have control over the way his work is defined and
organized.” (Segal, 1969, p. 196.) A Guttman scale was used to render an ordinal measure of this
variable. Work satisfactions were measured similarly by appropriate items revealing one’s degree of
contentment with the type of work performed. A specific relationship was believed to exist between
these variables as follows. As one’s perceived powerlessness tended to increase, one’s satisfaction
with work would tend to decrease. Therefore, if Segal’s theorizing were sound, we would expect to find
fairly strong, inverse association between these variables. Based on Segal’s ideas, some hypothetical
data have been arranged into a cross-tabulated form and presented in the table.
To verify/disprove Segal’s claim, we would want to solve for Somers’ d.
Page 2 of 11
Perceived Powerlessness and Work Satisfaction for 150 Chilean Physicians
Physician’s Perceived Powerlessness (A)
Work Satisfaction (B) Low Medium High Totals
High 25 18 10 53
Medium 12 13 15 40
Low 8 17 32 57
Totals 45 48 57 N = 150
Solution by hand:
Solving for dBA, first, we compute for #(+) and #(-).
#(+) =
#(-) =
Next, compute for ∑𝑘𝑗=1 𝐶𝑗2 .

∑3𝑗=1 𝐶𝑗2 =
Plug these in to the formula

2 [ #(+) − #(−) ]
𝑑𝐵𝐴 =
𝑁 2 − ∑3𝑗=1 𝐶𝑗2
Note that if we were to get the transpose of the data, i.e., if the columns and rows are switched such
that the Work Satisfaction (B) is treated as the independent variable and the Physician’s Perceived
Powerlessness (A) is treated as the dependent variable, we can also compute for Somers’ d AB. In this
case, the new data setup would be as follows.
Page 3 of 11
Work Satisfaction (B)
Physician’s Perceived Powerlessness (A) Low Medium High Totals
High 32 15 10 57
Medium 17 13 18 48
Low 8 12 25 45
Totals 57 40 53 N = 150
Solving for dAB, we compute for #(+) and #(-).
#(+) =
#(-) =
Next, compute for ∑𝑘𝑗=1 𝐶𝑗2 .

∑3𝑗=1 𝐶𝑗2 =
Plug these in to the formula

2 [ #(+) − #(−) ]
𝑑𝐴𝐵 =
𝑁 2 − ∑3𝑗=1 𝐶𝑗2
=
=
Hence, powerlessness influences work satisfaction levels slightly more than work satisfaction
influences powerlessness. However, based on these d values, it seems that neither variable is the
better predictor.
Also, if you notice, the numerators for dBA and dAB are equal, meaning the total number of agreements
and the total number of disagreements do not change even when we treat the independent variable as
dependent and vice versa.
Solution by R:
We will input the following codes in the script editor.
Page 4 of 11
These codes would yield the output below.
Solution by SPSS:
Below would be our data input.
Page 5 of 11
This will yield the following output.
Somer’s d is presented in the “Powerlessness Dependent” row of the “Value” column and is -.109 and
the “Approximate Significance” column shows that the statistical significance value i.e., p-value, is
0.108 which means p-value is not less than, say, 0.05. Therefore, the association between the ordinal
dependent variable “Powerlessness” and the ordinal independent variable “Work Satisfaction” is not
statistically significant at 5% level of significance.
Steps in Testing for the Significance of Somers’ dBA

1. State the null and alternative hypotheses, H0 and Ha respectively.
H0: ΔBA = 0 vs. Ha: ΔBA > 0
vs. Ha: ΔBA < 0
vs. Ha: ΔBA ≠ 0
𝑃(𝐴 𝑎𝑛𝑑 𝐵 𝑎𝑔𝑟𝑒𝑒 𝑖𝑛 𝑜𝑟𝑑𝑒𝑟) −𝑃(𝐴 𝑎𝑛𝑑 𝐵 𝑑𝑖𝑠𝑎𝑔𝑟𝑒𝑒 𝑖𝑛 𝑜𝑟𝑑𝑒𝑟)
where Δ𝐵𝐴 = which is an asymmetric
𝑃(𝑎 𝑝𝑎𝑖𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑡𝑖𝑒𝑑 𝑜𝑛 𝐴)
index of association between two variables A and B. Note that this unknown parameter ΔBA is
what is estimated by dBA.
2. Solve for Somers’ dBA using the procedure above.
3. Since the null distribution of dBA is very complicated to derive, we make use of the
approximation of its distribution. Hence, we have the statistic below.
𝑑𝐵𝐴
𝑍=
√Var(d𝐵𝐴 )
Page 6 of 11
where Var(dBA) = 4 ∑𝑟𝑖=1 ∑𝑘𝑗=1 𝑛𝑖𝑗 (𝑁𝑖𝑗+ + 𝑀𝑖𝑗 +
− 𝑁𝑖𝑗− − 𝑀𝑖𝑗− ) 2
(𝑁 2 − ∑𝑘𝑗=1 𝐶 𝑗2 ) 2
where 𝑁𝑖𝑗+ , 𝑀𝑖𝑗+ , 𝑁𝑖𝑗− , 𝑀𝑖𝑗− are sums of frequencies such that
𝑖−1 𝑗−1
𝑀𝑖𝑗+ = ∑ ∑ 𝑛𝑝𝑞
𝑝=1 𝑞=1
𝑖−1 𝑘
−
𝑀𝑖𝑗 = ∑ ∑ 𝑛𝑝𝑞
𝑝=1 𝑞=𝑗+1
𝑟 𝑐
𝑁𝑖𝑗+ = ∑ ∑ 𝑛𝑝𝑞
𝑝=𝑖+1 𝑞=𝑗+1
𝑟 𝑗−1
𝑁𝑖𝑗− = ∑ ∑ 𝑛𝑝𝑞
𝑝=𝑖+1 𝑞=1
Note that this test is not an exact test, so this can be used only when N is sufficiently large.
4. Specify the rejection region. The corresponding rejection rules for different Ha’s are as follow:
Ha: ΔBA > 0 Reject H0 if Z ≥ zα
Ha: ΔBA < 0 Reject H0 if Z ≤ -zα
Ha: ΔBA ≠ 0 Reject H0 if |Z| ≥ zα/2
Use Table A.1.
5. Decide and interpret.
6. Compute for the p-value.
Example 29.2.2
With the development of bar-code scanners for use in supermarkets and many other stores, there has
been a trend toward the omission of price markings on individual items. Retailers are keenly interested
in not having to mark individual prices. Two of the most important reasons are (1) the labor savings
resulting from not having to mark each item and (2) the ability to reprice items quickly in response to
cost changes, special sales, etc. On the other hand, shoppers have become accustomed to having
prices marked on individual items. Advantages of unit pricing which shoppers cite include the ability (1)
to compare easily prices for different brands of a particular product; (2) to review the total cost of items
in a market basket; and (3) to ensure correct charges in the checkout lane. If retailers want to move
toward omission of price markings, market specialists argue that public relations campaigns must be
mounted to educate the public about the advantages of such omissions. To have an effective
campaign, it is important to know current attitudes of such omissions. To have an effective campaign, it
is important to know current attitudes and what kind of shopper has the most resistance to omission of
time pricing. In study of shoppers in a large midwestern city, attitudes toward item price omission were
obtained and related to a number of demographic variables such as age, income, education, etc.
In a survey, the demographic variables can be thought of as independent variables and the response to
an attitude question is the dependent variable. One of the demographic variables was education, and
the researchers wanted to determine how education affected attitude. Since the variables education
and attitude are both ordinal variables and because we are primarily interested in the effect of
education on attitude, Somers’ dBA is an appropriate measure. The table below summarizes the
responses from N = 165 women shoppers.
Page 7 of 11
Education (A)
Attitude (B) Less than High school Some College Total

High school graduate college graduate
Very bad to bad 22 39 19 8 88

No difference 6 8 6 14 34
Good to very good 5 16 12 10 43
Total 33 63 37 32 165
To determine the association, Somers’ dBA will be calculated.
Solution:
First we need to determine the number of agreements and disagreements for the two variables.
#(+) =
#(-) =
Finally,
2 [ #(+) − #(−) ]
𝑑𝐵𝐴 =
𝑁 2 − ∑3𝑗=1 𝐶𝑗2
=
On the basis of this analysis, we conclude that education has a small relation with attitude toward item
price omission. The table shows a trend that women with more education have more positive attitudes
toward item price omission, and women with less education have more negative attitudes. Whether this
trend will be significant will be discussed below.
Using the same data above, is there sufficient evidence to conclude that the prediction for the attitude
of a random respondent improves based on the knowledge on his/her highest educational attainment?
Use α=0.05.
To answer this question, we have to perform a test for significance. Since we are interested in knowing
whether the prediction for the dependent variable improves based on knowing a value of the
independent variable, we will test whether ΔBA > 0.
1. Stating the hypotheses, H0: ΔBA = 0 vs. Ha: ΔBA > 0.
2. We have already computed for dBA.
3. Compute for the test statistic.
𝑑𝐵𝐴 0.189
𝑍= = ,
√Var(d𝐵𝐴 ) √Var(d𝐵𝐴 )
where Var(dBA) = 4 ∑𝑟𝑖=1 ∑𝑘𝑗=1 𝑛𝑖𝑗 (𝑁𝑖𝑗+ + 𝑀𝑖𝑗
+
− 𝑁𝑖𝑗− − 𝑀𝑖𝑗− ) 2
(𝑁 2 − ∑𝑘𝑗=1 𝐶 𝑗2 ) 2
Page 8 of 11
= 4 ∑3𝑖=1 ∑4𝑗=1 𝑛𝑖𝑗 (𝑁𝑖𝑗+ + 𝑀𝑖𝑗+ − 𝑁𝑖𝑗− − 𝑀𝑖𝑗
− 2
)
4 2
(1652 − ∑𝑗=1 𝐶 𝑗 ) 2
Aside,
n11 = 22 n23 = 6
N11+ = N23+ =
N11- = N23- =
M11+ = M23+ =
M11- = M23- =
n12 = 39 n24 = 14
N12+ = N24+ =
N12- = N24- =
M12+ = M24+ =
M12- = M24- =
n13 = 19 n31 = 5
N13+ = N31+ =
N13- = N31- =
M13+ = M31+ =
M13- = M31- =
n14 = 8 n32 = 16
N14+ = N32+ =
N14- = N32- =
M14+ = M32+ =
M14- = M32- =
n21 = 6 n33 = 12
N21+ = N33+ =
N21- = N33- =
M21+ = M33+ =
M21- = M33- =
n22 = 8 n34 = 10
N22+ = N34+ =
N22- = N34- =
M22+ = M34+ =
M22- = M34- =
∑3𝑖=1 ∑4𝑗=1 𝑛𝑖𝑗 (𝑁𝑖𝑗+ + 𝑀𝑖𝑗

+
− 𝑁𝑖𝑗− − 𝑀𝑖𝑗− ) 2 =
∑4𝑗=1 𝐶 𝑗2 =
Var(dBA) =
𝑑𝐵𝐴
Hence, Z = =
√Var(d𝐵𝐴 )
Page 9 of 11
4. Critical region: Reject H0 if Z ≥ zα = z0.05 = 1.645.
5. Since Z = ≥ z0.05 = 1.645, we reject H0. So at α=0.05, we have sufficient evidence to
conclude that the prediction for the attitude of a random respondent improves based on the
knowledge on his/her highest educational attainment.
6. P-value: 𝑃(𝑍 ≥ )
Table A.1
𝑃(𝑍 ≥ )∈( , )
Page 10 of 11
Note: The formula for variance of dBA i.e. 4 ∑𝑟𝑖=1 ∑𝑘𝑗=1 𝑛𝑖𝑗 (𝑁𝑖𝑗+ + 𝑀𝑖𝑗
+
− 𝑁𝑖𝑗− − 𝑀𝑖𝑗− ) 2 simplifies
(𝑁 2 − ∑𝑘𝑗=1 𝐶 𝑗2 ) 2
4(𝑟 2 −1)(𝑘+1)
to 𝑉𝑎𝑟(𝑑𝐵𝐴 ) = if the researcher can assume that the sampling has been from
9𝑁𝑟 2 (𝑘−1)
a population with a uniform distribution over all cells in the contingency table.
Exercise:
Using the data information on the attitude survey in example 29.2.2, perform a two-tailed test for
significance. Verify that the test would lead to the rejection of the null hypothesis.
REFERENCE:
Siegel, Sidney. Castellan, N. John. Nonparametric Statistics for the Behavioral Sciences (Second
Edition). New York: McGraw-Hill, 1988.
Page 11 of 11

Somers' D: Data Setup

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Somers' D: Data Setup

Diunggah oleh

Hak Cipta:

Format Tersedia

Somers’ dBA

B1 n11 n12 … n1k R1

To verify/disprove Segal’s claim, we would want to solve for Somers’ d.

Solving for dBA, first, we compute for #(+) and #(-).

Next, compute for ∑𝑘𝑗=1 𝐶𝑗2 .

Plug these in to the formula

Solving for dAB, we compute for #(+) and #(-).

Next, compute for ∑𝑘𝑗=1 𝐶𝑗2 .

Plug these in to the formula

We will input the following codes in the script editor.

Below would be our data input.

Steps in Testing for the Significance of Somers’ dBA

Attitude (B) Less than High school Some College Total

Very bad to bad 22 39 19 8 88

To determine the association, Somers’ dBA will be calculated.

∑3𝑖=1 ∑4𝑗=1 𝑛𝑖𝑗 (𝑁𝑖𝑗+ + 𝑀𝑖𝑗

Anda mungkin juga menyukai