Lecture 24: Ordinal Logistic Regression

Lecture 24: Ordinal Logistic Regression
(Text Section 8.4)
We have been considering log-linear models for cases where the response variable is multino-
mial. The categories that form the response types are not necessarily ordered (e.g. beetle
data, where “alive” and “dead” don’t have a natural order). However, in some cases, the
response types have a natural order, e.g.
1. Miner/lung disease data, where a natural ordering is normal, mild, severe
2. Cracker/bloating data, where a natural ordering is none, low, medium, high.
There are two types of ordered data:

1. Grouped continuous data
• The responses are originally measured on a continuous scale, but are grouped into
ordered categories and are reported as such.
• E.g., age is a continuous variable, but may be reported as categorical (“18-24”,
“25-34”, etc.)
2. Assessed ordered data
• The responses are collected and reported as ordered categories.
• E.g., Likert scale observations (survey respondents are asked to rate the level at
which they agree or disagree with a given statement on a scale from 1 to 5, say)
Models for ordered categorical data are often more parsimonious than models for general
multinomial data, and are desirable for this reason. However, before choosing such a model,
it is important to consider the following factors:
1. The ordering of the categories may be subjective, and hence results may not be uni-
versally interpretable.
2. Some responses can be ordered (e.g. colour preference, using wavelength), but probably
shouldn’t be.
3. Some responses should be ordered in some circumstances but not others. E.g., it
might be sensible to order political party preferences by ideology (from left-wing to
right-wing: NDP, Liberal, Conservative), but not by region of origin (from West to
East).
If any of the above factors applies, it is likely better to fit the general multinomial model
than to impose an ordering on the response categories.
Assuming we choose to treat the responses as ordinal, regardless of the type of data we have
(grouped continuous or assessed ordered), it may be helpful to think of a latent (unobserved)
1
continuous variable from which the response category arises. For example, in the miner
example, the underlying continuous response variable might be “disease severity”. This
response would be difficult to measure, so instead we consider three categories of severity
(normal, mild, severe).
Formally, define Zi as a continuous random variable, i = 1, . . . , n, and πij as the probability

of response j, j = 1, . . . , J, for covariate category i. Then, the cutpoints of the distribution
of Zi , Ci1 , . . . , Ci,J−1 for each i are defined by
P(Zi ≤ Ci1 ) = πi1

P(Ci,j−1 < Zi ≤ Cij ) = πij , j = 2, . . . , J − 1
P(Zi > Ci,J−1 ) = πiJ
Therefore, we can think of the response categories as corresponding to a collection of thresh-

olds of Zi .
P
∗
To model ordinal data, we typically consider the cumulative probabilities, πik = kj=1 πij ,
k = 1, . . . , J − 1. Note: Because 0 < πij < 1, our model must have the property that
πij∗ > πi,j−1
∗
!
The cumulative logit model is defined by

Ã !
πi1 + · · · + πij
log = β0j − β1j xi1 − · · · − βp−1,j xi,p−1
πi,j+1 + · · · + πiJ
Ã !
πij∗
log = β0j − β1j xi1 − · · · − βp−1,j xi,p−1 .
1 − πij∗
NOTES:
1. We specify the coefficients β1j , . . . , βp−1,j with negative signs in order to be consistent
with the model specification used by S-PLUS.
2. This is not a GLM since we have a multivariate response (as in the usual multinomial
model). In addition, we are modelling πij∗ , not πij !
The most popular form of this model (which we will focus on exclusively in this course) is
the proportional odds model. In this model, the linear predictor x0 βj is restricted so that the
intercept β0j may depend on j, but the effects of the other predictor variables are constant
across response categories:
Ã !
πij∗
log = β0j − β1 xi1 − · · · − βp−1 xi,p−1 ≡ ηij , (1)
1 − πij∗
∗
where β01 < β02 < · · · < β0,J−1 to ensure that πij∗ ≥ πi,j−1 . Note that this model is specified
∗
only for j < J, since πiJ ≡ 1.
2
1.0
.. ... ..... .......... ..............................
...........................
... .... ....... ....... ...............
... ... ..... ...........
. . .. .. . . .. .. .. . . .. .. .. . . .. ..
.. .... .... .. . .... ..... .
.. . . . . .
.. .. .. ..
.. .. .. ..
.. .. .. ..
. . . .
. . . .
. . . .
. . .
. . .
. .
0.8
. . .
. . .
. . .
. . .
. . .
. . . .
. . . .
. . . .
. . . .
Cumulative Probability
.
0.6
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. . .
0.4
. . . .
. . . .
. . . .
. . . .
. . .
. . .
. . .
. . .
. . . .
0.2
. . .
. . .
. . . .
. . . .
. . . .
. . . .
.
.. .. ..
. .. .. . ..
. .
... .... . .. ...
. .
.. . .... .. .. . . . .
...
0.0
−4 −2 0 2 4
The interpretation of this model is that the functions relating the cumulative probabilities
to the linear predictor have the same shape, but are simply shifted to the right or left. For
example, in the plot below, the curves, from right to left, are the functions
1
πij∗ = ,
1 + e−ηij
where ηij = β0j + 1.2xi , for response categories j = 1, . . . , 4. Here we’ve chosen β0j = j.
The proportional odds model is clearly restrictive, but it is parsimonious, easy to fit, and
relatively easy to interpret. For these reasons, it is commonly used, at least as a preliminary
model.
To fit the proportional odds model in S-PLUS, we use the polr function. To access this
function, we first need to load the MASS library (see the File/Load Library menu option).
The polr function works by computing πij , j = 1, . . . , J, based on (1) and then finding the
MLEs of the corresponding multinomial model.
Example: Car Preferences
NOTE: There are errors in Table 8.4 and some confusing aspects about the description on
p. 139; see errata on website.
In a study about car preferences, a number of people in six different groups (two sexes
3
and three age groups) were questioned about their views on air conditioning and power
steering. They were asked to rank these features as “Not important”, “Important”, or “Very
important”. The question of interest is whether the importance of these features differs
among age groups and/or between sexes. Therefore, we will treat the importance ranking as
the response, and the row totals for each covariate group (defined by each unique combination
of sex and age group) as fixed.
An initial model that we might consider fitting to these data is the proportional odds model
Ã ∗
!
πjk`
log ∗
= µj − β k − γ ` ,
1 − πjk`
∗
j = 1, 2, where πjk` is the probability that an individual will choose response j or below
given that they are in sex category k (k = 1 for women and k = 2 for men) and age category
` (` = 1, 2, 3). Here µ1 < µ2 , βk is the effect of the k th sex category, and γ` is the effect of
the `th age category. We use the treatment contrasts, so β1 = γ1 = 0. NOTE: This model
is the same as in (1), but specified using ANOVA rather than regression notation.
There is no anova function associated with the polr method in S-PLUS. However, we can
answer the question of interest by looking at the summary.
NOTE: The Intercepts given by S-PLUS are the values µ̂1 and µ̂2 . Specifically,
µ̂1 = U|I = 0.0435

µ̂2 = I|V = 1.6550
To test the effect of Age on the importance of a/c and power steering, we can fit the propor-
tional odds model without the Age factor. We can then compare the change in deviance to a
χ22 distribution (since there are 2 parameters associated with Age). Age is highly significant.
Similarly, we can fit the model without Sex, and compare the change in deviance to a χ21
distribution (since there is 1 parameter associated with Sex). Sex is also highly significant.
IMPORTANT: The residual deviance provided in the polr summary output is NOT
the usual residual deviance provided in the glm summary output. Rather, it is −2 times
the maximized log-likelihood. Therefore, we can use it to compute the change in deviance
between the full and reduced models, but not to check the GOF using a deviance test. For
this reason, we will use only informal, graphical methods to assess whether the proportional
odds assumption is reasonable.

Lecture 24: Ordinal Logistic Regression

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Lecture 24: Ordinal Logistic Regression

Diunggah oleh

Hak Cipta:

Format Tersedia

Lecture 24: Ordinal Logistic Regression

(Text Section 8.4)

There are two types of ordered data:

Formally, define Zi as a continuous random variable, i = 1, . . . , n, and πij as the probability

P(Zi ≤ Ci1 ) = πi1

Therefore, we can think of the response categories as corresponding to a collection of thresh-

The cumulative logit model is defined by

Example: Car Preferences

µ̂1 = U|I = 0.0435

Anda mungkin juga menyukai