Schneider

Proceedings
Quality of life scoring systems.

H.P.G. Schneider, M.D, PhD, FRAM
Department of Obstetrics and Gynecology, University of Muenster, Germany
Corresponding Author: Department of Obstetrics and Gynecology, University of Muenster, Von-Esmarch-
Str. 56, ZMBE,, D-48149 Muenster phone +49/2 51/83-5 57 10, fax +49/2 51/83-5 57 11
e-mail HPG.Schneider@uni-muenster.de

HPG Schneider
Quality-of-life scoring systems
1
Psychometry and the
construction of scales and
subscales
The standard method used for collecting
information on the prevalence and severity of
complaints has been a check list of
symptoms. Symptoms are defined as "an
indication of a disease or a disorder noticed
by the patient himself. A presenting symptom
is one that leaves a patient to consult a
doctor"
1
. Symptoms represent a subjective
expression or manifestation of some
underlying physical, psychological or social
dysfunction. Symptoms are, in effect,
evidence of dis-ease
2
. Particular knowledge
of symptoms and their effect on the daily
lives of women will assist the care-giver in
providing competent care together with long-
standing professional assistance during the
ageing process.

Reliable and valid measures of multi-
symptom conditions generally come in
form of scales and subscales, developed on
the basis of principles of test construction
and scaling
3
. In the field of psychology, the
techniques developed to construct such
measures became known as psychometrics.
The first experimental psychology laboratory
was founded by Wilhelm Wundt at the
University of Leipzig in 1870. The interest
was to establish the general principles of
psychological expression. As there is,
however, a wide variation in individual
expression, the construction of measures
was required sensitive enough to distinguish
between subjects and the various items
under investigation. This led to attempts to
construct "scales". By definition, scales are
instruments which measure phenomena on a
continuum using ordinal scaling
2
. Scales
measuring more complex human
characteristics, such as intelligence or
personality traits, invariably consist of a
number of items which are summated to give
an overall score for each person. A number
of various symptoms may yield a total score
which reflects the degree of severity of a
condition along a graded continuum for each
individual. Moreover, each symptom is
usually rated in terms of its frequency of
occurrence or severity.
Factor analysis is a multivariate
mathematical technique traditionally used in
psychometrics to construct measures of
psychological and behavioral characteristics,
such as intellectual abilities or personality
traits
2
. In theory, it addresses the problem of
how to analyze the structure of the inter-
relationship (correlations) among a large
number of variables (test scores,
questionnaire responses, behavior,
symptoms) by identifying a set of underlying
dimensions known as factors. The overall
objective of factor analysis is data
summarization and data reduction.
A central aim of factor analysis is the
orderly simplification of a number of
interrelated measures. Factor analysis aims
to order and give structure to observed
variables and, by virtue of that, allows for the
construction of instruments in the form of
scales and subscales.
The relationship between a symptom and
a factor is measured by a correlation
coefficient known as a factor loading. On that
basis, an instrument can be constructed
which consists of several separate
subscales; it will measure different aspects of
the symptom picture, based on the way
symptoms cluster together with factors and
on the size of the factor loadings. As a result,
a scale will emanate which yields a symptom
profile for each subject
2
. By identifying
symptoms which cluster together or form
groups of factors, one may be able to
delineate facets of the symptom picture and
identify those symptoms that are an essential
part of a syndrome and those which are not.
Scales for measuring a complex
phenomenon or multifaceted syndromes are
generally made up of a number of subscales;
they each measure a different facet of the
syndrome. Summating symptoms from
apparently different domains very often is
meaningless. Greene, in his methodological
evaluation, has compared this to adding a
person's height and waist measurement to
give an overall measure of "size". Such a
measure would fail to distinguish tall, thin
people from small, obese people, because
both would tend to have a similar overall
"size" score
2
. Similarly, the common practice
of reporting symptoms individually is bound

HPG Schneider
2
to fail because such a measure would not
assess a condition comprehensively.

General scales and condition-specific
scales are the two types for measuring
human characteristics or conditions. When
questionnaires are developed, they either
focus on generic or disease- and treatment-
specific aspects. Different generic scales
show many similarities, as they assess the
ability of patients to cope with their condition
physically, emotionally and socially as well
as their general performance at work and in
daily life
4
. The most commonly used generic
measures are the Sickness Impact Profile
5
,
the Nottingham Health Profile
6
, the Quality of
Well-Being Scale
7
and the Short Form (SF)-
36 Health Survey
8
. They all cover the
multidimensional aspects of quality of life
over a wide range of health problems. These
scales may be less responsive to treatment-
induced changes and could be considered
lengthy and time-consuming.

Disease-specific measures, on the other
hand, are more likely to be responsive and
make sense to clinicians as well as to
patients. Their specific measures relate to
concepts and domains in patient populations,
diagnostic groups or diseases. One of the
very first was the Women's Health
Questionnaire (WHQ)
9
. It was developed to
assess a wide range of physical and
emotional symptoms to study possible health
changes of mid-aged women. The WHQ
consists of 36 items grouped into nine
domains. Self-reported symptoms are scored
on a five-point Likert Scale
10
(table 1).

Table 1:
Psychometric Response Scales
10

After the questionnaire is completed, each item may be analyzed separately, or item responses may
be summed to create a score for a group of items
Traditionally a five-point scale, many psychometricians advocate use of seven- or nine-point scales
The Likert Scale is a bipolar scaling method, measuring either positive and negative to a statement
Typical test item in a Likert Scale is a statement
A respondent is asked to indicate a degree of agreement with the statement
<Ice cream is good for breakfast >
Strongly disagree
Disagree
Neither agree nor disagree
Agree
Strongly agree

HPG Schneider
3
Particular aspect of health
questionnaires may refer to psychiatric
problems, such as the Beck Depression
Inventory
11
. This index was designed to
assess clinical depression in psychiatric
patients and proved to be much less
sensitive to change than were other non-
psychiatric measures. Although the
depressed mood as experienced by
climacteric women may not be less severe
than that of psychiatric patients, it has
different origins and may therefore be of a
different context than psychiatric
depression. Other test systems include pain
scores, sleep disturbances, the assessment
of sexual dysfunction, mental and cognitive
function.

Health-related quality of life
The World Health Organization definition
of health to be a "state of complete
physical, mental, and social well-being and
not merely absence of disease or infirmity"
has remained unchanged since 1948
12
.
Although mortality was previously the
measure of choice to reflect population
health, the importance of "non-fatal" health
outcomes (i. e., functioning and disability in
various aspects of life) has recently been
recognized. National mortality statistics,
reported on the basis of the International
Classification of Diseases (ICD) system
was useful for tracking life expectancy and
causes of death but failed to reflect health
status among the living population. This led
to the development of the International
Classification of Impairments, Disabilities,
and Handicaps (ICIDH)
13
to classify the
consequences of diseases. "Impairment"
refers to any loss or abnormality of
psychological, physiological, or anatomical
structure or function of the tissue, organ, or
whole body system level (e. g. reduced
muscle strength). "Functional limitation"
refers to any restriction or inability (resulting
from an impairment) to perform an activity
in the manner or within the range
considered normal for the human being
(e. g. limited ability to walk). "Disability" has
been subclassified into four categories,
including physical, mental, social, and
emotional disability. Disability represents
any restrictions or limitations in the
fulfillment of a person's normal (depending
on their age, gender, social and cultural
factors) socially defined roles and tasks at
work, school, or recreation, or for personal
care
14
. Recently, the WHO has created a
revised ICIDH model titled International
Classification of Functioning, Disability and
Health (ICIDH-2), where the domains
include impairment, activity and
participation
15
. This ICF has been accepted
by 191 countries as the international
standard to describe and measure health
and disability. WHO estimates that as much
500 million healthy life years are lost each
year due to disability associated with health
conditions. This is more than half the years
that are lost annually due to premature
death. The ICF provides a common meter
about this immense problem.
While traditional health indicators are
based on the mortality rates of populations,
the ICF shifts focus to "life", e. g. how
people live with their health conditions and
how these can be improved to achieve a
productive, fulfilling life. It has implications
for medical practice; for law and social
policy to improve access and treatments;
and for the protection of the rights of
individuals and groups. The ICF takes into
account this social aspect of disability and
provides a mechanism to document the
impact of the social and physical
environments on a person's functioning
(figure 1).

HPG Schneider
4

Figure 1: WHO International Classification of Functioning and Disability
Interactions between the components of ICF

The way this has been interpreted was
by the example of a person with a serious
disability which finds it difficult to work in a
particular building because it does not
provide ramps or elevators; the ICF
identifies the needed focus of intervention,
e. g. that the building should include those
facilities and not that the person be forced
out of the job because of an inability to
work. Thereby, ICF puts all disease and
health conditions on an equal footing
irrespective of their cause. A person may
not be able to attend work because of a
cold or angina, but also because of
depression. This neutral approach puts
mental disorders on a par with physical
illness and is contributed to the recognition
and documentation of the world-wide
burden of depressive disorders, which is
currently the leading cause, world-wide, of
life years lost to disability. The clustering
nine domains with their qualifiers and
scoring system are listed in tables 2 and 3.
Validation studies are under way to ensure
that ICF is applicable across cultures, age
groups and genders so as to collect reliable
and comparable data on health outcomes
of individuals and populations.

Health Condition
( Disorder or Disease )
Activities Participation
Body
Functions and
Structures
Environmental
Factors
Personal
Factors

HPG Schneider
5
Table 2:
WHO International Classification of Functioning and Disability
15

Qualifiers D o m a i n s *
Performance Capacity
d1
Learning and applying knowledge
d2
General tasks and demands
d3
Communication
d4
Mobility
d5
Self-care
d6
Domestic life
d7
Interpersonal interactions and relationships
d8
Major life areas
d9
Community , social and civic life

Domains cover the full range of life areas. The component can be used to denote activities or
participation or both. The domains of this component are qualified by two qualifiers of performance
and capacity.

Table 3:

Scoring System of the International Classification of Functioning and Disability
15

0 No problem (none, absent, negligible, ) 0 4 %
1 Mild problem (slight, low, ) 5 24 %
2 Moderate problem (medium, fair, ) 25 49 %
3 Severe problem (high, extreme, ) 50 95 %
4 Complete problem (total, ) 96 100 %

Broad ranges of percentages are provided for those cases in which calibrated assessment
instruments or other standards are available to quantify the impairment, capacity limitation,
performance problem or barrier.

HPG Schneider
6
Health-related quality of life refers to
the effects of an individual's physical state
on all aspects of psycho-social functioning.
Generally speaking, quality of life may also
be defined as "the extent to which our
hopes and ambitions are matched by
experience"
16
. Recently, there is growing
awareness of the aspects of quality of life
and aging. Quality of life is a subjective
parameter and direct questioning is
therefore a simple and appropriate way of
accruing information about how patients
feel and function. Accordingly, measures of
quality of life (QOL) attempt to gauge the
effect ill health has across a number of
physical, psychological and social
parameters.

Quality of life and ageing
Those years of life in which a woman
passes through a transition from the
reproductive stage of life to the
postmenopausal years form a period
marked by waning ovarian function, best
referred to as the climacteric. The
Massachusetts Women's Health Study has
provided information that women would
express either positive or neutral feelings
about menopause with the exception of
those who experience surgical
menopause
17
. By that token, the majority of
women feel healthy and happy and do not
seek contact with physicians. Medical
intervention at this point of life should rather
be regarded as an opportunity to provide
and reinforce a program of preventive
healthcare. These issues of preventive
healthcare for women include family
planning, cessation of smoking, control of
bodyweight and alcohol consumption,
prevention of heart disease and
osteoporosis, maintenance of mental well-
being (including sexuality), cancer
screening, and treatment of neurological
problems.
Chronic disease in an ageing population
is incremental in nature. The best health
strategy would be to change the rate at
which illness develops and thus postpone
the clinical illness; in the end, if it is
postponed long enough, it might be
prevented effectively. This postponement of
illness has been termed "compression of
morbidity"
18,19
. The target is to lead a
relatively healthy life and compress illness
into a short period of time just before death.
Thus, disease is something not necessarily
best treated by medication or surgery, but
by prevention or, more accurately, by
postponement.
Improvement of quality of life is a
primary purpose of health promotion. This
can be achieved by preventive health
programs with their greater impact on
morbidity rather than mortality
19
. The aim is
maximal vigor in life rather than accepting
linear senescence. Some linear decline is
unavoidable, but the slope can be changed
by effort and practice.

How to assess quality of life in
ageing and climacteric women
An example of a simple symptoms
inventory without attempts to standardize it
or to apply psychometric methodology has
been the Kupperman Index
20,21
. This
questionnaire focused primarily on
symptomatic relief, assessed on the basis
of the physician's summary of the severity
of climacteric complaints, and assessed it
by a weighting index, rather than letting
women assess their perceived symptoms.
Some decades later, time had come for the
development of more specific symptom lists
or other questionnaires as instruments to
measure changes and to validate them in a
scientific manner. Psychometric methods
were more frequently applied in the 1950s
and 60s; this knowledge, however, was
greatly restricted to psychology and social
science and not yet common in medicine.
Test theory and test construction developed
rapidly in the 1960s also, spreading to the
medical field.
It was during this period of time when
social scientists had to acknowledge the
differences between "objective hard data"
and "subjective soft data", as different
degrees of proof. In particular, increased
awareness emanated of subjectively
perceived quality of life to best serve the
description of treatment benefits.

HPG Schneider
7
Instruments were utilized to develop a scale
and evaluate their basic properties such as
dimensions (domain). This would e. g.
require to analyze the structure of a
construct such as menopausal complaints.
By analyzing the possible intercorrelations
of all symptom combinations, it was found
that symptoms would cluster into "factors",
which allow assessing variation. Factor
analysis will distinguish the "domains or
subscales" of a complex construct such as
menopause.
Among clinicians and researchers, there
is a trend to increasing recognition of the
role of patient-reported data as outcome
measure for clinical and drug research.
Health authorities are in support of this
growing interest. As a result, multiple
attempts have been undertaken for a state-
of-the-art development of health-related
quality of life scales applicable to women in
their menopausal transition.
There are four criteria by which scales
would qualify as standardized or disease-
specific (adapted from
2
):
1. They have been constructed
on the basis of a factor analysis.
2. They consist of several
subscales, each measuring a different
aspect of a specific symptomatology.
3. The scales possess sound
psychometric properties.
4. They have been standardized
using adequate populations of women.
With these criteria being fulfilled, a
series of instruments currently dominates
international practice. Although some of
them do not necessarily meet the criterion
of primarily being considered health-related
quality of life (HRQoL) instruments, they are
listed because of their extensive clinical
usage and the large amount of statistical
information collected.
The following scales are introduced
according to their chronological order of
construction. They are short-listed in
table 4.

Table 4:

Standardized Menopause-Specific QOL Scales*
Name of scale Number of
items
Rating
points
Scoring Number of
subscales
( domains )
Reliability of
subscales
Greene Climacteric
Scale
21 4 Likert Scale 4 0.83 0.87
Women's Health
Questionnaire
32 2 Present / Absent 8 0.78 0.96
Qualifemme 15 6 VAS 100 mm 4 0.84 0.98
Menopause-Specific
QOL Questionnaire
16 7 Likert Scale 4 0.55 0.85
Menopausal
Symptom List
25 6 Frequency
Severity
3 0.73 0.83
Menopause Rating
Scale
11 5 Likert Scale 3 0.74 0.82
Menopause Quality of
Life Scale
48 6 Likert Scale &VAS 7 0.69 0.91
Utian QOL Scale 23 5 Likert Scale 4 0.73 0.84
* For more details see text.

HPG Schneider
8
Greene Climacteric Scale
This was the first properly analyzed
climacteric symptom scale. In 1976, J . G.
Greene developed his original 30-item self-
administered scale
22
. It was derived from an
earlier study by Neugarten and Kraines
23
.
Based on endocrine and emotional factors
underlying the etiology and dynamics of
menopause, Greene investigated the
relationship between menopausal
symptoms. Factor analysis of climacteric
symptoms established independent
domains such as vasomotor and physical.
The original 50 women aged 40 to 55 years
were scored on a four-point Likert scale (0
to 3). The results were inter-correlated
using product-moment coefficients with a
resulting matrix being submitted to principal
component analysis. The final scale yielded
three independent symptom groups or
factors, equivalent to subscales. These
were psychological, somatic and vasomotor
symptoms. Items with factor loadings
greater than +0.40 on one factor and less
than 0.30 on the other two factors were
included in the questionnaire. The resulting
21 items from an initial list of 30 were
included in the scale. Those items with
factor loading above +0.50 were given a
weighting factor of 2.
Gerald Greene's tool represents a
pioneering piece of work. While the original
scale was never designed to be a genuine
HRQoL instrument as defined today, it first
applied quantitative techniques to
questionnaire construction and marked the
beginning of the use of factor analysis in
clinical studies with "patient-reported"
outcomes as endpoint in the field of
women's health. Since these days, factor
analysis has been applied world around in
order to generate new menopausal scales.
Later, Gerald Greene tried to reconcile the
findings of seven other factor analytic
studies and meet the demand for a
"communal and comprehensive measure"
of climacteric symptoms; this revises new
tool was based on a sample of 200 rather
than 50 women. It was published in 1998
24

and looked at the optimum number of
factors or domains to be established with
resultant "communal" scales of
psychological, somatic and vasomotor
symptoms. By only selecting symptoms
found to have a factor loading of more than
0.35 in three or more studies, he also
determined which symptoms should be
included.
These new studies therefore replaced
four items from the original 1976 scale by
four new ones. Four other symptoms
underwent a change in the wording. An
additional item on loss of sexual interest
was added, and the psychological
symptoms domain was broken down into an
anxiety and a depressed mood scale. The
result is a 21-item, four-level questionnaire.
This "standardized" Greene Climacteric
Scale of 1998 was employed in a trial of
Kliogest
25
.

Women's Health Questionnaire
The Women's Health Questionnaire
(WHQ), developed by Myra Hunter, is a
self-administered questionnaire which
measures physical and emotional
experience and functioning of women aged
45 to 65 years
9
. It was designed specifically
to study possible changes in perceptions of
health and well-being during the
menopausal transition. The questionnaire
was initially developed in UK English and is
composed of 36 items. Of those, 35 items
investigate nine domains providing scale
scores: depressed mood, somatic
symptoms, memory/concentration,
vasomotor symptoms, anxiety/fear, sexual
behavior, sleep problems, menstrual
symptoms and attractiveness.
The WHQ was used both in
epidemiological and intervention studies. It
was employed in the Adelphi Women's
Health Program in 1998, with subsequent
publications
26,27,28
. A double-blind,
randomized, placebo-controlled multi-centre
clinical study was performed in 1995
29
. This
trial examined 223 volunteering Swedish
postmenopausal women with mild to severe
climacteric symptoms at baseline in terms
of their HRQoL response to transdermal
estradiol or placebo patches, respectively.
Recently, the structure of the WHQ was
examined in a UK sample; a revised model
was developed and verified to be used in

HPG Schneider
9
multi-center, international studies
30
. The
revised WHQ comprised 23 items,
investigating six domains. The cross-
sectional psychometric properties of the 23-
item WHQ were good and better than those
of the 36-item version. The 23-item WHQ
was assessed with multi-national data to
evaluate cross-cultural equivalents of
linguistically adapted versions.
Reproducibility and responsiveness need to
be documented.

Qualifemme
The Qualifemme questionnaire was
developed in France to measure the impact
of menopausal hormone deficiency on a
woman's quality of life. The first version
consisted of 32 items delineated from
several other validated and accepted
HRQoL instruments. These items were
translated and linguistically validated for
use in France
31
. The Qualifemme is scored
using a visual analogue scale. Item
weighting was achieved by a group of
menopausal experts contributing their
clinical experience. The original
investigation consisted of a subject pool of
351 women aged 51 to 68. A principal
component analysis identified five domains
with 32 items: general (9), psychological
(12), vasomotor (2), urogenital (6), and a
final domain covering pain and problems
with hair and skin (3). Internal consistency
was demonstrated by a Cronbach's alpha
coefficient of 0.87. Subsequently, a
reduction process removed 17 items from
the original instrument and resulted in the
current 15-item questionnaire. This
reduction did not alter the instrument's
quality psychometric standards
32
. Internal
consistency (Cronbach's alpha) was 0.73.
The Qualifemme was applied in a multi-
centre trial in France. HRQoL was
compared before and after sequential
versus continuous combined application of
17-oestradiol percutaneous gel and
nomegestrol acetate in 141
postmenopausal women from 36 centers
during the years 1996 and 1998. The global
quality of life score increased by 44.6 % for
those on sequential treatment and 38.3 %
in the continuous-combined treatment
group
33
. From this experience, Qualifemme
appears to be a valid instrument; it also
attempts to include the side effects of
menopausal hormone therapy such as
androgenic skin effects.

Menopause-Specific Quality of Life
Questionnaire
The Menopause-Specific Quality of Life
Questionnaire (MENQOL) was developed
by a group of researchers from Canada
during the mid-1990s
34
. A list of
postmenopausal symptoms was
established by extrapolation from the
menopause and quality of life literature plus
quality of life questionnaires and the plus
the investigators' clinical experience. The
final questionnaire collected 106 items. The
original five domains such as physical,
vasomotor, psychosocial and sexual, and
working life, upon completion of the study,
were reduced by omission of the domain
"working life".
The final 32-item menopause-specific
HRQoL instrument encompasses four
subscales (physical, vasomotor,
psychosocial and sexual) plus one overall
HRQoL item. Each domain is scored
separately within a possible range from 1
(not experiencing a problem) to 8
(extremely bothered). The mean of the
subscale serves as the overall subscale
score. As with the WHQ, no overall score
can be obtained from this questionnaire, as
the relative contribution of each domain to
an overall score is unknown.
Internal consistency (Cronbach) from
0.81 to 0.89. Construct validity (evaluative
and discriminative) oscillates between 0.40
and 0.65 or 0.28 and 0.60, resp. They were
determined within a randomized, parallel-
group design trial of conjugated versus
transdermal estrogen, both supplemented
with MPA in a sequential fashion. While all
domains improved during treatment, there
were significant differences between
groups
35
.
Discriminative power was poor in the
vasomotor domain and good in other
domains. Evaluative performance was fair
in the vasomotor and libido, poor in the
global subscale. The lack of introducing

HPG Schneider
10
factor analysis is another shortcoming, as it
withholds correlation patterns of data
variants. As most of the other instruments,
MENQOL also does not address the full
picture of potential side effects of
menopausal hormone therapy
36
.

Menopausal Symptom List
The Menopausal Symptom List (MSL)
was developed in 1997 to measure the
severity of symptoms commonly associated
with menopause. The theoretical symptom
check list was sent to 40 women aged 45 to
55 years living in Australia. Following two
principal component analyses, 25
significant items emerged in three domains,
labeled psychological, vaso-somatic, and
general somatic
37
. The latter combines the
anxiety and depression subscales of the
Greene Climacteric Scale and the Women's
Health Questionnaire. The vasomotor
subscale, besides two vasomotor
symptoms, also includes other somatic
symptoms for reasons not quite apparent.
The items are scored on a six-point Likert
scale of both frequency and severity.
The MSL is a symptom inventory in
terms of the selection, wording of items and
its scoring. Validation experience is limited.

Menopause Rating Scale
The first version of the Menopause
Rating Scale has been used since
1992
38,39
. It was initially developed to
provide the physician with a tool to
document specific climacteric symptoms
and their changes during treatment and
was seen as an improvement over the
commonly applied Kupperman Index.
A critical assessment of this new scale,
however, disclosed methodological
deficiencies, which both in theory and
practice limited its use. Accordingly, the
original physician-based scale was
improved as follows:
Application of the scale in a
representative sample of women after
questionnaire revision.
Revision of the questionnaire such that
women will complete it themselves; first of
all because self-assessment is more
sensitive, and second, a self-administered
questionnaire would not limit future
application.
Modification of the wording of items to
a simple, laymen-appropriate form.
Proper psychometric evaluation of the
revised scale based on a representative
sample and development of simple-to-use
standardized items with clear dimensions.
Classification of the severity of
complaints based on a normal population
sample.
Provision of normative data,
representative for the climacteric age in the
female population.
This new MRS questionnaire was
standardized in early 1996 using a
representative random sample of 689
German women aged 40 to 60 years
40
. This
revision of the questionnaire mainly
concerned the layout, some adjustments
regarding the number, structure, and
wording of items; these were made to
support applicability as self-administered
questionnaire. The MRS was formally
standardized following up-to-date
psychometric rules. Factor analysis of the
standardized eleven-item version
encompassed three domains:
psychological, somato-vegetative, and
urogenital dimension. Scoring is based on a
5-point Likert scale ranging from no
symptoms to mild, moderate, marked or
severe complaints.
A follow-up investigation was performed
from August to October 1997 in 306 women
from the original study. The retest reliability
of scores between the two points was
evaluated using Pearson's correlation
coefficient. The results of the follow-up
survey demonstrate stability in the
individual scores. The total score and
scores of the three defined dimensions
have significant agreement as
demonstrated using statistics
41
.
The validity of the MRS to measure
HRQoL in postmenopausal women was
determined by comparing the instrument to
both the Kupperman Index
20,21
and the SF-
36
42
. The Kupperman Index introduced
weighting factors based "prevalence and
consequence" in the way the developer had
perceived it. Thus, the assignment of such

HPG Schneider
11
weighting factors is not explicit and merges
distinct concepts into one coefficient. This
rather simple symptom questionnaire of the
late 1950s never experienced quantitative
research or psychometric validation. The
MRS proved to be a much more sound and
accurate instrument than the Kupperman
Index; the differences between the scores
could easily be explained by the domains
resulting from factor analysis. There was,
however, a high degree of association
between both instruments as documented
by Kendall's -b coefficient and Pearson's
correlation coefficient
42
. Truly more
important were the results of comparing the
MRS to SF-36. The psychological and
somato-vegetative MRS subscales did not
correlate equally well across all eight
domains of the SF-36. However, the pattern
of correlation was understandable, as the
highest degree of correlation occurred in
the domains of the SF-36 that are most
relevant to women during the menopausal
transition
2
. Thus, the MRS is a reliable,
well-defined instrument for measuring the
impact of climacteric symptoms on quality
of life
43,44
. It should be regarded as a brief
and compact instrument, easy to complete
and to score, and suitable for routine
controls. It covers the key complaints of
women during and after menopause. This
type of scale is not tailored to detail specific
therapies to the needs of each individual
woman.
The need for cross-nationally and cross-
culturally valid, reliable, and responsive
HRQoL instruments has never been so
great as today. Linguistic validation of the
MRS created an excellent international
response and acceptance. The first
translation was into English
45
. Other
translations followed
46
, and the following
versions are currently available: Brazilian,
Bulgarian, Belgium-French, Belgium-Dutch,
Chilean, Chinese, Croatian, English,
French, German, Greek, Indonesian,
Mexican/Argentinean, Polish, Spanish,
Swedish, Romanian, Russian, South
African English, South African Afrikaans,
Turkish, Ukrainian (Russia), Ukrainian
(Ukraine) language. Some of these
versions are available in published form
46
,
and all including the unpublished can be
downloaded in PDF-format from the internet
(see reference 43 and www.menopause-
rating-scale.info).

Menopausal Quality of Life Scale
The Menopausal Quality of Life Scale
(MQOL) was developed in 2000
47
. It was
intended as a condition-specific
questionnaire that examines the effects of
menopause on HRQoL as well as the
impact of employment, age, and medical
history; in addition, cross-sectional
information on differences in HRQoL was
obtained in a community-based sample of
women consequent to a self-rated change
in menopausal status. The effects of
hormone replacement therapy in the early
postmenopause were investigated. Based
on interviews of 32 and later another 29
women, a pilot questionnaire was
developed containing 63 items divided into
seven domains. These were energy, sleep,
appetite, cognition, feelings, interactions,
and symptoms impact. Each of these items
is reported using a six-point Likert scale.
The return of 99 questionnaires served for
psychometric analysis and resulted in a 48-
item questionnaire as well as a global
HRQoL question to rate the overall quality
of life.
Oblimin rotation was applied in a second
analysis with a resultant seven-factor
hierarchical structure, which accounted for
57 % of the data variance. This structure
proved unstable across sub-samples.
Therefore, the MQOL questionnaire was
given an overall instead of seven subscale
scores for each of the seven domains
21
.
Strong correlations of
interdependence between domains were
demonstrated. Consequently, the global
quality of life index was disregarded as a
single factor; all the items were evaluated
with the same importance and were added
in a total score. The empirical foundation of
this questionnaire construction with its
psychometric shortcomings have
unsuccessfully been tried to circumvent or
mask.

HPG Schneider
12
Utian Quality of Life Score
The Utian Quality of Life Score (UQOL)
is a modification of the original Utian
questionnaire from the 1970s
48
. It was
developed from the old questionnaire
designed to assess the sense of well-being
of participants in a treatment study
comparing estrogen to placebo
49
. The
UQOL is focused on general quality of life
rather than QOL in menopausal women.
Factor analysis was applied through two-
stages. The 23-items are rated with a five-
point Likert scale and create four subscales
(occupational, health, emotional, and
sexual).
A field study was conducted on 327
women aged 46 to 65, recruited from
eleven separate communities throughout
the east and mid-west of the United States.
The resulting 23-item instrument was then
administered to a second sample of 270
menopausal women and subsequently re-
administered to determine test-retest
validity. The SF-36 was concurrently
administered to determine scale validity.
The UQOL can measure severity of
QOL burden. However, only limited data on
reliability and validity are as yet available.
The paucity of menopausal symptom-
specific items may require a parallel
application of another more menopausal
symptom-related scale for the most widely
practiced application of such scales, which
is during the menopausal transition.

Menopausal hormone therapy
and QOL
In 2002, Hogervorst et al.
50

systematically reviewed the effect of
menopausal hormone therapy (MHT) on
cognitive function. Their study included
fifteen publications with a total of 566
postmenopausal women. This meta-
analysis did not report any favorable effect
of MHT on cognitive functions (verbal
measures, spatial measures, speed of
reading or memory). Randomized data
systematically report that hormone therapy
improves quality of life only when it is
hampered by the presence of climacteric
symptoms. When symptoms are not
present, hormone therapy does not improve
quality of life; and would not do so in elderly
women. This analysis would explain why
estrogen plus progestin in the WHI resulted
in no significant effects on general health,
vitality, mental health, depressive
symptoms, or sexual satisfaction. The use
of estrogen plus progestin in this large
study was associated with a statistically
significant but small and not clinically
meaningful benefit in terms of sleep
disturbance, physical functioning, and
bodily pain after one year
51
. The
postmenopausal women in the WHI had a
mean age of 63 years with a range of 50 to
79 years.
An open, uncontrolled post-marketing
study with over 9000 women with pre- and
post-treatment data of the MRS scale was
organized to evaluate the capacity of the
scale to measure the health-related effects
of hormone treatment independent from the
severity of complaints at baseline. Hormone
therapy consisted of a combination of 2 mg
estradiol valerate continuously and 1 mg
cyproterone acetate in a sequential addition
(Climen). The mean age was 49.8 years
(SD 6.4); about half of the women
participating were still perimenopausal
(51.9 %), the others already in the
postmenopausal period (48.1 %). The
mean body mass index was 24.7 (SD 3.7).
The absolute improvement of the symptoms
during treatment was 9.3 points of the MRS
total score on average. Did treatment
effects relate to the severity of complaints
at baseline? The answer is documented in
figure 2. The relative improvement of
complaints or quality of life increases with
the degree of severity of symptoms at
baseline. Very importantly, MRS scale
detects a positive treatment effect also in
women with little complaints
52
.
The MRS-assisted assessments of
menopausal hormone therapy and the
meta-analysis of Eva Hogervorst both
would explain why the WHI investigation
with menopausal complaints as exclusion
criterion did not produce major benefits in
terms of quality of life outcomes except
small benefits in terms of vasomotor
symptoms and sleep disturbance
53
. This

HPG Schneider
13
may be considered another piece of
evidence as to the general experience that
the study as well designed and big in size
as it may be, will never provide answers to
any problem that it was not designed for.

Figure 2: HRT: Relative Change of the MRS
Mean Values (SD) in Four Categories of Severity at Baseline

Reproduced from Schneider HPG et al.
43

Practical considerations
Researchers have been criticized for their
failure to use appropriate measures of health-
related quality of life in the evaluation of the
impact of any intervention through assessment
of patient outcome. Trials may either neglect
outcomes other than conventional clinical,
laboratory and radiological measures or may
use limited, inappropriate, or poorly validated
indicators as surrogates of the patient's own
experience. The recent enthusiasm for the
potential of questionnaires to provide accurate
evidence of outcomes from the patient's
perspective has created numerous reports,
although it is not clear how well developed the
applied methods are and whether they are
available across the full range of health
problems. British authors from the Institute of
Health Sciences at Oxford
54
have undertaken
an extensive review to describe the extent to
which patient-assessed outcome measures
have been developed and applied and
examined whether such instruments are
available for all aspects of clinical research.
They collected 3,921 reports, of which 46 %
were disease- or population-specific, another
22 % were generic, 18 % were dimension-
specific, 10 % were utility- and 1 % were
individualized measures. During 1990 to 1999,
the number of new reports of development and
evaluation rose from 144 to 650 per year. Over
30 % of evaluations were cancer,
rheumatology and musculo-skeletal disorders,
and older people's health. The generic
measures SF-36, Sickness Impact Profile,
and Nottingham Health Profile accounted for
16 % of the reports. The authors were not
surprised that there is evidence of a lack of
consistency in the selection of measures for
clinical trials which hinders comparison
between studies. In a study of 67 clinical trials,
48 were found to use 62 different existing
measures and 13 reported new measures.
For routine application in clinical practice or
in clinical trials, it is essential that the
instruments employed are simple and
comparatively short. The majority of patients or
test persons welcome the opportunity to report
0 10 20 30 40 50 60
t h e r a p e u t i c i m p r o v e m e n t ( % )
10.8 + 10.6
55.1 + 13.8
32.2 + 9.8
43.9 + 11.8
no / little symptoms
mild symptoms
moderate symptoms
severe symptoms
baseline
total
score

HPG Schneider
14
how symptoms and their subsequent treatment
affect daily life. Psychometrically evaluated
questionnaires allow uniform administration
and unbiased quantification of data as the
response options are predetermined and thus
equal for all respondents. A core set of
questionnaires would allow the comparison of
study results in patient populations. This is why
such widely used and excellently validated
instruments have been introduced in this
report.
Certain difficulties, however, introduce bias
into the interpretation of data. These include
the experiences of some interviewed
individuals, particularly of older age who might
have difficulty with reading or writing, or being
exposed to less experienced interviewers. The
expenses involved in gathering quality of life
data may also create divergence.
Standardization, compatibility, eradication of
possible bias and economy are therefore
important variables for the validity of any type
of quality of life assessment. The application of
health-related quality of life instruments
requires the same scrutiny and intention as the
measurement of physiological outcomes.
Random and representative samples of the
population should be investigated in sufficient
numbers and over prolonged periods of time.
In terms of statistics, quality of life is, by
definition, an assessment of multiple variables.
The use of many measures and multiple
statistical tests reduces the statistical power of
the analysis. Health-related quality of life
certainly is a multi-dimensional concept.
Whether or not the aggregation of several
dimensions into a summary index is
appropriate remains open to continuing
debate. A summary score may falsely suggest
improvement in one vital area and conceal
deterioration in another. Indices, however, are
practical and are a convenient method of
information transfer.
In a larger representative Berlin Study
55
,
important sequelae for the understanding of
well-being in menopausal women were found
to be women's self-confidence, the quality of
their partner relationship and the re-orientation
process initiated by menopause and their
psychosocial condition. Employment is
considered to be a protective factor. The
experience of relief from several physical and
psychosocial conditions has to be considered
in the assessment of well-being in menopausal
women. Another important example of the
application of HRQoL instruments is the
prevalence of individual menopausal
symptoms to differ among ethnic groups of
Asian women
56
. Within each ethnic group, the
percentage of women reporting items of the
MENQOL varied substantially (table 5)
57
.
Therefore, it may be inappropriate to utilize the
same QOL measuring instrument across
continents and maybe not even across
regional ethnicities, unless linguistic and
cultural adaptation is provided.
A hypoactive sexual desire disorder causes
marked distress or interpersonal difficulty with
severe impact on quality of life. In addition to
the menopause-related questionnaires and
inventories, which more or less consider
sexual behavior as a separate domain, a more
specific evaluation has emanated
58
. The
aspects of sexuality and quality of life have
been the subject of another report during this
Workshop.
Human beings are social individuals. If one
changes the health status or quality of life of
an ageing person, the partner might also be
affected, sometimes strongly and with positive
or negative interaction. This is rarely
considered in the development of tools to
measure treatment. Interdisciplinary
consensus can also help to determine the
most suitable measure for a particular
application. Researchers should undertake
comprehensive literature searches to ascertain
whether any suitable measure is available
before they decide to develop a new one.

HPG Schneider
15
Table 5:

PAM Study: Baseline Domain Scores by Ethnic Group
56

MENQOL (29) d o m a i n (mean S. D.)
Ethnic origin No. of women Vasomotor Psychosocial Physical Sexual
Chinese 249 3.13 (1.67) 2.84 (1.37) 3.21 (1.15) 4.04 (2.20)
Filipino 199 3.17 (1.60) 3.33 (1.41) 3.20 (1.23) 3.03 (2.03)
Indonesian 60 2.28 (0.87) 2.40 (0.68) 2.66 (0.63) 2.63 (1.18)
Korean 97 2.21 (1.40) 3.06 (1.46) 3.29 (1.24) 3.55 (2.29)
Malay 24 3.02 (1.56) 2.78 (1.11) 2.93 (1.08) 3.14 (1.78)
Pakistani 60 4.96 (2.41) 4.24 (1.64) 4.84 (1.61) 2.90 (1.70)
Taiwanese 81 2.29 (1.39) 2.37 (1.32) 2.84 (1.23) 2.11 (1.32)
Thai 150 2.87 (1.61) 3.10 (1.22) 3.28 (1.08) 2.89 (1.90)
Vietnamese 100 5.71 (1.59) 5.96 (1.48) 5.39 (1.20) 6.55 (1.67)

HPG Schneider
16
References:
1. Martin EA. The Oxford Medical Dictionary. Oxford: Oxford University Press, 1994
2. Greene J G. Measuring the symptoms dimension of quality of life: General and menopause-
specific scales and their subscale structure. In Schneider HPG, ed. Hormone Replacement
Therapy and Quality of Life. Carnforth, New York: Parthenon Publishing, 2002:35-43
3. Peck D, Shapiro C. Measuring Human Problems; A Practical Guide. Chichester: Wiley, 1990
4. Fitzpatrick R, Fletcher A, Gose S, et al. Quality of life measures in health care. I: Applications
and issues in assessment. Br Med J 1992;305:1074-1077
5. Bergner M. Development, use and testing of the Sickness Impact Profile. In Walker S, Rosser
M, eds. Quality of life assessment: Key issues in the 1990s. Dordrecht: Kluwer Academic
Press, 1993:201-209
6. Hunt SM, McKenna SP, McEwen J , et al. The Nottingham Health Profile: Subjective health
and medical consultations. Soc Sc Med 1981;15A:221-229
7. Kaplan RM, Anderson J P, Ganiats T. The Quality of Wellbeing Scale: Rationale for a single
quality of life index. In: Walker S, Rosser M, eds. Quality of life assessment: Key issues in the
1990s. Dordrecht: Kluwer Academic Press, 1993:65 ff
8. McHorney CA, Ware J E, Raczek AE. The MOS 36-item short-form health status survey (SF-
36): II. Psychometric and clinical tests of validity in measuring physical and mental health
constructs. Med Care 1993;31:247-263
9. Hunter M. The Women's Health Questionnaire (WHQ): a measure of mid-aged women's
perceptions of their emotional and physical health. Psychol Health 1992;7:45-54
10. Likert R. A technique for the measurement of attitudes. Arch Psychol 1932; 140: 55
11. Beck AT, Ward CH, Mendelson M, et al. An inventory for measuring depression. Arch Gen
Psychiatry 1962;4:561-574
12. World Health Organization. Preamble to the Constitution of the World Health Organization.
International Health Conference, New York, N. Y., June 19 - July 22, 1946: Report of the U. S.
Delegation, Including the Final Act and Related Documents, Department of State publication
2703, Conference Series 91. New York: WHO, 1946
13. World Health Organization. The International Classification of Impairments, Disabilities and
Handicaps. Geneva: WHO, 1980
14. Woodhouse LJ , Mukherjee A, Shalet SM, et al. The influence of growth hormone status on
physical impairments, functional limitations, and health-related quality of life in adults. Endocr
Rev 2006;27:287-317
15. World Health Organization. International Classification of Functioning, Disability, and Health.
Geneva: WHO, 2001
16. Calman KC. Quality of life in cancer patients an hypothesis. J Med Ethics 1984;10:124-127
17. McKinlay SM, Brambilla DJ , Posner J G. The normal menopause transition. Maturitas
1992;14:103-115
18. Fries J F. Aging, natural death and the compression of morbidity. N Engl J Med 1980;303:130-
135
19. Fries J F, Green LW, Levine S. Health promotion and the compression of morbidity. Lancet
1989;1:481-483

HPG Schneider
17
20. Kupperman HS, Blatt MHG, Wiesbader H, et al. Comparative clinical evaluation of estrogen
preparations by the menopausal and amenorrhoea indices. J Clin Endocrinol 1953;13:688-703
21. Kupperman HS, Wetchler BB, Blatt MHG. Contemporary therapy of the menopausal
syndrome. J AMA 1959;171:1627-1637
22. Greene J G. A factor analytic study of climacteric symptoms. J Psychosom Res 1976;20:425-
430
23. Neugarten BL, Kraines RJ . Menopausal symptoms in women of various ages. Psychom Med
1965;27:266-273
24. Greene J G. Constructing a standard climacteric scale. Maturitas 1998;29:25-31
25. Ulrich LG, Barlow DH, Sturdee DW, et al. for the UK continuous combined HRT study
investigators. Quality of life and patient preference for sequential versus continuous combined
HRT: the UK Kliofem multicenter study experience. Int J Gynaecol Obstet 1997;59
(Suppl1):11-17
26. Zllner Y, Piercy J , Alt J . Mental Heath Aspects of Peri- and Post-Menopausal Women.
Attitudes, quality of life, and the role of HRT (Poster). Arch Women's Mental Health 2001;3
(Suppl2):68
27. Zllner Y, Kay S, Abetz L, et al. La qualit de vie sexuelle des europennes. Gyn Info
2001;51:9-11
28. Piercy J , Zllner Y, Kay S, et al. Quality of life in postmenopausal women in five European
countries (Poster). Val Health 2001;4:168
29. Karlberg J , Mattsson LA, Wiklund I. A quality of life perspective on who benefits from estradiol
replacement therapy. Acta Obstet Gynecol Scand 1995;74:367-372
30. Girod I, de la Loge C, Keininger D et al. Development of a revised version of the Women's
Health Questionnaire. Climacteric 2006;9:4-12
31. Le Floch J P, Colau J CI, Zartarian M. Validation d'une mthode d'valuation de la qualit de
vie en mnopause. Refs en Gyncol Obsttr 1994;2:179-188
32. Le Floch J P, Colau J CI, Zartarian M, et al. Rduction d'un questionnaire d'valuation de la
qualit de vie en mnopause. Contracept Fertil Sex 1996;24:238-245
33. Le Floch J P, Chevalier T, Gelas B, et al. Quality of life improvement and hormonal
replacement therapy: comparison of sequential versus continuous combined schedules with
17b estradiol percutaneous gel and nomegestrol acetate. Menopause Rev 1999;4:87-96
34. Hilditch J R, Lewis J , Peter A, et al. A menopause-specific quality of life questionnaire:
Development and psychometric properties. Maturitas 1996;24:161-175
35. Hilditch J R, Lewis J E, Ross AH, et al. A comparison of the effects of oral conjugated equine
estrogen and transdermal estraldiol-17 combined with an oral progestin on the quality of life
in postmenopausal women. Maturitas 1996;24:177-184
36. Zllner YF, Acquadro C, Schaefer M. Literature review of instruments to assess health-related
quality of life during and after menopause. Qual Life Res 2005;14:309-327
37. Perz J M. Development of the menopause symptom list: A factor analytic study of menopause
associated symptoms. Women Health 1997;25:53-69
38. Hauser GA, Huber IC, Keller PJ , et al. Evaluation der klimakterischen Beschwerden
(Menopause Rating Scale [MRS]). Zentralbl Gynakol 1994;116:16-23

HPG Schneider
18
39. Schneider HPG, Doeren M. Traits for long-term acceptance of hormone replacement therapy
results of a representative German survey. Eur Menopause J 1996;3:94-98
40. Potthoff P, Heinemann LAJ , Schneider HPG, et al. Menopause-Rating Skala (MRS):
Methodische Standardisierung in der deutschen Bevlkerung. Zentralbl Gynakol
2000;122:280-286
41. Schneider HPG, Heinemann LAJ , Rosemeier HP, et al. The Menopause Rating Scale (MRS):
Reliability of scores of menopausal complaints. Climacteric 2000;3:59-64
42. Schneider HPG, Heinemann LAJ , Rosemeier HP, et al. The Menopause Rating Scale (MRS):
Comparison with Kupperman index and quality-of-life scale SF-36. Climacteric 2000;3:50-58
43. Schneider HPG, Schultz-Zehden B, Rosemeier HP, et al. Assessing well-being in menopausal
women. In Studd J , ed. The Management of the Menopause The Millennium Review 2000.
New York, London: Parthenon Publishing, 2000:11-19
44. Wiklund I. Methods of assessing the impact of climacteric complaints on quality of life.
Maturitas 1998;29:41-50
45. Schneider HPG, Heinemann LAJ , Thiele K. The Menopause Rating Scale (MRS): Cultural and
linguistic validation into English. Life Med Sc Online 2002;3:DOI:10.1072/LO0305326
46. Heinemann LAJ , Potthoff P, Schneider HPG. International versions of the Menopause Rating
Scale (MRS). Health Qual Life Outcomes 2003;1:28 http://www.hqlo.com/articles/browse.asp
47. J acobs P, Hyland ME, Ley A. Self rated menopausal status and quality of life in women aged
40-63 years. Br J Health Psych 2000;5:395-411
48. Utian WH. The mental tonic effect of oestrogens administered to oophorectomised females. S
Afr Med J 1972;46:1079-1082
49. Utian WH, J anata J W, Kingsberg SA, et al. The Utian Quality of Life (UQOL) Scale:
development and validation of an instrument to quantify quality of life through and beyond
menopause. Menopause 2002;9:402-410
50. Hogervorst E, Yaffe K, Richards M, et al. Hormone replacement therapy for cognitive function
in postmenopausal women. Cochrane Database Syst Rev 2002;CD003122
51. Shumaker SA, Legault C, Rapp SR, et al.; WHIMS Investigators. Estrogen plus progestin and
the incidence of dementia and mild cognitive impairment in postmenopausal women: the
Women's Health Initiative Memory Study: a randomized controlled trial. JAMA 2003;289:2651-
2662
52. Heinemann LAJ , DoMinh T, Strelow F, et al. The Menopause Rating Scale (MRS) as outcome
measure for hormone treatment? A validation study. Health Qual Life Outcomes 2004;2:67
53. Hays J , Ockene J K, Brunner RL, et al. Effects of estrogen plus progestin on health-related
quality of life. N Engl J Med 2002;348:1839-1854
54. Garratt A, Schmidt L, Mackintosh A, et al. Quality of life measurement: bibliographic study of
patient assessed health outcome measures. BMJ 2002;324:1417-1421
55. Schultz-Zehden B. FrauenGesundheit in und nach den Wechseljahren. Die 1000
Frauenstudie. Gladenbach: Verlag Kempkes, 1998
56. Haines CJ , Xing SM, Park KH et al. Prevalence of menopausal symptoms in different ethnic
groups of Asian women and responsiveness to therapy with three doses of conjugated
estrogens/medroxyprogesterone acetate: the Pan-Asia Menopause (PAM) study. Maturitas.
2005;52:264-276

HPG Schneider
19
57. Limpaphayom KK, Darmasetiawan MS, Hussain RI, et al. Differential prevalence of quality-of-
life categories (domains) in Asian women and changes after therapy with three doses of
conjugated estrogens/medroxyprogesterone acetate: the Pan-Asia Menopause (PAM) study.
Climacteric 2006;9:204-214
58. Derogatis L, Rust J , Golombok S, et al. Validation of the Profile of Female Sexual Function
(PFSF) in surgically and naturally menopausal women. J Sex Marital Ther 2004;30:25-36

Schneider

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Schneider

Diunggah oleh

Hak Cipta:

Format Tersedia

Proceedings

Quality of life scoring systems.

Anda mungkin juga menyukai