Anda di halaman 1dari 10

Statistical methods have great value in medicine

and should be of interest to practicing physicans

Statistics in the
Practice of Medicine
Robert G. Hoffmann, PhD, Gainesville, Fla.

PHYSICIANS and statisticians first step is to establish clinical criteria for normal¬
PRACTICING
may not to have many mutual interests,
seem
but perhaps they have more in common than is
ity and admit subjects to the study only if they
meet the clinical criteria.
apparent at first. The purpose of this article is to How many normal subjects would be required?
describe a few statistical tools and to illustrate how There is no way to answer this question without
these tools may be useful in the practice of first specifying the accuracy to which the normal
medicine. limits are wanted, and an estimate of normal sub¬
As a starting point, consider
a fairly frequently ject variability is also needed. Information is avail¬
encountered medical problem, diabetes mellitus. able about some tests in the literature, so the liter¬
ature data will be used. The results of testing 60
2

According to some physicians, they believe that


diabetics are best maintained at slightly elevated clinically normal subjects are shown in Table 1.
blood-glucose levels, but others feel that they Table 1 was formed simply by setting up the
should be maintained within limits of clinical nor- glucose scale shown in column 1; tallying the glu¬
mality. Which group is right or are they both cose values from the list supplied by the laboratory
right? And what is clinical normality and how is it (column 2); and adding up the tally marks as
determined? Suppose we consider the question of shown in column 3. The cumulative frequency
clinical normality first, partly because it is rela- shown in column 4 is obtained by successively add¬
tively easy to discuss and partly because it is a ing numbers of subjects beginning at the top of
portion of the problem of maintaining diabetics the table. The cumulative per cent in column 5 is
in control.
Table 1.—Frequency Distribution of 60 Normal Subjects
Normal Values by Blood Glucose
According to one textbook,1 the normal range (l)
Blood (3)
(4)
Cumu¬
(5)
Cumu¬
for level of blood glucose after fasting (which is Glucose,
Ml
(2)
Tally
Subjects,
No.
lative lative
%
Mg/100 Frequency
just called glucose after this) is 70-110 mg/100 ml of 70-74 1 1 1.7
whole blood. This may not be true for a given 75-70 7 8 13.3
80-84 IÍ I 11 19 31.7
laboratory, because methods for determining glu¬ 85-89 i mu mu i 16 35 58.3
cose concentrations differ among various labora¬ 90-94 I Hill II 12 47 78.3
tories. Suppose we decide to see whether this 95-99 4 51 85.0
100-104 4 55 91.7
range is reasonable for a specific laboratory. The 105-109 2 57 95.0
110-114 2 59 98.4
115-119 1 60 100.0
From the J. Hillis Miller Health Center, University of Florida. Total Subjects: 60

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
the cumulative frequency converted to a per cent mathematical expression which is usually referred
of total subjects. For the moment ignore these to as the "normal distribution." As far as medicine
cumulative values. is concerned, this is an unhappy name because it
The discussion of normal values was started par¬ is only a mathematical expression. Happily, how¬
tially as a means of illustrating statistical tech¬ ever, it has been named too after a famous mathe¬
niques and how they may be useful in the practice matician by the name of Gauss, so it will be re¬
of medicine. With the information displayed in the ferred to as the gaussian distribution.
first three columns of Table 1 we are ready to The gaussian distribution is of great importance
begin discussion of analogies between medical and in statistics and the applications being discussed,
statistical problems and approaches to them. but in an article only a few of its features can
Apart from blood glucose level, how is a diag¬ be described. The gaussian distribution shown in
nosis of diabetes made? From the clinical point of Fig 1 was fitted (by certain mathematical pro¬
view, it is a pattern of responses made by a patient, cedures) to the 60 normal blood sugar values.
with the pattern consisting of certain signs and There are two reasons why a mathematical
symptoms exhibited by the patient. Statisticians expression is sought to describe the glucose dis¬
look for "response patterns" too, but they look for tributions; first, to summarize the 60 values in a
them in groups of measurements such as the glu¬ convenient form and, second, to use this summary
cose tests shown in Table 1. The pattern is shown to estimate what would be obtained with a much
by the numbers of subjects having given glucose larger series of measurements. Bemember, what is
values. Note, for example, that the "most popular" sought is a clinically normal range using the infor¬
glucose value is in the range 85-89 mg, with 16 mation obtained in testing 60 normal subjects.
subjects. Values on either side of this range are One feature of the gaussian distribution is that
increasingly less popular as we go farther away it never touches the bottom of the chart ( abscissa )
from it on either side. on which it is plotted. From the practical point of
The pattern exhibited by the normal glucose sub¬ view, this means it would admit clinically normal
jects is most easily seen in chart form; Fig 1 ex¬ subjects with very high and very low glucose
hibits the data graphically. For a given value, the values, provided a of subjects was
large number
number of subjects is shown by the bar height examined. This that to use the gaussian
means
above the value. Patterns such as this are called distribution to establish the normal range, its tails
"frequency distributions" because the 60 subjects must be chopped off. The two chopping points will
have been distributed according to their glucose define the normal range. The problem at the mo¬
values, and the bar heights correspond to the ment is how to define and then determine these
frequency. two points. To solve it, knowledge of a few mathe¬
These laboratory results may now be compared matical properties of the gaussian distribution is
with the textbook range mentioned previously, 70- needed.
110 mg. Even a gross appraisal of the situation The distribution is a bell-shaped curve ( as shown
reveals that differences exist between the data and in Fig 1) which is determined by two quantities
the textbook range. None of the values for the the estimates of which are computed from the
normal subjects reached the lower limit of the measurements. These are the mean and the stand¬
textbook range, but several exceeded it. It looks ard deviation. Here is a little about these two
as though it would be best to establish a range quantities.
from the data. To help do this some statistical tools
Symbolic How Computed
will be used, but before they are discussed, con¬ Expression Add up all values and di-
sider again what is meant when we say "normal Mean _ 2.x -vide by number of values.
Estimate
range." _ n
Subtract mean from each
With unlimited time and facilities millions of each differ-
normal subjects could be tested for glucose. The _value, square
Standard /2(x "x)a ence, sum the resulting
result should be a distribution similar to that shown Deviation s = V n

1 squares, divide by one less


in Fig 1. But it is not practical to test millions of Estimate —
than the number of values,
normal subjects, so we must make best use of the and, finally, take the square
root.
information we have—the 60 subjects. To do this,
an assumption must be made. The hypothetical From the practical point of view, the mean locates
millions of normal subjects would produce a the curve. If, for example, 100 mg is added to each
smooth, symmetrical distribution curve similar to of the 60 glucose values, the curve would just slide
the one seen in Fig 1. The only way to test the up the glucose scale by 100 mg.
reasonableness of this assumption is to actually The standard deviation describes how "fat" or
test several hundred normal subjects and see what "skinny" is a given normal distribution. Note that
the distribution looks like; but apparently no one if all the values were the same as the mean, the
has done this. standard deviation would be zero.
The smooth curve shown in Fig 1 is actually a It is not obvious from what has been said up to

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
X=89.8 MG/IOOML
S= 9.6 MG/IOOML
TEXTBOOK RANGE.
60-110 MG
SICK" SUBJECTS
X = 10
"I-1-T
70 75 80 85
BLOOD GLUCOSE MG/IOOML ARBITRARY SCALE

Fig .—Frequency distribution of 60 clinically normal subjects by


1 Fig 3.—Two theoretical gaussian distributions summed to simulate
fasting blood glucose. The bell-shaped curve is a gaussian distribu¬ a composite distribution. The two gaussian components represent

tion fitted to the normal subjects' values. "clinically normal" and "sick" subjects.

(PLOTTED ON NORMAL PROBABILITY PAPER I

THEORETICAL
GAUSSIAN
DISTRIBUTION

95% LIMITS FOR


NORMAL RANGE _

-1-1-1-1-1—
70 75 85 90 95 100 105
BLOOD GLUCOSE MG/100 ML ARBITRARY SCALE
(END POINTS OF INTERVALS)

Fig 2.—Cumulative frequency distribution of 60 clinically normal Fig 4.—Two theoretical distributions and their sum, representing a
subjects by fasting blood glucose, plotted on normal probability composite, on probability paper. One component represents "clin¬
paper. The straight line is a theoretical gaussian distribution. ically normal," the other "sick" subjects.

this point, but it is possible to plot a gaussian dis¬ ping points.We are now ready to define the nor¬
tribution so that it has a mean of zero and a mal range from the 60 normal subjects.
standard deviation of unity. These are called unit Normal limits are arbitrarily defined to be the
gaussian curves, and the unit of measurement is points which enclose 95% of the values obtained
the standard deviation. This is a very important by testing clinically normal subjects. The mean
point because when a standard deviation is com¬ and standard deviation for the 60 normal subjects
puted from a set of measurements, the units are are:
the same as the measurements. Another feature of Mean = x = 89.8 ]
the gaussian curve is that about 95% of the area , , lmg/100 ml of blood
enclosed by the curve is contained within the Standard
range defined by both adding and subtracting two
Deviation = s = 9.6 J
standard deviations to and from the mean. The two Upper
limit: x + 2s = 89.8 + 19.2 = 109
standard deviations can be used to chop off the mg/100 ml of blood
tails of the gaussian curve, although any multiples Lower
of standard deviations could be taken as the chop- limit: x 2s = 89.8 19.2 71
— —

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
The computed limits are fairly close to the text¬
book values.
What has just been done may seem like a great
deal of trouble to obtain normal limits, and the
assumptions could certainly be questioned. There
is this about what has been done, however, that is
well worth keeping in mind: Given the same set of
60 glucose measurements, anyone, anywhere, any¬
time would obtain the same normal range if he
made the same assumptions and went through the
same set of computations. Assumptions must be
made unless definite information exists, and in a
problem of this kind, we will always be faced with
some uncertainty.
One practical difficulty with obtaining normal
limits as has just been done, is the fact that a
group of clinically normal subjects had to be ob¬
tained and tested. To do this for large numbers of
subjects is not possible for most laboratories. What
is about to be described is a method for obtaining
normal value information without testing any se¬
lected group of clinically normal subjects. We will
simply capitalize on the fact that most of the pa¬
tient specimens sent to laboratories for analysis are
normal with respect to many tests. To do this a
simple statistical technique is needed which is now
described.
The gaussian, or any other, distribution can be BLOOD GLUCOSE MG/IOOML
plotted in cumulative form. Befer again to Table 1 Fig 5. —Cumufotive distribution of 449 tests of fasting blood glucose.
for the glucose values tabulated in cumulative These values represent routine tests of patients' specimens. They are
form. In graphic form, the curve is "S" shaped, plotted on normal probability paper and a gaussian curve is eye-
rising to a maximum of the total number of meas¬ fitted to the "clinically normal" component.
urements. For a given glucose value the curve
shows the number of subjects having that value
or a lower one.
The glucose tests shown in Fig 1 were plotted on
ordinary graph paper, using an arithmetic scale for 90

the glucose values. The next illustration, Fig 2, 70 GAUSSIAN ESTIMATE OF THE
shows the glucose data plotted on a special graph 60-
CLINICALLY NORMAL COMPONENT

paper called normal (gaussian) probability paper. 50


The cumulative form of the distribution is used.
10
Note that the glucose scale is unchanged from the
previous plot (Fig 1) but the number of subjects 30"

(shown as apercentage of the total) along the 20"

side is not arithmetic scale. This special graph


an 10-

paper serves the useful purpose of "straightening


out" a cumulative gaussian distribution. It forms a 15 35 55 75 95 115 135 155 175 195 215 235 255 275 295 115

straight line. This feature allows a given set of BLOOD GLUCOSE MG/100 ML

data to be plotted on probability paper and a nor¬ Fig 6.—Frequency distribution of 449 tests of fasting blood glucose.
mal distribution "fitted" to it. A ruler is used to These values represent routine tests of patients' specimens. A gaus¬
sian curve has been fitted to the "clinically normal" component.
draw a straight line through the points. This is
called an "eye fit." The points around the 50%
point (side scale) are given the greatest weight. Laboratory data consists of specimens from healthy
This is all that is needed in the way of statistical as well sick persons. Gaussian distributions can
as
techniques solve the problem of obtaining nor¬
to be used describe the values from sick as well as
to
mal values from clinical laboratory data. healthy persons, but, starting with the values as
Before actual laboratory data are considered, they come from the laboratories, the distributions
however, charts which have been prepared to show are mixed.
how mixtures of gaussian distributions look when Fig 3 shows two gaussian distributions plotted
plotted on probability paper should be considered. on regular graph paper computed to simulate a

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
chopping points. If this were done, then the limits
would be 60-115 mg, a range that coincides fairly
closely to the one quoted from the textbook. A
mathematically more sophisticated method exists 3
for dissecting mixed distributions, but it is a little
too intricate to discuss here.
The blood glucose data are plotted in the form
of a regular frequency distribution in Fig 6. The
clinically normal component is shown as a gaussian
distribution. The standard deviation for the normal
component was estimated from the cumulative dis¬
25 45 65 85 105 125 145 165 185 205 225 245 265 tribution in Fig 5. Note that the gaussian estimate
BLOOD GLUCOSE MG/IOOML is a little too wide but not grossly so. The height of
Fig 7.—Frequency distributions of 500 tests of fasting blood glucose. the normal component was chosen so as to coincide
One distribution represents laboratory work in a general hospital with the height of the distribution of laboratory
and the other represents work done in a nearby private laboratory. values.
The discussion of uses of statistics in the prac¬
mixed or composite distribution. The two gaussian tice of medicine was introduced by considering a
distributions are also shown separately in the figure. question concerning the treatment of diabetics.
The one on the left represents clinically normal One portion of the problem involved normal val¬
subjects and the one on the right, the sick subjects. ues. We have seen that with the aid of some simple
They were computed so that out of every group of statistical techniques a basis for normal limits can
ten subjects, eight were normal and two were sick. be defined reasonably exactly. A method has been
The standard deviation for the sick subjects was suggested for obtaining normal limits without test¬
also chosen to be five times the magnitude of the ing any special group of clinically normal subjects.
standard deviation of the normal subjects. This statistical technique can be used for obtain¬
Fig 4 shows these distributions plotted on prob¬ ing any normal values in medicine where a group
ability paper. This figure deserves careful study. of measurements are available and the mathemat¬
Note that the composite distribution can be de¬ ical assumptions are reasonable.
scribed by not one, but two straight lines. Each The blood glucose data used in the illustrations
of these two lines corresponds to one of the two were selected from a large body of laboratory data
distributions shown in Fig 3. Note how closely the involving many tests and from several laboratories.
normal component of the composite corresponds to These data were chosen so that reasonable agree¬
the normals plotted separately. Probability paper ment was obtained when it was compared with
has separated the two components for us. textbook data. (In contrast to the normal range of
The separation is not perfect because the slope 60-115 mg obtained from the laboratory data illus¬
of the line which would be used to estimate the trated, data from another laboratory yielded a
normals is not quite as steep as it should be. This range of 88-135 mg. The same statistical procedure
means that a graphic estimate of the standard was used for both sets of data.) Anyone who uses

deviation, on which the limits of clinical normality these techniques will not necessarily find such good
are based, will be a little too broad, but if the agreement. If disagreement is substantial, which is
example is reasonable, the bias is not very large. right—the textbook or the estimates using the statis¬
The bias in the sick component is much greater, tical techniques? Perhaps it would be more appro¬
but estimating this component is not part of the priate to ask, "Which is the best normal range?" No
immediate goal. range is likely to be wholly right or wrong. With
This statistical technique may now be put to regard to the normal range based on current clinical
work on a real set of data. Fig 5 shows a probabil¬ values, this feature might be considered: The esti¬
ity plot of 449 fasting blood-glucose values from mate is based on the local situation, and this is im¬
the laboratory.* A straight line has been fitted by portant. The estimates are based on values as they
eye to the component which represents clinically are being reported locally, whether they agree with
normal subjects. At the 2.5% and 97.5% points (side textbooks or not.
scale) horizontal lines have been projected to the There is another point about the normal range
line representing the clinically normal components. problem which might be considered. Begardless
At their intersections, lines have been dropped to of the source of the range, it is assumed that the
the glucose scale at the bottom of the chart. These laboratory testing procedure is stable. At present,
mark the clinically normal range, which is 55-120 the stability of many laboratory procedures could
mg. Previously ( Fig 4 ) it was shown that the nor¬ be improved. Pathologists are at work on the prob¬
mal range would be a little too broad, so it might lem by means of their quality control programs,
be better to take the 5% and 95% points as the but much remains to be done. More will be said

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
120- Table 2—Blood Urea Nitrogen Test Results (Mg/100 Ml)
First Period Second Period
100- 15 14

PRIVATE LABORATORY 10 17
80- 12
-HOSPITAL LABORATORY 12
16 16
60-
20 22
13 13
40- 18 18
22 11
20- 14 15
21 16
"i-1-
100 MO 120 130 140 150 160
NORMAL QUOTIENT VALUES physicians were aware that glucose results were
Fig 8.—Frequency distributions of 500 tests by fasting blood glu¬ different for the two laboratories, but the differ¬
cose. One distribution represents values from a general hospital ence did not help the physicians in the care of
laboratory and the other, values from a private laboratory. The their patients. Was there something that could be
measurement scale (mg/100 ml of blood) has been transformed to
done to help the physicians without disturbing the
"normal quotient values."
laboratories to any great extent? A method was
about quality control programs later on in this found, which is about to be described. This is not
manuscript. exactly conventional statistics, so we digress from
our description of comparisons among groups.
How do we know whether two groups of meas¬
What is about to be done is to change the units of
urements differ from one another? The normal
measurement, which for glucose are milligrams per
ranges just discussed differed somewhat from one 100 ml of blood, to some new units of measure¬
another, but all were based on the examination ment. These new units are arbitrarily called "nor¬
of limited numbers of measurements. Problems
mal quotient values."
where limited numbers of measurements are avail¬
able are those where statistical techniques can be Normal Quotient Values
helpful, so they will be discussed next. Before the proposed units of measurement are
For example, if some patient specimens were
tested in a private laboratory, it would be pertinent described a few words about the medical measure¬
to inquire as to whether the private laboratory's ment problem seem to be in order. There are cer¬
results coincide with those obtained in a nearby tainly hundreds, if not thousands, of different kinds
of measurements made in medicine.4 Pediatricians
hospital's laboratory. For given laboratories, a good are concerned with growth rates; radiologists are
means of obtaining information is to send aliquots
of a "standard sample" to both laboratories and concerned with organ sizes; gynecologists are con¬
cerned with pelvic measurements; and most every¬
compare the reported results. All of this takes extra one is concerned with clinical laboratory measure¬
time and work, however, so the clinical values will ments. Many kinds of measurement scales are in
again be used as a source of information. From the
use because of the diverse nature of the things that
statistical point of view, this will be an example
are measured.
of comparisons among groups.
What is proposed here is a single measurement
Comparisons Among Groups—I scale that could be substituted for many of the ex¬
The problem is to determine whether a hospital isting measurement scales. The proposed scale is
laboratory is reporting values comparable to those based on the standard deviation of clinically nor¬
reported by a private laboratory. To obtain infor¬ mal subjects. In other words, it is based on the
mation pertinent to the problem, 500 consecutively biological rock of clinical normality. Transforming
determined blood glucose values obtained after any measurement scale to one based on the stand¬
fasting were copied from the records of the hos¬ ard deviation of normal subjects is all that is re¬
pital and the private laboratory. In the hospital quired. First, determine the standard deviation of
laboratory, the glucose oxidase method was in use the normal subjects as described earlier and, sec¬
and in the private laboratory the Folin-Wu method ond, divide each original scale value by the stand¬
was being used. Frequency distributions were tabu¬ ard deviation just computed. The units of measure¬
lated, and the results are shown in Fig 7. ment are now standard deviations. This is a
At this point in the preparation of the paper, somewhat awkwardly narrow scale because of the
work was stopped. The glucose distributions are large number of subjects which are included over
a range of one standard deviation, so each new
so obviously different that no statistical methods
are needed to detect the difference, as might have
been expected from the different testing methods. Table 3.—Blood Urea Nitrogen Averages and Variances
for Two Different Periods
The private laboratory is located across the street
from the hospital, so a single group of physicians First Period Second Period
Average. 16.1 15.4
will send patients to either laboratory. Surely the Variance 16.3 10.3
.

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
Table 4.—Null Hypothesis Tests for Averages and HEALTHY SUBJECTS
Variances of Blood Urea Nitrogen for Two Periods
First Second Null FALSE NEGATIVE TESTS
Period Period Hypothesis
Probability FALSE POSITIVE TESTS
Average . 16.1 15.4 0.56
Variance . 16.3 10.3 0.25 SICK SUBJECTS

scale value is now multiplied by 5. Only one more


step is needed in the transformation. MEASUREMENT
SCALE
To the new scale values just computed add (or UPPER LIMIT OF
subtract) a constant determined so that the mid¬ NORMAL RANGE
point of the range of clinically normal subjects is Fig 9.—Simulated distributions of "healthy" and "sick" subjects.
100. The number 100 was chosen because it is easy An area of uncertainty exists where the two distributions overlap

to remember, has three significant figures in it, each other.


and contains no decimal points. This completes
The data presented in Table 2 came from a hospital
the transformation, and the resulting units are
called "normal quotient values." In symbolic nota¬ laboratory when the blood urea nitrogen procedure
was changed from the urease-NessIer method (first
tion, this is all that has been done:
period) to the Autoanalyzer (second period). The
(5)x
——+ k
question the laboratory director wants answered is
y =
this: Did the biochemical procedure change affect
where y is the normal quotient value; the blood urea nitrogen test results? We will as¬
x is a value on the original measurement scale; sume that the two sets of 10 measurements are
a is the standard deviation of normal subjects;
k is a constant chosen so that the mid-point of the
random samples of normal values from the two
range of normal quotient units is 100. periods.
The laboratory director's question, "Did the bio¬
In actual practice, the scale transformation need chemical procedure change affect the test results?"
be computed only once. In clinical laboratories no cannot be answered without being more specific.
additional work would be required to report, in The type of change that may have occurred must
normal quotient units, results of testing patients' be specified. There are two things which could
specimens, after the initial transformation had been have happened. First is a change in level of test
computed. Of course a transformation would be results as reflected by the averages for the two
required for each test procedure. periods. Second is a change in the spread or scatter
The results of transforming the two glucose dis¬ of the test results as reflected in their variance. The
tributions shown in Fig 7 are seen in Fig 8. Note variance is the square of the standard deviation.
how similar are the two distributions now. The The averages and variances are presented in Table
mid-point of clinically normal subjects is 100 units; 3. Neither the averages nor the variances are the
the clinically normal range is 90-110 units (± 2 same for the two periods. Have real changes oc¬
standard deviations) and subjects with values out¬ curred, or are the differences just due to the par¬
side the normal range can be judged in terms of ticular subjects who happened to be tested during
deviations of healthy subjects. the two periods? This is a typical elementary sta¬
Although the distributions seen in Fig 8 are for tistics problem. Information pertinent to the ques¬
blood glucose, similar computations could be made tion is obtained by performing what are called
for any reasonably large group of measurements. "tests for statistical significance." This is what
The resulting distributions might not be the same, they mean.
but the mid-point of the normal subjects would be
100. The range of clinical normality would also be
90-110 normal quotient units. If the private and
hospital laboratories were reporting their results in
normal quotient units, the physicians' problems
about differences between laboratories would no
longer exist.
So much for normal quotient units for the pres¬ TEXTBOOK NORMAL RANGE 8-16 MG

ent. Other potential uses for them will be men¬ COMPUTED NORMAL RANGE 5-25MG
tioned later in connection with a few remarks
about electronic computers. For the moment, how¬
ever, a return to the problem of comparisons —I-1-1-1-1-1-1-1-1
25- 35- 45- 55- 65- 75- 85- 95- 105-
among groups is in order. B.U.N. MG/IOOML

Comparisons Among Groups—II Fig 10.—Frequency distribution of 500 tests by blood urea nitrogen
value. These represent results of testing patients' specimens. A
Consider another laboratory problem which normal range obtained from a textbook and a computed normal
arises when a change in testing procedure is made. range are shown.

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
From the practicing physician's point of view,
he would be interested to know that no change oc¬
curred in BUN levels with the change to the Auto-
PERCENT OF
SUBJECTS 50 analyzer. If the laboratory was reporting normal
WHO ARE SICK quotient values, no change at all would be ex¬
pected.
We have discussed the simplest kind of com¬
MEASUREMENT SCALE
parisons among groups, because there were only
two groups. Statisticians must be prepared to com¬
-UPPER LIMIT OF NORMAL RANGE
pute probabilities where any number of groups
Fig 11.—A hypothetical "illness likelihood curve." Height of the and any numbers of subjects are involved. Anyone
curve Indicates for each point on the measurement scale the per¬ who is interested in pursuing the matter further
centage of subjects who are sick. should do some reading in "The Design of Experi¬
ments" chapters in statistics books. Statistical prob¬
We hypothesize that there was no change in lems involving groups and the null hypothesis
BUN values when the testing procedure changed. just described are of greatest interest in medical
This is called the "null hypothesis," and here is research rather than practice. There are, however,
what is done to see whether it is true. We compute many kinds of "groups" problems. The one des¬
a probability that the null hypothesis is true. If the cribed very briefly below is probably of greatest
resulting probability is very small then grave interest to practicing physicians. Physicians would
doubts exist that the null hypothesis is true and we call it a problem in diagnosis and statisticians
proceed to reject it; otherwise we accept it. This would call it a discrimination problem.
is something like betting on a horse race. Before
a bet is placed we would want to know what the
Discrimination Among Groups
odds are that a given horse would win. Most peo¬ In earlier discussions of normal limits, distribu¬
ple would not want to bet much money on a horse tions of values from clinical laboratories were re¬
where the odds were 20 or 100 to 1 against his garded as composite distributions. A composite
winning. The "horse" in our laboratory problem is distribution is shown again in Fig 9. For the
the null hypothesis, which says that no change moment, ignore the vertical lines enclosing areas
occurred in BUN levels with the change in testing identified by a, b, and c. The problem now is to de¬
procedure. I do not know how odds are determined termine the effectiveness of whatever is being meas¬
for horse races, but in statistics they are based on ured, to separate the healthy from the sick subjects.
the observed measurements. It is obvious from study of Fig 9 that the point on
There are separate null hypothesis tests for the the measurement scale which will misclassify the
averages and the variances. For details of how they fewest subjects, ie, the smallest number of false
are computed, reference should be made to a book positive and negative subjects, is the point where
on statistics. The F test, named after the late Sir the two distributions cross each other. This is
Ronald Fisher, will be used for the variances and shown in Fig 9 as the upper limit of the normal
Student's t test for the averages. Results of the range. An assumption is made, however, that the
computations yield the probabilities shown in consequences of falsely misclassifying (positive or
Table 4. The averages and variances are included negative) a patient are equal. If this is not true,
for easy references. The null hypothesis probabil¬ then a different point must be chosen as the upper
ities shown in Table 4 are about 1 in 2 for the limit of the clinically normal range. It should also
averages and 1 in 4 for the variances. If this were be apparent that the problem is complex from
a horse race, the odds of winning would be good many points of view. For example, the relative
with the null hypothesis. In other words, the prob¬ numbers of healthy and sick subjects must be con¬
abilities provide evidence that the BUN values sidered and their numbers will depend on local
did not change when the change from a urease- circumstances. Perhaps the complexity of the prob¬
Nessler method to the Autoanalyzer was made. The lem is in part responsible for the remarkably little
results apply of course only to this one laboratory. work done on it. To my knowledge, essentially
Perhaps this is not a fair way of illustrating sta¬ one paper has been written on the subject.5 Refer¬
tistical methods, but far more laboratory data were ence is made to the simpler types of discrimination
collected than the two groups of 10 measurements problems. Attention has and is being given to the
just described. In fact about 150 BUN values for more complex problems. The problems and meth¬
each of two periods were obtained. A plot of their ods of approach to them are being described under
frequency distributions (not shown here) reveals the general headings of "Uses of Computers in
nothing which could be interpreted as a change in Medicine" and "Discriminant Functions."
levels. The point here is that the test for statistically No matter how complex the problem, however,
significant differences is confirmed when larger it is worth noting that obtaining information about
numbers of values are available. some local problems is very easy. Consider a speci-

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
relative proportions of healthy and sick subjects.
It would be easy, however, to keep track of these
proportions for many measurements. In the next
section, a method is described for doing this as
X =
22.6 well as for doing other things at the same time.
S 3.6
The usefulness of "illness likelihood" curves re¬
=

mains to be tested, however, so the discussion is


continued in terms of normal limits.
Regardless of how normal limits are determined,
it is assumed that the laboratory testing procedure
-i-1-r i-r-1-r
14- 16- 18 24- 26- 28- 30- is stable. This means that a blood specimen which
NUMBER OF PLUS TESTS has 18 mg of urea nitrogen in it is always reported
from the laboratory as containing approximately 18
Fig 1 2.—Frequency distribution of 33 "number plus" points by num¬
ber of plus tests. Each number plus point represents, out of 50 con¬ mg. The control of laboratory accuracy is a difficult
secutive routine BUN tests, the number having 15 mg/100 ml or and ever-present problem. Statistical methods can
more of BUN. contribute toward a solution of this problem. A
more complete description of some methods has
fie example. Fig 10 shows
a distribution of blood been accepted for publication elsewhere,6 so only
urea nitrogen for 2-months' period. This is a
a an outline of them is given here.
distribution of values resulting from testing pa¬
tients' specimens. In fact, some of these values Control of Laboratory Accuracy
were considered earlier in connection with the If limits of clinical normality are to have mean¬
discussion of comparisons among groups. The text¬ ing, the laboratory testing procedure must be
book range of clinical normality is shown on the stable. On the other hand, if the laboratory testing
figure as well as a normal range computed with the procedure is not stable, then the limits will shift.
aid of normal probability paper. Which normal This is a clue toward use of the clinical values for
range is best, ie, misclassifies the fewest patients? the purpose of laboratory control.
It should not be much work to review a few pa¬ Note the vertical line drawn through the BUN
tients' charts and see what happened in the areas distribution shown in Fig 10. Approximately 70% of
where the two ranges do not coincide. At least all BUN tests have values above this line, ie, are
some information would be obtained at no great 15 mg or more. If the laboratory testing procedure
amount of effort. There is nothing either to keep is stable, approximately 70% of the values will be
from establishing more than the two groups, normal 15 mg or more in consecutive groups of laboratory
and sick. tests. This is the basis of the control method.
The normal range could be dispensed with al¬ In order to easily identify tests, the values
together and what might be called an "Illness Like¬ falling to the right of the vertical line are arbi¬
lihood Curve" could be computed. The idea is a trarily called "plus tests." The vertical line is de¬
very simple one. Looking again at Fig 9 and noting termined by the highest point on the distribution
the areas of the curves identified as a, b, and c, (in statistical language called the mode). The con¬
one can see that if a patient was reported to have trol procedure consists of simply counting the
a test value in area a, he obviously would be number of "plus tests" in consecutive groups of 50
healthy as far as this test is concerned. Patients tests and plotting the "number plus" on a control
with values falling in area b, however would be chart. For details reference should be made to the
mostly healthy, but some of them would be sick. paper about the method.
In area c, the situation has changed completely Physicians might be a bit suspicious of a control
and all patients with test values this high or higher procedure that is based on testing patients' speci¬
would be sick. All that is done is to compute the mens. How do we know whether a group of speci¬
percentage of sick patients at each point on the mens might not be sent to the laboratory at one
measurement scale. In area a this is zero; for area b time and all of them have high nitrogen concen¬
it is about 10%; and for area c it is 100%. This trations? A partial answer to this question is seen
resulting curve will thus indicate an "illness likeli¬ by study of Fig 10. For example, only about one
hood" in terms of percentage; hence the name. Fig specimen in five has value of 30 mg of BUN
a
11 shows an hypothetical illness likelihood curve. or more, so it is very unlikely that large numbers
For any point on the measurement scale the likeli¬ of high values will come in at the same time. Prob¬
hood of a subject's being sick can be seen at a ably more convincing evidence is found in Fig 12.
glance. The point where the healthy and sick Fig 12 shows a frequency distribution of 33
curves (Fig 9) cross each other is the point where number plus points. Each unit in the distribution
a subject would have a 50% chance of
being sick. is the number of tests out of a group of 50 that
Many things would influence a given illness had a value of 15 mg of BUN or more. It thus
likelihood curve, the most important of which is the represents 1,650 consecutive BUN tests. Note that

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019
it suggests the familiar bell-shaped gaussian dis¬ Fed into the machine will be 500 or 1,000 consecutively
determined laboratory measurements and the date on which
tribution. The laboratory testing procedure as re¬
each test was performed. First it will check the measure¬
flected by this distribution was reasonably stable ments for a shift in the laboratory testing procedure and it
over the period represented by these tests. Large will note the results of its check. Second, it will tabulate a
numbers of high-valued specimens are not re¬ frequency distribution of the values. If a shift has occurred
ceived at the same time and the proportion of in the testing procedure, correction will be made for the
shift before the distribution is tabulated. Third, it will sepa¬
healthy and sick subjects is reasonably stable. rate the "clinically normal component" from the remainder
Most practicing physicians are not directly con¬ of the distribution. The separation will include computing
cerned with operating clinical laboratories, but the means and standard deviations of all the resulting com¬
they are concerned about laboratory accuracy. ponent distributions. Fourth, it will compute a "normal
They want information from the laboratory about quotient scale." Finally it will compute an "illness likeli¬
hood curve."
normal values and they want assurance that the
testing procedures are stable. In other words, they All of this will be done automatically. Similar
want information. Both producer and consumer of kinds of things could be done for other medical
information should agree on how the information measurements.
is produced and organized if misunderstandings The above uses of statistics are described in
are to be avoided. This brings up a major point of terms of local problems. There is no reason why
this paper: The greatest value of statistics to the some problems be studied on a state or
cannot
practicing physician is organized information. Sta¬ nation-wide basis. A few details would have to be
tistics offers the organizing methods, but the in¬ omitted, but other information would be obtained.
formation must come from medicine itself. For example, if distributions of growth rates of chil¬
dren were submitted from each of 1,000 pediatri¬
Summary and Comment cians scattered over the country, it soon would
A few features of some statistical methods have really be known how fast children are growing.
been described with emphasis on applications of Many of the statistical methods outlined here
interest to practicing physicians. The manner in would be useful for this and other similar prob¬
which frequency distributions are formed, how lems. Information that is now lying fallow in rec¬
theoretical distributions can be used as a base for ords would be organized and put to use.
limits of clinical normality, how measurements are It should be clear by this time that properly used
used to separate healthy from sick patients, and statistical methods can provide a set of aids to the
how information pertaining to the control of clin¬ practicing physician. These aids would be abstracts
ical laboratory measurements is easily obtained of information organized by statistical methods
from the results of testing patient specimens are which would allow the physician in a sense to
all outlined. These are only a few examples of share the experiences of hundreds or thousands of
potential uses of statistics in the practice of medi¬ others. Organized information is certainly not new
cine. Other examples appear in my statistics text.7 to medicine. A hospital chart is very carefully or¬
Common to all the above are essentially three ganized, but it describes only a single patient. The
things : kinds of things discussed here involve only a single
1. Statistical services must be available. aspect of many patients. Each kind of information
2. Information about specific problems must has its own uses and limitations. Each should com¬
be collected. plement the other.
3. Communication channels for the informa¬ The values in Fig 5 were furnished by John B. Henry, MD, Director
tion must be established; otherwise it will o£ the Clinical Laboratory, J. Hillis Miller Health Center. An Auto-
analyzer was used to test the specimens, which represent routine clini¬
be wasted. cal work.
For some problems, little time or effort is required References
to gather and organize information. The clinical 1. Cecil, R.L., and Loeb, R.F.: A Textbook of Medicine,
laboratory should be able to provide distributions 10th Ed, Philadelphia: W. B. Saunders Co., 1959.
of laboratory test values, and the medical records 2. Lozner, E.L., et al: Intravenous Glucose Tolerance
Tests, J Clin Invest 20:507, 1941.
department can perform a similar function using 3. Hald, A.: Statistical Theory with Engineering Applica-
information obtained from patients' charts. Refer¬ tions, New York: John Wiley and Sons, Inc., 1952.
ence here is made to problems of local interest, for 4. Sunderman, F.W., and Boerner, F.: Normal Values in
example those affecting a single hospital staff or a Clinical Medicine, Philadelphia: W. B. Saunders Co., 1950.
5. Dunn, J.E., Jr., and Greenhouse, S.W.: Cancer Diag-
county medical society. The applications, however, nostic Tests, Principles and Criteria for Development and
are not so limited, as will be discussed presently. Evaluation, Federal Security Agency, Public Health Service,
Large-scale applications, however, suggest mechan¬ National Cancer Institute, Publication No. 9, Washington,
ical means of handling information because of ex¬ D.C.: US Government Printing Office, 1950.
tensive amount of work involved. 6. Hoffmann, R.G., and Waid, M.E.: The Number Plus
Computers can
Method of Quality Control of Laboratory Accuracy, Amer
be helpful for some applications suggested here. J Clin Path, to be published.
For example, a computer program is being de¬ 7. Hoffmann, R.G.: Statistics for Medical Students,
vised to do the following: Springfield, Ill.: Charles C Thomas, Publisher, 1963.

Downloaded From: https://jamanetwork.com/ by a The Ohio State University Health Sciences Library User on 04/06/2019

Anda mungkin juga menyukai