NCSLI Measure
www.ncsli.org
TECHNICAL PAPERS
Authors
Jonathan Harben
The Bionetics Corporation
M/S: ISC-6175
Kennedy Space Center, FL 32899
jonathan.p.harben@nasa.gov
Paul Reese
Covidien, Inc.
815 Tek Drive
Crystal Lake, IL 60014
paul.reese@covidien.com
39
NCSLI Measure
www.ncsli.org
TECHNICAL PAPERS
In scenarios, #2, #3, & #4, this uncertainty makes it possible for the true value of
the measurand to be either in or out of tolerance. Consider scenario #3, where the UUT
was observed at 9.90 V, exactly at the lower
allowable tolerance limit. Under such conditions, there is a 50 % probability that either an
in-tolerance or out-of-tolerance decision will
be incorrect, barring any other information.
In fact, even for standards with the lowest
possible uncertainty, the probability of being
incorrect will remain at 50 % in scenario #33.
This concept of bench level risk is addressed
in several documents [9, 10, 11, 12].
The simple analysis of the individual
measurement results presented above is not
directly consistent with the intent of The
2 % rule in Z540.3, although it still has
application. Until now, our discussion has
dealt exclusively with bench level analysis
of measurement decision risk. That is, risk
was predicated only on knowledge of the
relationship between the UUT tolerance, the
measurement uncertainty, and the observed
measurement result made on-the-bench.
However, the computation of false accept
risk, for strict compliance with the 2 % rule
in Z540.3, does not depend on any particular
measurement, nor does it depend on its proximity to a given UUT tolerance limit. Instead,
the 2 % rule in Z540.3 addresses the risk at
the program level, prior to obtaining a measurement result. To understand both bench
level and program level false accept risk, the
intent underlying the 2 % rule and its relationship to TUR and EOPR4 must be examined.
3. The Answer to Two Different
Questions
False accept risk describes the overall probability of false acceptance when pass/fail
decisions are made. False accept risk can be
interpreted and analyzed at either the bench
level or the program level [4]. Both risk levels are described in ASME Technical Report
B89.7.4.1-2005 [13]. The ASME report refers
to bench level risk mitigation as controlling
Bayesian analysis can result in false accept risk
other than 50 % in such instances, where the a
priori in-tolerance probability (EOPR) of the UUT
is known in addition to the measurement result
and uncertainty.
40
NCSLI Measure
EOPR =
(1)
TECHNICAL PAPERS
Figure 2. Previous historical measurement data can influence future false accept risk.
NCSLI Measure
www.ncsli.org
TECHNICAL PAPERS
new laboratory open its doors for business
and meet the 2 % false accept requirement
of Z540.3 without EOPR data? The answer
is yes. However, the new laboratory must
employ bench level techniques, or techniques
such as boundary condition methods or
guardbanding. Such methods are described
later in this paper. This same logic would apply to an established laboratory that receives
a new, unique instrument to calibrate for the
first time. In the absence of historical data,
other appropriate techniques and/or bench
level methods must be employed.
If EOPR data or in-tolerance probability
is important for calculating risk, several other
questions are raised. For example, how good
must the estimate of EOPR be before program level methods can be used to address
false accept risk for a population of instruments? When is the collection of measurement data complete? What are the rules for
updating EOPR in light of new evidence?
Sharing or exchanging EOPR data between
different laboratories has even been proposed
with varying opinions. Acceptance of this
generally depends upon the consistency of
the calibration procedure used and the laboratory standards employed. The rules used
to establish EOPR data can be subjective
(for example, how many samples are available, are first-time calibrations counted, are
broken instruments included, are late calibrations included, and so on). Instruments can
be grouped together by various classifications, such as model number. For example,
reliability data for the M&TE model and
manufacturer level can be used to conservatively estimate the reliability of the M&TE
test point. This is addressed in compliance
Method 1 & 2 of the Z540.3 Handbook [16].
At first, this definition appears to be similar to older definitions of TUR. The definition implies that if the numerator, associated
with the specification of the UUT, is a plusor-minus () tolerance, the entire span of the
tolerance must be included. However, this is
countered by the requirement to multiply the
95 % expanded uncertainty of the measurement process in the denominator by a factor
of two. The confidence level associated with
the UUT tolerance is undefined. This quandary is not new, as assumptions about the level
of confidence associated with the UUT (numerator) have been made for decades.
There is, however, a distinct difference
between the TUR as defined in Z540.3 and
NCSLI Measure
TECHNICAL PAPERS
documented by several authors [18, 20, 21,
22]. However, the integrity of a TUR depends
upon the level of effort and honesty demonstrated by the manufacturer when assigning
accuracy specifications to their equipment.
It is important to know if the specifications
are conservative and reliable, or if they were
produced by a marketing department that was
motivated by other factors.
6. Understanding Program Level False
Accept Risk
Figure 5. Topographical contour map with tolerance limits (L) and regions
of incorrect compliance decisions.
43
NCSLI Measure
www.ncsli.org
TECHNICAL PAPERS
surements are a function of both () characterized by the UUT performance uut and the
measurement eobs with associated uncertainty
, where s obs = s uut + s std . The relative likelihood of all possible measurement results is
represented by the two dimensional surface
area created by the joint probability distribution given by (, eobs) = () (std). Figures 4 and 5 illustrate the concept of probability density of measurement and represent
the relative likelihood of possible measurement outcomes given the variables TUR and
EOPR. It is assumed that measurement uncertainty and the UUT distribution follow a
normal or Gaussian probability density function, yielding a bivariate normal distribution.
Figure 5 is a top-down perspective of Fig. 4,
when viewed from above.
The height, shape, and angle of the joint
probability distribution change as a function
of input variables TUR and EOPR. The dynamics of this are critical, as they define the
amount of risk for a given measurement scenario. The nine regions in Fig. 5 are defined
by two-sided symmetrical tolerance limits.
Risk is the probably of a measurement occurring in either the false accept regions or
the false reject regions. Computing an actual
numeric value for the probability (PFA or
PFR) involves integrating the joint probability density function over the appropriate two
dimensional surface areas (regions) defined
by the limits stated below. Incorrect (false)
acceptance decisions are made when euut >
|L| and L eobs L. In this case, the UUT
is truly out of tolerance, but is observed to
be in tolerance. Likewise incorrect (false) reject decisions are made when eobs >|L| and L
euut L, or where the UUT is observed to
be out of tolerance, but is truly in tolerance.
Integration over the entire joint probability
region will yield a value of 1, as would be
expected. This encompasses 100 % of the
volume under the surface of Fig. 4. When
the limits of integration are restricted to the
two false accept regions shown in Fig. 5, a
small portion of the total volume is computed
which represents the false accept risk as a
percentage of that total volume.
In the ideal case, if the measurement uncertainty was zero, the probability of measurement errors estd occurring would be zero.
The measurements would then perfectly reflect the behavior of the UUT and the distribution of possible measurement results would
be limited to the distribution of actual UUT
errors. That is, (obs) would equal (uut)
44
NCSLI Measure
www.ncsli.org
TECHNICAL PAPERS
Figure 6. Surface plot of false accept risk as a function of TUR and EOPR.
Figure 7. Topographical contour map of false accept risk as a function of TUR and EOPR.
8. True Versus Observed EOPR
NCSLI Measure
true EOPR becomes larger as the measurement uncertainty increases and the TUR
drops. A low TUR can result in a significant
deviation between what is observed and what
is true regarding the reliability data [23, 28,
29, 30]. The reported or observed EOPR
from a calibration history includes all influences from the measurement process. In this
case, the standard deviation of the observed
www.ncsli.org
TECHNICAL PAPERS
NCSLI Measure
www.ncsli.org
TECHNICAL PAPERS
Figure 10. PFA assumes worst case TUR for true EOPR and observed EOPR.
EOPR. Measurement uncertainty always at face value (analogous to Type B or heuhinders the quest for accurate data; it never ristic evaluations). However, the influence of
helps. The true value of a single data point the measurement process is always present.
can be higher or lower than the measured This method of removing measurement unvalue, it is never known whether the mea- certainty from the EOPR data is a best estisurement uncertainty contributed a positive mate of the true reality or reliability which is
error or negative error. Therefore, it is not sought through measurement.
possible to remove the effect of measurement
uncertainty from a single measurement result. 9. Guardbanding
However, EOPR data is a historical collection It is sometimes helpful to establish acceptance
of many pass/fail compliance decisions that limits A at the time-of-test that are more strincan be represented by a normal probability gent than the manufacturers tolerance limits
distribution with a standard deviation obs. L. Acceptance limits are often called guardSometimes the measurement uncertainty std band limits or test-limits. It is only necessary
will contribute positive errors and sometimes to implement acceptance limits A, which differ from
the tolerance
limits L, when the false
it will contribute
negative
errors.
mean error.
easurement uncertainty
contributed
a positive
errorIforthe
negative
Therefore,
it
to remove the
effect eof
measurement uncertainty from a single
measurement
accept
risk is higher than desired or as part
of these
std errors is assumed to be zero, the
er, EOPR data is a historical collection of many pass/fail compliance
decisionsto keep risk below a specified
effect of measurement uncertainty on a popu- of a program
.
presented by a normal probability distribution with a standard deviation
level.
Acceptance
limits may be chosen to
of EOPR data
be removed
as previmeasurementlation
uncertainty
willcan
contribute
positive
errors and
sometimes
it
at either the bench level or the
ously
shown.
inverse normal
negative errors.
If the
mean The
of these
errors isfunction
assumedisto bemitigate
zero, therisk
effect
t uncertainty on
a population
of EOPR
data can
be removed
as previously
programshown.
level. PFA calculations may be used
used
to estimate
observed
EOPR
obs from
rmal function data
is used
to
estimate
from
observed
EOPR
data
[31]
to establish acceptance limits based on the
[31]
mandated risk requirements. In most instanc(2)
,
(2) bands are applied, the toleres, where guard
(
)
ance limits are temporarily tightened or
where
-1 represents
esents the inverse
normal
distribution.the inverse normal dis- reduced to create acceptance limits needed to
tribution.
meet a PFA goal. The subject of guardbands a numerical quantity
at by statistical
means
applied
data
EOPRarrived
is a numerical
quantity
arrived
at to empirical
ing is extensive and novel approaches exist
Type A evaluation
in
the
language
of
the
GUM
[32].
The
data
comes
from
by statistical means applied to empirical data for establishing
urements made over time rather than from accepting manufacturers
claims at face acceptance limits to mitigate
analogous
to a Type AHowever,
evaluation
the lan- of the
risk,
even where EOPR data is not available
us to Type B or
heuristic evaluations).
theininfluence
measurement
guage
of theofGUM
[32]. The
data comes
from [25].
ays present. This
method
removing
measurement
uncertainty
from the
EOPR in the simplified case of no
However,
stimate of the true
realitymeasurements
or reliability which
sought
through
repeated
madeisover
time
rath- measurement.
guardbanding, the acceptance limits A are set
er than from accepting manufacturers claims equal to the tolerance limits L (A = L ).
ing
helpful to establish acceptance limits A at the time-of-test that are more stringent
acturers tolerance
L. Acceptance
limits are often called guardband limits or
47 limits
| NCSLI
Measure
only necessary to implement acceptance limits A, which differ from the tolerance
the false accept risk is higher than desired or as part of a program to keep risk
(3)
TECHNICAL PAPERS
Figure 11. Guardband multiplier for acceptable risk limits as a function of TUR.
NCSLI Measure
Analyze EOPR data. This will most likely be done at the instrument-level, as opposed to the test-point level, depending on
data collection methods. If the observed
EOPR data meets the required level of 89
%, then the 2 % PFA rule has been satisfied.
If this is not the case, then further
analysis is needed and the TUR must
be determined at each test point. If the
analysis reveals that the TUR is greater
than 4.6:1, no further action is necessary and the 2 % PFA rule has been met.
Compliance with the 2 % rule can be accomplished by either calculating PFA and/or
limiting its probability to less than 2% by the
methods presented above. If these methods
are not sufficient, alternative methods of mitigating PFA are available [16]. Of course, no
amount of effort on the part of the calibration
laboratory can force a UUT to comply with
unrealistic expectations of performance. In
some cases, contacting the manufacturer with
this evidence may result in the issuance of
revised specifications that are more realistic.
Assumptions, approximations, estimations, and uncertainty have always been part
of metrology, and no process can guarantee
that instruments will provide the desired accuracy, or function within their assigned tolerances during any particular application or
use. However, a well-managed calibration
process can provide confidence that an instrument will perform as expected and within
limits. This confidence can be quantified via
analysis of uncertainty, EOPR, and false accept risk. Reducing the number of assump7
www.ncsli.org
TECHNICAL PAPERS
The authors thank the many people who contributed to our understanding of the subject matter presented here. Specifically, the contributions of Perry King (Bionetics), Scott Mimbs (NASA), and Jim
Wachter (Millennium Engineering and Integration) at Kennedy Space
Center were invaluable. Several graphics were generated using PTCs
MathCad 14. Where numerical methods were more appropriate,
Microsoft Excel was used incorporating VBA functions developed
by Dr. Dennis Jackson of the Naval Surface Warfare Center in Corona,
California.
12. References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
49
NCSLI Measure
www.ncsli.org