Anda di halaman 1dari 17

EVIDENCE-BASED DIAGNOSTICS

History, Physical Examination, Laboratory


Testing, and Emergency Department
Ultrasonography for the Diagnosis of Acute
Cholecystitis
Ashika Jain, MD, Ninfa Mehta, MD, Michael Secko, MD, Joshua Schechter, MD,
Dimitri Papanagnou, MD, Shreya Pandya, MD, and Richard Sinert, DO

ABSTRACT
Background: Acute cholecystitis (AC) is a common differential for patients presenting to the emergency
department (ED) with abdominal pain. The diagnostic accuracy of history, physical examination, and bedside
laboratory tests for AC have not been quantitatively described.

Objectives: We performed a systematic review to determine the utility of history and physical examination (H&P),
laboratory studies, and ultrasonography (US) in diagnosing AC in the ED.

Methods: We searched medical literature from January 1965 to March 2016 in PubMed, Embase, and
SCOPUS using a strategy derived from the following formulation of our clinical question: patientsED
patients suspected of AC; interventionsH&P, laboratory studies, and US ndings commonly used to
diagnose AC; comparatorsurgical pathology or denitive diagnostic radiologic study conrming AC; and
outcomethe operating characteristics of the investigations in diagnosing AC were calculated. Sensitivity,
specicity, and likelihood ratios (LRs) were calculated using Meta-DiSc with a random-effects model (95% CI).
Study quality and risks for bias were assessed using the Quality Assessment Tool for Diagnostic Accuracy
Studies.

Results: Separate PubMed, Embase, and SCOPUS searches retrieved studies for H&P (n = 734), laboratory
ndings (n = 74), and US (n = 492). Three H&P studies met inclusion/exclusion criteria with AC prevalence of
7%64%. Fever had sensitivity ranging from 31% to 62% and specicity from 37% to 74%; positive LR [LR+]
was 0.711.24, and negative LR [LR] was 0.761.49. Jaundice sensitivity ranged from 11% to 14%, and
specicity from 86% to 99%; LR+ was 0.8013.81, and LR was 0.871.03. Murphys sign sensitivity was 62%
(range = 53%71%), and specicity was 96% (range = 95%-97%); LR+ was 15.64 (range = 11.4821.31), and
LR was 0.40 (range = 0.320.50). Right upper quadrant pain had sensitivity ranging from 56% to 93% and
specicity of 0% to 96%; LR+ ranged from 0.92 to 14.02, and LR from 0.46 to 7.86. One laboratory study met
criteria with a 26% prevalence of AC. Elevated bilirubin had a sensitivity of 40% (range = 12%74%) and
specicity of 93% (range = 77%99%); LR+ was 5.80 (range = 1.2526.99), and LR was 0.64 (range = 0.39
1.08). Five US studies with a prevalence of AC of between 10% and 46%. US sensitivity was 86% (range =
78%94%) and specicity was 71% (range = 66%76%); LR+ was 3.23 (range = 1.746.00), and LR was 0.18
(range = 0.100.33).

From the Department of Emergency Medicine, SUNY-Downstate Medical Center (AJ, NM, MS, JS, SP, RS), Brooklyn, NY; and the Department of
Emergency Medicine, Thomas Jefferson University Hospital (DP), Philadelphia, PA.
Received May 27, 2016; revision received October 23, 2016; accepted November 2, 2016.
This study was presented at the Society for Academic Emergency Medicine Annual Meeting, Dallas, TX, May 16, 2014; the Society for Academic
Emergency Medicine Mid-Atlantic Regional Meeting, Washington, DC, February 22, 2014; and the Society for Academic Emergency Medicine
North-East Regional Conference, New Haven, CT, March 26, 2014.
The authors have no relevant nancial information or potential conicts to disclose.
Supervising Editor: Christopher R. Carpenter, MD.
Address for correspondence and reprints: Ninfa Mehta, MD; e-mail: ashikajainmd@gmail.com.
ACADEMIC EMERGENCY MEDICINE 2017;24:281297.

2016 by the Society for Academic Emergency Medicine ISSN 1069-6563


doi: 10.1111/acem.13132 PII ISSN 1069-6563583 281
282 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

Conclusion: Variable disease prevalence, coupled with limited sample sizes, increases the risk of selection bias.
Individually, none of these investigations reliably rule out AC. Development of a clinical decision rule to include
evaluation of H&P, laboratory data, and US are more likely to achieve a correct diagnosis of AC.

A bdominal pain is one of the most common chief


complaints encountered by emergency depart-
ment (ED) physicians. Many abdominal pain patients
without utilizing radiology department resources (US,
CT, MRI, or HIDA). We limited our population to
ED patients suspected of AC and specifically limited
are found to have a benign etiology. However, the pos- sonography as performed and interpreted by ED physi-
sibility of acute cholecystitis (AC), which accounts for cians. While there are several variants of AC, this
3%11% of hospital admissions13 with a mortality study does not address variants such as emphysema-
rate of 0.8%,4 usually requires a comprehensive diag- tous cholecystitis or acalculous cholecystitis, as they
nostic evaluation using the modalities of history and would require formal radiologic evaluation for diag-
physical examination (H&P), laboratory results, and noses and therefore beyond the scope of point-of-care
imaging studies. Many studies have previously US (POCUS).
described individually the H&P,5,6 laboratory,7,8 and
imaging9,10 consistent with AC. The Tokyo AC guide-
METHODS
lines by Hirota et al.11 utilized an expert consensus
methodology to integrate multiple diagnostic modalities
Study Design
to predict AC. Hirota et al.l1 determined a definitive
diagnosis of AC could be obtained if the patient We conducted a systematic review of studies that
exhibited at least one of the two local signs of inflam- examined the operating test characteristics of the
mation (Murphys sign or right upper quadrant modalities used by emergency physicians to diagnose
[RUQ] mass/pain/tenderness) plus one of the sys- AC. The systematic review was conducting using the
temic signs of inflammation (fever, elevated C-reactive Preferred Reporting Items for Systematic Review and
protein or elevated while blood cell count [WBC]). Meta-analyses (PRISMA) guidelines.13
Then if AC was clinically suspected, a definitive diag-
nosis of AC would still depend on a radiology depart- Search Strategy
ment study (ultrasound [US], computerized The design and manuscript structure of this systematic
tomography [CT], magnetic resonance imaging [MRI], review conform to the recommendations from the
or hepatobiliary iminodiacetic acid scan/cholescintigra- Meta-analysis of Observational Studies in Epidemiol-
phy [HIDA]). Recently an updated version of the ogy (MOOSE)14 statement. In conjunction with a
Tokyo AC guidelines (TG13)12 was validated in a sim- medical librarian, six investigators independently
ilar population to the derivation cohort with a sensitiv- searched the medical literature from January 1965 to
ity of 87.6% and specificity of 77.7%. March 2016 in PubMed, Embase, and SCOPUS for
Although the TG1312 integrates the diagnostic the search terms diagnosis and cholecystitis. Diagnosis
modalities for AC, they are derived by expert opinion was searched under MeSH headings diagnosis, diagno-
based on retrospective multicenter analysis and vali- sis-related groups, delayed diagnosis, computer-assisted
dated in the same population as the derivation cohort diagnosis, early diagnosis, differential diagnosis,
may limit the generalizability to other patient popula- immunologic tests, ultrasonography, laboratory tech-
tions. We decided to develop a similar integrated niques and procedures, or radiography. Cholecystitis
approach to diagnosing AC but utilizing a systematic was searched under MeSH headings blood, diagnosis,
review/meta-analytic approach. The primary objective epidemiology, etiology, history, microbiology, pathol-
of this systematic review is to determine the diagnostic ogy, physiology, physiopathology, radiography, ultra-
test accuracy (sensitivity, specificity, and likelihood sonography, and cholecystitis. The two searches were
ratios [LRs]) of H&P, laboratory data, and imaging combined and limited by human subjects, adults, and
studies to predict AC for ED patients. Specifically, we English language articles. The PubMed, Embase, and
were interested in finding elements of the H&P, labo- SCOPUS searches were combined for the three sepa-
ratory data, and imaging studies available to ED physi- rate search topics: H&P, laboratory data, and US.
cians at the point of care, which would allow Studies were included if they recruited adult patients
expedited disposition of patients suspected of AC in the ED who had a bedside emergency US. Studies
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 283

were included only if the patient had the criterion eligible for inclusion, each author then applied the sta-
standard for final diagnosis, which was predetermined ted inclusion and exclusion criteria to determine
to be pathology diagnosis or biliary scintography. Nar- which studies to include in our systematic review. Dif-
rative reviews, case reports, or studies focused on chil- ferences were resolved by consensus after discussion
dren or therapy were not included. and adjudication.

Criteria for Considering Studies for This Data Analysis


Review Sensitivities, specificities, and LRs were calculated
Types of Participants. We included studies that based on constructed two-by-two tables for each
recruited adult patients presenting to the ED with included study. To compute meta-analysis summary
abdominal pain or RUQ pain as their chief complaint. estimates when more than one study assessed the
Studies that recruited patients who presented to an same index test, we combined test characteristic data
urgent care setting were excluded, as these patients are using a random-effects model with Meta-DiSc soft-
substantively different than those who present to the ware.20 Interstudy heterogeneity was assessed for
ED with true, emergent abdominal pain. Patients were pooled estimates of sensitivity and specificity using the
not excluded based on comorbidities. DerSimonian-Laird random-effect model.21 Publication
bias was not assessed because of the questionable
Types of Index Tests. We included studies that validity of this approach in diagnostic meta-analyses.22
used H&P findings, laboratory tests, and US as index
tests for the diagnosis of AC. We only included Quality Assessment
abdominal sonographic studies if performed and inter- Two authors (NM, MS) used the Quality Assessment
preted by ED physicians. While many EDs continue Tool for Diagnostic Accuracy Studies (QUADAS-2)23
to order formal USs, there is a growing number of for systematic reviews to evaluate the overall quality of
emergency physicians who are US trained and rely evidence for the trials included. For the purposes of
solely on POCUS. Additionally, RUQ US is a core this diagnostic systematic review, several considerations
emergency US application per ACEP15 guidelines. As were established a priori to assess the quality of indi-
a result, many emergency physicians do in fact make vidual trials. The ideal patient population would be
the diagnosis of AC based on POCUS. Some institu- those presenting to an ED with abdominal complaints
tions may continue to utilize formal US for a variety as mentioned.
of other reasons. Nevertheless, there is a substantial The QUADAS-2 method assesses four categories of
amount of evidence that has shown POCUS to be just study design, including patient selection, index test, ref-
as sensitive and specific as formal US for diagnosis of erence standard, and flow and timing. Four domains
AC.16 19 For this reason, formal US was excluded were assessed for biases. 1) Patient selectionWere
from the analysis. the patients enrolled at random or consecutively?
Were there inappropriate exclusions? Could the
Types of Reference Standard. We included patients included not be representative of all patients
studies that used as a reference standard a final diag- presenting to the ED with a clinical picture of AC? 2)
nosis of AC based on pathologic finding of AC at sur- Index testWas the history and physical examination
gery or a positive biliary scintigraphy study. obtained without knowledge of the results of criterion
standard test for AC? Were thresholds for vital signs
Data Abstraction pre-determined for the study? Is there concern that
Two or more authors for each index test category inde- the way the history and physical were obtained would
pendently selected articles from the combined be different than done in clinical practice? 3) Refer-
PubMed/Embase search for full text review (H&P = ence standardWas the criterion standard test for
734; laboratory data = 74; US = 492). Each reviewer AC obtained on every patient in the study? Were the
independently selected potentially eligible studies radiologists blinded to the clinical findings? 4) Flow
before both authors agreed on the list of studies for and timingCould the order of how the history and
full text review. Differences in study selection were physical and criterion standard test for AC were
resolved by consensus. Having read the methods sec- obtained and read have introduced bias? Studies
tions of the full-text version of the studies potentially would be considered low risk of bias if all four
284 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

domains were rated no bias. All included studies used compared to prospective diagnostic studies, when cor-
pathology as the reference standard, but if the refer- rected for methodologic flaws, do not produce differ-
ence standard was not clearly defined that portion of ent results, our retrospective studies all had significant
QUADAS-2 was at high risk for bias. Similarly, if exe- flaws. All of our retrospective studies57,25,26 had
cution of the index test was not clearly defined, that issues related to reliability of their retrospectively
portion of QUADAS-2 would also be at high risk for abstracted data. Gilbert et al.28 has defined eight crite-
bias. Each category is accompanied by a set of yes or ria: 1) training, 2) case selection, 3) definition of vari-
no questions and answering no to any question places ables, 4) abstraction forms, 5) meetings, 6)
that portion of the study design at high risk for bias. monitoring, 7) blinding and 8) testing of inter-rater
An unweighted Cohens kappa was calculated to mea- agreement of retrospective chart reviews to improve
sure agreement. Statistical agreement between these accuracy and minimize inconsistencies in data acquisi-
two reviewers was assessed via a kappa analysis using tion. All of our retrospective studies57,25,26 failed to
SPSS (v21.0). Two of the authors (NM and MS) indi- document any of these methods to assure unbiased
vidually rated the QUADAS-2 assessment with a data collection from their medical records. For these
kappa of 0.87. A meeting was held between the two reasons, all retrospective studies were excluded from
QUADAS-2 raters and the third author (RS) who the final data analysis.
adjudicated any differences in QUADAS-2 rating. As detailed in Figure 1A (H&P, n = 3), Figure 1B
(laboratory studies, n = 1), and Figure 1C (US
TestTreatment Threshold (n = 4), a total of nine diagnostic studies were
The Pauker and Kassirer decision threshold model included in our final analysis, respectively. A full
was used to develop a treatment algorithm.24 This description of the reviewed studies, including the study
method is based on considering six variables: false- design, subject characteristics, variables assessed, crite-
negative and false-positive proportions, sensitivity, rion standard, and AC prevalence is included in the
specificity, risk of a diagnostic test, risk of treatment, tables for each diagnostic modality: H&P (Table 1A),
and anticipated benefit of treatment. Estimates of these laboratory studies (Table 1B), and US (Table 1C).
variables were abstracted from our systematic review to
derive theoretical test and treatment thresholds for ED Prevalence
patients with AC diagnosed via bedside emergency The combined population from the eight unique
US (Figure 2). cohorts included in this review was 1,990, of which
297 patients were diagnosed with AC. The weighted
prevalence of AC across all studies was 14.9% with a
RESULTS
range of 7% to 64%. Our reviewed studies using
RUQ pain or suspected AC (seven studies,8,10,2934
Description of Included Studies
n = 657 patients) as their primary inclusion criteria
The PubMed, Embase, and SCOPUS search identi- and a higher prevalence of AC of 31% (range =
fied 734 citations for H&P (see Figure 1A). For labo- 10%64%) versus 7% (range = 7%46%) compared
ratory studies, PubMed, Embase, and SCOPUS to those with abdominal pain (Eskelinen et al.,35
identified a total of 74 articles (see Figure 1B). For n = 1,333 patients) as their primary inclusion criteria.
US, PubMed, Embase, and SCOPUS search identi- This clearly suggests a selection bias for AC in those
fied a total of 492 articles (see Figure 1C). Reviewing studies using RUQ pain10,29,30,32 or suspected AC
the bibliographies of the pertinent articles identified compared to studies including all generalized abdomi-
no additional studies. We excluded all studies of nal pain8,31,3335 as their inclusion criteria.
emphysematous and acalculous cholecystitis, which are
present with an indefinable risk factor such as multi- H&P
system trauma, burns, chronic debilitation, total par- All of the three studies29,30,35 reviewed in the H&P
enteral feeding, or immunosuppression, not typical of section used a prospective observational methodology.
usual ED presentation of AC. Inclusion criteria were not uniform across our
We decided to remove all the retrospective reviewed studies. Acute abdominal pain was the inclu-
studies57,25,26 from our review. Although, the work sion criteria for Eskelinen et al.,35where Schofield
by Lijmer et al.27 has shown that retrospective et al.29 and Bednarz et al.30 included only those
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 285

Figure 1. Consort diagram. (A) History and physical examination; (B) laboratory studies; (C) ultrasound studies.
286 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

Figure 1. continued.

Table 1A
Description of Reviewed Studies for H&P

Criterion Prevalence,
Study Study Design Subject Characteristics Variables Assessed Standard % (95% CI)
Schoeld Prospective Inclusion: Temperature > 64 (5473)
et al., 198629 observational RUQ pain 37.5C Gallstones at
study Exclusion: Mass Surgery
Not stated Vomiting
Sample Size: 100
Mean Age: 52 yrs.
Range: 2277 yrs.
Gender: 70% (f)
Bednarz Prospective Inclusion: Temperature > Gallstones at 39 (2850)
et al., 198630 observational Suspected cholecystitis fever surgery (71%)
study Exclusion: Mass Clinical (29%)
Not stated Jaundice
Sample Size: 70 RUQ pain
Mean age: 56 y RUQ tenderness
Range: 1992 y RUQ rebound
Gender: 50% (female)
Eskelinen Prospective Inclusion: Temperature > Gallstones 7 (69)
et al., 199335 observational Acute abdominal pain 37.1C at surgery
study Less than 7 days duration Mass
Exclusion: Jaundice
Not stated RUQ pain
Sample size: 1,333 RUQ tenderness
Mean age: 38 y RUQ rebound
SD: 22.1 y Murphys sign
Gender: 52.3% (female)

H&P = history and physical examination; RUQ = right upper quadrant.

patients with RUQ pain or those suspected of AC. et al.30 (n = 70) and Eskelinen et al.35 (n = 1,333).
This difference in inclusion criteria explains the signifi- Ages of included patient ranged from means of 388 to
cantly higher AC prevalence (64 and 39%) in the lat- 5630 yrs. All studies showed a female preponderance
ter two studies compared to Eskelinen et al.35 (7%). of subjects from 50% in Bednarz et al.30 to Schofield
Sample size also varied considerably between Bednarz et al.29 with 70%.
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 287

Table 1B
Description of Reviewed Studies for Laboratory Studies

Prevalence,
Study Study Design Subject Characteristics Variables Assessed Criterion Standard % (95% CI)
Eikman et al., 19758 Prospective Inclusion: Total bilirubin Gallstones at surgery 26 (1441)
observational study Suspicion of cholecystitis
Exclusion:
Not stated
Sample size: 39
Mean age: 44 y
Range: 1582 y
Gender: 67% (female)

Of the H&P variables in Table 1A, an elevated sufficiently low LRs to significantly decrease the
temperature (37.138.C) or fever was the most probability of AC.
common AC sign reported in all of reviewed studies.
All of our reviewed studies also contained mass as QUADAS-2 Analysis for H&P for AC
an assessed variable. RUQ pain, tenderness, and All of our reviewers agreed 100% on the QUADAS-2
rebound were evaluated only by Bednarz et al.30 and scoring for the three H&P studies we reviewed (Table
Eskelinen et al.35 Only Bednarz et al.30 tested jaundice 3). All reviewers found all three studies to have high risks
while vomiting was only investigated by Schofield of bias in reporting H&P test characteristics. The study
et al.29 Studies by Schofield et al.29 and Eskelinen by Eskelinen et al.35 suffered from differential verifica-
et al.35 used the finding of gallstones at surgery as tion bias. Differential verification bias, also called double
their criterion standard for AC, and the study by Bed- criterion standard bias, as described by Kohn et al.,36
narz et al.30 used both gallstones at surgery (79%) as occurs when the results of the index test determine differ-
well as clinical definition (29%). ent gold standards. Since the study by Eskelinen et al.35
was a study of all patients with abdominal pain, those
Test Characteristics of H&P for AC patients who tested positive for any of the index tests
From Table 2A, we found that the effects of only two commonly associated with AC such as RUQ findings
to three studies per risk factor coupled with the differ- (mass, pain, tenderness), jaundice, or Murphys sign
ences in study populations between generalized were all preferentially tested for the criterion standard for
abdominal pain and RUQ pain resulted in such AC. Those patients without any of these signs of AC
marked heterogeneity; that pooling of the data was not were then tested for other differential diagnoses of acute
adequate. We decided to report only point estimates abdominal pain, utilizing different criterion standard
and not pooled data for any of our test characteristics. tests. Only 10.1% of the study population of Eskelinen
We grouped the variable fever for the three reviewed et al.35 had AC, while 30.2% had acute appendicitis
studies together even though the study by Bednarz with 40% diagnosed as nonspecific abdominal pain with
et al.30 did not specify a specific temperature and the remainder having nephrolithiasis, dyspepsia, small
other two studies reported very similar cutoffs bowel obstruction, and other less common etiologies of
of >37.5C29 and >37.1C.35 We found that fever abdominal pain. Differential verification bias in the case
had very poor test characteristics with sensitivities of the study by Eskelinen et al.35 significantly falsely
between 31%29 and 62%,35 specificities of 37%30 to raised the specificity to greater than 95% for RUQ
74%,29 positive LR (LR+) of 0.7130 to 1.24,35 and (mass, pain, tenderness), jaundice, and Murphys sign,
negative LR (LR) of 0.7635 to 1.49.30 Between the with a smaller effect on sensitivity with result of inflating
Bednarz et al.30 and Eskelinen et al.35 studies, marked the LR+ (>13) for all these test characteristics.
heterogeneity was noted for the variables RUQ pain, When we examine the test characteristic of RUQ
mass, and tenderness, most likely secondary to biases pain in Table 2A, we see an excellent example of par-
documented in our QUADAS-2 analysis. Even with tial verification bias in the study of Bednarz et al.30
these biases none of characteristics of RUQ mass, Also as described by Kohn et al.36 partial verification
pain, or tenderness or the other risks such as RUQ bias, verification bias, referral, ascertainment, or
rebound, jaundice, Murphys sign, or vomiting had workup bias occurs when patients positive for the
288 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

Table 1C
Description of Reviewed Studies for US

Experience of Criterion Prevalence,


Study Study Design Subject Characteristics Sonographers Standard % (95% CI)
Kendall and Prospective Inclusion: Image acquisition: Surgical 10 (617)
Shimp, 200132 observational RUQ pain Full-time ED physicians pathology
study Epigastric pain 2-h didactics
Jaundice 3 h. supervised
History of stones Image interpretation:
Exclusion: As above
Non-ED visit
Ascites
HIV
Sample size: 109
Mean age: 39 y
Range: 1688 y
Gender: 79% (female)
Rosen et al., 200133 Prospective Inclusion: Image acquisition: Surgical 46 (357)
observational Suspected cholecystitis 52% < 25 previous pathology
study Radiology US RUQ US
Age > 18 y Image interpretation:
Exclusion: As above
No radiology US
Sample size: 116
Mean age: 49 y
Range: not stated
Gender: 72% (female)
Summers et al., 201010 Prospective Inclusion: Image acquisition: Surgical 14 (920)
observational Suspected cholecystitis Variable experience pathology
study Age > 18 y (ED physicians, attendings,
Exclusion: residents, and fellows
No follow-up, pathology Image interpretation:
Presentation 12AM8AM As above
Sample size: 193
Median age: 36 y
Range: 1887 y
Gender: 73% (female)
Noble et al., 201034 Prospective Inclusion: Image acquisition: Surgical 37 (2256)
observational Suspected cholecystitis Not stated pathology
study Age > 18 y Image interpretation:
Exclusion: As above
No RDMS EP on shift
Sample Size: 30
Mean age: not stated
Range: not stated
Gender: not stated

EP = emergency physician; RDMS = registered diagnostic medical sonographer; RUQ = right upper quadrant; US = ultrasound.

index test (RUQ pain) are more likely (studys inclu- enrolled patients > 80 years old, significantly limiting
sion criteria) to meet the criterion standard test (hepa- the generalizability of their observations.
tobiliary scintigraphy) than patients without RUQ pain Eikman et al.8 included patients with abdominal
who would not be included in the study by Bednarz pain suspected of AC and were tested by hepatobiliary
et al.30 This causes the sensitivity to be falsely raised scintigraphy. Although the main focus of the study
(93%) while deceptively depressing the specificity (0%). Eikman et al.8 was hepatobiliary scintigraphy, with suf-
ficient data on AC to permit evaluation of the test
Laboratory Findings characteristic of total bilirubin. Eikman et al.8 defined
Only the study by Eikman et al.8 all met our inclusion their criterion standard by the presence of gallstones at
criteria for review in the laboratory findings section in surgery, with a AC prevalence of 26%.
Table 1B. Studies by Brewer et al.31 and Potts et al.7
were both retrospective studies and were excluded Test Characteristics of Laboratory Findings
from our review for the already stated reason for fail- for AC
ure to abide by the quality criteria established by Gil- From Table 2B, we only report single study test char-
bert et al.28 In addition, the study by Potts et al.7 only acteristics for total bilirubin from the study by Eikman
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 289

Table 2A
Test Characteristics of H&P for AC

Sensitivity, Specicity, % Positive Negative


H&P Study % (95% CI) (95% CI) LR (95% CI) LR (95% CI)
Fever or temperature Bednarz et al., 198630 44 (2665) 37 (2353) 0.71 (0.441.14) 1.49 (0.892.50)
> 37.1 or > 37.5C Schoeld et al., 198629 31 (2044) 74 (5687) 1.18 (0.612.30) 0.94 (0.721.21)
Eskelinen et al., 199335 62 (5371) 50 (4753) 1.24 (1.071.44) 0.76 (0.600.96)
RUQ mass Bednarz et al., 198630 19 (638) 72 (5685) 0.66 (0.261.68) 1.13 (0.871.46)
Schoeld et al., 198629 15 (726) 83 (6794) 0.87 (0.342.25) 1.03 (0.861.23)
Eskelinen et al., 199335 16 (1024) 99 (98100) 16.25 (8.1432.44) 0.85 (0.780.92)
RUQ pain Bednarz et al., 198630 93 (7699) 0 (08) 0.92 (0.821.04) 7.86 (0.39157.69)
Eskelinen et al., 199335 56 (4765) 96 (9597) 14.02 (10.1919.28) 0.46 (0.380.56)
RUQ tenderness Bednarz et al., 198630 37 (1958) 58 (4273) 0.89 (0.481.62) 1.08 (0.741.59
Eskelinen et al., 199335 75 (6682) 95 (9496) 15.11 (11.5719.73) 0.26 (0.190.35)
RUQ rebound Bednarz et al., 198630 37 (1958) 58 (4273) 0.89 (0.481.62) 1.08 (0.721.59)
Eskelinen et al., 199335 42 (3450) 53 (5056) 0.89 (0.721.09) 1.10 (0.951.28)
Jaundice Bednarz et al., 198630 11 (229) 86 (7295) 0.80 (0.222.92) 1.03 (0.861.27)
Eskelinen et al., 199329 14 (821) 99 (98100) 13.81(6.7528.25) 0.87 (0.810.94)
Murphys sign Eskelinen et al., 199335 62 (5371) 96 (9597) 15.64 (11.4821.31) 0.40 (0.320.50)
29
Vomiting Schoeld et al., 1986 83 (7191) 56 (3872) 1.86 (1.272.73) 0.31 (0.170.57)

AC = acute cholecystitis; H&P = history and physical examination.

Table 2B
Test Characteristics of Laboratory Tests for AC

Laboratory Test Study Sensitivity, % (95% CI) Specicity, % (95% CI) Positive LR (95% CI) Negative LR (95% CI)
Bilirubin Eikman 19758 40 (1274) 93 (7799) 5.80 (1.2526.99) 0.64 (0.391.08)

AC = acute cholecystitis; LR = likelihood ratio.

Table 2C
Operating Characteristics Bedside Sonography for AC

Risk Factor Sensitivity, % (95% CI) Specicity, % (95% CI) Positive LR (95% CI) Negative LR (95% CI)
33
Rosen 2001 91 (7798) 66 (4980) 2.68(1.734.15) 0.13 (0.040.39)
Kendall 200132 82 (4898) 85 (8089) 4.42 (3.687.98) 0.21 (0.060.75)
Summers 201010 87 (6697) 82 (7488) 4.72 (3.226.91) 0.16 (0.060.46)
Noble 201034 82 (4898) 95(74100) 15.55 (2.26106.88) 0.19 (0.060.68)

AC = acute cholecystitis; LR = likelihood ratio.

et al.8 The presence of an elevated bilirubin not sur- (Table 3B). All reviewers found the study by Eikman
prisingly increased (LR+ = 5.80) the probability of et al.8 to have high risks of bias secondary to partial
AC. Elevated bilirubin was defined as greater than verification bias. Since the study by Eikman et al.8
2.0 mg/100 mL. Total bilirubin was not sufficiently included only those patients suspected of AC, the
robust to significantly decrease (LR = 0.64) the prob- index test (an elevated total bilirubin) we chose to
ability of AC. review was probably used as an inclusion criterion
defining at least a subset of patients suspected of AC.
QUADAS-2 Analysis for Laboratory Findings If the index test increased the probability of patients
for AC chosen to receive the diagnostic test (hepatobiliary
All of our reviewers agreed 100% on the QUADAS-2 scintigraphy) then the study is at risk for partial verifi-
scoring for the single laboratory study we reviewed cation bias. As was the case in the study by Bednarz
290 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

Table 3A
QUADAS: H&P

Item Schoeld 198629 Bednarz 198630 Eskelinen 199335


1. Was the spectrum of patients described in the Yes Yes Yes
paper and was it chosen adequately?
2. Were selection criteria described clearly? Yes Yes Yes
3. Is the reference standard likely to classify the target condition? Yes Yes Yes
4. Was there an abnormally long time period between the performance Yes, 48 h Yes, unknown No, unknown
of the test under evaluation and the conrmation of the prospect
diagnosis with the reference standard?
5. Did the whole sample, or a random selection of the sample, Yes, whole Yes, whole Yes, whole
receive verication using a reference standard of diagnosis?
6. Did all patients receive the same reference standard Yes Yes Yes
regardless of the index test result?
7. Were the results of the index test incorporated in Yes Yes Yes
the results of the reference standard?
8. Was the execution of the index test described in sufcient Yes Yes Yes
detail to permit replication of the test?
9. Was the execution of the reference standard described in sufcient Yes Yes Yes
detail to permit replication of the test?
10. Were the index test results interpreted blind Yes Yes Yes
to the results of the reference standard?
11. Were the reference standard results interpreted No No No
blind to the results of the index test?
12. Were clinical data available when test results were interpreted? Yes Yes Yes
13. Were uninterpretable/indeterminate/intermediate results reported No, not described Yes No
and included in the results?
14. Were reasons for dropout from the study reported? No, all admitted No No

H&P = history and physical examination; QUADAS = Quality Assessment Tool for Diagnostic Accuracy Studies.

Table 3B
QUADAS: Laboratory Data

Item Eikman 19758


1. Was the spectrum of patients described in the paper Yes
and was it chosen adequately?
2. Were selection criteria described clearly? No, vague description
3. Is the reference standard likely to classify the target condition? Yes
4. Was there an abnormally long time period between the Yes, 72 h
performance of the test under evaluation and the conrmation
of the diagnosis with the reference standard?
5. Did the whole sample, or a random selection of the sample, Yes, whole
receive verication using a reference standard of diagnosis?
6. Did all patients receive the same reference standard regardless of the index test result? Yes
7. Were the results of the index test incorporated in the results of the reference standard? Yes
8. Was the execution of the index test described in sufcient detail to permit replication of the test? No
9. Was the execution of the reference standard described in Yes
sufcient detail to permit replication of the test?
10. Were the index test results interpreted blind to the results of the reference standard? No
11. Were the reference standard results interpreted blind to the results of the index test? Yes
12. Were clinical data available when test results were interpreted? Yes
13. Were uninterpretable/indeterminate/intermediate results reported and included in the results? Yes
14. Were reasons for dropout from the study reported? No

QUADAS = Quality Assessment Tool for Diagnostic Accuracy Studies.

et al.30 in the H&P group who used RUQ pain as was removed from final analysis due to concerns of
both an index test of AC and a study inclusion crite- potential biases related to data entry as per the require-
rion. ments set forth by Gilbert et al.28 Villar et al.37 used
the same data set as Summers et al.10 and therefore
ED Bedside US was excluded from the final reviewed studies. Jang
Four studies, by Kendall and Shimp,32 Rosen et al.,33 et al.19 was excluded due to significant overlap of data
Summers et al.,10 and Noble et al.,34 met our final point and lack of reproducibility of the data. Studies
inclusion criteria for review. Blaivas and Adhikari9 by Schlager et al.38 and Jehle et al.39 were both
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 291

Table 3C
QUADAS: US

Summers Kendall
Item Noble 201034 Rosen 200133 201010 200132
1. Was the spectrum of patients described in the Yes Yes Yes Yes
paper and was it chosen adequately?
2. Were selection criteria described clearly? Yes Yes Yes Yes
3. Is the reference standard likely to classify the target condition? Yes Yes Yes Yes
4. Was there an abnormally long time period between the No No No No
performance of the test under evaluation and the conrmation of the
diagnosis with the reference standard?
5. Did the whole sample, or a random selection of the sample, receive Yes, whole Yes, whole Yes, whole Yes, whole
verication using a reference standard of diagnosis?
6. Did all patients receive the same reference Yes Yes Yes Yes
standard regardless of the index test result?
7. Were the results of the index test incorporated No No No No
in the results of the reference standard?
8. Was the execution of the index test described in Yes Yes Yes Yes
sufcient detail to permit replication of the test?
9. Was the execution of the reference standard described in Yes Yes Yes Yes
sufcient detail to permit replication of the test?
10. Were the index test results interpreted blind to Yes Yes Yes Yes
the results of the reference standard?
11. Were the reference standard results interpreted blind Yes Yes Yes Yes
to the results of the index test?
12. Were clinical data available when test results were interpreted? No No No No
13. Were uninterpretable/indeterminate/intermediate results No No Yes No
reported and included in the results?
14. Were reasons for dropout from the study reported? No Yes Yes Yes

QUADAS = Quality Assessment Tool for Diagnostic Accuracy Studies; US = ultrasound.

removed because they both did not differentiate the Test Characteristics of ED Bedside
findings of gallstones from AC. All the reviewed US Sonography for AC
studies were prospective. All the reviewed studies used Marked heterogeneity (chi-square p > 0.05, I2 = 75%)
RUQ pain or suspected AC. In addition, Kendall and between the test characteristics precluded us from
Shimp32 obtained RUQ USs in patients with epigas- reporting pooled results. From Table 2C, we noted a
tric pains, history of stones, and jaundice. Exclusion range of sensitivity for AC between (82%91%), speci-
criteria were only defined by Kendall and Shimp32 as ficity (66%95%), LR+ (2.6815.55), and LR (0.13
any patient with ascites or HIV+. Sample sizes varied 0.21). The study by Noble et al.34 provided for a sub-
from 3034 to 19310 subjects, with mean ages ranging stantial degree of heterogeneity between test character-
from 3610 to 4933 years. Female gender predominated istics for sonography. Although Noble et al.34 used
in all studies from 79%32 to 72%.33 RUQ pain similar to the other reviewed studies, the
The experience of the ED sonographers varied primary hypothesis of this study was not the operating
greatly across the studies. Only Kendall and Shimp32 characteristics of sonography, but was a comparison of
detailed the training of the ED sonographers used for the sonographic Murphy sign (SMS) before and after
their study. Rosen et al.33 documented that at least analgesia. Since Noble et al.34 only studied SMS and
48% of their ED sonographers in their study had not the whole range of the sonographic findings (wall
greater than 25 previous RUQ USs before the study. thickening, pericholecystic fluid, sludge, etc.) associated
The studies by Summers et al.10 and Noble et al.34 with AC, it is not surprising their test characteristics
failed to document the experience of their ED sonog- are outliers. A sensitivity analysis of the operating char-
raphers. All studies used the same sonographer to acteristics after removing the Nobel et al.34 still
both acquire and interpret their studies. None of the showed significant heterogeneity for specificity (chi-
studies utilized overreads by more experienced ED square p = 0.20, I2 = 74%; range = 66%84%) and
ultrasonographers or radiologists. None of the studies LR+ (chi-square p = 0.24, I2 = 73%; range = 2.7
reported intra- or inter-rater reliability of their ED or 5.4). After excluding Nobel et al.,34 heterogeneity for
radiology USs. sensitivity and LR significantly decreased allowing
292 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

Ttesting threshold = [(Ppos/nd) x (Rrx) + Rt] [(Ppos/nd x Rrx) + (Ppos/d x Brx)] = U/s (4%), MRI (4%),
HIDA (2%)
Ttreatment threshold = [(Pneg/nd) x (Rrx) - Rt] [(Pneg/nd x Rrx) + (Pneg/d x Brx)] = U/s (4%), MRI (52%),
HIDA (74%)

Where assumptions are based upon the summary estimates for probability treatments for cholecystitis*

Ppos/nd = probability of a positive result in patients without disease = 1-speciicity = 1-0.90= 0.10 (HIDA) 0.18
(MRI) 0.19 (U/s)*
Pneg/nd = probability of a negative result in patients without disease = speciicity = 0.90 (HIDA) 0.82 (MRI) 0.81
(U/s)*

Rrx = risk of treatment in patients without disease = 0.168

Rt = risk of diagnostic test = 0.0

Ppos/d = probability of a positive result in patients with disease = sensitivity = 0.94 (HIDA) 0.86 (MRI) 0.82
(U/s)*

Pneg/d = probability of a negative result in patients with disease = 1 sensitivity = 1-0.57= 0.06 (HIDA) 0.14
(MRI) 0.18 (U/s)*

Brx = beneit of treatment in patients with disease = 0.90

* Keiwiet et al

Figure 2. Testtreatment threshold formulas. *Kiewiet et al.40 AC = acute cholecystitis; HIDA = hepatobiliary iminodiacetic acid scan/cho-
lescintigraphy; MRI = magnetic resonance imaging; U/s = ultrasound.

pooling of sensitivity (chi-squared p = 0.67, I2 = 0%; went to the operating room (24 positive for AC) and
pooled = 88%; range = 75%95%) and LR (chi- 163 were discharged to telephone follow up, of whom
square p = 0.84, I2 = 0%; pooled = 0.16; range = 23 patients were unable to be contacted. Since we do
0.080.31). not know the prevalence of AC in the patients lost to
follow-up, the potential exists for follow-up bias. Partial
QUADAS-2 Analysis for ED Bedside verification bias may have also affected the study by
Sonography for AC Noble et al.,34 which only examined SMS, so positive
All of our reviewers agreed 100% on the QUADAS-2 SMS patients may have been preferentially referred to
scoring for the four ED US studies we reviewed (Table radiology US.
3C). The study by Summers et al.10 may also have In the study by Rosen et al.,33 of the 116 patients
been influenced by partial verification bias, in that 189 enrolled in the ED US study, 40 (34%) were excluded
ED USs were studied but only 125 of these patients from further analysis if the ED US had discordant
were referred to radiology US. In the study by Sum- results either positive for gallstones and a negative
mers et al.10 the results of the index test (ED US) par- SMS or vice versa. Since the results of the index test
tially determined workup (radiology US; partial partially determined the etiology of findings requiring
verification bias). Of the 189 ED US examinations, 26 a continued workup, this study also was exposed to
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 293

partial verification bias. Partial verification was not an posttest probability of the index test is to the right of
issue with the study by and Kendall and Shimp32 as it the treatment threshold, further testing is unnecessary
appears that all patients in both studies who received and treatment should be initiated based only on the
a ED US also received a radiology US. results of the index test. If the posttest probability of
the index test falls between the test and treatment
TestTreatment Threshold Estimates thresholds, then our analysis recommends continued
The testtreatment threshold model we developed was testing with that diagnostic modality for AC. The hori-
designed to aid physicians in efficiently and accurately zontal dashed lines represent the ranges of posttest
ruling in or ruling out the diagnosis AC. We built this probabilities of a positive or negative ED US consis-
model to investigate which element(s) of the H&P, labo- tent with AC.
ratory findings, and ED US had sufficient discrimina- Our three models for the testtreatment thresholds
tory power given the pretest probability of AC to for diagnosing/treating AC are based on the individ-
obviate a formal test by radiology: an official US, MRI, ual operating characteristics of radiology US, MRI,
or HIDA scan. Given that these formal radiology and nuclear that were recently documented in a sys-
department tests are not available 24 hours per day, 7 tematic review by Kiewiet et al.40 They reviewed 57
days a week, if element(s) of the H&P, laboratory stud- studies encompassing 5,859 patients and found HIDA
ies, or ED US were sufficiently robust to rule in or rule with the highest sensitivity (94%) and specificity
out AC, patient disposition to either discharge or begin (90%), followed by MRI sensitivity (86%) and speci-
empiric AC therapy would be facilitated. ficity (82%) and radiology US sensitivity (82%)/speci-
In Figure 2, we created three testtreatment thresh- ficity (81%).
old models (1, radiology US; 2, MRI; and 3, HIDA) The definitive treatment, i.e., operative manage-
to diagnose AC. The top half of the Figure 2 ment, is always delayed while obtaining these radiol-
describes the variables and calculations used to pro- ogy-based studies and delayed further during off-hours
duce the test and treatment thresholds depicted in gra- when these tests are not routinely available. While AC
phic below for each the three formal diagnostic tests is an inflammatory process, secondary infection can
(1, radiology US; 2, MRI; 3, HIDA) for AC. For each occur due to cystic duct obstruction and bile stasis,
diagnostic modality a separate set of test and treatment leading to gangrene, sepsis, and gallbladder perfora-
thresholds were defined across a range 0% to 100%. tion.1,2,57,41,42 The possibility of coexistent bacterial
Based on the work of Pauker and Kassirer24 these infection in AC is the justification for starting empiric
three models utilized the unique operating characteris- antibiotics especially if delays of definitive diagnosis
tics of each diagnostic modality while controlling for and surgical management are expected. Empiric antibi-
the risks of treatment of patients without AC (Rrx), otic therapy should include activity against the most
the risk of the diagnostic test (Rt), and the benefit of frequently associated pathogens: Escherichia coli, Entero-
treatment with AC (Brx). coccus, and Klebsiella.41 We defined the risks of treat-
For each diagnostic modality we used individual ment without disease (Rrx) as the risks of antibiotics
operating characteristics coupled with the variables for presumed AC whom after radiology-based tests
Rrx, Rt, and Brx to develop unique testtreatment rules out AC. Patients without AC but exposed to
thresholds. Since the operating characteristics for the antibiotics would suffer all the risks of antibiotics with-
three diagnostic modalities differ, their testtreatment out any of their intended benefits. We will define
thresholds also vary. (Rrx) as the overall risk of adverse drug reaction is
The test thresholds are depicted as the left-most 16.8%.42
open arrow for each diagnostic modality (radiology US We judged the risk of the diagnostic tests (Rt) US,
and MRI 4%, nuclear 2%). If the posttest probability MRI, and HIDA as zero. Finally, the benefit of treat-
of the index test we choose from our systematic review ment (Brx) of patients with AC has never been, nor
(dashed vertical line) falls to the left the testthreshold ever will be, tested by a randomized double-blinded
than further testing for AC is not warranted and an placebo controlled methodology; it would be unethical
alternative diagnosis other than AC should be consid- to study the spontaneous recovery rate of AC without
ered. The treatment thresholds are represented by the antibiotics or surgery. Without evidence-based studies
right-most open arrow for each diagnostic modality (ra- we used a conservative estimate of the benefit from
diology US 46%, MRI 52%, HIDA 74%). When the treating AC (Brx) = 0.90.
294 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

Of the H&P, laboratory, and US characteristics in most recent (2013) update of the Tokyo AC guideli-
this review, only ED US resulted in a significant nes12 help us form our clinical gestalt of an AC
change in LR of positive test ranged from (LR+ = patient. Unfortunately, these observational studies by
2.684.72) or a negative US (LR = 0.130.21) after their very nature suffer from verification bias, severely
excluding Noble et al.,34 for reasons explained. We limiting their potential to find characteristics of
defined the pretest probability of AC by the weighted patients suspected of AC, which can substantially rule
prevalence (31%) across the reviewed studies of using out AC. While the index tests for AC (RUQ pain/
RUQ as entry criteria. Applying Bayes theorem of a mass/tenderness/Murphys sign/fever/WBC) are very
pretest probability of AC = 31% with LR+ of 2.68 common in AC patients, we only understand their
4.72 and LR of 0.130.21 would lead to a range of true clinical utility when we examine the prevalence of
posttest probabilities after a positive ED US for AC AC in patients without these cardinal signs of AC. In
(55%68%) or a negative US (6%7%). our study, LR for H&P ranged from 0.26 to 7.86,
In Figure 2, in the case of a negative ED US for AC and LR for an elevated WBC was 0.64, meaning
the posttest probability would only decrease from 31% to that we were not able to use any of these findings to
6%7%; our analysis would recommend further testing significantly decrease the posttest probability of AC
for all three diagnostic modalities (test thresholds: radiol- and obviate further formal radiology testing.
ogy US = 4%, MRI = 4%, and HIDA = 2%). Thus, a From our ED US studies reviewed we were able to
negative ED US is not sufficient to adequately rule out find studies with LRs (LR+s of 2.68 to 4.72) and
AC and continued testing is required before discharge of (LRs of 0.13 to 0.21) significantly robust enough to
a patient suspected of AC. If the ED US is positive for produce marked changes from pre- to posttest proba-
AC, the posttest probability of AC would increase from bilities of AC. In comparing the performances of radi-
31% to 55%68% and obviate further testing for the ology to ED US, our reviewed studies found disparate
diagnostic modalities of radiology US (treatment thresh- results. Summers et al.10 found no significant differ-
old = 46%) and MRI (treatment threshold = 52%). The ences in the operating characteristics of formal radiol-
nuclear scan model, because of higher sensitivity (90%) ogy US (LR+ = 5.7 [range = 3.39.8]; LR = 0.20
and specificity (94%) compared to the other two modali- [range = 90.080.5]) compared to ED US (LR+ = 4.7
ties, recommends further testing (treatment threshold = [range = 3.26.9]; LR = 0.16 [range = 0.060.46])
74%) even if the ED US was positive for AC. in detecting AC. Yet, the study by Rosen et al.33 only
found a kappa of 0.46 for the AC agreement of ED
and formal radiology US. Kendall and Shimp32 only
DISCUSSION
directly compared the ED US to formal radiology US
This systematic review examines the utility of H&P, for the detection of SMS and found the ED US (sen-
laboratory studies, and ED US for the diagnosis of sitivity = 82%) greater than the formal US (45%).
AC in the ED population. We found eight studies Some of this heterogeneity in comparing the accuracy
that met inclusion criteria for H&P (n = 3), for labo- between radiology and ED US can be ascribed to the
ratory studies (n = 1), and for ED US (n = 5) with variations in the sonographic experience across the
varying quality based on the degree of bias. various studies. In addition, none of these studies
Similar to the findings by the rational clinical exami- included tests of inter-rater or intra-rater reliability of
nation of Trowbridge et al.,43 Does This Patient Have the ED sonographers.
Acute Cholecystitis? we found with an updated search Using our testtreatment threshold estimates (Fig-
that no single H&P finding or laboratory test was suffi- ure 2) with a pretest probability of AC (weighted AC
ciently robust to rule out AC. This is not to suggest that prevalence) of 31% and a positive ED US for AC an
H&P and laboratory findings are of no utility in the ED physician could initiate empiric AC antibiotics
diagnosis of AC, as the presence of these variables is and a surgery consultation obtained for admission,
necessary to define the at-risk population in the first without the need for formal radiology department test-
place. Ross et al.18 compared ED US to other modali- ing with US or MRI scans. A negative ED US would
ties if imaging for cholelithiasis; however, H&P and lab- only decrease the posttest AC probability to 6%7%,
oratory were not included in the primary review. still just above the testing threshold (US and MRI
Observational studies of AC patients from Copes 4%, HIDA 2%) for the three formal diagnostic modal-
Early Diagnosis of the Acute Abdomen44 in 1921 to the ities. The decision not to go onto further formal AC
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 295

testing will be a clinical decision based on your own laboratory studies. Despite a 4:1 female predisposition to
clinical judgment of either accepting a miss rate of AC AC, only one study by Irvin et al.3 examined this
between 6 and 7% or applying a lower pretest proba- parameter. These studies did not include other factors
bility of AC than 31% maybe commensurate with that predispose patients to AC, i.e., race and diabetes.
your clinical experience. A change in the estimate of When examining many other parameters included in
the pretest probability of AC to 25% would place a this study, there are few studies per parameter. For exam-
negative ED US posttest probability of AC (4%) at the ple, Singer et al.25 is the only study that met inclusion
cutoff for the test threshold for both radiology US and criteria and took history of gallstones into account. Fur-
MRI (4%). If the pretest probability is as low as 15%, thermore, these studies are over 20 years old and may
then a negative ED US could hit the lower limit of no longer represent current lifestyles and risk factors.
the test threshold (2%) for even the HIDA scan. On The diagnostic tools are associated with limitations as
the other hand, considering that ED US and radiology well. While the risk of diagnostic tests such as US,
US are equally accurate, as described, since the postt- MRI, and HIDA are estimated as zero, these tests do
est probability of an ED US is 6%7% and the test have some quantifiable risk. MRI and HIDA require
treatment threshold for all three modalities are 4%, patients to leave the ED, which holds some risk as well.
the difference between 6 and 4% is insignificant While not a direct or immediate risk, oftentimes imag-
enough to obviate further testing as per clinical judg- ing studies find results not expected with current investi-
ment, especially in the absence of an elevated biliru- gation purposes; these incidentalomas may lead to
bin. In this case, perhaps an alternate diagnosis further testing and expose patients to further risk.
should be pursued. Ultrasound is a user-dependent modality. Many
With an increasing movement toward early interven- studies that include USs as diagnostic criteria include
tion for patients with AC, the utility of ED US for early USs predominantly performed by emergency physi-
diagnosis will aid to avoid unnecessary delay in inter- cians who are very experienced and motivated to per-
vention, as well as reduction of outpatient referrals and form ultrasonography. This does not represent actual
repeated ED visits, which overall is more cost-effective practice where bedside US is performed by the practi-
and improve patient suffering and experience.45 tioner caring for the patient, who may or may not be
a registered sonographer.
Implications for Future Research
Characteristics of disease have not been studied in the
CONCLUSION
current era. Lifestyles and food habits have changed in
the past 40 years. All of studies in the H&P and labo- Right upper quadrant complaints are a frequently
ratory section were done before 2000. None of the encountered complaint among ED visits. The decision
studies evaluated the discriminatory power of different to pursue and treat these complaints is multifactorial.
historical, physical examination, and laboratory find- The primary objective of this systematic review is to
ings in combination. High-quality diagnostic studies of assess variables of history and physical examination,
AC in the future should combine H&P, laboratory laboratory testing, and ultrasound impact on making
data, and ED US with specific concerns for limiting the diagnosis of acute cholecystitis. We found that
bias and directly measuring the efficiencies obtained there is no one parameter that would suggest an
by ED POCUS, that is, the decreased time to achieve immediate cause to treat and pursue surgical interven-
diagnosis. The outcome test should not be based on tion. Individually, the many variations of complaints
the index test. Additionally, future studies should as well as clinical findings do not reliably rule out
report inter-rater reliability of the ED US interpreta- acute cholecystitis, thereby necessitating a clinical deci-
tions given that US is a user-dependent modality. sion rule to include multiple parameters (history and
Combining features may lead to better operating char- physical, laboratory data, and sonographic imaging) to
acteristics and should be considered in future studies. achieve a correct diagnosis of acute cholecystitis.

LIMITATIONS References
1. Hastings RS, Powers RD. Abdominal pain in the ED: a
Only three databases, PubMed, Embase, and SCOPUS,
35 year retrospective. Am J Emerg Med 2011;29:7116.
were used. Also notable is the paucity of H&P and
296 Jain et al. A SYSTEMATIC REVIEW OF ED ULTRASOUND FOR CHOLECYSTITIS DIAGNOSIS

2. Miettinen P, Pasanen P, Lahtinen J, Alhava E. Acute abdom- 19. Jang TB, Ruggeri W, Kaji AH. The predictive value of
inal pain in adults. Ann Chir Gynaecol 1996;85:59. specific emergency sonographic signs for cholecystitis. J
3. Irvin TT. Abdominal pain: a surgical audit of 1190 emer- Med Ultrasound 2013;21:2931.
gency admissions. Br J Surg 1989;76:11215. 20. Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy
4. Udekwu PO, Sullivan WG. Contemporary experience with A. Meta-DiSc: a software for meta-analysis of test accuracy
cholecystectomy: establishing benchmarks two decades data. BMC Med Res Methodol 2006;6:31.
after the introduction of laparoscopic cholecystectomy. Am 21. DerSimonian R, Laird N. Meta-analysis in clinical trials.
Surg 2013;79:12537. Control Clin Trials 1986;7:17788.
5. Adedeji OA, McAdam WA. Murphys sign, acute chole- 22. Deeks JJ, Macaskill P, Irwig L. The performance of tests
cystitis and elderly people. J R Coll Surg Edinb of publication bias and other sample size effects in system-
1996;41:889. atic reviews of diagnostic test accuracy was assessed. J Clin
6. Navarro Fernandez JA, Tarraga Lopez PJ, Rodriguez Montes Epidemiol 2005;58:88293.
JA, Lopez Cara MA. Validity of tests performed to diagnose 23. Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-
acute abdominal pain in patients admitted at an emergency 2: a revised tool for the quality assessment of diagnostic
department. Rev Esp Enferm Dig 2009;101:6108. accuracy studies. Ann Intern Med 2011;155:52936.
7. Potts FE 4th, Vukov LF. Utility of fever and leukocytosis 24. Pauker SG, Kassirer JP. The threshold approach to clinical
in acute surgical abdomens in octogenarians and beyond. decision making. N Engl J Med 1980;302:110917.
J Gerontol A Biol Sci Med Sci 1999;54:M558. 25. Singer AJ, McCracken G, Henry MC, Thode HC Jr,
8. Eikman EA, Cameron JL, Colman M, Natarajan TK, Dugal Cabahug CJ. Correlation among clinical, laboratory, and
P, Wagner HN Jr. A test for patency of the cystic duct in hepatobiliary scanning findings in patients with suspected
acute cholecystitis. Ann Intern Med 1975;82:31822. acute cholecystitis. Ann Emerg Med 1996;28:26772.
9. Blaivas M, Adhikari S. Diagnostic utility of cholescintigra- 26. Staniland JR, Ditchburn J, De Dombal FT. Clinical pre-
phy in emergency department patients with suspected sentation of acute abdomen: study of 600 patients. Br
acute cholecystitis: comparison with bedside RUQ ultra- Med J 1972;3:3938.
sonography. J Emerg Med 2007;33:4752. 27. Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evi-
10. Summers SM, Scruggs W, Menchine MD, et al. A dence of design-related bias in studies of diagnostic tests.
prospective evaluation of emergency department bedside JAMA 1999;282:10616.
ultrasonography for the detection of acute cholecystitis. 28. Gilbert EH, Lowenstein SR, Koziol-McLain J, Barta DC,
Ann Emerg Med 2010;56:11422. Steiner J. Chart reviews in emergency medicine research:
11. Hirota M, Takada T, Kawarada Y, et al. Diagnostic criteria where are the methods? Ann Emerg Med 1996;27:3058.
and severity assessment of acute cholecystitis: Tokyo 29. Schofield PF, Hulton NR, Baildam AD. Is it acute chole-
Guidelines. J Hepatobiliary Pancreat Surg 2007;14:7882. cystitis? Ann R Coll Surg Engl 1986;68:146.
12. Kiriyama S, Takada T, Strasberg SM, et al. TG13 guideli- 30. Bednarz GM, Kalff V, Kelly MJ. Hepatobiliary scintigra-
nes for diagnosis and severity grading of acute cholangitis phy. Increasing the accuracy of the preoperative diagnosis
(with videos). J Hepatobiliary Pancreat Sci 2013;20:2434. of acute cholecystitis. Med J Aust 1986;145:3168.
13. Shamseer L, Moher D, Clarke M, et al. Preferred report- 31. Brewer BJ, Golden GT, Hitch DC, Rudolf LE, Wangen-
ing items for systematic review and meta-analysis protocols steen SL. Abdominal pain. An analysis of 1,000 consecu-
(PRISMA-P) 2015: elaboration and explanation. BMJ tive cases in a University Hospital emergency room. Am J
2015;349:g7647. Surg 1976;131:21923.
14. Stroup DF, Berlin JA, Morton SC, et al. Meta-analysis of 32. Kendall JL, Shimp RJ. Performance and interpretation of
observational studies in epidemiology: a proposal for focused right upper quadrant ultrasound by emergency
reporting. Meta-analysis Of Observational Studies in Epi- physicians. J Emerg Med 2001;21:713.
demiology (MOOSE) group. JAMA 2000;283:200812. 33. Rosen CL, Brown DF, Chang Y, et al. Ultrasonography
15. American College of Emergency. P. Emergency ultrasound by emergency physicians in patients with suspected chole-
guidelines. Ann Emerg Med 2009;53:55070. cystitis. Am J Emerg Med 2001;19:326.
16. Miller A, Pepe P, Brockman C, et al. ED Ultrasound in 34. Noble VE, Liteplo AS, Nelson BP, Thomas SH. The
hepatobiliary disease. J Emerg Med 2006;30:6974. impact of analgesia on the diagnostic accuracy of the
17. Woo MY, Taylor M, Loubani O, Bowra J, Atkinson P. sonographic Murphys sign. Eur J Emerg Med 2010;17:
My patient has got abdominal pain: identifying biliary 803.
problems. Ultrasound 2014;22:2238. 35. Eskelinen M, Ikonen J, Lipponen P. Diagnostic
18. Ross M, Brown M, McLaughlin K, et al. Emergency approaches in acute cholecystitis; a prospective study of
physicians in patients with suspected cholelithiasis. Acad 1333 patients with acute abdominal pain. Theor Surg
Emerg Med 2011;18:22735. 1993;8:1520.
ACADEMIC EMERGENCY MEDICINE March 2017, Vol. 24, No. 3 www.aemj.org 297

36. Kohn MA, Carpenter CR, Newman TB. Understanding 41. Csendes A, Burdiles P, Maluenda F, Diaz JC, Csendes P,
the direction of bias in studies of diagnostic test accuracy. Mitru N. Simultaneous bacteriologic assessment of bile
Acad Emerg Med 2013;20:1194206. from gallbladder and common bile duct in control sub-
37. Villar J, Summers SM, Menchine MD, Fox JC, Wang jects and patients with gallstones and common duct
R. The absence of gallstones on point-of-care ultrasound stones. Arch Surg 1996;131:38994.
rules out acute cholecystitis. J Emerg Med 2015;49:475 42. Col N, Fanale J, Kronholm P. The role of medication
80. noncompliance and adverse drug reactions in hospital-
38. Schlager D, Lazzareschi G, Whitten D, Sanders AB. A izations of the elderly. Arch Intern Med 1990;150:841
prospective study of ultrasonography in the ED by emer- 5.
gency physicians. Am J Emerg Med 1994;12:1859. 43. Trowbridge RL, Rutkowski NK, Shojania KG. Does this
39. Jehle D, Davis E, Evans T, et al. Emergency department patient have acute cholecystitis? JAMA 2003;289:806.
sonography by emergency physicians. Am J Emerg Med 44. Silen W. Copes Early Diagnosis of the Acute Abdomen.
1989;7:60511. New York: Oxford University Press, 2010.
40. Kiewiet JJ, Leeuwenburgh MM, Bipat S, Bossuyt PM, Sto- 45. Kulvatunyou N, Joseph B, Gries L, et al. A prospective
ker J, Boermeester MA. A systematic review and meta-ana- cohort study of 200 acute care gallbladder surgeries: the
lysis of diagnostic performance of imaging in acute same disease but a different approach. J Trauma Acute
cholecystitis. Radiology 2012;264:70820. Care Surg 2012;73:103945.

Anda mungkin juga menyukai