Liang Qiao1, Bo Li1, Mei Long1, Xiao Wang1, Anrong Wang1 and Guonan Zhang2
Departments of 1Cancer Prevention and Treatment and 2Gynecologic Oncology, Sichuan Cancer Hospital and Institute and
Sichuan Cancer Prevention and Treatment Center, Chengdu, China
Abstract
The aim of this review was to provide an updated summary estimation of the accuracy of visual inspection with
acetic acid (VIA) and with Lugols iodine (VILI) in detecting cervical cancer and precancer. Studies on VIA/VILI
accuracy were eligible in which VIA/VILI was performed on asymptomatic women who all underwent conr-
matory testing of histology, combination of colposcopy and histology, or combination of multiple screening tests,
colposcopy and histology, to detect cervical intraepithelial neoplasia grade 2 or worse (CIN2+ or CIN3+). A bivar-
iate model was tted to estimate the accuracy of VIA/VILI and provide estimates of heterogeneity. Subgroup
analysis was used to investigate the source of heterogeneity. A total of 29 studies on VIA and 19 studies on VILI
were included nally in the meta-analysis. The summary sensitivity and specicity of VIA for CIN2+ were 73.2%
(95%CI: 66.580.0%) and 86.7% (95%CI: 82.990.4%), respectively, and those for VILI were 88.1% (95%CI: 81.5
94.7%) and 85.9% (95%CI: 81.790.0%), respectively. VIA and VILI were both more sensitive in detecting more se-
vere outcome, although there was a slight loss in specicity. Apparent heterogeneity existed in sensitivity and
specicity for both VIA and VILI. High sensitivity of both VIA and VILI for CIN2+ was found when a combina-
tion of colposcopy and histology was used as disease conrmation. VIA, VILI, even a combination of them in par-
allel, could be good options for cervical screening in low-resource settings. Signicant differences in sensitivity
between different gold standards might provide a proxy for optimization of ongoing cervical cancer screening
programs.
Key words: bivariate model, cervical cancer, meta-analysis, screening, visual inspection.
using a reference standard of multiple screening tests (in- language, publication date, or publication status restric-
cluding HPV test) and blind biopsy even in the absence tions were imposed.
of colposcopically detected lesions were not included. Among studies based on the same population, the one
In addition, the statistical approaches used in most of with the latest data, the most comprehensive and valid
the previous meta-analyses did not take into account analysis was eventually retained. Studies with missing or
within-study sampling error and additional unexplained unclear key information on test accuracy were excluded.
heterogeneity between studies. I2 statistic was widely Any reviews, comments, letters were also excluded.
used to measure the extent of heterogeneity, which was Two investigators independently screened each re-
not recommended by the Cochrane Diagnostic Test cord. The title and abstract of each citation were screened
Accuracy (DTA) Working Group because it did not rst, and the full text was screened second.
account for heterogeneity explained by phenomena such The Cochrane version of Quality Assessment of Diag-
as positivity threshold effects, and will overestimate the nostic Accuracy Studies (QUADAS), which includes 11
degree of heterogeneity observed.9,10 Thus, we con- items, was used to assess the quality of each included
ducted a meta-analysis using a statistically rigorous hier- study.11 Each item related to a single aspect of quality
archical model and based upon more comprehensive could be judged as yes, no, or unclear. The quality
studies with diverse reference standards for true disease score for yes was dened as 1, and that for no or un-
verication to provide updated average estimates of the clear was dened as 0.
accuracy of VIA and VILI, respectively, in detecting cer- Two reviewers used a systematic review data extrac-
vical intraepithelial neoplasia grade 2 or worse (CIN2+) tion form with the following entries: rst authors name,
or CIN3 or worse (CIN3+) in asymptomatic women, as year of study publication, country, study period, age
well as estimates of the heterogeneity, and investigated range of the study population, study design, size of study
the source of heterogeneity. population, screener, place of screening, gold standard,
disease threshold, and QUADAS indicators. The number
of true positives (TP), false positives (FP), false negatives
Methods (FN), and true negatives (TN) were also extracted from
each included study. The two reviewers reached consen-
Databases including PubMed, EMBASE, Cochrane sus each time there was a discrepancy in data collection.
Library, China National Knowledge Infrastructure If more than one study sample was included in a single
(CNKI), VIP and WANFANG up to December 2013 report and data for each sample were provided sepa-
were searched using the following: ((((visual inspec- rately, we treated the sample as if they had been pre-
tion) AND (acetic acid)) OR VIA) OR (((visual sented in individual studies. Missing key information
inspection) AND (Lugols iodine)) OR VILI)) AND on study design or test accuracy was requested from
(cervix dysplasia OR cervix neoplasms OR cervical the authors by email. If the authors did not respond, their
intraepithelial neoplasia OR cervical neoplasia OR studies were excluded from the meta-analysis.
cervical cancer OR cervical carcinoma OR CIN) AND The summary sensitivity and specicity for VIA or
(screening). The reference lists of those retrieved VILI were estimated directly using the bivariate model,
articles were then checked to obtain relevant studies which was a hierarchical model recommended by the
not identied in the database search. Cochrane DTA Working Group.9 The bivariate approach
The study selection criteria for eligibility were as fol- ts a two-level model with independent binomial distri-
lows: (i) a focus on the accuracy of VIA or VILI testing butions for the TP and TN conditional on the sensitivity
in apparently healthy, asymptomatic women; (ii) refer- and specicity in each study, and a bivariate normal
ence standard histology alone, or a combination of col- model for the logit transformations of sensitivity and
poscopy and histology (abnormal colposcopy should specicity between studies. Given that the sensitivity
have histologic conrmation), or combination of multi- and specicity in each study are assumed to have a bi-
ple screening tests, colposcopy and histology (abnormal variate normal distribution across studies, the possibility
colposcopy and/or positive on one or more screening of correlation between them can be incorporated.12
tests [e.g. liquid-based cytology, HPV etc.] should have Threshold effects could be explored if the covariance be-
histologic conrmation); (iii) disease threshold CIN2 or tween logit sensitivity and specicity was estimated to
CIN3; (iv) verication of the screening test using the be statistically signicantly negative, which represented
reference standard so that sufcient data could be the trade-off in sensitivity and specicity as the test pos-
obtained to complete all four cells of a 2 2 table. No itivity threshold across studies varied.9
For the comparison of VIA and VILI, a binary covari- (CIN2 threshold, n = 29; CIN3 threshold, n = 16); and
ate for test type was included in the model based on all in which VILI was used as the screening test (CIN2
available studies to investigate whether the expected threshold, n = 19; CIN3 threshold, n = 12). The main
sensitivity and/or specicity differed between the tests. characteristics of all integrated studies for VIA and VILI
Individual and summary estimates of sensitivity and are listed in Table 1.
specicity and summary receiver operating characteris- The forest plots show the variation of the sensitivity
tic (SROC) curve were plotted on an SROC graph. The and specicity of VIA and VILI to detect CIN2+ or
95% condence region around the pooled estimates CIN3+ (Fig. 2). There appeared to be greater variability
was included, as was 95% prediction region, which gave in estimated sensitivity than specicity across studies.
an indication of between-study heterogeneity. Estimates of sensitivities of both tests for CIN2+ were
Heterogeneity was evaluated statistically using the made with less certainty than those for CIN3+.
variance of logit transformed sensitivity and specicity The sensitivity and specicity for all tests and out-
and graphically by the prediction region in ROC space. comes are summarized in Table 2. The summary sensi-
Where heterogeneity is high, the value of the corre- tivity and specicity of VIA for CIN2+ and CIN3+
sponding variance parameter is far away from 0, and were 73.2% (95%CI: 66.580.0%) and 86.7% (95%CI:
the 95% prediction region is much larger than the 95% 82.990.4%), 80.6 (95%CI: 73.887.4%) and 85.3 (95%CI:
condence region.10 81.389.3%), respectively, and those for VILI were
Study level covariates of gold standard, screener, place 88.1% (95%CI: 81.594.7%) and 85.9% (95%CI: 81.7
of screening, size of study population, and the summary 90.0%), 92.4% (95%CI: 86.498.3%) and 84.7% (95%CI:
QUADAS score were added to the hierarchical model in- 81.288.1%), respectively. The two tests were both more
dividually to explore subgroup heterogeneity for VIA sensitive in detecting more severe outcome, although
and VILI. Size of study population was aggregated by there was a slight loss in specicity. For the outcome
median. Summary QUADAS score was divided into CIN2+, the summary sensitivity of VILI was statistically
two categories of 11 and <11. signicantly higher than that of VIA (P = 0.003). In con-
Forest plots and SROC plot were output by Review trast, the overall specicity of VILI was not signicantly
Manager (RevMan), version 5.2. The bivariate model different from that of VIA (P = 0.438). Similar differences
was tted using Proc NLMIXED in SAS, version 9.2. in comparison of the two tests for the outcome CIN3+
were also found.
The variance of logit-transformed estimates in Table 2
Results and the size of the prediction region on the SROC plot in
Figure 3 indicated that the magnitude of the heterogene-
A total of 1990 records were identied using the search ity was evident in both sensitivity and specicity for any
strategy after removing the duplicates, among which test and any outcome. The P-values of covariance
1828 records were excluded on the basis of title or ab- between logit sensitivity and specicity for all tests and
stract. Of the 162 full texts assessed, 143 were excluded outcomes indicated that no threshold effect existed
due to irrelevant content (n = 15), inappropriate popula- (P > 0.05).
tion source (n = 18), inappropriate reference standard Subgroup analysis was performed for the outcome of
(n = 1), partial verication (n = 34), repetitive analysis CIN2+. It showed a statistical improvement in sensitiv-
or publication based on the same database (n = 17), miss- ity, without loss of specicity as the true disease
ing or unclear key information on test accuracy (n = 8), verication was done using a combination of colposcopy
and other literature types such as review, comment, let- and histology compared with histology alone or a com-
ter and so on (n = 50). Finally, 19 articles were included bination of multiple tests, colposcopy and histology for
in the quantitative synthesis. Of the 19 articles, one re- both VIA and VILI testing. VIA performed by physician
ported results respectively from 11 separate centers, and nurse had statistically signicantly less sensitivity
and one reported two sets of results for test performed than that performed by the other three kinds of
by doctor and nurse, respectively, in the same popula- screeners. The statistically signicant highest specicity
tion. We treated those separate results as if they were ob- for VIA was noted in the setting of hospital and primary
tained from different studies, therefore 29 studies on health center, and the lowest in the setting of primary
VIA, and 19 studies on VILI were included in the meta- health center alone. The number of women screened
analysis (Fig. 1). Finally, we performed meta-analysis of and summary QUADAS score had no effect on the accu-
studies in which VIA was used as the screening test racy of VIA and VILI (Table 3).
1317
Table 1 (Continued)
1318
First author Pub. Country Period Age Study No. Screener Place of Gold Disease QUADAS
year range design Screened screening standard threshold score
(years)
L. Qiao et al.
Arbyn18 2008 India (Trivandrum 2) 19992003 2564 Cross- 4759 Health Field Colposcopy CIN2/3 11
sectional worker clinic and histology
Qiao19 2008 China 2007 3054 Cross- 2388 Nurse Hospital Multiple CIN2/3 11
sectional tests,
colposcopy
and histology
Li20 2008 China 2004 3049 Cross- 2432 NA NA Multiple CIN2 9
sectional tests,
colposcopy
and histology
Li21 2009 China 20042005 1529 Cross- 1819 Physician NA Multiple CIN2 10
sectional tests,
colposcopy
and histology
Murillo22 2010 Colombia 20072008 2559 Cross- 4957 Nurse NA Multiple CIN2 10
sectional tests,
colposcopy
and histology
Muwonge23 2010 Angola 20022006 2559 Cross- 8849 Nurse Hospital Colposcopy CIN2 11
sectional and and histology
primary
health
center
Ngoma24 2010 Tanzania 20022007 2559 Cross- 10374 Nurse Hospital Colposcopy CIN2 11
sectional and and histology
primary
health
center
Cremer25 2011 El Salvador 20072009 50+ Cross- 578 Physician Hospital Histology CIN2 9
sectional and nurse
Fei26 2011 China 2009 2565 Cross- 859 Physician NA Multiple CIN2/3 8
sectional tests,
colposcopy
and histology
Dasgupta27 2012 India 20062009 NA Cross- 4873 Physician Hospital Histology CIN2 10
sectional
Deodhar28 2012 India 20062007 3049 Cross- 5519 Nurse Field Multiple CIN2/3 11
sectional clinic tests,
colposcopy
and histology
(Continues)
1319
1320
L. Qiao et al.
Table 1 (Continued)
First author Pub. Country Period Age Study No. Screener Place of Gold Disease QUADAS
year range design Screened screening standard threshold score
(years)
Colposcopy
and histology
Muwonge23 2010 Angola 20022006 2559 Cross- 8842 Nurse Hospital Colposcopy CIN2 11
sectional and and histology
primary
health
center
Ngoma24 2010 Tanzania 20022007 2559 Cross- 10367 Nurse Hospital Colposcopy CIN2 11
sectional and and histology
primary
health
center
Fei26 2011 China 2009 25-65 Cross- 859 Physician NA Multiple CIN2/3 8
sectional tests,
colposcopy
and histology
Deodhar28 2012 India 20062007 3049 Cross- 5519 Nurse Field Multiple CIN2/3 11
sectional clinic tests,
colposcopy
and histology
Hu30 2012 China 2010 2855 Cross- 1100 NA Hospital Multiple CIN2 9
sectional tests,
colposcopy
and histology
CIN, cervical intraepithelial neoplasia; NA, not available; QUADAS, quality assessment of diagnostic accuracy studies; VIA, visual inspection after application of acetic acid; VILI, visual
inspection after application of Lugols iodine.
Figure 2 Coupled forest plots for visual inspection after application of acetic acid (VIA) and visual inspection after application of
Lugols iodine (VILI) for detecting cervical intraepithelial neoplasia (CIN)2+ or CIN3+.
further conrmed by the variance parameters quantita- were included to explore the apparent heterogeneity, in
tively and by the prediction ellipse graphically, in which which screener and place of screening were proxies for
threshold effects did not exist (according to the test of co- screener competence, size of study population was con-
variance parameter). Considering the impact of subjec- sidered as a proxy for accumulated experience of the as-
tivity on the accuracy of VIA and VILI, relative factors sessors, and gold standard was related to the strength of
Table 2 Summary estimates of VIA and VILI for CIN2+ and CIN3+ using a bivariate random effects model
Test Outcome Sensitivity Variance logit Specicity Variance logit Covariance between logit
(95%CI) (sensitivity) (95%CI) (specicity) sensitivity and specicity
VIA CIN2+ 0.732 (0.6650.800) 0.702 0.867 (0.8290.904) 0.709 0.212
CIN3+ 0.806 (0.7380.874) 0.475 0.853 (0.8130.893) 0.356 0.143
VILI CIN2+ 0.881 (0.8150.947) 1.472 0.859 (0.8170.900) 0.495 0.048
CIN3+ 0.924 (0.8640.983) 1.279 0.847 (0.8120.881) 0.165 0.033
VIA vs VILI in sensitivity for CIN2+ (P = 0.003); VIA vs VILI in sensitivity for CIN3+ (P = 0.011). CIN, cervical intraepithelial neoplasia; VIA,
visual inspection after application of acetic acid; VILI, visual inspection after application of Lugols iodine.
Figure 3 Summary receiver operating characteristic curve of () visual inspection after application of acetic acid (VIA) and ()
visual inspection after application of Lugols iodine (VILI) for detection of underlying (a) cervical intraepithelial neoplasia
(CIN)2+ or (b) CIN3+. () Summary point; () 95% condence region; (- - -) 95% prediction region.
true disease verication. Also, summary QUADAS testing compared with the combination of colposcopy
score, representing the whole screening methodology and histology (79.5% and 92.1%, respectively). Possible
quality, was also included. We did not perform subgroup explanations for the high sensitivity in disease verica-
analysis for the outcome of CIN3+, however, because of tion for the combination of colposcopy and histology
the small number of included studies, and the homoge- could be the high correlation between the visual inspec-
neity of the study level covariates among those studies. tion methods and colposcopy. The sensitivity of visual
The source of heterogeneity for the outcome CIN3+ inspection methods may be overestimated if
may be investigated through individual level covariates. colposcopically-directed biopsy and visual inspection
Again, limited by the small number of included studies, miss similar small lesions35 that could be identied on
we could not include all the covariates simultaneously in histology alone or a combination of multiple tests, col-
the hierarchical model for the outcome CIN2+. It was poscopy and histology. Pretorius et al. also conrmed
possible to include only one covariate at a time to per- that the sensitivity of colposcopy-directed biopsy for
form the subgroup analysis. CIN2+ was only 57% when multiple random biopsies
For the outcome CIN2+, on subgroup analysis, disease were taken from all tested women.36 Even though VIA
conrmation using histology alone or a combination of performed by both physician and nurse had a statisti-
multiple tests, colposcopy and histology had the conser- cally signicantly lower sensitivity than when per-
vative sensitivity of both VIA (61.3%) and VILI (63.5%) formed by the other three kinds of screeners, of note,
only two studies reported this kind of screener, and the with the performance of VILI, perhaps because the num-
corresponding subgroup summary sensitivity had wide ber of studies of VILI was lower than that of VIA.
condence interval (0.1030.611), so the conclusion that
there was signicant difference in sensitivities between
Conclusion
screeners should be made with caution. In contrast, there
was no statistically signicant difference in VIA and VIA and VILI have the advantage of correctly identify-
VILI performed by health worker, nurse, and physician. ing cervical precancerous and cancerous lesions. Despite
This suggests that trained health workers and nurses can lower specicity, the two tests, even a combination of
be effective alternative to physicians for cervical cancer them in parallel, could be good options for cervical
screening using VIA or VILI testing.37 In the present screening in low-resource settings. Signicant difference
study, VIA had the statistically signicantly highest in sensitivities of VIA/VILI between different gold
specicity in the setting of hospital and primary health standards might provide a proxy for optimization of
center, and the lowest in the primary health center alone. ongoing cervical cancer screening programs. VIA and
Even though there was no statistically signicant differ- VILI need to be evaluated in more settings, and more
ence in sensitivity of VIA between settings, the primary detailed information is required in order to explore the
health center had the highest and the setting of hospital source of heterogeneity.
and primary health center had moderate sensitivity. It
seems that VIA performed in the former is more likely
to identify precancerous and cancerous lesions, and Disclosure
conducted in the latter is more likely to minimize over-
treatment. Most of the covariates were not associated None declared.
31. Li N, Ma CP, Sun LX et al. Evaluation on the visual inspection are combined for cervical cancer prevention. J Med Screen
with Lugols iodine in cervical cancer screening program. 2007; 14: 144150.
Zhonghua Liu Xing Bing Xue Za Zhi 2006; 27: 1518 (in Chinese). 35. Pretorius RG, Kim RJ, Belinson JL, Elson P, Qiao YL. Ination of
32. Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, sensitivity of cervical cancer screening tests secondary to corre-
Zwinderman AH. Bivariate analysis of sensitivity and specic- lated error in colposcopy. J Low Genit Tract Dis 2006; 10: 59.
ity produces informative summary measures in diagnostic re- 36. Pretorius RG, Zhang WH, Belinson JL et al. Colposcopically di-
views. J Clin Epidemiol 2005; 58: 982990. rected biopsy, random cervical biopsy, and endocervical curet-
33. Nanda K, McCrory DC, Myers ER et al. Accuracy of the tage in the diagnosis of cervical intraepithelial neoplasia II or
Papanicolaou test in screening for and follow-up of cervical cy- worse. Am J Obstet Gynecol 2004; 191: 430434.
tologic abnormalities: A systematic review. Ann Intern Med 37. Sherigar B, Dalal A, Durdi G, Pujar Y, Dhumale H. Cervical
2000; 132: 810819. cancer screening by visual inspection with acetic acid: Interob-
34. Muwonge R, Walter SD, Wesley RS et al. Assessing the gain in server variability between nurse and physician. Asian Pac J
diagnostic performance when two visual inspection methods Cancer Prev 2010; 11: 619622.