Anda di halaman 1dari 8

73

ORIGINAL ARTICLE

Differences in endpoints between the Swedish W-E (two county) trial of mammographic screening and the Swedish overview: methodological consequences
r, B Vitak, L Nystro m and J Frisell L Holmberg, S W Duffy, A M F Yen, L Taba
................................................................................................... J Med Screen 2009;16:73 80 DOI: 10.1258/jms.2009.008103

See end of article for authors afliations

...................
Correspondence to: Lars Holmberg, Research Oncology, 3rd oor Bermondsey Wing, Guys Hospital, Divison of Cancer Studies, Guys Campus, Kings College London, SE1 9RT London, UK; Lars.holmberg@kcl.ac.uk Accepted for publication 11 March 2009

...................

Objectives To characterize and quantify the differences in the number of cases and breast cancer deaths in the Swedish W-E Trial compared with the Swedish Overview Committee (OVC) summaries and to study methodological issues related to trials in secondary prevention. Setting The study population of the W-E Trial of mammography screening was included in the rst (W and E county) and the second (E-county) OVC summary of all Swedish randomized mammography screening trials. The OVC and the W-E Trial used different criteria for case denition and causes of death determination. Method A Review Committee compared the original data les from W and E county and the rst and second OVC. The reason for a discrepancy was determined individually for all non-concordant cases or breast cancer deaths. Results Of the 2615 cases included by the W-E Trial or the OVC, there were 478 (18%) disagreements. Of the disagreements 82% were due to inclusion/exclusion criteria, and 18% to disagreement with respect to cause of death or vital status at ascertainment. For E-County, the OVC inclusion rules and register based determination of cause of death (second OVC) rather than individual case review (W-E Trial and 1st OVC) resulted in a reduction of the estimate of the effect of screening, but for W-County the difference between the original trial and the OVC was modest. Conclusions The conclusion that invitation to mammography screening reduces breast cancer mortality remains robust. Disagreements were mainly due to study design issues, while disagreements about cause of death were a minority. When secondary research does not adhere to the protocols of the primary research projects, the consequences of such design differences should be investigated and reported. Register linkage of trials can add follow-up information. The precision of trials with modest size is enhanced by individual monitoring of case status and outcome status such as determination of cause of death.

INTRODUCTION
ammographic screening reduces mortality from breast cancer in both randomized trials1,2 and in routine service screening.3,4 The Swedish W-E Trial was the rst randomized trial to demonstrate a reduction in breast cancer mortality from screening with mammography alone,5 showing a 31% reduction in breast cancer mortality with invitation to screening. This reduction has remained consistent over long-term follow-up.6 In 1987 the Swedish Cancer Society set up an Overview Committee (OVC) to review all the randomized mammography trials in Sweden, the W-E Trial being one of them. The OVC performed two overviews (hereafter called the 1st and the 2nd OVC) by collecting data from all four Swedish mammography trials in a uniform way. However, between the 1st and 2nd OVC there was a difference in the methods of determining cause of death, using an endpoint committee in the 1st OVC and registries only in the 2nd.

Concern was expressed about differences between results reported for the W-E trial by the original trialists and those reported by the Swedish Overview, particularly with respect to numbers of breast cancer deaths.7 It has been pointed out that such differences are an inevitable consequence of the different case denition and determination of cause of death, and the different eligibility criteria of the Swedish Overview1,8 10 as compared with the W-E Trial. These differences, however, raise both particular and general methodological issues related to follow-up of large trials or sets of trials in secondary prevention. These questions include: (1) What is the magnitude of these differences at an individual rather than aggregate level, what proportions of the differences are due to inclusion/exclusion criteria of cases in the Swedish Overview and to cause of death determination?

www.jmedscreen.com

Journal of Medical Screening

2009

Volume 16

Number 2

74

Holmberg et al.

(2)

(3)

What are the reasons for individual differences between the original study and the overview with respect to breast cancer case denition/inclusion and cause of death? What are the implications of these kinds of differences for endpoint denition in future studies in primary or secondary prevention?

In this paper, we report on a complete audit of breast cancer cases and deaths in the Swedish W-E Trial, as dened by the original trial investigators and by the Swedish Overview. We report the numbers of disagreements at individual level and the reasons for these. We discuss their implications for interpretation of the overview and the original trial (up to the end of 1993 for W-county and to the end of 1996 for E-county) and for design and follow-up of future secondary prevention trials.

BACKGROUND
The Swedish W-E Trial was initiated in 1977 in Kopparberg county (now Dalarna, referred to as W-county hereafter) stergo and in 1978 in O tland county (E-county). Small geographical clusters were randomized to invitation to screening (Active Study Population, ASP) or no invitation (Passive Study Population, PSP) within 7 strata in W-county and within 12 strata in E-county. The strata were chosen so that clusters within strata were socioeconomically homogeneous. In W-county, randomization was approximately in the ratio 2:1 for ASP:PSP. In E-county, roughly equal numbers were randomized to the two groups. Entry of strata to the trial was staggered to allow the mammography facilities to cope with the workload. Year of birth cohorts were included to give an approximate age range of 4074. For example, for a stratum whose randomization date was 1977, years of birth 19031937 were included. In total, 77,080 women were randomized to the ASP and 55,985 to the PSP. Details of the age and county breakdown of the study population are given elsewhere.11,12 The screening regime was single-view mammography, on average every 24 months in women aged 40 49 and every 33 months in women aged 5074. At the end of 1984 a signicant 31% reduction in breast cancer mortality was observed in the ASP5. The PSP was then invited to screening. The trial was closed immediately on completion of the rst round of screening in the PSP, and all cases in both arms diagnosed up to and including the end of the rst screen of the PSP were followed up for death from breast cancer. In W-county, according to the local trials records, there were 694 breast cancer cases in the ASP and 359 in the PSP. In E-county, there were 732 breast cancer cases diagnosed in the ASP and 683 in the PSP.6 These cases included both in situ and invasive cancers diagnosed during the trial period. The OVC dened breast cancer cases as women reported with an invasive breast cancer only (excluding women with cancer in situ) to the Swedish National Cancer Registry (NCR) during each trials recruitment period using the reporting date in the NCR as the date of diagnosis; women with an invasive breast cancer reported to the NCR before the trial start were excluded from the study
Journal of Medical Screening 2009 Volume 16 Number 2

base, although women diagnosed before 1958, when the NCR was established, could not be excluded. The OVC also accepted a woman as a breast cancer case in the study when there was only a breast cancer death registered in the Swedish Causes of Death Register (CDR), even if they were not registered at the NCR. Thus, the diagnosis could have occurred before trial start, during the trial period or after the trial ended. Deaths from breast cancer were retrieved from the CDR to include all deaths in women with breast cancer as the underlying cause, according to the death certicate. Inclusion criteria in all analyses in the overview were based on the exact age at randomization, as opposed to the Two-County trial, where inclusion was determined on the basis of year of birth. The OVC retrieved the original randomization le, based on the population register, from the IT co-ordinator responsible for data management in each of the counties. The les were linked by each womans unique National Registration Number to the corresponding Regional Tumour Registry which provides the data for the NCR to obtain verication and date of diagnosis and to the CDR to obtain date and cause of death. Importantly, the 1st OVC included specialists who independently from the W-E Trial determined the cause of death of the breast cancer patients based on case records. The publications of the 1st OVC gave a relative risk estimate similar to that of the W-E Trials local committee.2 In the 2nd OVC the decision was made to use the Swedish National Cancer and Death Registries (NCR and CDR) to determine cause of death instead of using the specialist committee, because the combined relative risk using the register data was similar to that of the 1st OVC.13 The 1st OVC conducted a computerized follow-up of both the W-County and E-County data to 31 December 1993, and the 2nd OVC continued data collection for the E-County only until 31 December 1996.2 The computerized follow-up ended 31 December 1993 for the rst evaluation round (which was the last time the W-data were included) and 31 December 1996 for the second evaluation round (which was the last time the E-data were included). The four particularly important differences between the W-E study design and the 1st & 2nd OVC s criteria were: (1) The original trial dened inclusion and exclusion of women to the trial by year of birth and residence in the relevant geographical areas at the time of randomization. The 1st & 2nd OVC dened the population by year, month and day of birth. The end-point committees of the W-E trial and the 1st OVC determined individual patient outcome by reviewing all clinical records as identied in the original trial data and in NCR and CDR data. The 2nd OVC used NCR and CDR data only. The OVC included women as cases if the CDR reported a breast cancer death even if there was no report of a breast cancer diagnosis in the NCR. The W-E Trial included only those women who had a microscopically conrmed breast cancer diagnosed during the trial period. The W-E Trial included all breast cancer cases (in situ and invasive), whereas the 1st and 2nd OVC both considered only invasive breast cancer cases reported to the NCR, excluding all women as reported to the
www.jmedscreen.com

(2)

(3)

(4)

Differences in endpoints between the Swedish W-E

75

NCR as having cancer in situ carcinoma and could not include those by clerical errors not reported from the clinics to the NCR.

METHODS
In 2006 the Swedish Cancer Society set up a Joint Review Committee (JRC) including members of the 1st & 2nd OVCs and the project leaders of the W-E trial to investigate the sources of disagreement between the results published by the trialists and the OVC (the 1st OVC for W county and the 2nd OVC for E county data). The lists of women with breast cancer according to the trialists and the OVC were compared. Where necessary, clinical records were retrieved. After investigating each case in the two lists independently by the trialists and the OVC, a classication scheme of the differences was developed by the JRC (Table 1). The records of breast cancer cases and deaths according to the local endpoint committee were compared with those of the OVC using the Swedish National Registration Numbers of the subjects for linkage. The deaths through 1993 were compared for W-county and through 1996 for E-county, as these dates were respectively the most recent Swedish Overview analyses to include each county.8,14 The JRC reviewed each disagreement between the two datasets with respect to either case denition or cause of death. The JRC determined the

reasons for each individual disagreement. As a nal result, the trialists also accepted some women as additional cancer cases in their trials depending on new information about migrated women and clerical errors. The addition of them to the original datasets is called the JRC conclusive dataset. Paired signicance tests between OVC and W-E endpoints were carried out using McNemar methods.15 Associations of the likelihood of disagreement with age, county and trial arm were assessed using the chi-squared test. Relative risks and 95% condence intervals on these were calculated using Poisson regression.

RESULTS
According to the W-E trial records the total numbers of women for W-county were 38,589 in the ASP and 18,582 in the PSP; for E-county the numbers were 38,481 and 37,403. The corresponding gures in the O-V records were 38,562, 18,478, 38,405 and 37,145. These differences of the order of less than 1% were not inuential in the estimation of the primary results.

W-county
Table 2 shows all cases included in either the local trial records for W-county or the 1st OVC records or both, with

Table1 Classication of potential differences between the WE trial and the overview
Explanation Category Type of disagreement A B WE trial Overview

G I K

Difference in the denition of age Age calculation was based on the year of Age calculation was based on exact date (accounts for differences at randomization and the year of birth of the of birth and randomization (day/month/ both ends of the age spectrum) trial attendee. year). Date of notication to the NCR, which, Denition of date of diagnosis Date of operation. Women who had a (accounts for differences at trial screening or clinical diagnosis at the end according to registry principles is the rst notication to the NCR of a cancer start and at trial end) of the trial, before closing date, but operated after the closing date, have been (often the date of a positive cytology). included. Difference in the principles of use Included only cases diagnosed within the Included cases diagnosed within the trial of causes of death registry trial period. period and breast cancer deaths registered in the CDR even if they were not registered at the NCR (Thus the diagnosis could have occurred before trial start, during the trial period or after the trial ended). Cases not retrievable from the Included all cases including cancer in situ Included only invasive cases retrievable from NCR at the time of the during the trial period when there was the NCR or identied at the CDR at the overview clinical information available on a breast time when the overview was conducted, cancer, even if a case was not registered but could not exclude those cases that at the NCR due to administrative errors were diagnosed before the start of the and excluded all cases with a history of NCR in 1958. breast cancer before the study started. Differences in the determination Cause of death was determined by the local 1st OVC used an independendent end-point of cause of death trial committee based on data available in committee; the 2nd OVC used cause of patients medical records. death data registered in the CDR. Erroneous inclusion in the W-E Clerical error or incorrect national registration number. database Miscellaneous clerical errors and Includes misspecication of eligibility or cause of death due to clerical error, erroneous other reasons registration in NCR or CDR or administrative loss of information. Also includes individual migration, where a subject received a breast cancer diagnosis outside the study areas, and was therefore in the overview but missing from the W-E database.

NCR, National cancer register; CDR, National cause of death register

www.jmedscreen.com

Journal of Medical Screening

2009

Volume 16

Number 2

76

Holmberg et al.

Table 2 W-county outcomes tabulated against overview outcomes (agreements in bold)


W-county outcome Study group PSP 1st OVC outcome Incl, BCD Incl, DOC Incl, Alive Not incl Total Incl, BCD Incl, DOC Incl, Alive Not incl Total Incl, BCD Incl, DOC Incl, Alive Not incl Total Incl, BCD 95 8 0 7 110 121 12 0 2 135 216 20 0 9 245 Incl, DOC 0 50 0 7 57 4 153 0 10 167 4 203 0 17 224 Incl, alive 0 0 138 54 192 0 0 344 48 392 0 0 482 102 584 Not incl 4 0 0 0 4 16 15 12 0 43 20 15 12 0 47 Total 99 58 138 68 363 141 180 356 60 737 240 238 494 128 1100

Table 3 Categorized disagreements between W-county trial records and 1st OVC records
Number (%) of disagreements in cases Disagreement category A B C D G I K Total ASP 14 5 13 51 16 0 20 119 (12) (4) (11) (43) (13) (0) (17) (100) PSP 3 52 2 12 8 0 3 80 (4) (65) (2) (15) (10) (0) (4) (100) Total 17 57 15 63 24 0 23 199 (8) (29) (7) (32) (12) (0) (12) (100)

ASP

Total

Incl, included; BCD, breast cancer death; DOC, death from other causes; PSP, passive study population, not invited; ASP, active study population, invited

A B C D G I K Total

Number (%) of disagreements for breast cancer death 2 (6) 0 (0) 2 (4) 0 (0) 6 (32) 6 (11) 10 (29) 2 (11) 12 (23) 2 (6) 1 (5) 3 (6) 16 (47) 8 (42) 24 (45) 0 (0) 0 (0) 0 (0) 4 (12) 2 (11) 6 (11) 34 (100) 19 (100) 53 (100)

PSP, passive study population, not invited; ASP, active study population, invited

the endpoint in each data set cross-tabulated. Of the 1053 cases included in the local trial records, the OVC included 925 cases (88%). Conversely, of the 972 cases included in the OVC records, the local trial included 925 breast cancer cases (95%). Of the 443 deaths to 1993 included in both datasets, there were 24 (5%) disagreements regarding determination of cause of death (type G disagreement). Of the total 199 disagreements, whether with respect to case inclusion or to cause of death, 175 (88%) pertained to case inclusion rather than cause of death. For both the ASP and the PSP, the overview was less likely to classify a death as from breast cancer. The magnitude of this tendency did not differ signicantly between ASP and PSP. Table 3 shows the reasons for disagreement between the two breast cancer-case datasets, in the ASP and PSP separately. The largest group of disagreements in the ASP was type D, mainly due to women with screen-detected in situ lesions included in the W-E dataset but not included by the overview. In the PSP, most of the disagreements were of type B, relating to date of diagnosis. These disagreements mostly resulted from women in PSP diagnosed at the rst screen but through delays in reporting, not entered into the NCR until after closure of the trial. These women were considered by the OVC only to have been diagnosed at the reporting date to the register and were thus excluded in the OVC (see Table 1, category B). Disagreements with respect to death from breast cancer were mainly due to category G (47%; disagreement about cause of death) and C (29%; use of cause of death register without reference to date of diagnosis) in the ASP, and to G (disagreement about cause of death) and B (32%; denition of date of diagnosis) in the PSP. Table 4 shows the breast cancer deaths and corresponding relative risks (RR) from the W-arm of the W-E trial, the OVC, and those derived after review of all information by the JRC and the resulting conclusive dataset (i.e. the original trial data plus correction for the clerical errors and cases lost to the trialists due to migration). The OVC result is more conservative than that of the original trial and the result

based on the JRC conclusive dataset. All analyses show a signicant mortality reduction in the ASP.

E-county
Table 5 shows cross-tabulation of the local trial endpoint records for E-county with the 2nd OVC records, for all women with breast cancer in either or both datasets. Of the 1415 women with breast cancer included in the local trial records, the 2nd OVC included 1298 (92%). Of the 1398 cases included in the 2nd OVC records, the local trial included 1298 (93%). Of the 655 deaths to 1996 included in both datasets, there were 53 (8%) disagreements. Of the total 279 disagreements, 217 (78%) pertained to case inclusion, 53 (19%) to cause of death and 9 (3%) to vital status at 31 December 1996. For both the ASP and the PSP, the 2nd OVC was less likely to classify a death as from breast cancer. This tendency was signicantly stronger in the PSP (59% vs. 52%; P 0.03). The reasons for disagreements are shown in Table 6. As with W-county, the largest group, 40% of the disagreements in the ASP are of type D, absence of trial cases from the NCR. For the PSP, however, similar proportions of disagreements were Table 4 Trial mortality result for W-county from original local trial endpoint, 1st OVC endpoint and the JRC conclusive dataset
ASP W original breast cancer deaths OVC breast cancer deaths JRC conclusion breast cancer deaths for W Number of subjects 135 141 136 38,589 PSP 110 99 111 18,582 RR (95% CI) 0.59 (0.45 0.76) 0.69 (0.53 0.90) 0.59 (0.45 0.76)

PSP, passive study population, not invited; ASP, active study population, invited

Journal of Medical Screening

2009

Volume 16

Number 2

www.jmedscreen.com

Differences in endpoints between the Swedish W-E

77

Table 5 E-county outcomes tabulated against 2nd OVC outcomes (agreements in bold)
E-county outcome Study group PSP 2nd OVC outcome Incl, BCD Incl, DOC Incl, Alive Not incl Total Incl, BCD Incl, DOC Incl, Alive Not incl Total Incl, BCD Incl, DOC Incl, Alive Not incl Total Incl, BCD 164 27 0 9 200 147 14 0 2 163 311 41 0 11 363 Incl, DOC 4 128 1 8 141 8 163 1 13 185 12 291 2 21 326 Incl, alive 2 3 296 41 342 0 2 338 44 384 2 5 634 85 726 Not incl 19 20 10 0 49 20 21 10 0 51 39 41 20 0 100 Total 189 178 307 58 732 175 200 349 59 783 364 378 656 117 1515

Table 7 Trial mortality result for E-county from original local trial endpoint, 2nd OVC endpoint and the JRC conclusive dataset
ASP Original E breast cancer deaths OVC breast cancer deaths JRC conclusion breast cancer deaths for E Number of subjects 163 175 162 38,309 PSP 200 189 206 37,403 RR (95% CI) 0.80 (0.64 0.98) 0.90 (0.72 1.12) 0.77 (0.62 0.95)

ASP

PSP, passive study population, not invited; ASP, active study population, invited

Total

Associations with disagreement


We also investigated whether study group (ASP/PSP), county or age were signicantly related to the likelihood of disagreement about breast cancer death. In the 685 cases classied as breast cancer death by either the W and E local committees or the OVC or both, there was no signicant association of study group with disagreement (P 0.2). There was a higher proportion of disagreement in E-county than in W-county, but this did not attain statistical signicance (P 0.09). There was, however, a signicant effect of patientage at the time of randomization on the probability of a risk of disagreement (P , 0.001). In both counties, the disagreement increased with age (Figure 1).

Incl, included; BCD, breast cancer death; DOC, death of other causes; PSP, passive study population, not invited; ASP, active study population, invited

observed in categories D (19%), absence of the case from the NCR, G (22%), disagreement about cause of death, and K (26%), miscellaneous clerical errors and other reasons. With respect to breast cancer death, disagreements were dominated by category G (disagreement about cause of death) and C (date of diagnosis) in both the ASP and PSP. Table 7 shows the E-county trial result with respect to breast cancer mortality using the original trial endpoint, the 2nd OVC endpoint and the conclusive endpoint after review of all sources by the JRC (i.e. the original trial data plus correction for the clerical errors and cases lost to the trialists due to migration). The trial endpoint and the JRC conclusive dataset both show a signicant 2023% reduction in mortality, whereas the 2nd OVC result shows a non-signicant 10% reduction. Table 6 Categorized disagreements between E-county trial records and 2nd OVC records
Number (%) of disagreements in cases Disagreement category A B C D G I K Total ASP 21 0 15 54 22 3 20 135 (16) (0) (11) (40) (16) (2) (15) (100) PSP 17 18 10 28 31 3 37 144 (12) (12) (7) (19) (22) (2) (26) (100) Total 38 18 25 83 52 6 57 279 (14) (6) (9) (30) (19) (2) (20) (100)

DISCUSSION
In this study, the Swedish Cancer Societys Joint Review Committee (JRC) investigated disagreements between the breast cancer incidence and death data as recorded in the original Swedish Two-County Trial, based on individual patient records and determination of cause of death by an expert committee, and that in the 2nd OVC based on the National Cancer Registry and Cause of Death Register. For the purposes of this study, we had full access to original W-E trial data, original data collected for the OVC, individual medical records, and register data from the regional tumour registries for the respective counties. The registration of new diagnosis of breast cancer is mandatory by law in Sweden

A B C D G I K Total

Number (%) of disagreements for breast cancer death 5 (12) 3 (5) 8 (7) 0 (0) 4 (7) 4 (4) 15 (34) 10 (16) 25 (24) 1 (2) 4 (7) 5 (5) 22 (50) 30 (49) 52 (49) 1 (2) 0 (0) 1 (1) 0 (0) 10 (16) 10 (10) 44 (100) 61 (100) 105 (100)

PSP, passive study population, not invited; ASP, active study population, invited

Figure 1 Percentage disagreement between W-E and 2nd OVC by age, in 604 cases classed as breast cancer deaths by one or both sources

www.jmedscreen.com

Journal of Medical Screening

2009

Volume 16

Number 2

78

Holmberg et al.

and the completeness of registration of breast cancer is over 98%.16 Thus, we were able to determine the reason for discrepancy in every individual case and no discrepancies were left unexplained. Our main empirical ndings are that the JRC found that of the 2615 cases included by the W-E Trial or the OVC, there were 478 (18%) disagreements about inclusion/ exclusion of women into the trial or determination of the cause of death. The vast majority of these pertained to a disagreement in inclusion/exclusion and not to disagreement in determination of cause of death. The disagreements were in the great majority of cases due to OVC-study design decisions pertaining to issues such as denition of age and last date of inclusion into the study, and use of a register rather than clinical records for case denition and cause of death determination. Disagreement about whether a death included in both the W-E Trial and the OVC was from breast cancer or not was relatively rare. We also found that the likelihood of disagreement about the cause of death was not signicantly affected by county or trial arm. Such disagreement was, however, signicantly more likely in older patients. These ndings have implications both for the interpretation of screening effects and for methodological issues in overviewing original research. The combined results of the two counties showed a signicant breast cancer mortality reduction associated with the offer of screening by any of the three endpoint criteria. Using the JRC conclusions, the combined RR was 0.69 (95% CI 0.58 0.83). Thus, the overall interpretation was not sensitive to these differences in design. In W-county, the result was signicant by any of the three criteria, whereas in E-county, the result was signicant using the original trial endpoint, and the JRC conclusive review endpoint, but not statistically signicant using the 2nd OVC endpoint. The JRC conclusive result included some women with breast cancer previously missed by the trialists due to migration, but picked up by the NCR or CDR. The remit of the JRC was not to determine whether one or the other of the endpoints were correct. However, it is clear from the E-county results that a combination of differing causes of death determinations and inclusion/exclusion rules made a crucial difference to the primary result. It is highly relevant for the eld of secondary prevention to understand how such modest disagreements cause such a difference to the outcome in a trial with a total of 133,065 subjects. The answer is that the disagreements only needed to impact on the small minorities of subjects classied as dying from breast cancer within the trial arm subgroups of one geographical stratum (E-county) within the larger trial. In the ASP of E-county, disagreements with respect to cause of death and eligibility for inclusion caused a loss of 16 and a gain of 28 breast cancer deaths, a net increase of 12 breast cancer deaths. In the PSP, there was a loss of 36 breast cancer deaths and a gain of 25, a net loss of 11 deaths (Table 5). Thus the 2nd OVC classication of eligibility for inclusion and cause of death gave a 7% higher death rate in the ASP and a 6% lower death rate in the PSP, sufcient to convert a statistically signicant 20% reduction in mortality to a statistically non-signicant 10% reduction. It should be noted that if the inclusion criteria had been identical and the only
Journal of Medical Screening 2009 Volume 16 Number 2

difference had been the disagreements over cause of death, the result in E-county would still have been rendered non-signicant. The effect of misclassication of exposure factors has been extensively studied in epidemiology,17 19 and when it is non-differential with respect to disease outcome, it tends to dilute estimated effects. Although less fully researched, the misclassication of outcome has also been shown to cause underestimation of exposure/outcome associations.20 Disagreement rates between OVC and W-E classications were 18% in both counties. Discrepancies of this magnitude are suggestive of misclassication probabilities of 10%, and would be likely to lead to dilution of observed associations by approximately 33%.21 The differences between W-E and OVC are smaller than this for W-county and rather larger for E-county. That they are proportionally larger for E-county is likely to be due to the fact that disagreement rates were differential between trial arms. The implications of this are that in general, the poorer the classication, the greater the potential for missing a true effect, that the presence of differential misclassication may increase the potential bias, and that the more thorough the classication effort, the more sensitive the comparison is likely to be. All these circumstances underline the importance of using an expert panel for determining cause of death when the individual study units contain few events. Others have regarded the determinations of such an expert committee as the gold standard,22 even when they have concluded that national death register information is adequate in comparison.23 The OVC obtained results closer to those of the original trial when the 1st OVC used an expert endpoint committee.2 The nding that the disagreement of cause of death increased with age is also of general interest. It accords with the ndings of the 1st OVC where four clinicians not involved in the trials independently determined cause of death and the discordance at the initial review was 5%, 5%, 13% and 19% in women 4049, 5059, 6069 and 7074 years respectively, at randomization.13 This probably reects an increasing difculty to determine cause of death with age for several reasons: a mixed clinical picture due to increasing co-morbidity, death occurring more often at home or in a nursing home without a clinical examination closely before death, very low probability of an autopsy, and increased uncertainty about origin of eventual metastases if also another malignancy has been diagnosed during follow-up. With long-term follow-up, information may also be lost that the woman is a trial participant and that determination of cause of death may be important. The results of the JRC review show that the disagreements were due to design differences between the clinical intervention trial approach employed in the original W-E Trial and the register-study design used by the OVC. This leads to a more general observation: design decisions in either an original study or a subsequent overview that may at rst glance seem trivial e.g. dening a date for end of trial can inuence basic and important study features such as the number of included subjects. Thus, design differences between original studies and overviews have to be taken into account when the overview does not adhere to the original designs, and it should be investigated if
www.jmedscreen.com

Differences in endpoints between the Swedish W-E

79
A M F Yen, Cancer Research UK Centre for Epidemiology, Mathematics and Statistics, Wolfson Institute of Preventive Medicine, London, UK r, Professor of Radiology, University of Uppsala, School of L Taba Medicine, Department of Mammography, Falun Central Hospital, Falun, Sweden B Vitak, Consultant Radiologist, Division of Radiological Sciences, Department of Medical and Health Sciences, Linko ping University, Linko ping, Sweden m, Associate Professor of Epidemiology, Department of L Nystro Public Health and Clinical Medicine, Umea Universtiy, Umea , Sweden J Frisell, Professor of Surgery, Department of Molecular Medicine and Surgery, Unit of Breast Surgery, Karolinska Institute, Solna, Sweden

the interpretation is sensitive to such design conicts. An example here is the inclusion of women with in situ tumours as cases in the original study contrasted with the decision to only include those registered with an invasive cancer in the 1st and the 2nd OVC. This decision made an especially large difference for the ASP. Thus, seemingly general deviances from the original study design may not be neutral to the evaluation of the randomized trial. In this case this decision above all contributed to the different number of cases reported in the original trial as compared with the 1st and 2nd OVC, but little to the evaluation of breast cancer mortality. Is there a role for registry data in evaluation of primary or secondary interventions? It would denitely seem so where the research involves millions of person-years and large numbers of cause-specic deaths, thus misclassications are likely to be heavily outnumbered by reliable observations3,4 such as in large prevention and secondary prevention studies. For individual trials with smaller sizes, however, it is more reliable to individually determine case status and the cause of deaths by an expert committee.

ACKNOWLEDGEMENTS
The study was supported by grants from the Swedish Cancer Society and the American Cancer Society. We thank Sherry Yueh-Hsia Chiu from the Institute of Preventive Medicine, Division of Biostatistics, College of Public Health at the National Taiwan University for excellent help and Robert Smith from the American Cancer Society for valuable discussions and advice. Conict of interest and contributions: The authors are associated with the WE trial and the Overview as described in contributions and have otherwise no conict of interest in relation to this work. Lars Holmberg, Stephen Duffy and Jan Frisell oversaw the comparison and coordinated the analyses. Lars Holmberg and Stephen Duffy drafted the report. Jan Frisell and Lars Holmberg were the principal investigators for the grants szlo Taba r and Bedrich Vitak that supported the study. La were the principal investigators for the W and E trial parts, respectively, and provided all data for the W-E trial. Jan Frisell and Lennarth Nystrom were the principal and the coordinating investigators for the Overview committee, respectively, and Lennarth Nystrom provided the Overview data. Stephen Duffy and Amy Yen made the statistical analyses. All authors had full access to the data, contributed in the comparison process, the interpretation of the analyses and revised the manuscript for intellectual content. Lars Holmberg is the guarantor for the study.

CONCLUSION
The following points are suggested by the above results: (1) The conclusion that invitation to mammography screening was associated with a signicant breast cancer mortality reduction remains robust after a full examination of disagreements between the original Two-County Trial endpoints and those of the Swedish overview. Disagreements about actual cause of death were a minority of the overall disagreements and were common only for older cases; the majority of disagreements related to inclusion or exclusion. The use of the overview inclusion criteria and the national registry data for determination of breast cancer deaths led to a substantial change in the result for one of the two counties illustrating that nondifferential misclassication of the main endpoint tends to drive results towards the null. Thus, for trials with modest size it would appear to be more prudent to rely on trial logistics with close individual monitoring of case status, presence of covariates and outcome status such as determination of cause of death based on all available clinical information. When secondary research does not adhere to the protocols of the primary research projects included, the consequences of such design differences should be investigated and reported. Seemingly trivial design decisions may have signicant impact on the result and are not always neutral to the randomized design.

(2)

(3)

REFERENCES
1 2 3 Smith RA, Duffy SW, Gabe R, Tabar L, Yen AMF, Chen HHT. The randomized trials of breast cancer screening: what have we learned? Radiol Clin Nth Amer 2004;42:793 806 Nystro m L, Rutquist LE, Wall S, et al. Breast cancer screening with mammography: overview of the Swedish randomised trials. Lancet 1993;341:973 8 Swedish Organised Service screening Evaluation Group. Reduction in Breast Cancer Mortality from Organised Service screening with Mammography: 1. Further conrmation with extended data. Cancer Epidemiol Biomarkers Prev 2006;15:45 51 Swedish Organised Service screening Evaluation Group. Reduction in breast cancer mortality from organised service screening with mammography: 2. Validation with alternative analytic methods. Cancer Epidemiol Biomarkers Prev 2006;15:52 56 r L, Fagerberg CJ, Gad A, et al. Reduction in mortality from breast Taba cancer after mass screening with mammography. Randomised trial from the Breast Cancer screening Working Group of the Swedish National Board of Health and Welfare. Lancet 1985;325:829 32 Tabar L, Vitak B, Chen HH, Duffy SW, Smith RA. The Swedish Two-County Trial twenty years later: updated mortality results and new insights from long term follow-up. Radiol Clin Nth Amer 2000;38:625 51 Gtzsche PC, Olsen O. Is screening for breast cancer with mammography justiable? Lancet 2000;355:129 33

(4)

...............

Authors afliations L Holmberg, Professor of Cancer Epidemiology, Kings College London, Medical School, Division of Cancer Studies, London, UK S W Duffy, Professor of Breast Cancer Screening, Cancer Research UK Centre for Epidemiology, Mathematics and Statistics, Wolfson Institute of Preventive Medicine, London, UK

6 7

www.jmedscreen.com

Journal of Medical Screening

2009

Volume 16

Number 2

80
8 9 10 11 12 13 Nystro m L, Andersson I, Bjurstam N, Frisell J, Nordenskjo ld B, Rutqvist LE. Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet 2002;359:909 19 Freedman DA, Petitti DB, Robins JM. On the efcacy of screening for breast cancer. Int J Epidemiol 2004;33:43 55 Duffy SW. Interpretation of the breast screening trials: a commentary on the recent paper by Gtzsche and Olsen. The Breast 2001;10:209 12 Tabar L, Fagerberg G, Duffy SW, Day NE, Gad A, Grontoft O. Update of the Swedish two- county program of mammographic screening for breast cancer. Radiol Clin Nth Amer 1992;30:187 210 Duffy SW, Tabar L, Vitak B, et al. The Swedish Two-County Trial of mammographic screening: cluster randomisation and endpoint evaluation. Ann Oncol 2003;39:1746 54 Nystro m L, Larsson L-G, Rutqvist LE, et al. Determination of cause of death among breast cancer cases in the Swedish mammography screening trials: a comparison between ofcial statistics and validation by an endpoint committee. Acta Oncol 1995;34:145 52 Larsson LG, Andersson I, Bjurstam N, et al. Updated overview of the Swedish randomised trials on beast cancer screening with mammography: age group 40 49 at randomisation. J Natl Cancer Inst Monogr 1997;22:57 61 McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 1947;12:153 7 16 17

Holmberg et al.
Barlow L, Westergren K, Holmberg L, Talba ck M . The completeness of the Swedish Cancer Register - a sample survey for year 1998. Acta Oncol 2009;48:27 33 Freedman LS, Midthune D, Carroll RJ, Kipnis V. A comparison of regression calibration, moment reconstruction and imputation for adjusting for covariate measurement error in regression. Stat Med 2008;27: 5195 216 Wong MY, Day NE, Luan JA, Wareham NJ. Estimation of magnitude in gene-environment interactions in the presence of measurement error. Stat Med 2004;23:987 98 Bashir SA, Duffy SW. The correction of risk estimates for measurement error. Ann Epidemiol 1997;7:154 64 Duffy SW, Warwick J, Williams AR, et al. A simple model for potential use with a misclassied binary outcome in epidemiology. J Epidemiol Comm Hlth 2004;58:712 7 Duffy SW, Maximovitch DM, Day NE. External validation, repeat determination, precision of risk estimation in misclassied exposure data in epidemiology. J Epidemiol Comm Hlth 1992;46:620 24 Miller AB. Design of cancer screening trials/randomized trials for evaluation of cancer screening. World J Surg 2006;30:1152 62 Ma kinen T, Karhunen P, Aro J, Lahtela J, Ma a tta nen L, Auvinen A. Assessment of causes of death in a prostate cancer screening trial. Int J Cancer 2008;122:413 17

18 19 20 21 22 23

14

15

Journal of Medical Screening

2009

Volume 16

Number 2

www.jmedscreen.com

Anda mungkin juga menyukai