Anda di halaman 1dari 9

Neuromuscular Disorders 16 (2006) 459467 www.elsevier.

com/locate/nmd

A systematic review of diagnostic studies in myasthenia gravis


Michael Benatar *
Department of Neurology, Emory University, 1365A Clifton Road NE, Atlanta, GA 30322, USA Received 29 March 2006; received in revised form 5 May 2006; accepted 9 May 2006

Abstract We performed a systematic review to identify studies that reported the accuracy of tests for the diagnosis of myasthenia gravis. We identied 20 studies of reasonable, although variable, methodological quality upon which to base estimates of the accuracy of the ice test, rest test, Tensilon test, acetylcholine receptor antibodies, repetitive nerve stimulation and single ber electromyography for the diagnosis of myasthenia gravis. After examining inter-study heterogeneity for each diagnostic modality, we calculated pooled estimates of sensitivity and specicity as well as positive and negative likelihood ratios. Results are reported separately for ocular and generalized myasthenia. Studies that have examined the performance of anti-acetylcholine receptor antibody testing and single ber electromyography were generally of better quality than those that examined other diagnostic modalities. We suggest that caution should be exercised in the interpretation of the diagnostic performance of these tests given the methodological limitations of the studies upon which test performance is based. q 2006 Elsevier B.V. All rights reserved.
Keywords: Myasthenia gravis; Diagnosis; Accuracy; Sensitivity; Specicity; Likelihood ratio

1. Introduction The starting point for any clinical evaluation is the history and physical examination. The clinician uses the patients presenting symptoms and signs in order to generate a differential diagnosis and to estimate the pre-test probability of a particular disease. Diagnostic tests are then used to rene or narrow the differential diagnosis. A useful diagnostic test is one, which markedly increases or decreases the probability of a particular diagnosis. As clinicians we all use diagnostic tests on a regular basis and typically the presumption is made that the diagnostic properties and clinical utility of these tests have been well established. Regrettably, this is not always the case, at least in part because the methodological quality of diagnostic studies has historically been somewhat neglected [1,2]. This is not to say that attempts have not been made to assemble criteria by which the methodological quality of diagnostic tests might be evaluated [1,37], but rather that these efforts are relatively recent compared to the studies that serve as the

foundation for the use of many diagnostic tests. The consequence is that the results of many of the studies purporting to evaluate the performance of diagnostic tests, may be unreliable. The use of tests for the diagnosis of myasthenia gravis is a case in point. Here, we report the results of a systematic review of the literature pertaining to the use of tests for the diagnosis of myasthenia gravis. We begin, however, with a general discussion of the methodological aspects diagnostic studies in order to develop a framework within which to evaluate the methodological quality of the relevant literature.

2. Methodology of diagnostic tests The essential feature of a diagnostic study is that the outcome of the test under consideration (the index test) is compared to a gold or reference standard that is used to dene the presence or absence of disease. The index test may refer to any method used to acquire information about the disease in question, including symptoms, physical ndings on examination or the results of some diagnostic test (laboratory study, imaging result, etc.). The diagnostic performance of this index test is evaluated by comparing it to the truth. Since the truth is seldom available, the next best is to rely upon an appropriate reference standard.

* Tel.: C1 404 778 3267; fax: C1 404 778 3075. E-mail address: michael.benatar@emory.edu

0960-8966/$ - see front matter q 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.nmd.2006.05.006

460

M. Benatar / Neuromuscular Disorders 16 (2006) 459467

Ideally, the reference standard should represent the best available mechanism for determining the presence or absence of the disease in question. The reference standard should be applied independently of the index test. Independence in this sense comprises two elements. The rst is that the person applying the reference standard should be masked or blinded to the results of the index test. Similarly, the investigator performing the index text should be masked or blinded to the results of the reference standard. Such blinding will ensure that the investigators do not differentially interpret the results of the index or reference test depending on the results of the other. The more subjective the reference standard or the index test, i.e. the more susceptible it is to different interpretations, the more important is the need to maintain adequate blinding. The second aspect of independence requires that the components of the reference standard should not overlap with the index test. For example, if the goal of a study is to evaluate the accuracy of single ber electromyography (SFEMG) for the diagnosis of myasthenia gravis, then the reference standard should not incorporate or rely upon the results of SFEMG. The failure to ensure independence in this sense results in a form of systematic error known as incorporation bias. The implication of this form of bias is that the reference standard and index test are more likely to agree, with the result that estimates of both sensitivity and specicity are articially inated. Since, a diagnostic test is useful only insofar as it distinguishes between disorders that might otherwise be confused, the nding that a diagnostic test can distinguish between healthy controls and subjects severely affected by a disease process tells us nothing about the diagnostic utility of the test. The study population should be representative of the population in which the test will be used and should include subjects will the full spectrum of disease. For example, some patients with myasthenia gravis harbor antibodies against the acetylcholine receptor and others do not (so called seronegative myasthenia). Did the study population only include those with acetylcholine receptor antibodies, in which case the full spectrum of patients with myasthenia would not be represented by the study population? Failure to include patients who span the full spectrum of the disease represents a form of systematic error known as spectrum bias. The best way to ensure that the full spectrum of patients is included in the study is to consecutively recruit study subjects in a prospective fashion. The selection of the study population is also relevant to estimates of the properties of the diagnostic test. In considering the utility of a diagnostic test it is common to think in terms of the tests sensitivity and specicity. Sensitivity (the proportion of subjects with the disease in whom the test is positive) and specicity (the proportion of subjects without the disease in whom the test is negative) are typically considered properties of the test

itself. By contrast, the positive predictive value (the proportion of people with a positive test who actually have the disease) and the negative predictive value (the proportion of people with a negative test who do not have the disease in question), are inuenced by the prevalence of the disease within the study population and are the relevant properties in any consideration of the clinical utility of a diagnostic test. To some extent this is true. However, both sensitivity and specicity are also affected by the choice of the study population. A common mistake is to include subjects who are already known to have the disease as well as a separate group of healthy controls (the term casecontrol study is sometimes used to describe such a design). A better strategy is to include in the study a series of patients who are referred for diagnostic evaluation for a particular disorder (the term cohort study is sometimes used to describe this approach). The diagnostic odds ratio (a summary measure of the performance of a diagnostic test) may be substantially overestimated when the casecontrol approach is used instead of the cohort design [2]. The optimal design of a study that aims to evaluate the accuracy of a diagnostic test, therefore, is a prospective blinded comparison of the index test and the reference standard in a consecutive series of patients from a relevant population. A relevant population of patients spans the full spectrum of disease and is representative of the population in which the test will be used.

3. Methods We searched the MEDLINE database using the term myasthenia gravis [MeSH] combined with the terms Diagnosis [MeSH] OR Diagnostic Tests, Routine [MeSH] OR Sensitivity and Specicity [MeSH] OR ROC Curve [MeSH]. Relevant articles were retrieved and further publications were identied by searching the references cited in these articles. The full text of 50 studies identied in this way was read. A decision was made regarding their suitability for inclusion in the review based on whether they met pre-specied inclusion criteria. The minimum criteria for inclusion were that the reference standard was explicitly dened and that sufcient data was reported to allow calculation of both sensitivity and specicity for the diagnostic test under investigation. The studies identied for inclusion were graded for their methodological quality based on ve pre-specied domains: (1) Prospective. Retrospective design was assumed in the absence of an explicit statement that data collection was prospective. (2) Consecutive sample. Non-consecutive series was assumed unless it was explicitly stated that the sample was consecutive.

M. Benatar / Neuromuscular Disorders 16 (2006) 459467 Table 1 Methodological characteristics of Diagnostic Studies Reference Ice test Ertas Ellis Lertchavanakul Kubis Golnik Czaplinski Sethi Sleep test Odel Kubis Design Prospective Consecutive Reference standard Free from incorporation bias Yes Yes Yes Yes Yes Yes Yes Yes Yes No No No Yes Yes Yes Yes No No Yes No Yes Yes Yes Yes

461

Blinding

CC CC CC CC CC CC CC CC CC

Yes Yes Yes Yes Yes Yes No No Yes

No No No Yes No No No No Yes Yes No Yes No No No No No Yes Yes No Yes No No No

Gestalt Gestalt Prestigmine test or EMG AChR-Ab or abnormal SFEMG AChR-Ab or positive Tensilon test AChR-Ab or RNS decrement Gestalt with AChR-Ab or decrement on RNS Tensilon test See above Two of AChR-Ab, EDX and response to therapy Gestalt and positive AChR-Ab, Tensilon or EMG Gestalt with AChR-Ab and response to therapy EDX or Tensilon Gestalt, Tensilon and RNS Gestalt and response to therapy Gestalt, Tensilon test and RNS decrement See above See above Gestalt with response to therapy See above See above Gestalt and EDX Gestalt, AChR-Ab, SFEMG Gestalt, response to Tensilon or treatment; AChR-Ab AChR-Ab or response to therapy Gestalt and response to therapy See above See above See above Extra-ocular/eyelid weakness and response to therapy O3 months

Yes No No Yes No Yes Yes No Yes No No No No No No No No No No No No No No No

Anti-acetylcholine receptor antibodies Padua Cohort No Nicholson Costa Howard Lefvert Limburg Lindstrom Tensilon test Nicholson Cohort Cohort CC CC CC CC Cohort No Yes No No No No No No No No Yes No No No

Repetitive nerve stimulation Padua Cohort Oey Cohort Nicholson Cohort Costa Cohort Mier CC Rubin CC Kennett CC Single ber electromyography Benatar Cohort Rouseev Cohort Oey Cohort Padua Cohort Costa Cohort Ukachoke Cohort

No No No No Yes No

Yes Yes Yes Yes Yes Yes

Yes Yes Yes No Yes Yes

Yes No No No No No

CC, case control; see text for denitions of casecontrol and cohort study designs.

(3) Study design. In the context of a diagnostic study, the terms casecontrol and cohort are used slightly different to what is customary. The term cohort is used to describe a study that includes subjects who were considered to possibly have the disease under investigation. By contrast, the term casecontrol is used to describe a study in which subjects known to have the disease and subjects known to not have the disease in question (e.g. healthy or disease controls) are included. (4) Blinding was graded as present if the reference standard and index tests were performed and interpreted without knowledge of each other. In the absence of explicit

indication that blinding was present, it was assumed to be absent. (5) Independence of the reference standard was accepted if the reference standard was free from incorporation bias (i.e. the reference standard did not include or incorporate elements of the index test). Each of these domains was scored in a binary fashion. Data on these methodological features as well as results were extracted using a paper form. Data were abstracted separately for the diagnosis of ocular and generalized myasthenia. Unless explicitly stated that patients had ocular myasthenia, the disease was assumed to be of the

462

M. Benatar / Neuromuscular Disorders 16 (2006) 459467

generalized form, even if ocular symptoms pre-dominated. Estimates of sensitivity, specicity and likelihood ratios were based on 2!2 tables, with a value of 0.5 added to each cell to accommodate cells with a value of zero in order to permit estimation of condence intervals. Separate analyses were performed for studies evaluating the accuracy of tests for the diagnosis of ocular and generalized myasthenia. Similarly, separate analyses were performed for casecontrol and cohort studies given previous observations that casecontrol studies tend to over-estimate indices of diagnostic accuracy [2]. Pooled estimates of sensitivity and specicity were calculated for each diagnostic test in which each study was weighted according to its sample size. The 95% condence intervals for these pooled estimates were p calculated as sensitivity or specificityG 1:96 variance, where variance is calculated as (probability) ! (1Kprobability)/sample size. Likelihood ratios (LR) were calculated where the positive LRZ(sensitivity)/(1Kspecicity) and the negative LRZ (1Ksensitivity)/(specicity). The positive LR describes the likelihood of a positive test result in a myasthenic patient compared to a positive test result in a nonmyasthenic patient. Similarly, the negative LR represents the likelihood of a negative test result in a myasthenic patient compared to a negative test result in a nonmyasthenic patient. Likelihood ratios greater than 10 and less than 0.1 generate large and often denitive changes from pre-test to post-test probability of disease, LRs between 510 and 0.10.2 lead to moderate changes in pre-test to post-test probability and LRs between 25 and 0.20.5 result in small changes in probability. Likelihood ratios between 12 and 0.51 rarely alter pre-test probability. The way to use a LR is to estimate the pretest odds of disease (calculated as the probability of disease/[1Kprobability of disease]) and to multiply the LR by the odds, to yield the post-test odds, which can then be converted to a post-test probability (calculated as the odds of disease/[1Codds of disease]).

4.1. Ice test Three studies described the accuracy of the ice test for the diagnosis of ocular myasthenia [15,16,28] and ve studies reported the accuracy of this test in generalized myasthenia [14,16,17,19,26]. All seven studies employed a casecontrol study design [1417,19,26,28]. The sensitivity and specicity (with 95% condence intervals) of each individual study for the diagnosis of ocular and generalized myasthenia are summarized in Tables 2a and 2b. The pooled estimates of sensitivity were 0.94 for ocular myasthenia and 0.82 for generalized disease (Table 3). The pooled estimates of specicity were 0.97 of ocular and 0.96 for generalized myasthenia (Table 3). The likelihood ratios suggest that both positive and negative test results are likely to have a dramatic effect on the pre-test probability of disease (Table 3). 4.2. Rest test A single casecontrol study of limited methodological quality reported a sensitivity of 0.99 and a specicity of 0.91 of the rest test for the diagnosis of ocular myasthenia [29] (Table 2a). A single casecontrol study, but with otherwise fairly good design, reported the accuracy of the sleep or rest test for the diagnosis of generalized MG [19]. Sensitivity was poor (0.50), but specicity was high (0.97) (Table 2b). The likelihood ratios suggest that both positive and negative test results are likely to signicantly change the pre-test probability of ocular disease and that a positive (but not a negative) test result changes the probability of generalized myasthenia (Table 3). 4.3. Tensilon test We identied a single study that examined the diagnostic accuracy of the Tensilon (edrophonium) test [24]. This cohort study of poor methodological quality reported sensitivities of 0.92 and 0.88 for ocular (Table 2a) and generalized myasthenia (Table 2b), respectively, as well as specicities of 0.97 for both forms of the disease. The likelihood ratios suggest that both positive and negative test results are extremely useful in that they lead to marked changes from pre-test to post-test probability of disease (Table 3). 4.4. Anti-acetylcholine receptor antibodies There were seven studies that reported the diagnostic accuracy of the presence of anti-acetylcholine receptor antibodies, four of which employed a casecontrol design [18,2022] and three of which were cohort studies [11,12,24]. The sensitivity and specicity (with 95% condence intervals) of each individual study for the diagnosis of ocular and generalized myasthenia are

4. Results We identied 20 studies that met our inclusion criteria [827]. These studies provided information about the Tensilon test [24], the ice test [1417,19,26,28], the sleep or rest test [19,29], acetylcholine receptor antibody assays [11,12,18,2022,24], repetitive nerve stimulation [10 12,2325,27] and single ber electromyography [813]. Fifteen studies contributed data relevant to the diagnosis of ocular myasthenia and 15 provided information about the diagnostic accuracy of these tests in generalized myasthenia. The methodological qualities of these studies are summarized in Table 1 and the results of these studies presented in Tables 2a and 2b.

M. Benatar / Neuromuscular Disorders 16 (2006) 459467 Table 2a Accuracy of tests for the diagnosis of ocular myasthenia Study Number of methodology criteria met 3 2 2 Sensitivity (95% CI) Specicity (95% CI) Table 2b Accuracy of tests for the diagnosis of generalized myasthenia Study Number of methodology criteria met 4 3 3 2 2 4 Sensitivity (95% CI) Specicity (95% CI)

463

Ice test Ertas Ellis Lertchavana Sleep/rest test Odel

0.94 (0.841.00) 0.97 (0.911.00) 0.93 (0.851.00)

0.97 (0.901.00) 0.97 (0.911.00) 0.98 (0.931.00)

Ice test Kubis Ertas Czaplinski Sethi Golnik Sleep/rest test Kubis

0.79 (0.620.95) 0.92 (0.801.00) 0.92 (0.761.00) 0.77 (0.580.96) 0.79 (0.660.91) 0.50 (0.310.69)

0.97 (0.901.00) 0.97 (0.901.00) 0.92 (0.761.00) 0.94 (0.831.00) 0.98 (0.931.00) 0.97 (0.901.00) 0.98 (0.951.00) 1.00 (0.991.00) 0.98 (0.960.99) 0.99 (0.981.00) 0.99 (0.971.00) 1.00 (1.001.00) 0.95 (0.881.00) 0.96 (0.911.00) 0.97 (0.911.00) 0.98 (0.961.00) 0.98 (0.941.00) 0.96 (0.881.00) 0.98 (0.951.00) 0.97 (0.921.00)

0.99 (0.961.00)

0.91 (0.840.98) 0.98 (0.951.00) 0.95 (0.891.00) 0.99 (0.981.00) 1.00 (0.991.00) 0.98 (0.961.00) 0.99 (0.971.00) 0.97 (0.911.00) 0.97 (0.921.00) 0.89 (0.790.98) 0.96 (0.891.00) 0.98 (0.941.00) 0.96 (0.891.00) 0.98 (0.951.00) 0.85 (0.750.95) 0.93 (0.841.00) 0.66 (0.470.84) 0.77 (0.650.89) 0.97 (0.911.00)

Anti-acetylcholine receptor antibodies Costa 3 0.39 (0.250.52) Padua 2 0.44 (0.300.59) Howard 1 0.71 (0.670.75) Lefvert 1 0.71 (0.650.76) Limburg 1 0.50 (0.430.57) Nicholson 1 0.48 (0.360.59) Repetitive nerve stimulation 3 Oeya 3 Costab Paduac 2 Nicholsond 1 1 Kennette Single ber electromyography 4 Benatarf Costag 4 3 Paduag Oeyh 3 Rouseevi 3 Ukachokei 3 Tensilon test Nicholson
a b

Anti-acetylcholine receptor antibodies Costa 3 0.98 (0.941.00) Lefvert 1 0.87 (0.840.91) Limburg 1 0.90 (0.870.93) Howard 1 0.92 (0.900.94) Nicholson 1 0.96 (0.921.00) Lindstrom 1 0.87 (0.830.91) Repetitive nerve stimulation 4 Costaa Nicholsonb 1 Mierc 1 1 Rubind Kennette 1 Single ber electromyography 4 Benatarf 4 Costag Tensilon test Nicholson 1 0.98 (0.941.00) 0.62 (0.470.76) 0.32 (0.160.49) 0.88 (0.820.94) 0.53 (0.400.66) 0.75 (0.600.90) 0.98 (0.941.00) 0.88 (0.780.97)

0.39 (0.230.56) 0.33 (0.200.47) 0.15 (0.200.48) 0.32 (0.140.50) 0.11 (0.020.20) 0.62 (0.460.77) 0.97 (0.921.00) 0.99 (0.951.00) 0.94 (0.871.00) 0.92 (0.911.00) 0.83 (0.720.94) 0.92 (0.831.00)

Single nervemuscle studied (facial nerve-orbicularis oculi). Based on four studies (facial nerve-nasalis, accessory nerve-trapezius, radial nerve-anconeus and ulnar nerve-abductor digiti minimi). c Single nervemuscle studied (truncus primarius superior-abductor digiti minimi). d The number and specic nervemuscles pairs studied are not reported. e Single nervemuscle studied (radial nerve-anconeus). f Volitional concentric needle electrode single ber electromyography in frontalis. g Volitional single ber electrode single ber electromyography in orbicularis oculi. h Stimulated single ber electrode SFEMG of orbicularis oculi. i Volitional single ber electrode SFEMG in the frontalis muscle.

a Based on four studies (facial nerve-nasalis, accessory nerve-trapezius, radial nerve-anconeus and ulnar nerve-abductor digiti minimi). b The number and specic nervemuscles pairs studied are not reported. c Based on studying the phrenic nerve-diaphragm. d Single nervemuscle studied (trigeminal-masseter). e Single nervemuscle studied (radial nerve-anconeus). f Volitional concentric needle electrode single ber electromyography in frontalis. g Volitional single ber electrode single ber electromyography in orbicularis oculi.

myasthenia and that a positive test result is more helpful than a negative test result in ocular myasthenia (Table 3). 4.5. Repetitive nerve stimulation

summarized in Tables 2a and 2b. The pooled estimate of sensitivity for the diagnosis of ocular myasthenia was lower (0.44) for the higher methodological quality cohort studies [11,12,24] than for the less well-designed casecontrol studies (0.66) [18,2022] (Table 2a). The pooled estimate of sensitivity for the diagnosis of generalized myasthenia was higher among the cohort studies (0.96) than among the case control studies (0.90) (Table 2b). Specicity for both ocular and generalized disease was generally excellent 0.970.99 irrespective of study design. Likelihood ratios based on the cohort studies indicate that both positive and negative test results are extremely useful for the diagnosis of generalized

We identied six studies that examined the diagnostic accuracy of repetitive nerve stimulation (RNS). These studies were extremely heterogeneous with respect to both the choice of individual nervemuscle pairs as well as the number of nervemuscle pairs examined. One study did not indicate, which or how many nervemuscle pairs were studied [24]. Other studies examined specic nervemuscle pairs such as the trigeminal-nerve/masseter [25], the phrenic nerve/diaphragm [23] or the radial nerve-anconeus [27]. Repetitive nerve stimulation was performed on four different nervemuscle pairs in the study of the highest methodological quality [12].

464

M. Benatar / Neuromuscular Disorders 16 (2006) 459467

Table 3 Summary estimates for the diagnosis of myasthenia gravis Test Ice test Rest testa Tensilon testa Antibody (casecontrol) Antibody (cohort) RNSb SF-SFEMG (frontalis) SF-SFEMG (o.oculi) SF-SFEMG (EDC)a CN-SFEMG (frontalis)a Ocular myasthenia Sensitivity 0.94 (0.900.99) 0.99 (0.961.00) 0.92 (0.831.00) 0.66 (0.630.69) 0.44 (0.370.52) 0.29 (0.220.36) 0.86 (0.780.94) 0.97 (0.941.00) 0.62 (0.460.77) Specicity 0.97 (0.941.00) 0.91 (0.840.98) 0.97 (0.911.00) 0.99 (0.981.00) 0.98 (0.951.00) 0.94 (0.910.98) 0.73 (0.630.83) 0.92 (0.889.97) 0.96 (0.891.00) LRC 31 11 31 66 22 4.8 3.2 12 15.5 LRK 0.06 0.01 0.08 0.34 0.57 0.76 0.19 0.03 0.40 Generalized myasthenia Sensitivity 0.82 (0.750.89) 0.50 (0.310.69) 0.88 (0.780.97) 0.90 (0.880.91) 0.96 (0.930.99) 0.79 (0.740.84) 0.98 (0.941.00) 0.75 (0.600.90) Specicity 0.96 (0.931.00) 0.97 (0.901.00) 0.97 (0.921.00) 0.99 (0.981.00) 0.99 (0.971.00) 0.97 (0.950.99) 0.98 (0.951.00) 0.96 (0.881.00) LRC 20.5 16.7 29 90 96 26 49 18.8 LRK 0.19 0.52 0.12 0.10 0.04 0.22 0.02 0.26

Numbers in bold are those LR that cause marked changes in probability of disease and are based on studies of high methodological quality. a Statistics based on a single study. b Summary estimate of questionable value given heterogeneity between studies.

Summary estimates of sensitivity and specicity were calculated for RNS (Table 3) but should be interpreted with caution given the heterogeneity of these studies. The results of the individual studies may be more meaningful (Tables 2a and 2b). Kennett and Fawcett reported a sensitivity of 0.11 for RNS of the radial nerve with recording from anconeus for the diagnosis of ocular myasthenia [27]. Padua and colleagues reported a sensitivity of 0.34 for the diagnosis of ocular myasthenia based on stimulation of the truncus primarius superior [11]. Interestingly, the sensitivity of RNS for the diagnosis of ocular myasthenia was no better in the study that examined four nervemuscle pairs [12], although the sensitivity in this study was high (0.98) for the diagnosis of generalized myasthenia. Specicity was high for both ocular (0.97) and generalized (0.95) disease [12]. Repetitive stimulation of the phrenic nerve yielded low sensitivity (0.32), but high specicity (0.97) for generalized myasthenia [23]. Better sensitivity (0.88) for the diagnosis of generalized disease was reported for trigeminal nerve stimulation (with recording from the masseter muscle) with similarly high specicity (0.98) [25]. Kennett and Fawcett reported a sensitivity of 0.53 for the diagnosis of generalized myasthenia based on RNS of the radial nerve [27]. The one study in which RNS was performed on multiple nervemuscle pairs attempted to correlate the location of clinical weakness with abnormalities on RNS of specic nervemuscle pairs [12]. These investigators reported the highest sensitivity (0.89) for the diagnosis of what they termed axial disease based on RNS of the spinal accessory nerve-trapezius nervemuscle pair [12]. Given the small sample size upon which this estimate is based, the 95% condence intervals range from 0.681.0 (using the exact binomial method). RNS of the spinal accessory nerve with

recording from trapezius also yielded the best sensitivity for the diagnosis of ocular disease, but remained low (0.33). RNS with recording from nasalis and anconeus yielded the best sensitivity for the diagnosis of bulbar disease0.54 and 0.46, respectively [12]. The conclusion seems to be that there are insufcient data to reliably correlate the location of clinical weakness with abnormalities of RNS of specic nervemuscle pairs. There are similarly insufcient data to consider a receiver-operating characteristic (ROC) curve analysis to examine the dependency of diagnostic accuracy on the number of nervemuscle pairs examined with repetitive nerve stimulation. 4.6. Single ber electromyography The studies that examined the diagnostic utility of single ber electromyography (SFEMG) were also heterogeneous. Most used a single-ber needle electrode [913], but one used a concentric needle electrode [8] The frontalis muscle was examined in some studies [8,9,13] whereas the orbicularis oculi muscle was examined in others [1012]. Stimulated SFEMG was performed in one study [10] while most studies employed the volitional technique [8,9,1113]. All of these studies employed a cohort design. In several studies [8,9,11,13], application of the reference standard yielded three groups of subjectsthose with myasthenia, those without myasthenia and a third group in which the diagnosis was indeterminate. These studies required a sensitivity analysis to determine the impact on sensitivity and specicity of including the indeterminate group rst among those with myasthenia and then among those without myasthenia. Two studies examined the utility of a single ber electrode SFEMG in the frontalis muscle in ocular

M. Benatar / Neuromuscular Disorders 16 (2006) 459467

465

myasthenia [9,13]. The rst reported a sensitivity of 0.92 and a specicity of 0.66. Sensitivity analysis shows that diagnostic sensitivity may have been as low as 0.70 and specicity as low as 0.53 [9]. The second study reported a sensitivity of 0.83 and a specicity of 0.77, but the reference standard used in this study left more than half of the patients with an indeterminate diagnosis [13]. The results of SFEMG are not reported for this indeterminate group and so it is not possible to undertake a formal sensitivity analysis. The pooled estimates of sensitivity and specicity for single-ber electrode SFEMG in the frontalis muscle in patients for the diagnosis of ocular myasthenia are 0.86 and 0.73, respectively (Table 3). Three studies examined the utility of single ber electrode SFEMG in the oribularis oculi muscle [1012]. The study by Padua and colleagues reported a sensitivity of 0.99 and a specicity of 0.85 [11], but the reference standard used yielded an indeterminate disease status in half of the patients included in this study. Sensitivity analysis shows that diagnostic sensitivity may have been as low as 0.56 and specicity as low as 0.81 [11]. The use of the results of SFEMG as part of the reference standard (incorporation bias) may also have contributed to an elevated estimate of the tests sensitivity. The study by Costa and colleagues reported a sensitivity of 0.97 and a specicity of 0.98 [12]. One potential problem with this study is that the index test (SFEMG) was not really used as a diagnostic test. Although, the authors recruited a consecutive series of patients referred for the diagnosis of myasthenia, they used their reference standard to separate these patients into two groups (those with and those without myasthenia) and then used the results of SFEMG in the control group (those without myasthenia) to dene the upper limit of normal for jitter. This normative data was then used to evaluate the sensitivity of the technique in the population of patients with myasthenia [12]. At the very least, this approach will have ensured a very high specicity for the test. Finally, the study by Oey and colleagues examined the utility of stimulated SFEMG in the orbicularis oculi muscle [10]. These authors reported a sensitivity of 0.94 and a specicity of 0.93, although the estimate of sensitivity may have been articially inated based on the use of a relatively low threshold for dening jitter as abnormal. The pooled estimates of sensitivity and specicity for single-ber electrode SFEMG of the orbicularis oculi muscle for the diagnosis of ocular myasthenia, therefore, are 0.97 and 0.92 (Table 3). A single study examined the accuracy of concentric needle SFEMG of the frontalis muscle and reported a sensitivity of 0.62 and a specicity of 0.96 [8]. Sensitivity analysis indicates that diagnostic sensitivity may have been as low as 0.50 and specicity as low as 0.90 [8]. We only identied two studies that examined the accuracy of SFEMG for the diagnosis of generalized myasthenia [8,12]. The rst used a concentric needle in the frontalis muscle and reported a sensitivity of 0.75 and a

specicity of 0.96 [8]. Sensitivity analysis shows that diagnostic sensitivity may have been as low as 0.53 and specicity as low as 0.90 [8]. The second study by Costa et al. used a single-ber electrode in the extensor digitorum communis (EDC) muscle and reported a sensitivity and specicity of 0.98 [12]. The limitations of this study have already been discussed (see above). Pooled estimates of sensitivity and specicity were not calculated given that these two studies examined different muscles using different types of recording electrodes. The data in these studies were insufciently described to permit a receiver-operating characteristic (ROC) curve analysis to evaluate the diagnostic impact of varying the threshold used to dene jitter as abnormal. The likelihood ratios based on single ber electrode SFEMG of the orbicularis oculi and extensor digitorum communis muscles in ocular and generalized myasthenia, respectively, indicate that these tests lead to marked changes from pre-test to post-test probability of disease. Concentric needle SFEMG is most helpful when jitter is abnormal, but negative test results lead to only modest changes in the pretest probability of myasthenia (Table 3).

5. Discussion We have systematically reviewed the literature that pertains to the use of six commonly used tests for the diagnosis of myasthenia gravis. We identied only seven cohort studies [813,24] in which subjects were recruited on the basis of a suspicion for the diagnosis of myasthenia gravis. The remaining studies all employed a casecontrol design in which subjects already known to have myasthenia gravis were compared to a group of control subjects (either healthy subjects or those with other neurological or autoimmune diseases). A previous investigation of the quantitative effects of methodological aspects of study design on estimates of diagnostic accuracy found that studies including non-representative populations (e.g. casecontrol studies) showed the greatest tendency to overestimate both sensitivity and specicity [2]. In this study the lack of blinded assessment of the index test and reference standard, retrospective data collection and nonconsecutive recruitment of study subjects were found to have less impact on estimates of diagnostic accuracy, once the issue of case selection (spectrum bias) has been overcome [2]. For this reason we have generally presented separate analyses for casecontrol and cohort studies, but have pooled studies that differed with respect to these other methodological issues. All of the studies that examined the accuracy of the ice test and the sleep (rest) test employed a casecontrol design, suggesting that the available literature substantially over-estimates the diagnostic utility of these tests. The most reasonable conclusion seems to be that we simply do not have reliable estimates of how these tests

466

M. Benatar / Neuromuscular Disorders 16 (2006) 459467

would perform under real clinical conditions. Perhaps surprisingly, we only encountered a single study that evaluated the diagnostic accuracy of the Tensilon test. These authors reported high sensitivity (0.92 for ocular disease and 0.88 for generalized myasthenia) and high specicity (0.97 for both forms of myasthenia). Studies that examined the accuracy of assays for antiacetylcholine receptor antibodies were of mixed methodological quality. The better designed cohort studies indicated high sensitivity (0.96) for the diagnosis of generalized myasthenia, but relatively poor sensitivity (0.44) for the diagnosis of ocular myasthenia. This sensitivity for the diagnosis of ocular disease is lower than commonly believed. The specicity of this test for the diagnosis of any form of myasthenia is extremely good (0.980.99). Studies of RNS were of very mixed methodological quality and the marked heterogeneity between the studies in terms of study design, the number of nervemuscle pairs studied as well as the specic choice of nervemuscle pairs, makes it extremely difcult to draw any conclusions with condence about the diagnostic utility of this technique. The available data would seem to support the conclusion that the sensitivity of RNS for the diagnosis of ocular myasthenia is quite poor and that the diagnostic yield for generalized disease is better if multiple nervemuscle pairs are studied [12]. Studies of the diagnostic accuracy of SFEMG were generally of better quality given that they all employed a cohort design. Although other methodological issues raise some cause for concern, these studies paint a general picture of higher sensitivity for the diagnosis of ocular myasthenia when a single-ber electrode is used to record jitter from the orbicularis oculi muscle (0.97) compared to studies with a single ber electrode recording from the frontalis muscle (0.86) and a concentric needle recording from frontalis (0.62). The data are more limited for generalized myasthenia. The specicity of SFEMG for the diagnosis of generalized MG was high, whilst the specicity for ocular MG was more variable, ranging from as low as 0.66 to as high as 0.98. The conclusion seems to be that further studies of the accuracy of each of these index tests are warranted. Perhaps not surprisingly, the presence of anti-acetylcholine receptor antibodies offers the highest specicity for the diagnosis of myasthenia gravis, underscoring the fact that the other tests (notably the electrophysiological studies) serve only to demonstrate the presence of a disorder of neuromuscular transmission; the clinical context is required for the diagnosis of myasthenia gravis. The most important message is that the study populations in whom these tests are evaluated should mimic the population in which the test will eventually be used, which is to say that the study population should comprise a series of patients in whom the diagnosis of myasthenia gravis is considered. It would be preferable for these studies to enroll subjects prospectively and in a

consecutive fashion. The reference standard for the diagnosis of myasthenia should be explicitly stated and should not incorporate elements of the index test. The investigators performing the index test should be blinded to the results of the application of the reference standard and vice versa. Finally, given the regional variability of a disease such as myasthenia, it would be important for future studies to consider separately the sensitivity of RNS of different nervemuscle pairs and of SFEMG of different muscles. Until such studies are performed, we are obliged to work with the data we have, but this review would suggest that there is good reason to be less condent than we have in the past about the performance of the tests we commonly use to diagnose myasthenia gravis.

References
[1] Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Br Med J 2003;326(7379):414. [2] Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evidence of design-related bias in studies of diagnostic tests. J Am Med Assoc 1999;282(11):10616. [3] Greenhalgh T. How to read a paper. Papers that report diagnostic or screening tests. Br Med J 1997;315(7107):5403. [4] Mulrow CD, Linn WD, Gaul MK, Pugh JA. Assessing quality of a diagnostic test evaluation. J Gen Intern Med 1989;4(4):28895. [5] Literature review of the usefulness of repetitive nerve stimulation and single ber EMG in the electrodiagnostic evaluation of patients with suspected myasthenia gravis or LambertEaton myasthenic syndrome. Muscle Nerve 2001;24(9):12391247. [6] Scherer K, Bedlack RS, Simel DL. Does this patient have myasthenia gravis? J Am Med Assoc 2005;293(15):190614. [7] Edlund W, Gronseth G, So Y, Franklin G, for the Quality Standards Subcommittee and the Therapeutics and Technology Assessment Subcommittee. Clinical Practice Guideline Process Manual [online]. Available at: http://www.aan.com/professionals/practice/pdfs/ 2004_Guideline_Process.pdf, Accessed January 2006. [8] Benatar M, Hammad M, Doss-Riney H. Concentric needle single ber electromyography for the diagnosis of myasthenia gravis. Muscle Nerve 2006; Apr 26 [Epub ahead of print]. [9] Rouseev R, Ashby P, Basinski A, Sharpe JA. Single ber EMG in the frontalis muscle in ocular myasthenia: specicity and sensitivity. Muscle Nerve 1992;15(3):399403. [10] Oey PL, Wieneke GH, Hoogenraad TU, van Huffelen AC. Ocular myasthenia gravis: the diagnostic yield of repetitive nerve stimulation and stimulated single ber EMG of orbicularis oculi muscle and infrared reection oculography. Muscle Nerve 1993;16(2):1429. [11] Padua L, Stalberg E, LoMonaco M, Evoli A, Batocchi A, Tonali P. SFEMG in ocular myasthenia gravis diagnosis. Clin Neurophysiol 2000;111(7):12037. [12] Costa J, Evangelista T, Conceicao I, de Carvalho M. Repetitive nerve stimulation in myasthenia gravisrelative sensitivity of different muscles. Clin Neurophysiol 2004;115(12):277682. [13] Ukachoke C, Ashby P, Basinski A, Sharpe JA. Usefulness of single ber EMG for distinguishing neuromuscular from other causes of ocular muscle weakness. Can J Neurol Sci 1994;21(2):1258. [14] Czaplinski A, Steck AJ, Fuhr P. Ice pack test for myasthenia gravis. A simple, noninvasive and safe diagnostic method. J Neurol 2003; 250(7):8834.

M. Benatar / Neuromuscular Disorders 16 (2006) 459467 [15] Ellis FD, Hoyt CS, Ellis FJ, Jeffery AR, Sondhi N. Extraocular muscle responses to orbital cooling (ice test) for ocular myasthenia gravis diagnosis. J Aapos 2000;4(5):27181. [16] Ertas M, Arac N, Kumral K, Tuncbay T. Ice test as a simple diagnostic aid for myasthenia gravis. Acta Neurol Scand 1994;89(3):2279. [17] Golnik KC, Pena R, Lee AG, Eggenberger ER. An ice test for the diagnosis of myasthenia gravis. Ophthalmology 1999;106(7):12826. [18] Howard Jr FM, Lennon VA, Finley J, Matsumoto J, Elveback LR. Clinical correlations of antibodies that bind, block, or modulate human acetylcholine receptors in myasthenia gravis. Ann NY Acad Sci 1987;505:52638. [19] Kubis KC, Danesh-Meyer HV, Savino PJ, Sergott RC. The ice test versus the rest test in myasthenia gravis. Ophthalmology 2000;107(11):19958. [20] Lefvert AK, Bergstrom K, Matell G, Osterman PO, Pirskanen R. Determination of acetylcholine receptor antibody in myasthenia gravis: clinical usefulness and pathogenetic implications. J Neurol Neurosurg Psychiatry 1978;41(5):394403. [21] Limburg PC, The TH, Hummel-Tappel E, Oosterhuis HJ. Antiacetylcholine receptor antibodies in myasthenia gravis. Part 1. Relation to clinical parameters in 250 patients. J Neurol Sci 1983; 58(3):35770.

467

[22] Lindstrom JM, Seybold ME, Lennon VA, Whittingham S, Duane DD. Antibody to acetylcholine receptor in myasthenia gravis: prevalence, clinical correlates, and diagnostic value. 1975. Neurology 1998;51(4): 933 (and 936 pages following). [23] Mier A, Brophy C, Moxham J, Green M. Repetitive stimulation of phrenic nerves in myasthenia gravis. Thorax 1992;47(8):6404. [24] Nicholson GA, McLeod JG, Grifths LR. Comparison of diagnostic tests in myasthenia gravis. Clin Exp Neurol 1983;19:459. [25] Rubin DI, Harper CM, Auger RG. Trigeminal nerve repetitive stimulation in myasthenia gravis. Muscle Nerve 2004;29(4):5916. [26] Sethi KD, Rivner MH, Swift TR. Ice pack test for myasthenia gravis. Neurology 1987;37(8):13835. [27] Kennett RP, Fawcett PR. Repetitive nerve stimulation of anconeus in the assessment of neuromuscular transmission disorders. Electroencephalogr Clin Neurophysiol 1993;89(3):1706. [28] Lertchavanakul A, Gamnerdsiri P, Hirunwiwatkul P. Ice test for ocular myasthenia gravis. J Med Assoc Thai 2001;84(Suppl. 1): S131S6. [29] Odel JG, Winterkorn JM, Behrens MM. The sleep test for myasthenia gravis. A safe alternative to Tensilon. J Clin Neuroophthalmol 1991; 11(4):28892.