Authors Aliations
Department of Human Performance, Minnesota State University, Mankato, Mankato, United States Cardiopulmonary Rehabilitation, Havasu Regional Medical Center, Lake Havasu City, United States 3 Department of Sport and Exercise Science, University of Auckland, Auckland, New Zealand
Key words
Abstract
The verication bout has emerged as a technique for conrming true VO2max; however, validity during a single visit is unknown. We evaluated 3 dierent GXT durations with severe intensity verication bouts. On 3 separate days, in counterbalanced order, 12 recreational-trained men completed short (9 1 min), middle (11 1 min), and long (13 2 min) duration GXTs followed by exhaustive, sine wave verication bouts during the same visit. Intensities for verication were set at speeds equivalent to 2-stages minus endGXT speed. No dierences (p < 0.05) in VO2max
(mL/kg/min) were observed between short (49.1), middle (48.2), and long (48.8) protocols. In addition, no dierences in verication bout duration occurred between protocols (3 1 min). Validity of VO2max was strongest for the middle duration protocol (ICC = 0.97; typical error = 1 mL/kg/ min; CV = 2 %). A small, but signicantly higher HRmax (~12 bpm) was observed for the long protocol. Maximum respiratory exchange ratios were inconsistent (ICC ranged 0.580.68). Our ndings indicate GXT-verication bout testing during a single visit is a valid means of measuring true VO2max. The 10 min target for GXT duration was the optimum.
Introduction
Maximal volume of oxygen uptake (VO2max) represents the limit to which oxygen delivery and uptake within exercising muscles can occur [19]. The VO2max represents maximal aerobic capacity and its measurement is used to appraise health and performance as well as develop exercise prescriptions. Hill and Lupton [12] introduced the aerobic capacity concept in 1923, although whether these researchers implied that a nite measure of VO2max was characterized by a plateau in VO2 relative to increased work rate has been questioned [24]. Indeed, contemporary researchers have suggested a true VO2max can be demonstrated without a plateau phenomenon [7, 29]. Traditionally, a graded exercise test (GXT) duration of 812 min was suggested as optimum [6]; however, this dogma has been questioned [20]. Howley et al. [15] originally summarized a host of secondary criteria for verifying attainment of VO2max obtained during a GXT. The most prevalent of these criteria actually reported in research studies are a respiratory exchange ratio (RER) above 1.10 and attainment within 10 bpm of age-predicted heart rate maximum (HRmax)
accepted after revision November 18, 2010 Bibliography DOI http://dx.doi.org/ 10.1055/s-0030-1269914 Published online: January 26, 2011 Int J Sports Med 2011; 32: 266270 Georg Thieme Verlag KG Stuttgart New York ISSN 0172-4622 Correspondence Dr. Robert W. Pettitt Department of Human Performance Minnesota State University Mankato Mankato, MN 56001 Tel.: + 1/507/389 1811 Fax: + 1/507/389 5618 robert.pettitt@mnsu.edu
[23]. Poole et al. [27] observed that such criteria were often surpassed well prior to attainment of VO2max and exercise intolerance and recommended abandonment of such secondary criteria. Instead, these authors recommended use of a verication bout. The verication bout subsequent to a GXT has emerged as a technique for identifying true VO2max [21]. Some researchers have performed verication bouts on separate days [3, 27]; however, such protocols require 2 visits which adds time and technician cost. Conversely, others [8, 22] have conducted verication bouts subsequent to a GXT during a single visit and have reported valid observations of true VO2max. Unfortunately, authors of these studies neglected to report validity data of these measures that take into account individual subject variations, which can be problematic in measurement research [14]. A second concern is that most studies utilized supra-maximal intensities for their verication bouts [3, 8, 11, 22]. Too high of an intensity for the verication bout may result in too short of an exercise duration for VO2 to achieve maximum consistently between subjects [22]. Therefore, we examined the consistency of VO2max values derived from 3 dierent GXT dura-
Kirkeberg JM et al. Protocols for Verifying VO2max Int J Sports Med 2011; 32: 266270
tions and subsequent submaximal, severe intensity [13] verication bouts. Our hypothesis was that the 10 min target duration (middle) would yield more consistent VO2max values in comparison to 8 min (short) or 14 min (long) target durations. We also examined the secondary criteria of RER above 1.10 and HRmax for each protocol, as these are the most commonly reported secondary criteria for conrming VO2max [23].
Table 1 Peak exercise durations (M SD) for the 3 protocols. Protocol short middle long
a
Experimental procedures
Upon entering the laboratory, each subjects height and mass were measured directly. These data along with their response to a question on their physical activity rating [9], scaled from 0 to 10, were inserted into a non-exercise VO2max prediction equation [16]. Treadmill speed required to evoke VO2max (Speak) at a 3 % grade was calculated by the following metabolic equations [1]: [a] Wpeak (m/min) = (Relative VO2max 3.5)/0.2 [b] 3 % Grade Speak = Speak (Speak * 0.027) Where 3.5 represents resting VO2, 0.2 assumes an additional mL O2/kg/min for each m/min rise in running speed, and 0.027 represents the correction factor for grade. In counterbalanced fashion, subjects completed one of 3 GXT protocols at a 3 % grade whereby speed began at 134 m/min and advanced to a specic speed (rounded to the nearest 2.68 m/min, the limit of the treadmill used). A 2 min warm-up of walking at 67 m/min preceded each protocol. Speed increments for each minute were calculated for short (8 min), middle (10 min), and long (14 min) target durations using: [c] Increment = [(3 % Grade Speak 134)/(target duration)] Where 1.34 m/min is starting speed and the speed evoking the target duration commences 1 min prior to the expected VO2 response. For example, if 10 min is the target duration for evoking the predicted VO2max, Speak would be set at 9 min. A heart rate strap (Polar Instruments, Oulu, Finland) was axed to the thorax and heart rate was recorded telemetrically by a metabolic analyzer (Parvomedics TrueOne, Sandy, UT). The analyzer also recorded expired VO2 and VCO2, on a breath by breath basis; however, the data were averaged to 30 s time samples, as 30 s time-sampling is the most commonly reported technique [28]. Flow and gas calibration was performed prior to each test as per the manufacturers guidelines. Subjects were given strong verbal encouragement for each test. Termination of the GXT was determined by the subject dis-
Statistical analyses
Separate one way analyses of variance (ANOVA) with repeated measures and Bonferroni adjustment for post hoc analysis were used to test for duration dierences. Separate 2-way ANOVAs with repeated measures were used to compare VO2max, RERmax, and HRmax between the 3 GXT durations and corresponding verication bouts. Relative validity between the GXT and verication bout for each physiological measurement was determined using intraclass correlations (ICC ). Typical error (or standard error of measurement), both average variation and coecient of variation (CV), were determined using the techniques described by Hopkins [14]. Alpha for all inferential statistics was set at 0.05.
Kirkeberg JM et al. Protocols for Verifying VO2max Int J Sports Med 2011; 32: 266270
mounting with the aid of arm rails and straddling the treadmill belt at the point of intolerance. Speed was decreased from the end speed (Send) to an intensity of 0.50 (Send 134 m/min) + 134 m/ min for a 3 min cool-down exercise period, as per our pilot work (refer to discussion for details on cool-down intensity). Following the cool-down period, the subject began an exhaustive, sine wave verication bout (i. e., constant speed). Intensity for the verication bout was equivalent to Send minus 2 stages, where stages were derived using equation c. Prior to commencing the study, pilot testing on recovery durations and intensities, along with verication bout intensities were examined. Although prior investigators [8, 11, 22] have used a supramaximal intensity for the verication bout, our pilot testing revealed our protocol was more suitable for permitting at least a 2 min verication bout subsequent to the 3 GXT durations tested. As subjects indicated their desire to terminate exercise with a thumb up gesture, in both the GXT and verication bouts, they were asked to rate their level of perceived exertion (RPE ranging 620) by pointing to a chart [5].
Short 60
VO2 (mL/kg/min)
Table 3 Intraclass correlation (), typical error, coecient of variation ( %) for the 3 duration GXTs and corresponding verication bouts. ICC VO2max short middle long HRmax short middle long RERmax short middle long 0.89 0.95 0.91 0.95 0.97 0.96 0.58 0.58 0.68 Typical Error mL/kg/min 1.87 1.04 1.44 bpm 2.1 1.6 1.4 L L 1 0.03 0.03 0.02 Coecient of Variation % 3.8 2.1 3.0 % 1.1 0.9 0.8 % 3.1 2.7 1.8
50 40 30 20 10 0 0 60 5 10 Middle 15 20
VO2 (mL/kg/min)
50 40 30 20 10 0 0 5 10 Long 15 20
60
VO2 (mL/kg/min)
50 40 30 20 10 0 0 5 10 Time (min) 15 20
Discussion
Our study is the rst to demonstrate high test-retest consistency between the GXT-verication bouts for same day testing using intensities below the Send. Correlation coecient values in our study exceed those reported by Astorino et al. (r = 0.89) [3], who performed their verication bouts on a separate day from the GXT. The CV for our middle duration (2.1 %) was less than the variability reported by Midgley et al. [22] (3.5 and 3.9 %) and Hawkins et al. [11] (3.2 and 4.3 %). Thus, equal if not superior verication bouts can be accomplished during a single testing session. One criticism raised by Noakes [25] is when researchers conclude true VO2max exists when grouped means between 2 tests do not dier signicantly from each other. Specically, he argued that one or more subjects may dier substantially on 2 tests but not render 2 sets of observations statistically dierent. The issue of how to judge whether individual subjects vary on repeated measures of a test was addressed eloquently by Hopkins [14]. He recommended absolute measures of consistency (i. e., typical error and CV) are the appropriate statistics that help the exercise scientist judge how robust a measurement technique is from a practical standpoint. In our middle duration protocol (10 min target GXT), variability was ~1 mL/kg/min (CV of 2 %) between the GXT and verication bout ( Table 3). This variability is nearly one half that of the absolute variability values reported by prior studies using verication bouts. Thus, our ndings are both signicant and practically relevant. Smaller increment GXTs (longer target durations) resulted in lower Send values, a nding consistent with others [2, 6, 30]. Despite using dierent intensities for the 3 constant-speed verication bouts, based on Send, no dierences in verication dura tions were observed ( Table 1). Such a nding refutes the possible notion that dierences in GXT duration inuence exercise tolerance during a verication bout that commences shortly following the termination of a GXT. Similar verication bout
Fig. 1 Representative mean VO2max (mL/kg/min) values during the short (50.8), middle (51.7), and long (50.3) protocols for 23-year-old subject.
Table 2 Physiological variables (M SD) from the 3 protocols. GXT VO2max (mL/kg/min) short middle long short middle long short middle long 49.24 5.31 48.90 5.08 49.07 4.70 1.11 5.23 1.10 4.14 1.07 3.13b 187.7 9.6 186.5 9.7 188.8 7.1a Verication Bout 48.97 5.95 47.45 4.48 48.44 5.02 1.00 4.99c 1.01 4.59c 1.00 3.44c 185.3 9.6 185.8 9.2 188.7 6.6a
RERmax
HRmax (bpm)
Higher than middle and short; b Lower than short and middle; c Lower than GXT
(p < 0.05)
Secondary criteria
RER was lower in the verication bouts vs. the GXT protocols (main eect, p < 0.05) and was lower for the long vs. the short or middle durations (interaction, p < 0.05). As such, RERmax consistency between the GXT and verication bout was weak ( Table 3). Summarized in Table 2, the long duration GXT
Kirkeberg JM et al. Protocols for Verifying VO2max Int J Sports Med 2011; 32: 266270
evoked the highest HRmax values (p < 0.05) both subsequent to the GXT and verication bouts, respectively. Nearly all subjects (11 of 12) achieved within 10 bpm of age-predicted HRmax, as estimated using 220 minus age. Most subjects achieved an RER above 1.10 in the short (7 of 12) and middle (9 of 12) GXT protocols; however, the frequency of this criterion decreased for the long (3 of 12) protocol.
durations occurred conceivably because aggregate energy expenditures (i. e., aerobic contribution and depletion of anaerobic capacity) were similar between the 3 GXTs as a function of the power-duration relationship; however, direct methods to quantify total expenditure in the present study were not made. In the present study, we observed lower CV measures for our GXT and verication bouts in comparison to prior groups [11, 22]. Less variability in our study may be explained by the fact that these research groups used supramaximal intensities for verication stages whereas our protocol was below Send. 2 other groups [7, 29] have reported submaximal loads successfully veried VO2max; however, neither reported absolute measures of variability making comparisons dicult. Use of submaximal intensities may provide better verication measures of VO2max in that too high of an intensity during the verication bout may prohibit attainment of a valid VO2max (e. g., verication bouts lasting only 12 min) [21]. Hill et al. [13] described that attainment of VO2max during constant-load, severe exercise conforms to the same hyperbolic relationship existing between power and exercise tolerance (i. e., intensities above the critical power (CP) which evoke a slow component of VO2 approaching maximum at a faster rate). Indeed, small increases in intensity above the CP increases the rates at which metabolites such as H + and Pi accumulate within the exercising muscle [17]. Hill et al. [13] referred to this concern as the fourth exercise domain; an intensity above CP in which local fatigue results in too short of an exercise duration for attaining VO2max. In an eort to prescribe a severe intensity for our verication bouts but not encroach the fourth exercise domain (NB, a greater concern for our short target duration), we assigned our verication intensities using 2-stages preceding Send, a decision based on pilot work. Such a protocol permitted our sample of recreational-trained men to exercise a suitable duration for each verication bout ( Table 1). A second dierence with our study, observing lower CV measures between GXT and verication bouts, in comparison to prior groups [11, 22], was our use of a custom GXT protocol. Hawkins et al. [11] used the same % grade increment increase, for men and women; however, these authors did not report GXT durations which would expectantly be dierent based on dierences in aerobic capacity. Conversely, Midgley et al. [22] used a custom protocol that netted a GXT duration of ~12 min. Although prior data suggest GXT duration is less relevant for yielding similar VO2max values [20]; the eect of GXT duration on a subsequent, same day verication bout has not been reported. At the onset of the study, we applied a regression technique [9, 16] we deemed appropriate for our sample of recreational-trained subjects, to derive an expected GXT duration. The ability to validly prescribe GXT durations, with non-exercise regression equations, on highly-trained athletes is an area in need of further analysis. Also based on pilot work, we selected 50 % of the dierence in starting speed and Send to serve as the intensity for our active recovery. We speculated this would minimize risk of syncope subsequent to the GXT; yet, be low enough of an intensity to eliminate reliance on the fast glycolytic energy pathway (i. e., lactate production). Indeed, cool-down exercise at intensities below threshold can augment clearance of blood lactate and increase hepatic production of blood glucose along with muscle glycogen synthesis [4]. Postexercise VO2 values preceding commencement of the verication bout decreased below 50 % of the dierence in VO2max during the GXT and VO2 values at the start of the verication bout (see Fig. 1 for representative). Other
investigators have used extremely low intensities for their cooldown exercise period [8, 22]. One reservation for using too low of a cool-down intensity is that the verication bout is too abrupt [22]. Conversely, use of a moderate intensity (below threshold) may help better maintain circulation of oxygen and regulatory hormones along with metabolic inertia within the engaged skeletal muscles [26]. Achievement of a RER above 1.10 has been suggested as a secondary criterion for verifying VO2max [15]. Poole et al. [27] recommended abandoning this criterion based on the observation that many of their subjects surpassed a RER of 1.10 well prior to attaining VO2max. In the present study, RERmax values for the verication bouts were lower than the GXTs for each protocol we examined. Rossiter et al. [29] reported similar trends for RER, when same-day GXTs were performed, for both their submaximal and supramaximal verication trials. The likely cause for lower RER values during same day verication is a diminished non-metabolic CO2 production (i. e., VO2 between the GXT and verication were similar, see Table 2). We also noted that RERmax uctuated frequently between test days, was inconsist ent between the GXTs and verication bouts ( Table 3), and that many of our subjects failed to achieve the 1.10 criterion for their GXT. As such, our ndings concur with Poole et al. that RER fails to serve as a valuable metric for VO2max testing. Similar to the above 1.10 RER criterion, attaining within 10 bpm of age-predicted HRmax has also been suggested for verifying attainment of VO2max [15]; however, this criterion has also been criticized [27]. Midgley et al. [22] argued that age-predicted HRmax may inate estimates on athletes who routinely experience trained-induced decreases in HRmax. Alternatively, these authors suggested a comparison of HRmax from the GXT and verication bout may serve as a more useful secondary criterion due to the improbability of voluntarily replicating 2 submaximal HRmax values from incremental and constant-load exercise bouts carried out to intolerance. Interestingly, our data actually dem onstrates this is possible ( Table 2). Specically, we observed a small, but signicant increase in HRmax for the long duration GXT and subsequent verication bout, although most subjects achieved 10 bpm of age-predicted HRmax for each protocol. Achievement of this slightly higher HRmax (~12 bpm) during the long protocol was consistent among subjects ( Table 3) and occurred despite observations that slightly lower HRmax values were observed between the shorter GXT and verication bouts. We are condent subjects gave a maximal eort based on ratings of perceived exertion. Therefore, our only explanation for this nding is the dierence in total duration between protocols. At very high intensities, HR is mediated by circulating catecholamine concentrations [18]. The long protocol perhaps permitted better humoral activation of cardiac pacing cells. When target GXT durations were estimated for 8, 10, and 14 min, using equations ac, subjects achieved VO2max in GXT durations comparable to those targets ( Table 1). Should a practitioner desire to achieve GXT durations between 8 and 12 min, our recommendation is to select 10 min as the target. Albeit longer protocols (e. g., 13-min) may evoke higher and true HRmax values ( Table 2), the expense of such is a compromise of less consist ent verication of VO2max ( Table 3). Although longer duration GXTs may be equally valid [20], the spirit of the original study by Buchfuhrer et al. [6] was to demonstrate that a valid measure of VO2max may be gained in a shorter duration. Our ndings indicate that 10 min is the optimum GXT duration to strive for when attempting GXT-verication bouts within the same session. Such
Kirkeberg JM et al. Protocols for Verifying VO2max Int J Sports Med 2011; 32: 266270
References
1 ACSMs Guidelines for Exercise Testing and Prescription. Baltimore: Lippicott, Williams & Wilkins; 2010 2 Astorino TA, Rietschel JC, Tam PA, Taylor K, Johnson SM, Freedman TP, Sakarya CE. Reinvestigation of optimum duration of VO2max testing. J Exerc Physiol 2004; 7: 18 3 Astorino TA, White AC, Dalleck LC. Supramaximal testing to conrm attainment of VO2max in sedentary men and women. Int J Sports Med 2009; 30: 279284 4 Belcastro AN, Bonen A. Lactic acid removal rates during controlled and uncontrolled recovery exercise. J Appl Physiol 1975; 39: 932936 5 Borg GAV. Borgs Perceived Exertion and Pain Scales. Champaign, IL: Human Kinetics; 1998 6 Buchfuhrer MJ, Hansen JE, Robinson TE, Sue DY, Wasserman K, Whipp BJ. Optimizing the exercise protocol for cardiopulmonary assessment. J Appl Physiol 1983; 55: 15581564
Kirkeberg JM et al. Protocols for Verifying VO2max Int J Sports Med 2011; 32: 266270
a hypothesis, however, should be explored using women, on subjects with dierent tness levels, and using dierent modes of exercise. The non-signicant mean dierences between GXT and verica tion bouts ( Table 2), combined with the high level of consist ency in measurement ( Table 3) begs the question, why administer a verication bout if it yields the same result? The answer is implied in the name of the technique: to verify the GXT result. Such a procedure is useful particularly when individuals fail to experience a plateau in VO2 despite an encouraged best eort, which is reportedly common [7]. Thus, concerning the verication bout, the more relevant question is what dierence between the GXT and verication VO2 tests is acceptable for concluding true VO2max has been measured? Midgley et al. [21] suggested that ~2 % dierence would be acceptable; however, their recommendation was derived on the error of the metabolic device without consideration for normal biological variation. Thus, such a criterion may be too conservative. If 30 s sampling is used, we recommend true VO2max on recreationaltrained men is veried when GXT and verication measures differ less than 3 %. Using typical errors of the present study, that criterion would approximate to a relative VO2 criterion of less than 1.5 mL/kg/min. Clearly, such standards would vary for different sampling rates and dierent tness levels. Thus, more research is needed in this regard. In summary, the ability to measure true VO2max has implications for monitoring aerobic capacity in response to training and for prescribing exercise intensities [19]. We compared VO2max values between GXTs and severe intensity verication bouts carried out for 3 separate test durations in recreational-trained men. Albeit, small but signicantly higher HRmax values were observed in the long protocol, the 10 min target duration yielded the most consistent results for VO2max verication. The severe intensity verication bout (below Send) provides better consistency of VO2max measurements in comparison to previous studies utilizing supramaximal intensities (i. e., above Send). The verication bout subsequent to a GXT during a single visit is a promising technique for determining true VO2max. Therefore, future research on same day GXT-verication protocols is warranted.