American Journal of Respiratory and Critical Care Medicine

Resting pulmonary function and exercise variables are widely used to stage and monitor idiopathic interstitial pneumonia (IIP). However, the variability of exercise data (maximal exercise and the 6-minute walk test) has not been evaluated definitively. We have prospectively quantified the reproducibility of resting and exercise functional data in fibrotic IIP (idiopathic pulmonary fibrosis, fibrotic nonspecific interstitial pneumonia) and have evaluated interrelationships between variables. Thirty consecutive patients with fibrotic IIP underwent serial resting pulmonary function tests, 6-minute walk (n = 29), and maximal exercise (n = 24) at an interval of 1 week, with all testing performed in accordance with American Thoracic Society standards. Within-subject reproducibility was excellent for 6-minute walk distance (SD/mean = 4.2%) and clinically acceptable for resting pulmonary function indices and VO2max on maximal exercise testing. However, the amplitude of oxygen desaturation at the end of exercise was poorly reproducible in both 6-minute walk and maximal exercise testing (SD/mean > 25%). There was a highly significant relationship between VO2max on maximal exercise testing and 6-minute walk distance (rs = 0.78, p < 0.0001). In fibrotic IIP, the excellent reproducibility of the 6-minute walk distance is a major advantage in routine staging and monitoring, whereas maximal exercise variables are poorly reproducible.

Pulmonary function indices are central to the staging and monitoring of the two categories of idiopathic interstitial pneumonia (IIP) in which fibrosis predominates, hereafter termed fibrotic IIP: idiopathic pulmonary fibrosis (IPF) and fibrotic nonspecific interstitial pneumonia (NSIP). Resting pulmonary function tests provide invaluable prognostic information in fibrotic IIP, both at presentation (14) and with the evaluation of serial trends (58). The severity of oxygen desaturation at the end of maximal exercise testing is a major component of the old and new clinical-radiologic-physiologic (CRP) indices (1, 9). However, maximal treadmill testing is not always readily available and may be impracticable, when there is advanced lung disease or concurrent cardiac disease. Thus, there is increasing interest in less aggressive field exercise testing in diffuse lung disease. The 6-minute walk test (6MWT), widely acknowledged as a valuable clinical tool in chronic obstructive pulmonary disease (COPD), is currently under evaluation in IPF and provides more accurate prognostic information than resting pulmonary function tests in that disease (10).

There is a surprising paucity of data on the reproducibility of exercise testing in fibrotic IIP. Without knowledge of reproducibility, it is difficult to determine, in individual cases, whether an apparent change in severity is significant or merely a reflection of measurement variation. Therefore, we have (1) compared the within-subject reproducibility of resting pulmonary function indices, incremental treadmill testing, and the 6MWT and (2) quantified interrelationships between these variables in 30 patients with fibrotic IIP. Some of the results of this study have been previously presented in the form of an abstract (11).

We recruited consecutive patients presenting to our unit, May 1998 to July 2000, with the clinical features of IPF, as defined by the American Thoracic Society (ATS)/European Respiratory Society consensus committee (12), and a clinical/high-resolution computed tomography (HRCT) diagnosis of fibrotic IIP:

Clinical criteria were as follows:

  1. Exclusion of other known causes of interstitial lung disease

  2. Restrictive ventilatory defect or isolated reduction in gas transfer

  3. Age older than 50 years

  4. Insidious, unexplained dyspnea on exertion

  5. Duration of illness for more than 3 months

  6. Bibasilar, inspiratory crackles

For HRCT criteria, appearances compatible with fibrotic IIP were required, as in recent cohorts (5, 13), based on HRCT observations in patients with a biopsy diagnosis of usual interstitial pneumonia or NSIP and the clinical features of IPF (14). HRCT abnormalities were predominantly basal/subpleural in distribution and comprised a mixture of reticular and ground-glass abnormalities, with traction bronchiectasis when ground-glass attenuation was prominent and no consolidation or nodules. Appearances were subcategorized (13, 14) as follows: (1) typical of IPF; (2) indeterminate, but suggestive of IPF; and (3) suggestive of fibrotic NSIP.

Exclusion criteria were as follows: clinically unstable, resting PaO2 of less than 7 kPa on air and major comorbidity (e.g., ischemic heart disease, malignancy). Local ethics committee approval was obtained, and signed, informed consent was received from all patients.


HRCT sections of 1 mm were acquired at 10-mm intervals in the supine position. Scans were evaluated independently by two observers (D.M., A.U.W.). The extent of disease on computed tomography was scored as described previously (15, 16); this reproducible method has been applied to a large number of clinical studies of IPF (1). The extent of emphysema, if present, was scored to the nearest 5% using the same methodology.

Pulmonary Function Testing

Patients were evaluated twice 1 week apart, at the same time of day. Pulmonary function tests included FEV1, FVC, and total lung capacity (TLC) using a constant-volume body plethysmograph with diffusion capacity (DlCO) as measured by the single-breath technique (V6200 Autobox DL; SensorMedics, Yorba Linda, CA) performed to ATS standards (17, 18) and expressed as percent predicted (1921). The first CRP score (9) and the recently defined composite physiologic index (1), derived from FEV1, FVC, and DlCO, were calculated.

Exercise Testing

Patients had no prior familiarity with exercise testing. Patients were exercised on room air, allowing at least 45 minutes between tests.

  1. The 6MWTs, as advocated by Guyatt and coworkers (22) and recent ATS guidelines (23), were given by an experienced operator (P.Y.) using standard verbal prompts. Measurements included resting and 6-minute oxygen saturation (SpO2), walk distance, the presence or absence of desaturation to 88% or lower at the end of the 6MWT, and pre- and post-modified Borg dyspnea scores. Cutaneous oximetry (Nellcor N-20PA; Puritan Bennett, Inc., Pleasanton, CA) was obtained with a finger probe.

  2. Incremental maximal symptom limited treadmill test (Vmax 229; SensorMedics) using a standardized protocol in accordance with the American Thoracic Society/American College of Chest Physicians (ATS/ACCP) statement (24). Testing was supervised by two experienced physicians (T.E., A.U.W.), and all patients were strongly encouraged verbally not to stop until a maximal effort had been made. In all cases, distressing dyspnea was evident at test termination. Measurements included SpO2, minute ventilation (Ve), maximum oxygen uptake (Vo2max), and pre- and post-modified Borg dyspnea scores.

FVC and DlCO were the primary resting variables. The primary exercise variables were Vo2max and 6MWT distance.

Statistical Analyses

Data are expressed as mean values (SDs). Measurement variation was quantified as the SD of differences between observations (25), or the weighted κ coefficient of agreement (Kw) when appropriate, and was illustrated in selected examples using Pearson's product moment correlation. Interrelationships between functional variables were evaluated using Spearman's rank correlation coefficient; mean data for Visits 1 and 2 were analyzed. A p value of less than 0.05 was taken to be statistically significant. Survival was evaluated using proportional hazards regression (STATA software; Computing Resource Center, Santa Monica, CA). The prognostic values of resting and exercise variables were examined.

As shown in Table 1

TABLE 1. Demographic data, smoking histories, and baseline computed tomographic features

Patient Characteristics (n = 30)

Mean (SD)
Age, yr73 (8.5)
Male-to-female ratio24:6
Smoking history
 Ex-smokers, no.23 (77%)
 Nonsmokers, no.7 (23%)
 Duration, mo41.1 (65.2)
CT findings
 Extent of IPF, %24.0 (12.8)
 Presence of emphysema, no.13 (43%)
 Appearances typical of IPF, no.17 (57%)
 Appearances compatible with IPF, no.10 (33%)
 Appearances typical of fibrotic NSIP, no.
3 (10%)

Definition of abbreviations: CT = computed tomography; IPF = idiopathic pulmonary fibrosis; NSIP = nonspecific interstitial pneumonia.

, patients were predominantly male with a mean age of 73 years. The mean duration of dyspnea was 41 months, and 77% of patients were current or previous smokers. The HRCT findings are outlined in Table 1, with the majority assigned as typical of IPF or indeterminate but suggestive of IPF. Resting pulmonary function indices are summarized in Table 2

TABLE 2. Pulmonary function indices (mean of two measurements) and their reproducibility

Pulmonary Function
 Indices Given as
 Mean (SD) Values


SDdiff/Mean Value (%)
Resting pulmonary function tests (n = 30)
 FEV1, % pred91.6 (20.5)6.647.2
 FVC, % pred81.2 (19.4)6.457.9
 TLC, % pred85.7 (17.2)4.715.5
 DLCO, % pred52.7 (16.8)4.879.2
Resting gas exchange (n = 29)
 Arterial PO2, kPa10.7 (2.2)NA*NA*
 Calculated A-a gradient, kPa2.8 (2.2)NA*NA*
6-Minute walk (n = 29)
 Resting Borg dyspnea score  0 (0–3.5)Kw = 0.67NA
 Post-Borg dyspnea score3.1 (1.6)Kw = 0.79NA
 Distance, m425.7 (143.3)17.9 4.2
 O2 desaturation8.8 (5.2)2.528.3
 Desaturation to 88% or lowerNA K = 0.93NA
Maximal exercise testing (n = 24)
 VO2max, % pred60.9 (18.2)6.410.5
 Post-Borg dyspnea score3.8 (2.0)Kw = 0.76NA
 O2 desaturation8.0 (4.8)3.442.5
 O2 desaturation, adjusted for VO2max14.2 (9.5)7.250.7
Composite indices
 CRP score (n = 23)25.1 (13.3)6.525.9
 CPI (n = 30)
44.8 (13.4)

*The reproducibility of resting arterial gas measurements was not evaluated.

The reproducibility of Borg dyspnea scores is stated as the weighted κ coefficient of agreement (Kw).

The reproducibility of desaturation to 88% or lower during the 6-minute walk is stated as the (nonweighted) κ coefficient of agreement (K).

Definition of abbreviations: CPI = composite physiologic index; CRP = clinical-radiologic-physiologic; DLCO = diffusion capacity of carbon monoxide; K = nonweighted κ coefficient of agreement; Kw = weighted κ coefficient of agreement; NA = not applicable; SDdiff = SD of differences between measurements; TLC = total lung capacity.

Indices are expressed as the SDdiff between the two measurements, given as absolute values and as percentages of the mean of the two measurements in the whole study group.

; reproducibility was clinically acceptable (SD of differences/mean value [SD/mean] < 10%).

Reproducibility of Exercise Testing

Twenty-nine patients were available for 6MWT reproducibility analyses (one patient was not available for the second test). No patients declined the test; all patients completed it as per standard protocol. Within-subject reproducibility for 6MWT distance, illustrated in Figure 1

, was higher than for any other variable in the present study (r = 0.98; SD/mean = 4.2%). By contrast, both measures of 6MWT desaturation were very poorly reproducible (Table 2). Measurement variation did not correlate significantly with disease severity (as judged by DlCO levels and the extent of disease on computed tomography).

Twenty-four patients completed the maximal exercise test protocol on two occasions. Four patients declined to perform maximal exercise testing at recruitment (nonspecific aversion to the test) and a further patient declined the second exercise test for the same reason. One patient developed transient ventricular ectopy during the first test, and a repeat test was not performed. No other adverse effects were observed; in all cases, patients reached symptom limitation with the test terminated by dyspnea.

Within-subject reproducibility for Vo2max verged on clinical acceptability (r = 0.88, SD/mean = 10.5%; Figure 2)

, although it was slightly lower than the reproducibility of resting variables. However, oxygen desaturation on exertion was associated with very major measurement variation, illustrated in Figure 3, and this increased further when corrected for measured Vo2max as a percentage of predicted Vo2 (r = 0.61, SD/mean = 50.7%).

Exercise variables were reevaluated in 17 patients with HRCT features typical of IPF. As shown in Table 3

TABLE 3. Exercise variables and their reproducibility in patients with CT features typical of ipf

Exercise Indices
 Given as Mean (SD)


SDdiff/Mean Value (%)
6-Minute walk (n = 17)
 Resting Borg dyspnea score*0 (0–3.5)Kw = 0.73NA
 Post-Borg dyspnea score*3.2 (1.7)Kw = 0.82NA
 Distance, m398.9 (136.5)16.54.1
 O2 desaturation10.4 (5.2)2.625.0
 Desaturation to 88% or lowerNA K = 1.00*NA
Maximal exercise testing (n = 13)
 VO2max, % pred59.5 (21.1)7.212.1
 Post-Borg dyspnea score*3.6 (2.5)Kw = 0.81NA
 O2 desaturation9.3 (4.6)2.830.1
 O2 desaturation, adjusted for VO2max
17.1 (10.3)

*The reproducibility of desaturation to 88% or lower during the 6-minute walk is stated as the (nonweighted) κ coefficient of agreement (K).

The reproducibility of Borg dyspnea scores is stated as the weighted κ coefficient of agreement (Kw).

Definition of abbreviations: CT = computed tomography; IPF = idiopathic pulmonary fibrosis; K = nonweighted κ coefficient of agreement; Kw = weighted κ coefficient of agreement; NA = not applicable; SDdiff = SD of differences between measurements.

Variables are expressed as the mean of two measurements, the SDdiff between the two measurements, given as absolute values and as percentages of the mean of the two measurements in 17 patients with high-resolution computed tomography appearances typical of idiopathic pulmonary fibrosis.

, reproducibility (good for 6MWT distance and desaturation to 88% or lower at the end of the 6MWT; poor for oxygen desaturation, for both 6MWT and maximal exertion) was virtually identical to reproducibility findings in the whole study population.

Reproducibility of the Composite Scores

Within-subject reproducibility of the composite physiologic index was clinically acceptable (SD/mean = 7.8%). By contrast, the physiologic component of the CRP score was associated with considerable measurement variation (SD/mean = 25.9%), reflecting the major contribution made by oxygen desaturation on maximal exercise testing to the CRP score.

We examined all variables for an order effect. Mean values for all resting variables declined slightly at Visit 2, but on paired t testing, this was statistically significant only for TLC (p < 0.05). For exercise variables, both Vo2max and 6MWT distance increased at the second visit, although only the 6MWT distance reached significance on paired t testing, 416.9 (141.1) m increasing to 434.6 (146.9) m (p < 0.005).

Correlations between Exercise Variables and Other Data

Highly significant correlations were observed between Vo2max, DlCO, and 6MWT distance (Tables 4 and 5

TABLE 4. The relationships between 6-MINUTE walk test data and other variables (resting pulmonary function tests, maximal exercise data, extent of disease on computed tomography), quantified using spearman's rank correlation coefficient


6MWT O2 Desaturation (at end of test)

6MW O2 Desaturation
 Adjusted for Walk Distance
CT extent of IPF−0.12−0.050.02
DLCO, % pred−0.61*−0.33−0.62
FEV1, % pred−0.16−0.200.01
FVC, % pred0.06−0.11−0.04
TLC, % pred−0.07−0.040.06
VO2maxSee Table 40.29−0.53
O2 desaturation on maximal exerciseSee Table 40.570.43
O2 desaturation adjusted for VO2maxSee Table 40.65*0.64
CRPSee Table 40.660.59

*p < 0.0001.

p < 0.05.

Definition of abbreviations: CPI = composite physiologic index; CRP = clinical-radiologic-physiologic; CT = computed tomography; DLCO = diffusion capacity of carbon monoxide; IPF = idiopathic pulmonary fibrosis; 6MWT = 6-minute walk test; 6MWD = 6-minute walk distance; TLC = total lung capacity.

Data are stated as r values.

TABLE 5. The relationships between incremental maximal treadmill testing against the extent of disease (computed tomography) and lung function indices, quantified using spearman's rank correlation coefficient

VO2max on
 Maximal Exercise Testing

O2 Desaturation on
 Maximal Exercise
 Testing Adjusted for VO2max

CRP Score
CT extent of disease−
DLCO, % pred0.65*−0.40−0.60
FEV1, % pred−0.10−0.13−0.38
FVC, % pred0.13−0.23−0.46
TLC, % pred0.04−0.29−0.49

*p < 0.0001.

p < 0.05.

Definition of abbreviations: CPI = composite physiologic index; CT = computed tomography; DLCO = diffusion capacity of carbon monoxide; 6MWD = 6-minute walk distance; TLC = total lung capacity.

Data are stated as r values.

; Figures 46). By contrast, there were no significant correlations between exercise variables and PaO2, lung volume indices, or extent of disease on HRCT. There was a strikingly positive correlation between the 6MWT distance and Vo2max (rs = 0.78; Figure 4) and significant positive correlations between the 6MWT distance and DlCO (rs = 0.61; Figure 5) and between Vo2max and DlCO (rs = 0.65; Figure 6). However, the 6MWT distance bore little relationship to oxygen desaturation on maximal exercise, both unadjusted (Figure 7) and adjusted for respiratory work (as in the physiologic component of the CRP score).

Bivariate equations were constructed with the 6MWT distance as the dependent variable and Vo2max as one covariate; the other variables were examined as the second covariate in separate equations. This showed, for all analyses, that Vo2max was a very strong determinant of the 6MWT distance, with no other variable having a significant relationship with 6MWT distance after adjustment for Vo2max.

Survival in Relation to Resting and Exercise Variables

Follow-up was complete to death or to 4 years in all cases. There were 18 deaths during a median follow-up of 28 months, and the 4-year survival was 39%. Outcome was not related to resting pulmonary function indices or the 6MWT distance. The amplitude of desaturation during the 6MWT was the single strongest predictor of mortality (hazards ratio = 1.14, 95% confidence intervals = 1.04, 1.24; p < 0.005). Oxygen desaturation to 88% or lower at the end of the 6MWT was also associated with a higher mortality (hazards ratio = 2.92, confidence intervals = 1.04, 8.22; p = 0.04), shown in Figure 8

. Maximal exercise variables predictive of mortality included percentage of Vo2max (hazards ratio = 0.96, confidence intervals = 0.93, 1.00; p = 0.03) and oxygen desaturation adjusted for Vo2max (hazards ratio = 1.06, confidence intervals = 1.01, 1.11; p = 0.02); unadjusted oxygen desaturation was not significantly related to outcome.

Despite the cardinal role of serial pulmonary function tests in monitoring fibrotic IIP, no prospective evaluation has been performed of the reproducibility of exercise indices in a sizeable cohort of patients. We report major intertest variation in indices of maximal exercise used in routine evaluation in many centers. By contrast, Vo2max was acceptably reproducible, and the 6MWT distance exhibited minimal intertest variation.

The measurement of pulmonary function and exercise indices is integral to the assessment and monitoring of patients with IPF and may also provide powerful prognostic information and facilitate treatment decisions, including timing of lung transplantation (3, 5, 26). Thus, within-subject reproducibility is a crucial consideration. Serial trends in indices with low “measurement noise” can be interpreted with greater confidence. However, data on the reproducibility of exercise testing in IPF are surprisingly limited. In a study of six patients with a variety of restrictive lung diseases, “maximal” incremental cycle ergometry was reported as reproducible, but measurement variation was evaluated at preselected Vo2 levels (i.e., at 40 and 70% of predicted maximum) and not at the end of maximum exercise (27).

The prognostic value of 6MWT data, notably oxygen desaturation at end exercise rather than the 6MWT distance, was an important additional finding, confirming the observations of Lama and coworkers (10). In both studies, the total amplitude of desaturation had a slightly higher prognostic value than observed desaturation to 88% or less. However, as shown in Table 2, the latter variable was strikingly reproducible in the present study, whereas the total amplitude of desaturation was not. For the same reason, the presence or absence of desaturation to 88% or lower at the end of the 6MWT may be a preferable staging variable, and this applies equally to less reproducible maximal exercise variables, which were similarly predictive of mortality.

In COPD, studies of maximal cycle ergometry suggest variable reproducibility (3.5–29%), although intermeasurement variation may be minimized with familiarity (2831). We noted acceptable reproducibility of both treadmill Vo2 max and post-Borg dyspnea scores in our study population. These results, in particular the acceptable reproducibility of maximum Vo2, indicate that the level of effort and respiratory work in the second test was truly comparable to that of the first test. Thus, the often large differences in desaturation variables are not spurious (because of major variations in patient tolerance) but are likely to reflect the fact that exercise becomes maximal when the patient enters the steep part of the oxygen desaturation curve. It appears that minor differences in patient tolerance and apparently trivial delays in stopping the test may have a disproportionate effect on final saturation. Furthermore, similar variation in oxygen desaturation was evident at the end of the 6MWT, despite the striking reproducibility of the 6MWT distance. Measurement variation is likely to be further exacerbated by the inherent noise of cutaneous oximetry, for which the 95% confidence limits may be as wide as ± 4 to 5% (32). Exercise arterial gas measurements, advocated by some authors, are often unattractive to both patients and clinicians.

To our knowledge, the reproducibility of the 6MWT has not been systematically evaluated in IPF. The 6MWT is widely acknowledged as an objective measure of functional capacity in COPD (23), with established credibility as an interventional outcome measure, in pulmonary rehabilitation, prescription of ambulatory oxygen, and as a predictor of morbidity and mortality. However, mechanisms of dyspnea differ between COPD and restrictive lung disease, and therefore extrapolation from COPD to IPF is unwarranted.

Our results show that the 6MWT distance is highly reproducible in fibrotic IIP. A small learning (training) effect was less than generally reported for COPD (33). It seems likely that with IPF, as has been shown in severe COPD (34), exercise intensity achieved during a 6MWT may approach maximal exercise. Vo2max had a particularly strong positive correlation with 6MWT distance. Therefore, the 6MWT distance may be regarded as a surrogate marker for Vo2 in IPF, as previously demonstrated in COPD, cardiac failure, and end-stage lung disease (3537). The 6MWT offers a number of advantages over maximal exercise testing. It does not require sophisticated equipment. It may correlate better with quality-of-life indices than maximal exercise testing (38) and is seldom associated with patient aversion. Five of 30 study participants declined to undergo maximal treadmill testing, whereas none expressed reservations about the 6MWT, despite a lack of familiarity with either test. Furthermore, in fibrotic IIP, maximal exercise testing is often contraindicated by the severity of disease or by concurrent ischemic heart disease. In a study of 68 patients with IPF, maximal exercise testing was performed as part of protocol, but was contraindicated in more than 30% of cases (39). By contrast, the 6MWT is widely performed with little morbidity in elderly patients with significant cardiac failure (40).

The high reproducibility of spirometric volumes and lower but acceptable reproducibility of gas transfer in the present study is in keeping with previous reports (27, 41). It has recently been reported that short-term trends in gas transfer (5) or FVC (6, 7) are the most accurate determinants of survival in fibrotic IIP, with FVC trends sometimes easier to interpret, because of lower variability. In this regard, the high reproducibility of the composite physiologic index is reassuring; this composite index corrects for concurrent emphysema, which may mask disease progression by causing FVC levels to be spuriously preserved (1). By contrast, the physiologic component of the old CRP score (9) was unacceptably variable, even though resting arterial gases (a small component of the CRP score) were not repeated but analyzed as identical results.

Our study, a prospective evaluation of consecutive patients meeting clinical criteria for IPF, was designed specifically to capture the spectrum of disease encountered in routine clinical practice. HRCT appearances were either typical of IPF or, in a large minority of cases, compatible with either IPF or fibrotic NSIP. In the latter scenario, usual interstitial pneumonia is found at biopsy in the majority of cases (13). On the basis on this finding, and the poor survival in our own population, we estimate that the proportion of patients with underlying histologic appearances of fibrotic NSIP was unlikely to have exceeded 20%, the proportion of NSIP in the population in which the prognostic value of the 6MWT was reported (10). We were reluctant to exclude patients with possible fibrotic NSIP as the outcome in this disorder has significant overlap with IPF (5, 42). More important, expert radiologic evaluation in distinguishing between usual interstitial pneumonia and NSIP on HRCT is not always available in routine clinical practice. However, the findings in the present study changed minimally when 13 patients with “intermediate” HRCT appearances were excluded from analysis.

Our study population differed from patients with IPF who underwent biopsy in another respect: concurrent emphysema was evident on HRCT in more than 40% of cases. This finding may reflect the relatively advanced age of our patient group, another aspect that is representative of routine clinical practice. The high prevalence of coexisting emphysema is likely to have contributed to a relative preservation of lung volumes in our patients (1); as a result, mean FVC levels are higher than in many studies of younger patients with fibrotic IIP. However, the reproducibility of resting and exercise functional variables was not linked to the functional severity of disease at baseline. Because a wide range of disease severity was studied, our findings can be applied in routine practice.

A more important limitation was the logistic need to separate maximal exercise testing by 1 week. Confounding factors that might, in theory, reduce reproducibility were minimized. Patients exercised at the same time of day, were clinically stable, with no changes in medication, and there was careful standardization of patient instructions, protocol, operator, equipment, and calibration. Despite this, it is theoretically possible that even at an interval of 1 week, progression of disease might have occurred in some cases. Furthermore, in two cases, a decline in spirometric volumes in excess of 10% was noted, despite the absence of symptomatic deterioration. However, removal of these cases did not materially improve the reproducibility of exercise variables and, in any case, progression of disease might be expected to influence all variables equally.

In designing this study, we aimed to reproduce the application of exercise testing in the routine evaluation of diffuse lung disease. Patient characteristics were typical of populations with fibrotic IIP managed outside major referral centers. The use of oximetry rather than arterial gases reflects clinical practicability. Similarly, the 6MWT and maximal exercise testing were performed on only two occasions. The performance of two practice walks has been advocated by some (43); however, our findings support the recent ATS guidelines for the 6MWT, which state clearly that only one practice walk is required (23). The reproducibility of the 6MWT distance appears to be exceedingly high in IPF, and highly unlikely to improve materially with repeated testing.


In fibrotic IIP, distance walked during the 6MWT is highly reproducible. Oxygen consumption on maximal exercise is moderately reproducible and correlates strikingly with the 6MWT distance. Oxygen desaturation parameters on exercise are associated with unacceptable measurement variation. These data indicate that, in the routine evaluation of fibrotic IIP, the 6MWT has major advantages over maximal exercise testing on reproducibility grounds.

The authors gratefully acknowledge clinical research assistance from Ms. S. Rudkin and W. Fergusson.

1. Wells AU, Desai SR, Rubens MB, Goh NSL, Cramer D, Nicholson AG, Colby TV, du Bois RM, Hansell DM. Idiopathic pulmonary fibrosis: a composite physiological index derived from disease extent observed by computed tomography. Am J Respir Crit Care Med 2003;167:962–969.
2. Schwartz DA, Helmers RA, Galvin JR, Van Fossen DS, Frees KL, Dayton CS, Burmeister LF, Hunninghake GW. Determinants of survival in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med 1994;149:450–454.
3. Erbes R, Schaber T, Loddenkemper R. Lung function tests in patients with idiopathic pulmonary fibrosis: are they helpful for predicting outcome? Chest 1997;111:51–57.
4. King TE Jr, Schwarz MI, Brown K, Tooze JA, Colby TV, Waldron JA Jr, Flint A, Thurlbeck W, Cherniack RM. Idiopathic pulmonary fibrosis: relationship between histopathologic features and mortality. Am J Respir Crit Care Med 2001;164:1025–1032.
5. Latsi PI, du Bois RM, Nicholson AG, Colby TV, Bisirtzoglou D, Nikolakopoulou A, Veeraraghavan S, Hansell DM, Wells AU. Fibrotic idiopathic interstitial pneumonia: the prognostic value of longitudinal functional trends. Am J Respir Crit Care Med 2003;168:531–537.
6. Collard HR, King TE Jr, Bartelson BB, Vourlekis JS, Schwarz MI, Brown KK. Changes in clinical and physiologic variables predict survival in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med 2003;168:538–542.
7. Flaherty KR, Mumford JA, Murray S, Kazerooni EA, Gross BH, Colby TV, Travis WD, Flint A, Toews GB, Lynch JP III, et al. Prognostic implications of physiologic and radiographic changes in idiopathic interstitial pneumonia. Am J Respir Crit Care Med 2003;168:543–548.
8. Noble PW, Morris DG. Time will tell: predicting survival in idiopathic interstitial pneumonia [editorial]. Am J Respir Crit Care Med 2003;168:510–511.
9. Watters LC, King TE, Schwarz MI, Waldron JA, Standford RE, Cherniak RM. A clinical, radiographic, and physiological scoring system for the longitudinal assessment of patients with idiopathic pulmonary fibrosis. Am Rev Respir Dis 1986;133:97–103.
10. Lama VN, Flaherty KR, Toews GB, Colby TV, Travis WD, Long Q, Murray S, Kazerooni EA, Gross BH, Lynch JP III, et al. Prognostic value of desaturation during a 6-minute walk test in idiopathic interstitial pneumonia. Am J Respir Crit Care Med 2003;168:1084–1090.
11. Eaton T, Rudkin S, Milne D, Wells AU. The reproduciblity of oxygen desaturation on maximal exercise testing in cryptogenic fibrosing alveolitis [abstract]. Am J Respir Crit Care Med 1999;159:A64.
12. American Thoracic Society/European Respiratory Society international multidisciplinary consensus classification of the idiopathic interstitial pneumonias. Am J Respir Crit Care Med 2002;165:277–304.
13. Flaherty KR, Thwaite EL, Kazerooni EA, Gross BH, Toews GB, Colby TV, Travis WD, Mumford JA, Murray S, Flint A, et al. Radiological versus histological diagnosis in UIP and NSIP: survival implications. Thorax 2003;58:143–148.
14. MacDonald SL, Rubens MB, Hansell DM, Copley SJ, Desai SR, du Bois RM, Nicholson AG, Colby TV, Wells AU. Nonspecific interstitial pneumonia and usual interstitial pneumonia: comparative appearances and diagnostic accuracy of high-resolution computed tomography. Radiology 2001;221:600–605.
15. Wells AU, Rubens MB, du Bois RM, Hansell DM. Serial CT in fibrosing alveolitis: prognostic significance of the initial pattern. AJR Am J Roentgenol 1993;161:1159–1165.
16. Wells AU, Hansell DM, Rubens MB, Cullinan P, Black CM, du Bois RM. The predictive value of appearances on thin section computed tomography in fibrosing alveolitis. Am Rev Respir Dis 1993;148:1076–1082.
17. American Thoracic Society. Standardization of spirometry: 1994 update. Am J Respir Crit Care Med 1995;152:1107–1136.
18. American Thoracic Society. Single-breath carbon monoxide diffusing capacity (transfer factor): recommendations for a standard technique—1995 update. Am J Respir Crit Care Med 1995;152:2185–2198.
19. Morris JF. Spirometry in the evaluation of pulmonary function: medical progress. West J Med 1976;125:110–111.
20. Goldman HI, Becklake MR. Respiratory function tests: normal values at median altitudes and the prediction of normal results. Am Rev Tuberc 1959;79:457–467.
21. Burrows B, Kasik JE, Niden AH, Barclay WR. Clinical usefulness of the single-breath pulmonary diffusing capacity test. Am Rev Respir Dis 1961;84:789–806.
22. Guyatt GH, Pugsley SO, Sullivan MJ, Thompson PJ, Berman LB, Jones NL, Fallen EL, Taylor DW. Effect of encouragement on walking test performance. Thorax 1984;39:818–822.
23. American Thoracic Society statement. Guidelines for the six-minute walk test. Am J Respir Crit Care Med 2002;166:111–117.
24. American Thoracic Society/American College of Chest Physicians statement on cardiopulmonary exercise testing. Am J Respir Crit Care Med 2003;167:211–277.
25. Chinn S. Statistics in respiratory medicine: repeatability and method comparison. Thorax 1991;46:454–456.
26. Mogulkoc N, Brutsche MH, Bishop PW, Greaves SM, Horrocks AW, Egan JJ. Pulmonary function in idiopathic fibrosis and referral for lung transplantation. Am J Respir Crit Care Med 2001;164:103–108.
27. Marciniuk DD, Watts RE, Gallagher CG. Reproducibility of incremental maximal cycle ergometer testing in patients with restrictive lung disease. Thorax 1993;48:894–898.
28. Brown SE, Fischer CE, Stansbury DW, Light RW. Reproducibility of VO2max in patients with chronic airflow obstruction. Am Rev Respir Dis 1985;131:435–438.
29. Cox NJM, Hendriks JCM, Binkhorst RA, Folgering HT, van Herwaarden CLA. Reproducibility of incremental maximal cycle ergometer tests in patients with mild to moderate obstructive lung diseases. Lung 1989;167:129–133.
30. Swinburn CR, Wakefield JM, Jones PW. Performance, ventilation and oxygen consumption in three different types of exercise test in patients with chronic obstructive lung disease. Thorax 1985;40:581–586.
31. Owens MW, Kinasewitz GT, Strain DS. Evaluating the effect of chronic therapy in patients with irreversible air-flow obstruction. Am Rev Respir Dis 1986;134:935–937.
32. Ries AL, Farrow JT, Clausen JL. Accuracy of two ear oximeters at rest and during exercise in pulmonary patients. Am Rev Respir Dis 1985;132:685–689.
33. Stevens D, Elpern E, Sharma K, Szidon P, Ankin M, Kesten S. Comparison of hallway and treadmill six-minute walk tests. Am J Respir Crit Care Med 1999;160:1540–1543.
34. Sciurba F, Criner GJ, Lee SM, Mohsenifar Z, Shade D, Slivka W, Wise RA. Six-minute walk distance in chronic obstructive pulmonary disease: reproducibility and effect of walking course layout and length. Am J Respir Crit Care Med 2003;167:1522–1527.
35. Wijkstra PJ, TenVergert EM, van der Mark TW, Postma DS, Van Altena R, Kraan J, Koeter GH. Relation of lung function, maximal inspiratory pressure, dyspnoea, and quality of life with exercise capacity in patients with chronic obstructive pulmonary disease. Thorax 1994;49:468–472.
36. Cahalin LP, Mathier MA, Semigran MJ, Dec GW, DiSalvo TG. The six-minute walk test predicts peak oxygen uptake and survival in patients with advanced heart failure. Chest 1996;110:325–332.
37. Cahalin L, Pappagianopoulos P, Prevost S, Wain J, Ginns L. The relationship of the 6-min walk test to maximal oxygen consumption in transplant candidates with end-stage lung disease. Chest 1995;108:452–459.
38. Guyatt GH, Thompson PJ, Berman LB, Sullivan MJ, Townsend M, Jones NL, Pugsley SO. How should we measure function in patients with chronic heart and lung disease? J Chronic Dis 1985;38:517–524.
39. Wells AU, King AD, Rubens MB, Cramer D, du Bois RM, Hansell DM. Lone cryptogenic fibrosing alveolitis: a functional-morphological correlation based on extent of disease on thin-section computed tomography. Am J Respir Crit Care Med 1997;155:1367–1375.
40. Bittner V, Weiner DH, Yusuf S, Rogers WJ, McIntyre KM, Bangdiwala SI, Kroneberg MW, Kostis JB, Kohn RM, Guillotte M, et al. Prediction of mortality and morbidity with a 6-minute walk test in patients with left ventricular dysfunction. JAMA 1993;270:1702–1707.
41. Wanger J. Irvin C. Comparability of pulmonary function results from 13 laboratories in a metropolitan area. Respir Care 1991;36:1375–1382.
42. Nicholson AG, Fulford LG, Colby TV, du Bois RM, Hansell DM, Wells AU. The relationship between individual histologic features and disease progression in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med 2002;166:173–177.
43. Solway S, Brooks D, Lacasse Y, Thomas S. A qualitative systematic overview of the measurement properties of functional walk tests used in the cardiorespiratory domain. Chest 2001;119:256–270.
Correspondence and requests for reprints should be addressed to Athol U. Wells, M.D., Consultant Respiratory Physician, Interstitial Lung Disease Unit, Emmanuel Kaye Building, Manresa Road, Chelsea, London SW3 6LR, UK. E-mail:


No related items
American Journal of Respiratory and Critical Care Medicine

Click to see any corrections or updates and to confirm this is the authentic version of record