American Journal of Respiratory and Critical Care Medicine

Rationale: High-resolution computed tomography (HRCT) has been suggested as a potential outcome surrogate for cystic fibrosis (CF) lung disease. An important attribute of a valid outcome surrogate is that the surrogate reflects true clinical outcomes. Objectives: We performed this study to validate HRCT, a proposed surrogate outcome measure for CF lung disease, against a true clinical outcome, the number of respiratory tract exacerbations occurring in 2 yr, and to assess the correlation of CT scores and pulmonary function tests (PFTs) with this clinical outcome. Methods: CTs and PFTs were performed on 6- to 10-yr-old children at the beginning and end of a 2-yr study during which the number of exacerbations were recorded. Spearman correlations and Poisson models were used to assess the correlation of the number of exacerbations with baseline values and changes in PFTs and CT scores. Measurements and Main Results: Nine of 61 subjects had a total of 22 respiratory tract exacerbations. At baseline, PFTs and four CT scores showed significant correlation with number of exacerbations, but no variable by itself predicted exacerbations with high accuracy. For change over the 2-yr period, three CT scores showed significant correlation with exacerbations, whereas no PFTs showed significant correlation. Conclusion: This is the first study showing correlation between CT and a true clinical outcome. Change in CT scores correlates moderately well with the number of exacerbation. Poor correlation between change in FEV1 and exacerbations suggests that HRCT may be a more appropriate outcome surrogate for longitudinal studies of young children.

Outcome surrogates are short-term measures that reflect long-term outcomes. Using outcome surrogates can decrease both study duration and sample size. The use of outcome surrogates is well established in both research and in the regulatory process for new drugs. Drugs can be approved on the basis of changes in outcome surrogates alone (1). Outcome surrogates, however, must be validated for each disease process to ensure accurate results from their use (2). The development of valid outcome surrogates for cystic fibrosis (CF) lung disease has been recognized as an important need in CF research.

An essential component of the validation of an outcome surrogate is the demonstration that changes in the proposed surrogate correlate with true outcomes. This study was performed to evaluate the correlation between CF lung disease as assessed by high-resolution computed tomography (HRCT) scanning and a true outcome measure of morbidity: the number of respiratory tract exacerbations (RTEs) experienced over a 2-yr period. HRCT was also compared with pulmonary function tests (PFTs) in terms of how well each correlated with number of RTEs. PFTs, specifically FEV1, is the only surrogate outcome measure currently accepted by the U.S. Food and Drug Administration for CF lung disease (3).

Some of the results of this study have been previously reported in the form of an abstract (4).

HRCTs were performed at the beginning and end of a multicenter 2-yr study of inhaled dornase alfa (Pulmozyme) (5). Entry criteria for this study included age between 6 and 10 yr at entry, a diagnosis of CF by pilocarpine iontophoresis or genotyping, and an FVC of 85% predicted or greater for age. During the 2-yr period, all RTEs were recorded. An RTE was defined as an event requiring hospitalization and intravenous antibiotics because of worsening respiratory signs and symptoms.

HRCT scans and PFTs were performed on the same day. PFTs were performed according to American Thoracic Society guidelines (6). PFT values were expressed as percent-predicted values from the Harvard Six Cities Study equations (7).

HRCT scans were obtained with inspiratory images at 10-mm intervals and four expiratory images at the following anatomic levels: 0.5 cm above the aortic arch, at the carina, halfway between the carina and 1 cm above the higher hemidiaphragm, and 1 cm above the higher hemidiaphragm. HRCT scans were scored independently by two thoracic radiologists.

The scoring system recorded the presence and severity of specific findings of CF lung disease in each lobe of the lung. Bronchiectasis severity was defined as the size of the dilated bronchi relative to an accompanying vessel. Mucous plugging was defined as the presence of bronchi with opacification of the lumen, centrilobular nodules, or peripheral branching structures. Peribronchial thickening was defined as a bronchial wall thickness greater than 2 mm in the hilar region, 1 mm in the central lung, or 0.5 mm in the peripheral lung. For these findings, a severity value was determined on the basis of the appearance. Parenchymal opacities, areas of ground glass opacity, and cysts or bullae were assessed for the overall volume of the lobe occupied by each abnormality. Air trapping was defined as well-circumscribed areas of decreased parenchymal density on expiratory images and was assessed either as subsegmental or as segmental or larger. For all findings, the severity value was multiplied by the extent of the abnormality to produce a score for that finding. These scores were added to produce an overall score. All scores were normalized to a scale of 0 to 100 with increasing score indicating increasing disease.

Baseline and end-study values for the PFT and HRCT measures were recorded. The change in values was obtained by subtracting the baseline value from the value at the end of the study.

Spearman rank correlations were calculated between number of RTEs and the HRCT and PFT values at baseline as well as the change in the values between the beginning and end of the study. Poisson models were used to determine whether age, sex, or treatment group (Pulmozyme vs. placebo) affected correlation between number of RTEs versus baseline HRCT and PFT scores. Standard linear models were used to determine whether age, sex, or treatment group affected correlation between change in score versus number of RTEs.

PFTs and HRCTs at the beginning and end of the study were available for 61 children. Age at the baseline HRCT scan ranged from 6 to 11 yr (mean, 8.3 yr; SD, 1.6 yr). There were 25 girls and 36 boys. Twenty-nine children received dornase alfa and 32 received placebo. There were no significant differences at baseline or in the 2-yr change in HRCT scores or PFTs between the two groups. Of the 61 subjects, nine experienced a total of 22 RTEs. Five subjects experienced one RTE, and one subject each experienced two, three, four, and eight RTEs. There was no difference in the number of exacerbations per subject between the two treatment groups (Wilcoxon rank sum test two-sided p value, 0.87). The two groups were therefore combined for analysis. PFT and HRCT scores at baseline and at the end of the study are shown in Table 1

TABLE 1. Pulmonary function test values and high-resolution computed tomography scores (n = 61)





Change
Measure
Baseline, Mean (SE)
End Study, Mean (SE)
End Study − Baseline, Mean (SE)
Two-sided p Value
FEV1 % predicted 99.4 (2.1) 97.4 (2.2) −2.00 (1.66)0.23
FVC % predicted106.0 (1.6)103.3 (1.8) −2.71 (1.27)0.04
FEF25–75 % predicted 89.2 (3.7) 89.9 (4.1) 0.72 (3.98)0.86
HRCT overall score  7.4 (0.6) 10.5 (0.7) 3.17 (0.46)< 0.0001
Bronchiectasis score  3.1 (0.6)  5.2 (0.9) 2.04 (0.56) 0.001
Mucous plugging score  2.3 (0.6)  2.3 (0.7) −0.05 (0.37)0.90
Peribronchial thickening Score  9.9 (0.6) 11.7 (0.7) 1.83 (0.63)0.01
Parenchymal abnormalities Score  1.9 (0.5)  3.5 (0.7) 1.55 (0.68)0.03
Hyperinflation (air trapping) score
 23.8 (2.6)
 38.1 (2.4)
14.22 (2.44)
< 0.0001

Definition of abbreviation: HRCT = high-resolution computed tomography.

.

Predicting Number of RTEs with Baseline HRCT Scores and PFTs

At baseline, percent-predicted values for the three PFT measures (FEV1, FVC, and forced expiratory flow, midexpiratory phase [FEF25–75]) correlated significantly with the number of RTEs (Table 2)

TABLE 2. Spearman rank correlation with the number of respiratory tract exacerbations


Variable Name

Estimate

95% CI

p Value
Baseline (n = 61)
 HRCT overall score0.28(0.02, 0.50)0.03
 Bronchiectasis score0.28(0.02, 0.50)0.03
 Mucous plugging score0.30(0.04, 0.52)0.02
 Peribronchial score0.15(−0.11, 0.39)0.26
 Parenchyma score0.19(−0.07, 0.43)0.15
 Hyperinflation score0.20(−0.06, 0.44)0.12
 FEV1 % predicted−0.40(−0.60, −0.16)0.002
 FVC % predicted−0.31(−0.53, −0.06)0.02
 FEF25–75 % predicted−0.36(−0.57, −0.11)0.004
Two-year change from baseline (n = 61)
 HRCT overall score0.32(0.07, 0.53)0.01
 Bronchiectasis score0.35(0.10, 0.56)0.005
 Mucous plugging score0.15(−0.11, 0.39)0.25
 Peribronchial score0.17(−0.09, 0.41)0.18
 Parenchyma score0.26(0.00, 0.49)0.05
 Hyperinflation score0.07(−0.19, 0.32)0.57
 Relative change FEV1 %
     predicted0.20(−0.06, 0.44)0.13
 Relative change FVC %
     predicted0.07(−0.19, 0.32)0.62
 Relative change FEF25–75 %
     predicted
0.14
(−0.12, 0.38)
0.29

Definition of abbreviations: CI = confidence interval; HRCT = high-resolution computed tomography.

. The highest magnitude of correlation was with FEV1 percent predicted (r = −0.40, p = 0.002). At baseline, the HRCT overall score, bronchiectasis score, and mucous plugging score also correlated significantly with the number of RTEs. The highest correlation was with mucous plugging (r = 0.30, p = 0.02). Figure 1A shows number of RTEs versus baseline mucous plugging score. The variability in baseline mucous plugging score is large for the group of subjects who experienced no RTEs over the 2-yr period. Subgroup analysis of those subjects who experienced one or more RTEs showed significant correlation for number of RTEs only with baseline mucous plugging score (r = 0.77, p = 0.01) and baseline peribronchial thickening score (r = 0.72, p = 0.03; Table E2 in the online supplement).

Because no single covariate predicted number of RTEs with much accuracy, Poisson models that included age, sex, and treatment group, as well as HRCT and PFT scores as candidate predictor variables, were fit to attempt to improve prediction. The result of a stepwise fit yielded a model with much better predictive power. The Poisson model included baseline FEV1 percent predicted, baseline mucous plugging score, sex, baseline age, and sex by mucous plugging score interaction (mucous plugging was more important for predicting number of RTEs in females compared with males, given all other variables in the model). Figure 1B displays the observed versus predicted number of RTEs based on this Poisson model. Model results, including estimated coefficients, are given in the online supplement.

Predicting Change in HRCT Scores and PFTs with Number of RTEs

None of the PFT measures showed a significant correlation between the change over 2 yr and the number of RTEs (Table 2). In fact, all correlations involving change in PFTs were unexpectedly positive, which would indicate that someone who experiences more RTEs tends to improve in PFTs over time. On the other hand, the 2-yr change in overall, bronchiectasis, and parenchymal disease HRCT scores correlated significantly with number of RTEs. The highest correlation was with the bronchiectasis score (r = 0.35, p = 0.005; 95% confidence interval, 0.10–0.56). Figure 2A

shows change in HRCT overall score versus number of RTEs and Figure 2B shows change in FEV1 percent predicted versus number of RTEs. On the basis of standard linear models with change from baseline as the response variable, the covariates sex, age, and treatment group were not significant predictors for change in HRCT overall score, bronchiectasis score, or FEV1 percent predicted. For the subjects who experienced one or more RTEs, only the HRCT overall score (r = 0.91, p = 0.001) and the bronchiectasis score (r = 0.83, p = 0.01) showed significant correlation with number of RTEs (Table E2).

PFTs, and specifically FEV1, are the measures most frequently used to assess the severity of lung disease in people with CF. In the Cystic Fibrosis Foundation's 2002 patient registry annual data report, FEV1 is the only measure used to describe disease severity (15). Two conferences on outcome measures for CF have concluded that only FEV1 is sufficiently well studied to be relied on as an outcome surrogate (3, 16).

Studies have shown, however, that PFTs are insensitive to the presence of early lung disease (9, 17) and that PFTs are limited in their ability to show the progression of lung disease (18, 19). In the present study, PFTs at baseline correlated with the number of RTEs (although the predictive power was poor), but the change in PFTs did not. HRCT scores also correlated with RTEs at baseline, and the change in HRCT scores also correlated with the number of RTEs. With intervention trials based on comparing the change in value over the course of the trial, this study suggests that HRCT scores may be a more useful outcome surrogate in a clinical trial.

All specific findings on HRCT progressed over the 2-yr period except for mucous plugging, which did not change (Table 1). The greatest change was seen in air trapping, but this change did not correlate with the number of exacerbations. The use of voluntary inspiratory and expiratory maneuvers rather than spirometer control of lung volume, and scoring by expert readers rather than computer, may explain the difference between our results and those of investigators who have suggested that air trapping may provide a useful outcome surrogate (20).

Both mucous plugging and peribronchial thickening have been shown to increase during RTEs (2123), but in the current study, mucous plugging did not change, and the progression of peribronchial thickening when the subjects were in their usual state of health did not correlate with the number of RTEs. Mucous plugging and peribronchial thickening are reversible findings that may be more related to short-term changes than to long-term disease progression in these young children. Bronchiectasis, believed to represent a nonreversible change, progressed in the group overall (Table 1), and this progression correlated with number of RTEs (Table 2).

Outcome surrogates must be carefully validated to avoid misleading results (24, 25). In an article on the use of imaging biomarkers, Smith and colleagues (26) suggested that three criteria must be met: the presence of the imaging biomarker must be closely linked to the presence of the target disease; the detection or measurement of the biomarker must be accurate, reproducible, and feasible over time; and the changes over time in the biomarker must be closely linked to the true endpoint sought for the therapy being evaluated (26). In a separate article, Robert Temple (27) of the U.S. Food and Drug Administration described a hierarchy of four attributes of valid outcome surrogates: the surrogate must be biologically plausible, it must reflect the overall severity of the disease, it must improve rapidly with effective treatment, and it must be correlated with true outcomes rather than short-term measures of disease severity.

Many of these criteria have been addressed for CT scanning in CF. The biological plausibility of CT scanning as an outcome surrogate is supported by the initial validation of CT scanning criteria for bronchiectasis using direct comparison between HRCT images and gross pathologic sections (28). Another support for biological validity is provided by the use of CT scanning as the imaging modality of choice for the evaluation of the findings of CF lung disease, and that 95% of people with CF die of pulmonary complications. The link between severity of disease and HRCT is provided by studies showing that HRCT scores correlate with clinical scores (29, 30). The reproducibility of HRCT scoring in patients with CF is supported by a gene therapy study in which HRCT scores using the same system used in this study were obtained at baseline and Day 90. No significant changes were seen in the placebo group over this time in HRCT scores or PFTs (8).

This is the first study demonstrating that changes in HRCT scores correlate with true outcomes in CF. This addresses the final criterion of valid outcome surrogates: the link between the outcome surrogate and a true endpoint, and the correlation with true outcomes. RTEs are a major source of the clinical morbidity of CF (31, 32), and are commonly used as outcome measures in CF research (5, 33). The definition used in this study was intentionally restrictive with the requirement for intravenous drug administration. A study of azithromycin in CF used similar criteria to define a respiratory exacerbation (34). Including the use of oral or inhaled antibiotics in the criterion would have very likely resulted in a higher number of exacerbations. However, physicians vary widely in their indications for oral and inhaled antibiotics, limiting the value of this definition of an exacerbation. In addition, intravenous drug administration is in itself a source of morbidity, requiring either hospitalization or the placement of an indwelling catheter.

In this study, PFTs and HRCT scores obtained at the beginning of the trial correlated with number of RTEs, but the ability of any one of these variables to predict number of RTEs was poor. A Poisson model that included both baseline PFT and HRCT scores together with age and sex greatly improved the predictive power, but the model should be verified in future studies. To decrease study duration, it will also be necessary to show that HRCT score changes predict clinical outcome in addition to correlating over the same time period as shown in this study.

Change in HRCT scores over the 2-yr period correlated with number of RTEs, whereas change in PFTs did not. Significant correlation does not imply highly accurate prediction. Figure 2 shows that the variability of change in HRCT overall score and change in FEV1 percent predicted is large for the group of subjects who experienced no RTEs. However, the behavior of the group of subjects who experienced at least one RTE is what drives the significant correlation between number of RTEs and change in HRCT overall, bronchiectasis, and parenchymal disease scores. This moderate correlation likely reflects the multifactorial nature of CF disease severity and the small number of RTEs seen in these young children with mild to moderate lung disease. Nine of the 61 subjects experienced one or more RTEs. The subgroup analysis of the nine subjects who experienced one or more RTEs, however, also showed correlation with HRCT scores and not with PFTs. In future studies, the use of other endpoints that might provide a larger number of occurrences in young patients should be considered.

The risk of radiation must be considered when proposing the use of HRCT as an outcome surrogate. The radiation dose for HRCT has decreased since this study was performed. We are currently using 40 mA, rather than the 200 mA used in this study; a fivefold decrease in radiation exposure. Using this lower dose technique, an HRCT scan can be performed using the same amount of radiation as the average background radiation to which people in the United States are exposed every 3 mo (35).

This study provides further support for the use of HRCT as an outcome surrogate in patients with CF. A broad range of interventions for patients with CF is now available and under development. These interventions include newborn screening, antiinflammatory and antibiotic agents, and agents that impact the airway surface and mucus. Gene therapy offers the promise of a cure for CF. Continued progress in CF care requires the ability to assess these therapies. HRCT offers promise as a valuable tool in this evaluation.

1. U.S. Food and Drug Administration. New initiatives to speed access to new drugs. Rockville, MD: U.S. Food and Drug Administration; 1992.
2. De Gruttola VG, Clax P, DeMets DL, Downing GJ, Ellenberg SS, Friedman L, Gail MH, Prentice R, Wittes J, Zeger SL. Considerations in the evaluation of surrogate endpoints in clinical trials: summary of a National Institutes of Health workshop. Control Clin Trials 2001;22:485–502.
3. National Institutes of Health. Workshop on surrogate markers for CF. Bethesda, MD: National Institutes of Health; 1997.
4. Brody A, Klein J, Molina P, Campbell J, Millard S, Quan J. High-resolution CT correlates with the number of exacerbations in young children with CF. Pediatr Pulmonol 2003;36(S25):318–319.
5. Quan JM, Tiddens HA, Sy JP, McKenzie SG, Montgomery MD, Robinson PJ, Wohl ME, Konstan MW. A two-year randomized, placebo-controlled trial of dornase alfa in young patients with cystic fibrosis with mild lung function abnormalities. J Pediatr 2001;139:813–820.
6. American Thoracic Society. ATS statement: Snowbird workshop on standardization of spirometry. Am Rev Respir Dis 1979;119:831–838.
7. Wang X, Dockery DW, Wypij D, Fay ME, Ferris BG Jr. Pulmonary function between 6 and 18 years of age. Pediatr Pulmonol 1993;15:75–88.
8. Moss RB, Rodman D, Spencer LT, Aitken ML, Zeitlin PL, Waltz D, Milla C, Brody AS, Clancy JP, Ramsey B, et al. Repeated adeno-associated virus serotype 2 aerosol-mediated cystic fibrosis transmembrane regulator gene transfer to the lungs of patients with cystic fibrosis: a multicenter, double-blind, placebo-controlled trial. Chest 2004;125:509–521.
9. Brody AS. Early morphologic changes in the lungs of asymptomatic infants and young children with cystic fibrosis. J Pediatr 2004;144:145–146.
10. Brody AS, Kosorok MR, Li Z, Broderick LS, Foster JL, Laxova A, Bandla H, Farrell PM. Reproducibility of a scoring system for CT scanning in cystic fibrosis. Pediatr Pulmonol 2004;2004:299.
11. Zar J. Biostatistical analysis, 4th ed. Englewood, NJ: Prentice-Hall; 1999.
12. Everitt B, Rabe-Hesketh S. Analyzing medical data using S-PLUS. New York: Springer; 2001.
13. Venables WN, Ripley BD. Modern applied statistics with S-PLUS. New York: Springer; 1999.
14. Draper N, Smith H. Applied regression analysis, 3rd ed. New York: Wiley; 1998.
15. Cystic Fibrosis Foundation. Patient registry: 2002 annual report. Bethesda, MD: Cystic Fibrosis Foundation; 2003.
16. Ramsey BW, Boat TF. Outcome measures for clinical trials in cystic fibrosis: summary of a Cystic Fibrosis Foundation consensus conference. J Pediatr 1994;124:177–192.
17. Long FR, Williams RS, Castile RG. Structural airway abnormalities in infants and young children with cystic fibrosis. J Pediatr 2004;144:154–161.
18. Helbich TH, Heinz-Peer G, Fleischmann D, Wojnarowski C, Wunderbaldinger P, Huber S, Eichler I, Herold CJ. Evolution of CT findings in patients with cystic fibrosis. AJR Am J Roentgenol 1999;173:81–88.
19. de Jong PA, Nakano Y, Lequin MH, Mayo JR, Woods R, Pare PD, Tiddens HA. Progressive damage on high resolution computed tomography despite stable lung function in cystic fibrosis. Eur Respir J 2004;23:93–97.
20. Goris ML, Zhu HJ, Blankenberg F, Chan F, Robinson TE. An automated approach to quantitative air trapping measurements in mild cystic fibrosis. Chest 2003;123:1655–1663.
21. Shah RM, Sexauer W, Ostrum BJ, Fiel SB, Friedman AC. High-resolution CT in the acute exacerbation of cystic fibrosis: evaluation of acute findings, reversibility of those findings, and clinical correlation. AJR Am J Roentgenol 1997;169:375–380.
22. Brody AS, Molina PL, Klein JS, Rothman BS, Ramagopal M, Swartz DR. High-resolution computed tomography of the chest in children with cystic fibrosis: support for use as an outcome surrogate. Pediatr Radiol 1999;29:731–735.
23. Robinson TEKK, Newaskar M, Bhise P, Mishra N, Blankenberg FG, Northway WH, Leung AN, Moss RB. Spirometer-triggered high-resolution computed tomography (HRCT) scores in mild and moderate to severe CF lung disease. Pediatr Pulmonol 2001(Suppl);22:378.
24. Fleming TR, DeMets DL. Surrogate end points in clinical trials: are we being misled? Ann Intern Med 1996;125:605–613.
25. DeMets DL, Califf RM. Lessons learned from recent cardiovascular clinical trials: part I. Circulation 2002;106:746–751.
26. Smith JJ, Sorensen AG, Thrall JH. Biomarkers in imaging: realizing radiology's future. Radiology 2003;227:633–638.
27. Temple RJ. A regulator authority's opinion about surrogate endpoints. In: Nimmo WS, editor. Clinical measurements on drug evaluation. New York: Wiley; 1995.
28. Kang EY, Miller RR, Muller NL. Bronchiectasis: comparison of preoperative thin-section CT and pathologic findings in resected specimens. Radiology 1995;195:649–654.
29. Nathanson I, Conboy K, Murphy S, Afshani E, Kuhn JP. Ultrafast computerized tomography of the chest in cystic fibrosis: a new scoring system. Pediatr Pulmonol 1991;11:81–86.
30. Helbich TH, Heinz-Peer G, Eichler I, Wunderbaldinger P, Gotz M, Wojnarowski C, Brasch RC, Herold CJ. Cystic fibrosis: CT assessment of lung involvement in children and adults. Radiology 1999;213:537–544.
31. Davis PB, Drumm M, Konstan MW. Cystic fibrosis. Am J Respir Crit Care Med 1996;154:1229–1256.
32. MacLusky ILH. Cystic fibrosis. In: Chernick V, Boat TF, Kending EL Jr, editors. Kendig's disorders of the respiratory tract in children. Philadelphia: WB Saunders; 1998. p. 838–882.
33. Konstan MW, Byard PJ, Hoppel CL, Davis PB. Effect of high-dose ibuprofen in patients with cystic fibrosis. N Engl J Med 1995;332:848–854.
34. Saiman L, Marshall BC, Mayer-Hamblett N, Burns JL, Quittner AL, Cibene DA, Coquillette S, Fieberg AY, Accurso FJ, Campbell PW III. Azithromycin in patients with cystic fibrosis chronically infected with Pseudomonas aeruginosa: a randomized controlled trial. JAMA 2003;290:1749–1756.
35. Lucaya J, Piqueras J, Garcia-Pena P, Enriquez G, Garcia-Macias M, Sotil J. Low-dose high-resolution CT of the chest in children and young adults: dose, cooperation, artifact incidence, and image quality. AJR Am J Roentgenol 2000;175:985–992.
Correspondence and requests for reprints should be addressed to Alan S. Brody, M.D., Professor of Radiology and Pediatrics, Department of Radiology, MLC-5031, Cincinnati Children's Hospital, 3333 Burnet Avenue, Cincinnati, OH 45229-3039. E-mail:

Related

No related items
American Journal of Respiratory and Critical Care Medicine
172
9

Click to see any corrections or updates and to confirm this is the authentic version of record