Studies estimating the rate of lung function decline in cystic fibrosis have been inconsistent regarding the methods used. How the methodology used impacts the validity of the results and comparability between studies is unknown.
The Cystic Fibrosis Foundation established a work group whose tasks were to examine the impact of differing approaches to estimating the rate of decline in lung function and to provide analysis guidelines.
We used a natural history cohort of 35,252 individuals with cystic fibrosis aged ⩾6 years in the Cystic Fibrosis Foundation Patient Registry (CFFPR), 2003–2016. Modeling strategies using linear and nonlinear forms of marginal and mixed-effects models, which have previously quantified the rate of forced expiratory volume in 1 second (FEV1) decline (percent predicted per year), were evaluated under clinically relevant scenarios of available lung function data. Scenarios varied by sample size (overall CFFPR, medium-sized cohort of 3,000 subjects, and small-sized cohort of 150), data collection/reporting frequency (encounter, quarterly, and annual), inclusion of FEV1 during pulmonary exacerbation, and follow-up length (<2 yr, 2–5 yr, entire duration).
Rate of FEV1 decline estimates (percent predicted per year) differed between linear marginal and mixed-effects models; overall cohort estimates (95% confidence interval) were 1.26 (1.24–1.29) and 1.40 (1.38–1.42), respectively. Marginal models consistently estimated less rapid lung function decline than mixed-effects models across scenarios, except for short-term follow-up (both were ∼1.4). Rate of decline estimates from nonlinear models diverged by age 30. Among mixed-effects models, nonlinear and stochastic terms fit best, except for short-term follow-up (<2 yr). Overall CFFPR analysis from a joint longitudinal-survival model implied that an increase in rate of decline of 1% predicted per year in FEV1 was associated with a 1.52-fold (52%) increase in the hazard of death/lung transplant, but the results exhibited immortal cohort bias.
Differences were as high as 0.5% predicted per year between rate of decline estimates, but we found estimates were robust to lung function data availability scenarios, except short-term follow-up and older age ranges. Inconsistencies among previous study results may be attributable to inherent differences in study design, inclusion criteria, or covariate adjustment. Results-based decision points reported herein will support researchers in selecting a strategy to model lung function decline most reflective of nuanced, study-specific goals.
Cystic fibrosis (CF) is characterized by progressive deterioration of lung function due to bronchiectasis, mucus plugging, infection, and inflammation. Given the association between lung function and both mortality and quality of life, understanding risk factors and patterns of lung function decline has informed our comprehension of CF as a disease. Coupling this understanding with the effectiveness of therapeutic options to slow or reverse disease progression has led to optimized care.
As CF registries have become more widespread and with advances in statistical techniques and software for implementation, there have been an increasing number of different analytic approaches that quantify lung function decline. Initially, summary statistics stratified by age were used to assess year-to-year changes (1). By summarizing all measures into a mean, this approach was easily comprehensible but neglected data complexity (2, 3). Later studies correlated year-to-year trends to study natural progression (4). Increasingly, methods emerged using linear regression, then marginal and mixed-effects models, and, most recently, joint models (5). Extensive literature using different methods has explored lung function and age-related decline in relation to demographic factors (6, 7), microbiology (8), therapeutic interventions (9–11), and mortality (12–17). Different conclusions across studies have been observed (18), and it is unclear whether they relate to data sources used and their characteristics, modeling approaches, or other factors.
Given the range of statistical approaches available and differences in data collection across studies and registries, we conducted a broad natural history study using the Cystic Fibrosis Foundation Patient Registry (CFFPR). We systematically compared different statistical methods and inclusion criteria to determine how these approaches impact our interpretation of lung function decline. We aimed to provide a framework of recommendations regarding analysis and modeling of longitudinal lung function decline by 1) identifying methods that yield reasonable and interpretable estimates of rate of decline in lung function over the life course, 2) characterizing how these methods perform under different data availability scenarios, and 3) disseminating recommendations regarding analysis and modeling of longitudinal lung function decline.
Individuals included had a valid CF diagnosis as reported to the CFFPR; received care at a CFF-accredited center between January 1, 2003 (when encounter-level data collection began), and December 31, 2016 (the most recently available data upon study initiation); were ⩾6 years old during the time frame; and consented to CFFPR participation (19). Data observed below age 6 were excluded. We excluded individuals who lacked pulmonary function testing (PFT) records, received a lung transplant before 2003, or were diagnosed with CF after age 18. This study was approved by the Cincinnati Children’s Hospital Medical Center Institutional Review Board.
The modeled outcome and marker of lung function was measured by forced expiratory volume in 1 second (FEV1), expressed as percent predicted (using Global Lung Function Initiative [GLI] reference equations [20]). Baseline was the first observed FEV1 during the time frame. Observations with FEV1 above 150% predicted or those taken after lung transplant were omitted/censored. Covariates included age at CF diagnosis, year of birth, and occurrence of acute pulmonary exacerbation (PEx). We limited the set of covariates to better focus on selection of analytic approach rather than covariate assumptions.
In addition to the overall analysis cohort, other lung function data availability scenarios were considered, including 1) smaller sample size, 2) different measurement frequencies, 3) duration of follow-up, and 4) exclusion of measures taken during PEx. Medium and small sample evaluations were performed by randomly selecting data on n = 3,000 and 150 individuals, corresponding to multisite/smaller registry, national, and center-level cohorts. The impact of measurement frequency, which may arise from different methods of lung function reporting or data availability, was assessed by aggregating encounter-level FEV1 percent predicted data from the overall cohort as quarterly and annual using summary statistics (mean and median), which mimic data collection frequency from other registries (e.g., the UK CF Registry) and approaches taken in prior studies assessing rate of FEV1 decline (21). To estimate lung function decline over short-term durations, follow-up was limited to <2 years and 2–5 years by randomly selecting segments of follow-up from patients in the overall analysis cohort. For the last alternative scenario, FEV1 observed during PEx, identified as measurements taken during or within 14 days before intravenous antibiotic treatment in the hospital, at home, or during a clinic visit where the physician indicated the presence of moderate or severe PEx, was excluded from the overall cohort.
To fit longitudinal FEV1 trajectories and estimate rate of decline, marginal and mixed-effects models were chosen as the two primary modeling strategies. Age (in years) was used as the time scale in each model, and an observation was included at each FEV1 measurement. Rate of FEV1 decline was reported as percent predicted per year with 95% confidence interval (CI). Analyses were performed using R version 4.1.3 (R Foundation for Statistical Computing). R code was embedded in an interactive application (available as supplemental material in the data supplement) with simulated CF data to illustrate applications. Equivalent SAS code for each model type is also provided (Section D in the data supplement).
Dependencies between FEV1 observations within the same individual over time are handled differently in marginal and mixed-effects models (5). In marginal models, parametric estimates for rate of decline and other terms are acquired through generalized estimating equations with a focus on the overall population decline over individual patient variation (3). The models typically assume exchangeability to account for repeated measurements of the same individual (22). A previous CFFPR study shows an example of using this assumption in marginal models to assess the effect of methicillin-resistant Staphylococcus aureus infection on the population-level rate of FEV1 decline (8). We used this approach and applied the “sandwich” estimator to generate robust standard errors for estimates of population-level average rate of decline (23).
Alternatively, mixed-effects models include both fixed effects for population-level estimates and random effects to estimate how each patient varies from the population. A brief, introductory comparison of marginal and mixed-effects models is provided (Section B.1 in the data supplement) and is available in existing literature (24, 25). We fit mixed-effects models under each scenario that include the random intercepts model first applied by Corey and colleagues (26); the random intercepts-and-slopes model (27) that has been applied in different forms by Konstan and colleagues (21); and random intercepts with an exponential (stochastic) correlation structure, originally applied by Taylor-Robinson and colleagues to the Danish CF Registry (28).
Joint models can be useful to correct estimates of the FEV1 longitudinal outcome while accounting for nonrandom dropout. In parallel, they can correct the survival outcome when accounting for the effect of an endogenous time-dependent covariate such as FEV1. These joint models of longitudinal outcomes (such as FEV1) and survival outcomes (such as lung transplant/death) combine a mixed-effects submodel for the longitudinal outcome (FEV1) with a survival submodel (for lung transplant/death). We fit a joint model of the longitudinal lung function process using the aforementioned mixed-effects model structures and added the survival process using a Weibull model for the survival submodel. This accounts for measurement error in FEV1 percent predicted as a marker of lung function while estimating hazard of death/lung transplant based on absolute and/or rate of decline as examined in prior CF studies (15, 16, 29). The joint model’s shared parameters have been used previously in CF studies to connect the survival outcome with the underlying value of lung function and its rate of change over time (14–16, 29). The first shared parameter quantifies the association between average FEV1 (percent predicted) and hazard of death/lung transplant, whereas the second shared parameter quantifies the association between rate of change in FEV1 (percent predicted per year) and hazard of death/lung transplant. We examined a newer, experimental approach to the joint model by changing the distribution assumption for the longitudinal submodel of FEV1 percent predicted. γ- and β-Distributions were considered in place of the normal distribution using the JMbayes2 R package (30). Results and R code with this procedure are provided in the data supplement (Sections C.10 and E in the data supplement).
To characterize lung function progression over time, we fit linear and nonlinear specifications under these four model types, referred to as 1) marginal, 2) random intercepts, 3) random intercepts and slopes, and 4) stochastic. The linear form included both an intercept to estimate initial lung function and slope reflecting a constant rate of change over time. Nonlinear progression was modeled using natural splines, which allowed flexible characterization of rate of lung function decline by taking the first derivative over time (31). The different spline structures examined, including explanation of nonlinear random effects and impact on correlations, are described in Section C.2 in the data supplement.
We examined the validity of modeling assumptions and influence of survivorship in sensitivity analyses. We fit separate models to the following subgroups: 1) individuals ages 6–30 years, 2) those aged 30+, and 3) all ages but restricted to F508del homozygotes. Models were` evaluated by adding year of birth as a covariate. We fit mixed-effects models with nonconstant variance in response to systematic patterns found in residuals (Section C.4 in the data supplement).
There were 35,252 individuals who met the overall cohort inclusion criteria. The analysis included 30,811 individuals after we excluded those who entered the CFFPR post-transplant (n = 580), those diagnosed after age 18 (n = 2,916), and those without lung function measurements (n = 945). Individuals included had an average of 8.3 years of follow-up and 39 FEV1 measurements (Table 1). Overall, we did not observe substantial differences in population characteristics or subgroups among different scenarios (differing sample sizes, data availability, length of follow-up, and PEx exclusion), except where expected from sampling strategy.
Scenario | n | Baseline Age | Age at Diagnosis | Baseline FEV1, % Predicted | Year of Birth | Follow-Up Time (yr) | No. of PFTs per Individual |
---|---|---|---|---|---|---|---|
Overall cohort | 30,811 | 14.6 (6.0, 74.6) | 2.2 (−0.6, 18.0) | 78.59 (8.81, 149.50) | 1991 (1937, 2010) | 8.3 (0.0, 14.0) | 39 (1, 277) |
Sample size | |||||||
Medium | 3,000 | 14.9 (6.0, 57.6) | 2.3 (−0.5, 18.0) | 78.39 (8.81, 142.00) | 1990 (1946, 2009) | 9.3 (0.2, 14.0) | 44 (2, 248) |
Small | 150 | 15.4 (6.0, 48.3) | 2.0 (0, 17.5) | 79.03 (17.94, 134.83) | 1990 (1955, 2010) | 9.6 (0.0, 14.0) | 41 (1, 134) |
Frequency of data availability† | |||||||
Quarterly | 30,811 | 14.7 (6.0, 74.8) | 2.2 (−0.6, 18.0) | 80.22 (8.81, 149.50) | 1991 (1937, 2010) | 8.3 (0.0, 14.0) | 25 (1, 57) |
Annual | 30,811 | 14.6 (6.0, 75.0) | 2.2 (−0.6, 18.0) | 78.92 (10.72, 147.80) | 1991 (1937, 2010) | 8.3 (0.0, 14.0) | 9 (1, 15) |
Length of follow-up | |||||||
2–5 yr | 26,692 | 17.8 (6.0, 73.4) | 2.2 (−0.6, 18.0) | 75.27 (8.81, 147.42) | 1991 (1937, 2009) | 3.8 (0.1, 5.0) | 20 (2, 161) |
<2 yr | 30,811 | 18.0 (6.0, 75.0) | 2.2 (−0.6, 18.0) | 74.26 (9.01, 147.80) | 1991 (1937, 2010) | 1.4 (0.0, 2.0) | 9 (1, 73) |
Exclude PEx events | 30,659 | 14.6 (6.0, 74.6) | 2.2 (−0.6, 18.0) | 79.59 (9.25, 149.50) | 1991 (1937, 2010) | 8.2 (0.0, 14.0) | 27 (1, 140) |
Linear forms of marginal and mixed effects models yielded statistically significantly different estimates of FEV1 rate of decline for the overall cohort, as determined by nonoverlapping 95% CIs. Estimates (percent predicted per year) were 1.26 (1.24–1.29), 1.40 (1.38–1.42), and 1.66 (1.656–1.664) for marginal, mixed-effects, and joint models, respectively (Figure 1). The stochastic model had the best overall fit of the mixed-effects models (Section C.3 in the data supplement). As a result, this covariance structure was used for all linear mixed-effects models presented. There are no measures of goodness of fit that can be used to compare the marginal and mixed-effects model types, but the linear mixed-effects models had higher precision (narrower 95% CI width) than marginal models. Furthermore, the two methodologies have different model assumptions, and, therefore, the selection of which model to use depends also on the nature of the research question and data.
Figure 2 graphically displays results of the marginal, three mixed-effects, and joint models with linear and nonlinear forms. Comparison of nonlinear marginal versus mixed-effects models indicated differences in the age of most rapid decline. In the marginal model, it was age 12 (annual rate of decline, ∼2.7% predicted/yr), whereas the mixed-effects model suggested it occurred at age 20 (∼2.5% predicted/yr) (Figure 2D). Joint models assumed to be linear over time estimated more rapid decline than the analogous mixed-effects and marginal models across the age span (Figures 2A and 2B). The nonlinear joint models suggested more modest decline in younger ages than the linear joint models but more extensive decline in adulthood ages (>20 yr old; Figures 2C and 2D); however, estimated lung function values fell below zero under the nonlinear joint model, suggesting unreliable results for older age ranges (Figure 2C). We investigated the issue with negative lung function values from this traditional joint model by assuming more properly bounded distributions in the longitudinal submodel. This approach mitigated issues with negative lung function values, but coefficient estimates lacked direct interpretability and therefore are presented in the data supplement (Section C.10 in the data supplement).
Models in which lung function decline was assumed to be linear over time produced reasonable estimates until age 50, after which they began to predict negative values for FEV1 percent predicted (Figure 2A). These three models suggested average annual rates of decline between 1.25% and 1.75% predicted per year (Figure 2B). Models assuming nonlinear lung function showed that change in decline was more nuanced with severest declines during adolescence and young adulthood and relatively milder declines observed after age 30 (see Sensitivity Analyses subsection for separate analyses of ages 6–30 and 30+). The marginal and mixed-effects models with nonlinear shapes predicted similar and realistic values of FEV1 percent predicted for all ages, but the joint models continued to show negative FEV1 percent predicted values and indicate more substantial decline in older ages (Figures 2C and 2D). The joint model indicated that less rapid decline in FEV1 (percent predicted per year) was associated with decreased hazard of death/lung transplant across all scenarios (Table 2). The results specifically implied that rate of decline in lung function is strongly associated with the risk for death/lung transplant across scenarios. For example, in the overall cohort, at any given time point, a decrease of 1% predicted in FEV1 corresponded to a 1.07-fold (7%) increased risk of death/lung transplant; a decline of 1% predicted per year in the slope of FEV1 was associated with a 1.52-fold (52%) increase in the risk of death/lung transplant. Although the joint model demonstrated stronger association between rate of lung function decline and mortality than with progression estimated by other model types, it had limited interpretation in older ages, owing to unrealistic lung function predictions, particularly for the nonlinear form.
Linear | Nonlinear | |||
---|---|---|---|---|
Underlying Value* | Rate† | Underlying Value* | Rate† | |
HR (95% CI) | HR (95% CI) | HR (95% CI) | HR (95% CI) | |
Scenario | ||||
Overall cohort | 0.93 (0.93, 0.93) | 0.66 (0.65, 0.67) | 0.93 (0.93, 0.93) | 0.67 (0.66, 0.68) |
Sample size | ||||
Medium | 0.92 (0.91, 0.92) | 0.69 (0.66, 0.72) | 0.91 (0.91, 0.92) | 0.69 (0.66, 0.73) |
Small | 0.88 (0.83, 0.92) | 0.65 (0.53, 0.80) | 0.88 (0.84, 0.92) | 0.66 (0.54, 0.80) |
Frequency of measurement | ||||
Quarterly | 0.92 (0.92, 0.92) | 0.72 (0.71, 0.73) | 0.92 (0.92, 0.92) | 0.72 (0.71, 0.73) |
Annual | 0.91 (0.91, 0.91) | 0.70 (0.69, 0.72) | 0.91 (0.91, 0.91) | 0.72 (0.71, 0.73) |
Length of follow-up | ||||
2–5 yr | 0.95 (0.95, 0.95) | 0.70 (0.70, 0.71) | 0.95 (0.95, 0.95) | 0.71 (0.70, 0.71) |
<2 yr | 0.93 (0.92, 0.93) | 0.63 (0.59, 0.67) | 0.93 (0.93, 0.93) | 0.77 (0.74, 0.80) |
Exclude PEx events | 0.93 (0.93, 0.94) | 0.68 (0.67, 0.69) | 0.93 (0.93, 0.94) | 0.69 (0.68, 0.70) |
Subcohort | ||||
Age 6–30 yr | 0.93 (0.93, 0.93) | 0.78 (0.77, 0.80) | 0.93 (0.93, 0.93) | 0.79 (0.78, 0.80) |
Age >30 yr | 0.96 (0.96, 0.97) | 0.74 (0.72, 0.76) | 0.96 (0.96, 0.97) | 0.72 (0.71, 0.74) |
F508del homozygotes | 0.93 (0.93, 0.93) | 0.67 (0.66, 0.69) | 0.93 (0.93, 0.93) | 0.68 (0.67, 0.69) |
Overall, similar patterns were observed regarding the impact of scenarios on rate of decline estimates (Figure 1). Across scenarios for the marginal model, rates of decline ranged from 1.26% to 1.40% predicted per year; for the mixed-effects model, most values ranged from 1.36% to 1.47% with an outlying estimate of 1.66% for 2–5 years of follow-up; joint model estimates ranged from 1.61% to 1.84% predicted per year. Figure 3 displays results for the nonlinear mixed models. (Results from other models can be found in the data supplement.) Nonlinear models showed overlapping curves for annual versus quarterly FEV1 measures, <2 versus 2–5 years of follow-up, and including/excluding PEx. With small sample sizes, estimates began to diverge at older ages, where the data are particularly sparse.
We separately modeled the following subgroups: 1) individuals aged 6–30 years, 2) those aged 30+, and 3) all ages who are homozygous for the F508del cystic fibrosis transmembrane conductance regulator (CFTR) mutation. Linear marginal modeling estimated more rapid decline than linear mixed-effects modeling in ages 6–30 (difference, ∼0.18% predicted/yr) but less severe decline in ages 30+ (difference, ∼0.48% predicted/yr) (Figure 4). Rate of FEV1 progression declined more rapidly in ages 6–30 than in those aged 30+ (Figure 5A). There was not a substantial difference in rates of decline between the entire population and F508del homozygotes (Figure 5B compared with Figures 2C and 2D). Results did not change when we accounted for age-related heteroscedasticity, adjusted for year of birth, or used time since baseline rather than age as the time scale (Sections C.5, C.6, and C.7 in the data supplement, respectively).
We systematically examined lung function rates of decline among individuals with CF across multiple modeling assumptions and strategies spanning various PFT data availability scenarios. Our findings generally indicate that differences in rate of decline estimates from prior studies are due not necessarily to statistical modeling strategy but rather to inherent differences in study design, inclusion criteria, or additional covariate adjustment, especially in estimating effects associated with a given exposure. As such, a universal analysis protocol is not recommended. In part, the appropriate model and assumptions depend on the research question. To support researchers in developing analytic plans, we have outlined decision points and provided rationale and direction at each juncture (Figure 6). These decision points, which are specific to objectives and methods for analysis of lung function decline, complement more general reporting guidelines from the EQUATOR Network (Enhancing the Quality and Transparency of health research Network), such as STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) (32) and TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) (33), for explanatory and prediction study goals, respectively. Combined use should facilitate more standardized reporting of estimates and results in future studies.
Three important elements to study natural FEV1 progression were assessed: the different models that can be used and were previously investigated, the structure of each model (e.g., linear vs. nonlinear), and the data that are available (e.g., long vs. short follow-up periods, collection frequency). Overall, the linear model results aligned with previous estimates ranging from 1.3% to 1.7% predicted per year (18). Statistically significant differences were observed between the marginal and various mixed-effects models, such as the difference for the overall cohort projected to 0.7% predicted for every 5 years of follow-up as compared with 0.14% predicted over 1 year of follow-up (Figure 1). There is no consensus on a minimal clinically important difference for FEV1 percent predicted as a CF clinical trial endpoint (34), making it difficult to discern clinical relevance of observed differences. Findings on the stochastic mixed-effects model corroborated results from prior studies indicating superior fit over long time periods compared with conventional covariance specifications (28, 31). Improvements in fit are likely due to more observations arising from longer, more frequently collected sequences of PFT data over the life course.
This analysis highlights the challenges regarding the decision of whether to use linear or nonlinear lung function decline. Although linear shape models produce a single, easily reported effect estimate that is comparable across studies, this analysis shows that the natural history of CF lung function over long periods does not follow a linear trajectory. Narrow widths of the CIs reported from linear results may stem from underestimating true variability due to nonlinear trajectories but surely also arise from the notable challenge of large sample sizes that shrink standard errors (35). Our study shows that trade-offs between simpler and more complex models are somewhat mitigated, given access to statistical software for advanced longitudinal analyses; however, study goal and data availability remain prominent considerations (Figure 6). Studies should account for nonlinearity across different age groups, particularly adolescents and young adults, where age and/or baseline lung function strata are often used for adjustments (21). Specifically, the linear mixed-effects model that includes all ages estimates a rate of decline of 1.40% predicted per year, differing substantially from estimates from models restricted to ages 6–30 (1.66% predicted/yr) and from models restricted to individuals aged 30 and older (0.85% predicted/yr). It is unclear whether this reflects a true change in the rate of decline as individuals age or is an artifact of survivor bias such that those aged 30+ remaining in the cohort are healthier. This finding is in contrast to results from Stanojevic and colleagues (20) and suggests that the previously observed increased rate of decline observed during adolescence was due to the use of the Wang and Hankinson reference equations (36, 37), because it was not observed with GLI reference equations. Liou and colleagues’ earlier work before GLI development corroborates worsening lung function during adolescence compared with other periods, regardless of reference equations used (1). For circumstances in which linear models are chosen, we found increased precision under linear mixed-effects models compared with marginal models (Figure 1), although linear mixed-effects models require different assumptions (Section B.1 in the data supplement).
In the joint models, baseline FEV1 percent predicted was not associated with hazard of death/lung transplant for any scenarios or subcohorts; however, there were consistent, strong associations with rate of decline. By contrast, two previous CFFPR cohort studies (12, 38) and several advanced joint modeling studies using random intercepts and slopes that were limited to a single U.S. CF center (14, 15, 29) have shown an association between baseline FEV1 percent predicted and survival but have not found the inclusion of FEV1 rate of decline (percent predicted per year) to impact survival. Observed differences could be due in part to different data availability scenarios, as we have observed in this study; baseline definitions; or covariate inclusion. Findings from the traditional joint models should be interpreted with caution in older age ranges, however, given the unrealistic estimates of pulmonary function falling below zero (Figure 2C). Despite our findings from late-breaking software advancements that appear to mitigate this issue, coefficients of the advanced joint models are not directly comparable to the linear mixed effects and generalized estimating equation models presented. Future research is needed for a more comprehensive investigation of joint models with different distributional assumptions and interpretation of their results. In addition, scenarios with less than 5 years of follow-up may not provide reasonable estimates with joint models. The importance and utility of joint models to examine influences on mortality risk may increase over time as more of the population reaches older ages with broad uptake of highly effective CFTR modulators (11, 39, 40); presumably, data will not be as sparse, allowing more realistic estimates.
Our findings support comparability of results between registries, as in the U.S. and U.K. CF lung function decline study (41), encouraging evidence that frequency of observed PFT data, length of follow-up, and the inclusion of FEV1 measures taken during PEx do not substantially impact estimated rate of lung function decline when study design, participant inclusion criteria, and covariates included in the models are the same. The variety of scenarios selected were chosen to reflect different data-reporting strategies used by CF registries worldwide. Registries such as the CFFPR have made it possible to report encounter-level PFT data rather than gathering once-yearly measurements, but PFT data availability is driven largely by care teams and specific needs/severity of individual patients. Lung function monitoring will likely proceed through more frequent remote collection, as evidenced by the coronavirus disease (COVID-19) pandemic (42). Modeling approaches from the present study could be adapted to account for these future sources of variability as modalities change. Similar to the U.K.-U.S. study of lung function decline, our findings also suggest that evolution of the CF care model from quarterly to less frequent clinical visits may not significantly impact the estimated rate of lung function decline if using mixed-effects models and certain assumptions are met (41). Autoregressive covariance structures could be considered in either marginal or mixed-effects models (43) but were not investigated in this study, because they require regularly spaced measurements, which would not hold for irregularly observed encounter data.
Using the first available PFT as a baseline could be subject to left-truncation bias, but findings were similar when adjusting for year of birth. The analysis cohort excluded late-diagnosed individuals, which may represent a milder CF phenotype. Inferences are subject to survivor bias, such as observed trends noted after ages 30 and 50 years, wherein the number of individuals with PFT data diminished to 7,121 and 713, respectively. Implicit bias may exist in the PEx definition employed for this study. Although PEx events are diagnosed primarily on the basis of drops in lung function, the decision for and intensity of subsequent treatment varies according to measured factors (e.g., baseline FEV1 percent predicted, extent of drop) (44, 45) and is subject to indication bias in observational studies (46).
This study was performed using data obtained before broad release of CFTR modulators. Modeling lung function decline after CFTR modulator initiation is an emerging area with an important but smaller number of studies with ivacaftor (9, 10), compared with pre-CFTR modulator studies of natural history. Findings from the present study, which assessed model choices and data availability for the broader population, are expected to inform future investigations of modulator effects, including from this work group. We did not examine alternative methods to account for missing data, such as censoring weights, which takes individualized propensity to drop out and uses this to “weight” estimated rate of decline (i.e., inverse probability weighting) (47). Instead, we implemented a joint longitudinal-survival model using existing software to account for dropout due to lung transplant/death. Population-level rate of decline estimated from the joint model may suffer from immortal cohort bias (48). In a joint model, estimated rate of decline is based on the model including estimated “true” lung function values after death, which produced unrealistic estimates for older ages in the CF cohort. However, individualized rates of decline estimated from mixed-effects models, including the joint model, may still hold under certain assumptions (49).
Advanced simulation studies could further distinguish marginal and mixed-effects models. Assumptions made in the data-generating mechanism need careful study because this will, in theory, favor one type of model over another. In the setting in which these models are equivalent, we performed simulations (Section B.1 in the data supplement); however, the “true” rate of lung function decline is unknown. We found that both types produce accurate results for rate of decline and other model parameters. Technical discussion on similarities and differences between the model types is provided elsewhere (3). In-sample predictive accuracy results were not cross-validated, but future studies focused on prediction would benefit from such procedures to assess model accuracy.
Our investigation yields a roadmap for modeling CF lung function decline, paving the way to apply this framework in investigations of the impact of interventions (e.g., universal newborn screening) and use of novel treatments (e.g., CFTR modulators) and treatment withdrawal approaches.
The authors thank the Cystic Fibrosis Foundation for use of CFFPR data to conduct this study and the patients, care providers, and clinic coordinators at U.S. CF centers for their contributions to the CFFPR. Requests to access the CFFPR analysis data used for this study may be sent to the Cystic Fibrosis Foundation for review/approval: [email protected].
1. | Liou TG, Elkin EP, Pasta DJ, Jacobs JR, Konstan MW, Morgan WJ, et al. Year-to-year changes in lung function in individuals with cystic fibrosis. J Cyst Fibros 2010;9:250–256. |
2. | Fitzmaurice G, Davidian M, Verbeke G, Molenberghs G, editors. Longitudinal data analysis. Boca Raton, FL: Chapman & Hall; 2008. |
3. | Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis. 2nd ed. Hoboken, NJ: John Wiley & Sons; 2011. |
4. | Rosenfeld M, VanDevanter DR, Ren CL, Elkin EP, Pasta DJ, Konstan MW, et al.; Investigators of Coordinators of the Epidemiologic Study of Cystic Fibrosis. Decline in lung function does not predict future decline in lung function in cystic fibrosis patients. Pediatr Pulmonol 2015;50:856–862. |
5. | Szczesniak R, Heltshe SL, Stanojevic S, Mayer-Hamblett N. Use of FEV1 in cystic fibrosis epidemiologic studies and clinical trials: a statistical perspective for the clinical researcher. J Cyst Fibros 2017;16:318–326. |
6. | Konstan MW, Morgan WJ, Butler SM, Pasta DJ, Craib ML, Silva SJ, et al.; Scientific Advisory Group and the Investigators and Coordinators of the Epidemiologic Study of Cystic Fibrosis. Risk factors for rate of decline in forced expiratory volume in one second in children and adolescents with cystic fibrosis. J Pediatr 2007;151:134–139e1. |
7. | Konstan MW, Wagener JS, Vandevanter DR, Pasta DJ, Yegin A, Rasouliyan L, et al. Risk factors for rate of decline in FEV1 in adults with cystic fibrosis. J Cyst Fibros 2012;11:405–411. |
8. | Dasenbrook EC, Merlo CA, Diener-West M, Lechtzin N, Boyle MP. Persistent methicillin-resistant Staphylococcus aureus and rate of FEV1 decline in cystic fibrosis. Am J Respir Crit Care Med 2008;178:814–821. |
9. | Sawicki GS, McKone EF, Pasta DJ, Millar SJ, Wagener JS, Johnson CA, et al. Sustained benefit from ivacaftor demonstrated by combining clinical trial and cystic fibrosis patient registry data. Am J Respir Crit Care Med 2015;192:836–842. |
10. | Kawala CR, Ma X, Sykes J, Stanojevic S, Coriati A, Stephenson AL. Real-world use of ivacaftor in Canada: a retrospective analysis using the Canadian Cystic Fibrosis Registry. J Cyst Fibros 2021;20:1040–1045. |
11. | Keogh RH, Cosgriff R, Andrinopoulou ER, Brownlee KG, Carr SB, Diaz-Ordaz K, et al. Projecting the impact of triple CFTR modulator therapy on intravenous antibiotic requirements in cystic fibrosis using patient registry data combined with treatment effects from randomised trials. Thorax 2022;77:873–881. |
12. | Liou TG, Adler FR, Fitzsimmons SC, Cahill BC, Hibbs JR, Marshall BC. Predictive 5-year survivorship model of cystic fibrosis. Am J Epidemiol 2001;153:345–352. |
13. | Liou TG, Adler FR, Cox DR, Cahill BC. Lung transplantation and survival in children with cystic fibrosis. N Engl J Med 2007;357:2143–2152. [Published erratum appears in N Engl J Med 2008;359:e6.] |
14. | Piccorelli AV, Schluchter MD. Jointly modeling the relationship between longitudinal and survival data subject to left truncation with applications to cystic fibrosis. Stat Med 2012;31:3931–3945. |
15. | Schluchter MD, Piccorelli A. Shared parameter models for joint analysis of longitudinal and survival data with left truncation due to delayed entry—applications to cystic fibrosis. Stat Methods Med Res 2019;28:1489–1507. |
16. | Barrett J, Diggle P, Henderson R, Taylor-Robinson D. Joint modelling of repeated measurements and time-to-event outcomes: flexible model specification and exact likelihood inference. J R Stat Soc Series B Stat Methodol 2015;77:131–148. |
17. | Keogh RH, Seaman SR, Barrett JK, Taylor-Robinson D, Szczesniak R. Dynamic prediction of survival in cystic fibrosis: a landmarking analysis using UK patient registry data. Epidemiology 2019;30:29–37. |
18. | Harun SN, Wainwright C, Klein K, Hennig S. A systematic review of studies examining the rate of lung function decline in patients with cystic fibrosis. Paediatr Respir Rev 2016;20:55–66. |
19. | Knapp EA, Fink AK, Goss CH, Sewall A, Ostrenga J, Dowd C, et al. The Cystic Fibrosis Foundation Patient Registry. Design and methods of a national observational disease registry. Ann Am Thorac Soc 2016;13:1173–1179. |
20. | Stanojevic S, Bilton D, McDonald A, Stocks J, Aurora P, Prasad A, et al. Global Lung Function Initiative equations improve interpretation of FEV1 decline among patients with cystic fibrosis. Eur Respir J 2015;46:262–264. |
21. | Konstan MW, Schluchter MD, Xue W, Davis PB. Clinical use of ibuprofen is associated with slower FEV1 decline in children with cystic fibrosis. Am J Respir Crit Care Med 2007;176:1084–1089. |
22. | Cogen J, Emerson J, Sanders DB, Ren C, Schechter MS, Gibson RL, et al.; EPIC Study Group. Risk factors for lung function decline in a large cohort of young cystic fibrosis patients. Pediatr Pulmonol 2015;50:763–770. |
23. | Diggle PJ, Heagerty P, Liang KY, Zeger SL. Analysis of longitudinal data. Oxford: Oxford University Press; 2002. |
24. | Gardiner JC, Luo Z, Roman LA. Fixed effects, random effects and GEE: what are the differences? Stat Med 2009;28:221–239. |
25. | Hubbard AE, Ahern J, Fleischer NL, Van der Laan M, Lippman SA, Jewell N, et al. To GEE or not to GEE: comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology 2010;21:467–474. |
26. | Corey M, Edwards L, Levison H, Knowles M. Longitudinal analysis of pulmonary function decline in patients with cystic fibrosis. J Pediatr 1997;131:809–814. |
27. | Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics 1982;38:963–974. |
28. | Taylor-Robinson D, Whitehead M, Diderichsen F, Olesen HV, Pressler T, Smyth RL, et al. Understanding the natural progression in %FEV1 decline in patients with cystic fibrosis: a longitudinal study. Thorax 2012;67:860–866. |
29. | Schluchter MD, Konstan MW, Davis PB. Jointly modelling the relationship between survival and pulmonary function in cystic fibrosis patients. Stat Med 2002;21:1271–1287. |
30. | Rizopoulos D. R Package ‘JMbayes2’: extended joint models for longitudinal and time-to-event data; 2022 [accessed 2022 Dec 28]. Available from: https://cran.r-project.org/web/packages/JMbayes2/. |
31. | Szczesniak RD, McPhail GL, Duan LL, Macaluso M, Amin RS, Clancy JP. A semiparametric approach to estimate rapid lung function decline in cystic fibrosis. Ann Epidemiol 2013;23:771–777. |
32. | Skrivankova VW, Richmond RC, Woolf BAR, Davies NM, Swanson SA, VanderWeele TJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology Using Mendelian Randomisation (STROBE-MR): explanation and elaboration. BMJ 2021;375:n2233. |
33. | Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): the TRIPOD statement. J Clin Epidemiol 2015;68:134–143. |
34. | Stanojevic S, Ratjen F. Physiologic endpoints for clinical studies for cystic fibrosis. J Cyst Fibros 2016;15:416–423. |
35. | Cox DR, Kartsonaki C, Keogh RH. Big data: some statistical issues. Stat Probab Lett 2018;136:111–115. |
36. | Wang X, Dockery DW, Wypij D, Fay ME, Ferris BG Jr. Pulmonary function between 6 and 18 years of age. Pediatr Pulmonol 1993;15:75–88. |
37. | Hankinson JL, Odencrantz JR, Fedan KB. Spirometric reference values from a sample of the general U.S. population. Am J Respir Crit Care Med 1999;159:179–187. |
38. | Liou TG, Kartsonaki C, Keogh RH, Adler FR. Evaluation of a five-year predicted survival model for cystic fibrosis in later time periods. Sci Rep 2020;10:6602. |
39. | Keogh RH, Szczesniak R, Taylor-Robinson D, Bilton D. Up-to-date and projected estimates of survival for people with cystic fibrosis using baseline characteristics: a longitudinal study using UK patient registry data. J Cyst Fibros 2018;17:218–227. |
40. | Rubin JL, O’Callaghan L, Pelligra C, Konstan MW, Ward A, Ishak JK, et al. Modeling long-term health outcomes of patients with cystic fibrosis homozygous for F508del-CFTR treated with lumacaftor/ivacaftor. Ther Adv Respir Dis 2019;13:1753466618820186. |
41. | Schlüter DK, Ostrenga JS, Carr SB, Fink AK, Faro A, Szczesniak RD, et al. Lung function in children with cystic fibrosis in the USA and UK: a comparative longitudinal analysis of national registry data. Thorax 2022;77:136–142. |
42. | Collaco JM, Albon D, Ostrenga JS, Flume P, Schechter MS, Cromwell EA. Factors associated with receiving CF care and use of telehealth in 2020 among persons with cystic fibrosis in the United States. J Cyst Fibros 2022;S1569-1993(22)01424-2. |
43. | Brockwell PJ, Davis RA. Time series: theory and methods. New York: Springer; 1998. |
44. | Wagener JS, Williams MJ, Millar SJ, Morgan WJ, Pasta DJ, Konstan MW. Pulmonary exacerbations and acute declines in lung function in patients with cystic fibrosis. J Cyst Fibros 2018;17:496–502. |
45. | Sanders DB, Ostrenga JS, Rosenfeld M, Fink AK, Schechter MS, Sawicki GS, et al. Predictors of pulmonary exacerbation treatment in cystic fibrosis. J Cyst Fibros 2020;19:407–414. |
46. | Schechter MS, VanDevanter DR, Pasta DJ, Short SA, Morgan WJ, Konstan MW; Scientific Advisory Group and the Investigators and Coordinators of the Epidemiologic Study of Cystic Fibrosis. Treatment setting and outcomes of cystic fibrosis pulmonary exacerbations. Ann Am Thorac Soc 2018;15:225–233. |
47. | Rouanet A, Helmer C, Dartigues JF, Jacqmin-Gadda H. Interpretation of mixed models and marginal models with cohort attrition due to death and drop-out. Stat Methods Med Res 2019;28:343–356. |
48. | Tyrer F, Bhaskaran K, Rutherford MJ. Immortal time bias for life-long conditions in retrospective observational studies using electronic health records. BMC Med Res Methodol 2022;22:86. |
49. | Neuhaus JM, McCulloch CE, Boylan RD. Analysis of longitudinal data from outcome-dependent visit processes: failure of proposed methods in realistic settings and potential improvements. Stat Med 2018;37:4457–4471. |
Supported by the
Author Contributions: Study concept and design: R.S., E.-R.A., W.S., A.K.F., G.F., and M.S.S. Acquisition of data: A.K.F., G.F., and B.M. Statistical analysis of data: R.S., E.-R.A., W.S., A.K.F., G.F., E. Gecili, and E. Ghulam. Interpretation of data analysis: all authors. Drafting of the manuscript: R.S., E.-R.A., W.S., A.K.F., G.F., E.C., and M.S.S. Critical revision of the manuscript for important intellectual content: all authors. Had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis: R.S., E.-R.A., and W.S.
This article has a data supplement, which is accessible from this issue’s table of contents at www.atsjournals.org.
Author disclosures are available with the text of this article at www.atsjournals.org.