Rationale: Idiopathic pulmonary fibrosis (IPF) is a progressive lung disease for which novel therapies are needed. External controls (ECs) could enhance IPF trial efficiency, but the direct comparability of ECs versus concurrent controls is unknown.
Objectives: To develop IPF ECs by fit-for-purpose data standards to historical randomized clinical trial (RCT), multicenter registry (Pulmonary Fibrosis Foundation Patient Registry), and electronic health record (EHR) data and to evaluate endpoint comparability among ECs and the phase II RCT of BMS-986020.
Methods: After data curation, the rate of change in FVC from baseline to 26 weeks among participants receiving BMS-986020 600 mg twice daily was compared with the BMS-placebo arm and ECs using mixed-effects models with inverse probability weights.
Measurements and Main Results: At 26 weeks, the rates of change in FVC were −32.71 ml for BMS-986020 and −130.09 ml for BMS-placebo (difference, 97.4 ml; 95% confidence interval [CI], 24.6–170.2), replicating the original BMS-986020 RCT. RCT ECs showed treatment effect point estimates within the 95% CI of the original BMS-986020 RCT. Pulmonary Fibrosis Foundation Patient Registry ECs and EHR ECs experienced a slower rate of FVC decline compared with the BMS-placebo arm, resulting in treatment-effect point estimates outside of the 95% CI of the original BMS-986020 RCT.
Conclusions: IPF ECs generated from historical RCT placebo arms result in comparable primary treatment effects to that of the original clinical trial, whereas ECs from real-world data sources, including registry or EHR data, do not. RCT ECs may serve as a potentially useful supplement to future IPF RCTs.
Novel treatments for idiopathic pulmonary fibrosis (IPF) are urgently needed, but randomized controlled trials (RCTs) in IPF are costly and lengthy and can face ethical and recruitment challenges. Augmentation of RCT placebo arms with external controls (ECs) has the potential to improve IPF trial efficiency, but direct comparability of ECs to concurrent RCT controls is unknown.
To generate ECs and evaluate the comparability of endpoints between ECs and concurrent controls in IPF, this study applied rigorous approaches using conformance, completeness, and plausibility data standards to placebo arms of historical IPF RCTs, multicenter IPF registry data, and electronic health record data. The study demonstrated that ECs generated from RCT placebo arms matched the disease progression of concurrent controls adequately to result in comparable primary treatment effects, whereas ECs from the registry or electronic health record data did not. The findings highlight the complexity of constructing ECs to supplement concurrent controls in IPF, particularly with regard to surrogate endpoints such as FVC in real-world data sources. The results suggest that ECs generated using historical RCT placebo arms can provide supportive data to augment concurrent control arms in future IPF RCTs.
Idiopathic pulmonary fibrosis (IPF) is a rare, progressive lung disease with high mortality rates and limited treatment options. In 2014, the U.S. Food and Drug Administration (FDA) approved two IPF therapies based on demonstrated slowing of lung function decline (1, 2), but neither drug halts disease progression, and both carry significant adverse effects. Novel IPF treatments are urgently needed and are currently in development, although existing standard-of-care therapies create ethical challenges in designing trials with placebo arms. Furthermore, IPF randomized controlled trials (RCTs) are costly and lengthy and face recruitment challenges. Therefore, innovative approaches to IPF clinical trial design are needed. One approach of growing interest to regulatory authorities and clinicians is the use of external (i.e., synthetic) control arms to supplement clinical trial data (3–5).
External control (EC) arms are designed to mirror placebo control arms based on patient-level data from historical RCTs or real-world data (RWD) such as registries or electronic health records (EHRs). EC arms have the potential to augment concurrent placebo control arms and result in benefits such as reductions in patient enrollment time, cost, and the ethical challenges presented by placebo arms. In fact, ECs derived from historical RCTs were used in a recent phase II IPF trial to augment its concurrent control arm (6). Yet, the comparability of rigorously selected ECs to concurrent RCT control arms and the impact of ECs on treatment effect estimates are not well established, particularly for surrogate endpoints such as FVC, which is used in many IPF trials. Furthermore, a comparison of ECs generated across multiple categories of data sources using historical controls from prior RCTs, multicenter registries, or EHR data is lacking.
The aims of this study were to develop high-quality ECs in IPF from historical RCT and multiple RWD sources by applying fit-for-purpose data standards and to compare endpoints among ECs versus those of the BMS-986020 phase II RCT (clinicaltrials.gov ID NCT01766817). The BMS-986020 phase II RCT tested a novel lysophosphatidic acid receptor 1 antagonist in adults with IPF at doses of 600 mg once daily or twice daily (BID). The trial demonstrated a reduction in the rate of lung function decline in patients treated with BMS-986020 600 mg BID compared with placebo, but was stopped prematurely as a result of apparent hepatobiliary toxicity (7). Evaluation of appropriate circumstances under which external data may serve as ECs is needed to inform future IPF trial design.
Candidate data sources for ECs met the following criteria: 1) U.S. patients with IPF; 2) for historical RCTs, at least 40 patients receiving placebo; 3) longitudinal collection of outcomes of interest (i.e., FVC, hospitalization, death); and 4) longitudinal follow-up of ⩾26 weeks. RCT ECs were drawn from the placebo arms of three IPFNet historical RCTs (Anticoagulant Effectiveness in Idiopathic Pulmonary Fibrosis [ACE-IPF], clinicaltrials.gov ID NCT00957242, n = 73 [8]; PANTHER-IPF Prednisone, Azathioprine, and N-Acetylcysteine: A Study that Evaluates Response in Idiopathic Pulmonary Fibrosis [PANTHER-IPF], NCT00650091, n = 131 [9]; and Sildenafil Trial of Exercise Performance in Idiopathic Pulmonary Fibrosis [STEP-IPF], NCT00517933, n = 91 [10]). RWD ECs were drawn from patients with IPF enrolled in the Pulmonary Fibrosis Foundation Patient Registry (PFF-PR, n = 874) (11) and EHR data of patients with IPF seen in the Duke University Health System (n = 2,419) (Figure 1). In the BMS-986020 RCT, randomized patients received BMS-986020 600 mg (once daily or BID) or placebo for 26 weeks. The BMS-986020 600 mg BID arm was chosen for evaluation because this regimen significantly slowed FVC decline compared with placebo in the RCT (7). Key data-source characteristics are provided in Table 1.

Figure 1. Summary of external control arm generation in PFF-PR (left), EHR (center), and randomized controlled trial (RCT; right) data sources, including evaluating fitness for purpose and assessing comparability to the BMS-986020 RCT. ACE-IPF = Anticoagulant Effectiveness in Idiopathic Pulmonary Fibrosis; EHR = electronic health record; ICD = International Classification of Diseases; ILD = interstitial lung disease; IPF = idiopathic pulmonary fibrosis; PANTHER-IPF = Prednisone, Azathioprine, and N-Acetylcysteine: A Study that Evaluates Response in Idiopathic Pulmonary Fibrosis; PFF-PR = Pulmonary Fibrosis Foundation Patient Registry; PFT = pulmonary function test; STEP-IPF = Sildenafil Trial of Exercise Performance in Idiopathic Pulmonary Fibrosis.
[More] [Minimize]Source | Years | Treatments | Key Eligibility Criteria | Follow-Up | Concomitant Therapies |
---|---|---|---|---|---|
BMS-986020 | 2013–2016* | BMS-986020 600 mg/d (n = 48) vs. 600 mg BID (n = 48) vs. placebo (n = 47) | Age 40–80 yr, IPF diagnosed within past 6 yr, postbronchodilator FVC 45–90% predicted, DlCO 30–80%, able to walk ⩾150 m on 6MWT with demonstrated decrease of ⩾2% in oxygen saturation on exertion | 26 wk | Intermittent prednisone permitted |
ACE-IPF | 2009–2011† | Warfarin (n = 72) vs. placebo (n = 73) | Age 35–80 yr with progressive IPF defined as a history of 1) worsening of dyspnea or 2) physiologic deterioration defined as an absolute decline of FVC ⩾10% or DlCO ⩾15%, a reduction in arterial oxygen saturation ⩾5%, or radiographic progression | 48 wk | Daily prednisone permitted |
STEP-IPF | 2007–2009† | Sildenafil (n = 89) vs. placebo (n = 91) | Advanced IPF, defined by DlCO <35% predicted | Up to 28 wk, 12 wk randomized followed by 12 wk open label | Prednisone permitted if stable dose |
PANTHER-IPF | 2009–2011† | Azathioprine, prednisone, and NAC (n = 133)‡ vs. placebo (n = 131) | Age 35–80 yr, IPF diagnosed within past 4 yr, FVC ⩾50% predicted and DlCO ⩾30% of predicted | 60 wk | None |
Duke Health System | 2015–2019 | NA; real-world data (N = 2,419) | Age >40 yr, ICD 10 code J84.10, J84.112, or J84.89 without any diagnosis codes for exclusionary conditions | Up to 4.8 yr | Antifibrotic therapy available, prednisone |
PFF Patient Registry | March 2016 to March 2021 | NA; real-world data (N = 874) | Medium- or high-confidence IPF diagnosis at enrollment | Up to 4.7 yr with median 1.9 yr | Antifibrotic therapy available |
Recently defined verification checks, including conformance (of data structures compared with expected standards), completeness (frequency of missing data without reference to data values), and plausibility (believability of data values based on contextual knowledge) (12), were applied to inform EC creation and evaluate fitness for purpose (see online supplement and Tables E1–E3 in the online supplement).
Criteria specified by Pocock (3) were used to frame the assessment of each EC data source to the concurrent trial control arm (see Table E4). This included a comparison of eligibility criteria among RCTs and application of BMS-986020 RCT eligibility criteria, as feasible, to all ECs (see online supplement and Table E5). Standard-of-care treatments were also compared between external and concurrent controls (see online supplement).
The primary study outcome was the rate of change in FVC (in ml) from baseline to 26 weeks, consistent with the endpoint for the BMS-986020 RCT. Secondary endpoints included the rate of change in FVC% predicted from baseline to 26 weeks and the time to two composite measurements of disease progression: 1) ⩾10% decline in FVC% predicted, lung transplant, or death and 2) ⩾10% decline in FVC% predicted, hospitalization, lung transplant, or death. Each endpoint in the composite measure of disease progression was also evaluated individually. Details about endpoint definition and ascertainment in each data source are provided in the online supplement and Table E6.
After applying BMS-986020 eligibility criteria to RCT and RWD ECs, the probability of enrollment in the BMS-986020 trial for patients in all ECs (i.e., balancing score) was estimated using a logistic regression model that included covariates previously associated with disease progression or death in patients with IPF (i.e., baseline data for age, sex, race [White vs. other], smoking status, IPF duration, FVC% predicted, DlCO% predicted, FEV1/FVC ratio, congestive heart failure, gastroesophageal reflux disease, and coronary artery disease). To increase patient homogeneity across data sources, extremely high or low balancing scores were exclusionary (13). Specifically, patients in ECs with balancing scores lower than the lowest score among BMS-986020 trial patients (i.e., lowest probability of being enrolled in the BMS-986020 trial) and patients in the BMS-986020 trial with balancing scores greater than the highest score in the ECs (i.e., highest probability of being enrolled in the BMS-986020 trial; see Figure E2) were excluded.
For primary and secondary endpoints, inverse probability weighting (IPW) was applied to all pairs of BMS-986020 treatment arms and ECs. First, the probability of assignment to BMS-986020 treatment was estimated using a logistic regression model that included the baseline characteristics listed above. Inverse probability weights were calculated as 1/(probability of treatment assignment) for the EC and 1/(1 − probability of treatment assignment) for the BMS-986020 600 mg BID arm. Next, to estimate the effect of BMS-986020 600 mg BID on the rate of change in FVC and FVC% predicted, an IPW mixed-effects model was fit with random intercepts across patients with the estimated inverse probability weights. For time-to-event outcomes, Cox proportional hazards models with the inverse probability weights were used to estimate hazard ratios (HRs) between BMS-986020 600 mg BID treatment and control for time-to-event endpoints. ECs were considered comparable to the BMS-placebo arm if the treatment effect point estimate generated using the EC was within the 95% confidence interval (CI) generated using the concurrent control arm.
As a sensitivity analysis, overlap weights, rather than an IPW approach, were applied (14), and results were compared for the outcomes of the rate of change in FVC and time to disease progression. As an additional sensitivity analysis, patients in the EHR EC who were exposed to antifibrotic therapy were excluded. To understand temporal trends for FVC, outcome and time were plotted. For the time-to-event outcomes, Kaplan-Meier curves were created. All analyses were conducted using the survey, nlme, and survival packages in R 4.1.3 and GLIMMIX in SAS 9.4.
ECs from historical RCTs, PFF-PR, and EHR data were created by first applying verification checks to assess data reliability (see online supplement for full details). Next, the BMS-986020 trial eligibility criteria were applied to each data source to increase the comparability of the ECs and BMS-placebo arm (Figure 1 and see Table E8). This resulted in 36 patients in the ACE-IPF EC, 17 patients in the STEP-IPF EC, and 107 patients in the PANTHER-IPF EC. In the EHR and PFF-PR cohorts, 995 and 512 patients had postbaseline clinical follow-up and met BMS-986020 RCT eligibility criteria, respectively. Of these, 710 patients in the EHR EC and 413 patients in the PFF-PR EC had at least one follow-up pulmonary function test (PFT) within 35 weeks after baseline and were considered in the primary analysis. All 995 and 512 patients in the EHR and PFF-PR cohorts, respectively, were considered in disease progression analyses. Approximately half of the patients in the EHR EC (465 of 995, 46.7%) and 83.2% of patients in the PFF-PR EC (426 of 512) were exposed to antifibrotic therapy at baseline or during follow-up. All ECs and the BMS-placebo arm (n = 47) were compared with 48 patients in the BMS-986020 600 mg BID treatment arm.
At baseline, the BMS-986020 600 mg BID treatment arm was similar to the BMS-placebo arm and the ECs with regard to median age, FVC% predicted, and DlCO% predicted (Table 2). The BMS-986020 trial had a higher proportion of non-White participants than the ECs, and the EHR EC had a higher proportion of female patients than the BMS-986020 trial, RCT ECs, and PFF-PR EC. Supplemental oxygen use during a 6-minute-walk test was similar to the BMS-986020 600 mg BID arm (29.2%) in the PFF-PR EC (30.4%), but not in the BMS-placebo arm (17.0%) or the other ECs. After the exclusion of patients with extreme balancing scores, 40 of 710 patients from the EHR data source and 8 of 413 patients from the PFF-PR were excluded from the primary analysis, and 78 of 995 patients from the EHR and 12 of 512 from the PFF-PR were excluded from the disease progression analyses (see Table E9).
Characteristic | BMS-986020 | External Controls | |||||
---|---|---|---|---|---|---|---|
600 mg BID Treatment* (n = 95) | Placebo (n = 47) | ACE-IPF (n = 36) | STEP-IPF (n = 17) | PANTHER-IPF (n = 107) | EHR Requiring Follow-Up FVC ⩽35 wk (n = 710) | PFF-PR Requiring Follow-Up FVC ⩽35 wk (n = 413) | |
Age, yr | 69.5 (63.0–73.5) | 69.0 (65.0–76.0) | 64.7 (58.0–70.5) | 69.9 (65.4–74.3) | 66.0 (60.0–73.0) | 69.3 (63.7–74.6) | 70.3 (65.7–74.9) |
Female sex | 27/96 (28.1%) | 14/47 (29.8%) | 7/36 (19.4%) | 5/17 (29.4%) | 25/107 (23.4%) | 292/710 (41.1%) | 108/413 (26.2%) |
White race | 71/96 (74.0%) | 30/47 (63.8%) | 34/36 (94.4%) | 17/17 (100%) | 103/107 (96.3%) | 629/706 (89.1%) | 389/405 (96.0%) |
Non-White race or Hispanic ethnicity | 50/96 (52.1%) | 28/47 (59.6%) | 4/36 (11.1%) | 0/17 | 8/107 (7.5%) | 85/706 (12.0%) | 32/403 (7.9%) |
Time since IPF diagnosis, yr | 1.1 (0.5–2.1) | 1.3 (0.7–3.1) | 0.8 (0.3–2.3) | 1.4 (0.7–1.9) | 0.7 (0.2–1.8) | 0 (0–1.1) | 1.3 (0.4–2.7) |
Diagnosis after chest CT scan | 91/96 (94.8%) | 47/47 (100%) | 35/36 (97.2%) | 8/17 (47.1%) | 99/107 (92.5%) | 532/710 (74.9%)† | 401/413 (97.1%)‡ |
Diagnosis after lung biopsy | 28/96 (29.2%) | 9/47 (19.1%) | 23/36 (63.9%) | 16/17 (94.1%) | 59/107 (55.1%) | 117/710 (16.5%)† | 152/413 (36.8%) |
Comorbidities | |||||||
Current or past smoking | 61/96 (63.5%) | 25/47 (53.2%) | 29/36 (80.6%) | 12/17 (70.6%) | 81/107 (75.7%) | 446/710 (62.8%) | 251/413 (60.8%) |
CHF | 2/96 (2.1%) | 2/47 (4.3%) | 0/36 | 0/17 | 1/107 (0.9%) | 39/710 (5.5%) | 11/413 (2.7%) |
GERD | 45/95 (47.4%) | 19/47 (40.4%) | 24/36 (66.7%) | 11/17 (64.7%) | 67/107 (62.6%) | 169/710 (23.8%) | 277/413 (67.1%) |
CAD | 16/96 (16.7%) | 6/47 (12.8%) | 4/36 (11.1%) | 3/17 (17.6%) | 24/107 (22.4%) | 105/710 (14.8%) | 97/413 (23.5%) |
FVC, L | 2.4 (2.0–3.0) | 2.5 (1.8–3.1) | 2.8 (2.1–3.4) | 2.5 (2.3–2.8) | 2.9 (2.3–3.4) | 2.4 (2.0–3.0) | 2.7 (2.2–3.2) |
FVC, % predicted | 67.5 (61.0–77.0) | 69.0 (59.4–79.9) | 72.1 (61.4–76.1) | 61.5 (53.3–71.7) | 70.7 (62.9–77.4) | 65.0 (57.0–75.0) | 68.9 (60.5–77.6) |
DlCO, ml/mm Hg/min | 11.0 (8.8–13.3) | 12.0 (9.0–15.3) | 12.9 (9.6–16.6) | 9.7 (8.1–10.7) | 12.6 (10.7–15.9) | 10.9 (8.8–13.1) | 12.8 (10.7–15.5) |
DlCO, % predicted | 40.4 (35.0–51.4) | 45.3 (37.4–52.6) | 43.7 (35.9–48.0) | 32.6 (31.6–33.7) | 43.2 (37.6–51.4) | 49.0 (41.0–59.0) | 43.8 (37.3–51.5) |
FEV1, L | 2.1 (1.7–2.5) | 2.2 (1.6–2.7) | 2.3 (1.9–2.7) | 2.0 (1.9–2.3) | 2.3 (2.0–2.8) | 2.0 (1.7–2.4) | 2.2 (1.9–2.7) |
FEV1/FVC, % | 85.8 (82.0–88.7) | 87.4 (82.4–90.2) | 80.9 (79.1–84.3) | 81.4 (78.6–87.9) | 83.6 (80.2–86.5) | 81.8 (78.0–85.3) | 82.9 (79.1–86.4) |
6MWT distance, m | 387 (299–463) | 396 (338–463) | 389 (303–428) | 380 (340–414) | 381 (333–439) | NA | 392 (320–470) |
Supplemental O2 use during walk test | 28/96 (29.2%) | 8/47 (17.0%) | 5/36 (13.9%) | 0/17 | 7/107 (6.5%) | NA | 79/260 (30.4%) |
SF-36 aggregate score | |||||||
Physical | 35.8 (31.3–41.7) | 39.7 (33.9–45.3) | 38.7 (31.3–45.1) | 37.2 (33.3–41.9) | 41.9 (34.2–47.2) | NA | NA |
Mental | 49.9 (41.8–56.3) | 50.8 (39.4–59.4) | 54.7 (48.3–58.0) | 53.8 (49.3–60.0) | 56.4 (51.1–61.2) | NA | NA |
Compared with patients in the BMS-placebo arm, patients treated with BMS-986020 600 mg BID experienced significantly slower semiannual rates of decline in FVC and FVC% predicted, replicating the original BMS-986020 RCT (Figure 2). Specifically, the rate of decline in FVC over 26 weeks was 32.71 ml in patients treated with BMS-986020 600 mg BID, versus 130.09 ml in concurrent controls (difference, 97.38 ml; 95% CI, 24.55–170.21 ml; Figure 3A and see Table E10). Similarly, the rate of decline in FVC% predicted over 26 weeks was 1.36% in patients treated with BMS-986020 600 mg BID, versus 3.95% in concurrent controls (difference, 2.59%; 95% CI, 0.35–4.82%; Figure 3B and see Table E11).

Figure 2. FVC (A) and FVC% predicted (B) over time among the BMS-986020 600 mg twice daily treatment arm, BMS-986020 placebo arm, and external controls. Week 26 is indicated by the red dashed line. ACE-IPF = Anticoagulant Effectiveness in Idiopathic Pulmonary Fibrosis; EC = external control; EHR = electronic health record; IPF = idiopathic pulmonary fibrosis; PANTHER-IPF = Prednisone, Azathioprine, and N-Acetylcysteine: A Study that Evaluates Response in Idiopathic Pulmonary Fibrosis; PFF-PR = Pulmonary Fibrosis Foundation Patient Registry; STEP-IPF = Sildenafil Trial of Exercise Performance in Idiopathic Pulmonary Fibrosis.
[More] [Minimize]Compared with RCT ECs, patients treated with BMS-986020 600 mg BID experienced a slower semiannual rate of decline in FVC and FVC% predicted, resulting in treatment effect point estimates within the 95% CI of the original BMS-986020 RCT (Figure 3 and see Tables E10 and E11). Specifically, an IPW mixed effects model demonstrated that the effect of treatment with BMS-986020 was comparable to that in the original RCT when the BMS-placebo group was replaced with the STEP-IPF EC (difference, 168.03 ml over 26 weeks; 95% CI, 46.27–289.79 ml), ACE-IPF EC (difference, 63.69 ml over 26 weeks; 95% CI, −49.27 to 176.65 ml), and PANTHER-IPF EC (difference, 44.52 ml over 26 weeks; 95% CI, −32.59 to 121.62 ml; Figure 3). Patients treated with BMS-986020 also experienced a slower rate of decline in FVC% predicted compared with the STEP-IPF EC, ACE-IPF EC, and PANTHER-IPF EC, resulting in treatment effect estimates comparable to those in the original BMS-986020 RCT (see Table E11). Results were similar using mixed effects models without IPW as well as with overlap weights, although the effect of treatment with BMS-986020 was slightly diminished (Figure E3 and see Tables E12 and E13).

Figure 3. Estimates of differences in the rate of change in FVC (A) and FVC% predicted (B) from baseline to 26 weeks between treatment with BMS-986020 600 mg twice daily and controls from the mixed effects models after inverse probability score weighting. Inverse probability weighting was not performed when BMS-placebo was used as the comparator. ACE-IPF = Anticoagulant Effectiveness in Idiopathic Pulmonary Fibrosis; CI = confidence interval; EC = external control; EHR = electronic health record; IPF = idiopathic pulmonary fibrosis; PANTHER-IPF = prednisone, azathioprine, and N-acetylcysteine: A Study that Evaluates Response in Idiopathic Pulmonary Fibrosis; PFF-PR = Pulmonary Fibrosis Foundation Patient Registry; RCT = randomized controlled trial; STEP-IPF = Sildenafil Trial of Exercise Performance in Idiopathic Pulmonary Fibrosis.
[More] [Minimize]The semiannual rate of FVC decline in patients in the PFF-PR EC and EHR EC was slower than in patients in the RCT ECs and similar to patients treated with BMS-986020 600 mg BID (Figure 2). Thus, the effect of treatment with BMS-986020 600 mg BID was attenuated and not comparable to the original BMS-986020 RCT when the PFF-PR EC or EHR EC replaced the BMS-placebo arm (PFF-PR EC IPW difference, 5.10 ml over 26 weeks; 95% CI, −42.28 to 52.48; EHR EC IPW difference, 11.63 ml over 26 weeks; 95% CI, −79.96 ml to 103.23 ml; Figure 3). The treatment effect remained outside the 95% CI of the original BMS-986020 RCT when evaluating the outcome of rate of decline in FVC% predicted using the PFF-PR EC or EHR EC (see Table E11). A sensitivity analysis including only EHR EC patients who were never exposed to antifibrotic therapy also demonstrated similar results (see Tables E10 and E11). Application of overlap weights rather than IPW resulted in a faster rate of decline in the PFF-PR EC and EHR EC, leading to a treatment effect within the 95% CI of the original BMS-986020 RCT, with the EHR EC and closer to the original 95% CI with the PFF-PR EC (see Tables E12 and E13).
By 26 weeks, 5 of 17 (29.4%) patients in the STEP-IPF EC experienced a disease progression event, defined as a ⩾10% decline in FVC% predicted, lung transplant, or death (see Figure E4A). By 32 weeks, a disease progression event was experienced by 13 of 48 patients (27.0%) in the BMS-986020 600 mg BID arm, 12 of 47 (25.5%) in the BMS-placebo arm, 15 of 107 (14.0%) in the PANTHER-IPF EC, 4 of 36 (11.1%) in the ACE-IPF EC, 69 of 500 (13.8%) in the PFF-PR EC, and 183 of 917 (20.0%) in the EHR EC. No significant difference in time to the composite endpoint of disease progression was observed among patients treated with BMS-986020 600 mg BID versus BMS-placebo (HR, 0.90; 95% CI, 0.43–1.93; Table 3 and see Figures E4B and E4C). A comparable risk of disease progression was observed when treatment with BMS-986020 600 mg BID was compared with the STEP-IPF EC (IPW HR, 0.86; 95% CI, 0.29–2.54) and the EHR-EC (IPW HR, 1.75; 95% CI, 0.71–4.34). However, the risk of disease progression was outside of the 95% CI of the original BMS-986020 RCT when treatment with BMS-986020 600 mg BID was compared with the ACE-IPF EC (IPW HR, 3.09; 95% CI, 0.92–10.39), PANTHER-IPF EC (IPW HR, 2.16; 95% CI, 0.96–4.87), and PFF-PR EC (IPW HR, 2.03; 95% CI, 0.82–5.02). In sensitivity analyses using overlap weights, the comparability of disease progression risk improved with all ECs but remained outside the 95% CI of the original BMS-986020 RCT with the ACE-IPF EC (HR, 2.79; 95% CI, 0.75–10.40) and PANTHER-IPF EC (HR, 1.99; 95% CI, 0.87–4.56; see Table E14). Additional secondary endpoints, including a disease progression composite that included time to hospitalization, as well as each component of the composite evaluated individually, are described in the online supplement and Tables E15–E18.
Control Data Source | Pts. at Risk | Events | Hazard Ratio (95% CI) | |
---|---|---|---|---|
No Additional Adjustment: 600 mg BID (n = 48, 13 Events) vs. Control | IPW: 600 mg BID (n = 48, 13 Events) vs. Control | |||
RCT | ||||
BMS-placebo | 47 | 14 | 0.90 (0.43–1.93) | – |
ACE-IPF EC | 36 | 7 | 2.16 (0.70–6.64) | 3.09 (0.92–10.39) |
STEP-IPF EC | 17 | 5 | 0.86 (0.31–2.42) | 0.86 (0.29–2.54) |
PANTHER-IPF EC | 107 | 41 | 2.41 (1.16–4.98) | 2.16 (0.96–4.87) |
RCT-ECs combined* | 160 | 53 | 2.02 (1.04–3.95) | 1.80 (0.83–3.91) |
RWD | ||||
EHR-EC | 917 | 509 | 1.42 (0.81–2.49) | 1.75 (0.71–4.34) |
EHR-EC without antifibrotics† | 472 | 229 | 1.57 (0.88–2.82) | 1.95 (0.76–5.01) |
PFF-PR EC | 500 | 308 | 2.28 (1.26–4.13) | 2.03 (0.82–5.02) |
During the past two decades, FDA approval has been granted to more than 45 new products using EC data, and the number of pharmaceutical submissions that use EC data has increased during the past several years (15–17). Prior studies in oncology have generated ECs with historical RCTs (18) or RWD (13, 19) and observed effect estimates similar to concurrent clinical trial control arms for the outcome of overall survival. However, the comparability of ECs to concurrent RCT control arms has not been established in IPF, nor have such methods been applied to studies that evaluate changes in lung function as a primary outcome. In this study, ECs were developed for the BMS-986020 phase II RCT after evaluation of data source fitness for purpose, application of BMS-986020 RCT eligibility criteria, and exclusion of patients with extreme balancing scores using historical IPF RCT placebo arms, multicenter U.S. registry data, and EHR data. We demonstrated primary treatment effects comparable to that of the original BMS-986020 RCT when using RCT ECs, but not RWD ECs, including PFF-PR ECs and EHR ECs.
A strength of this analysis is the evaluation of the comparability of ECs across multiple data sources, including historical RCT, multicenter registry, and EHR data. Patients in RCT ECs experienced rates of FVC decline comparable to that in the BMS-placebo arm, resulting in treatment effect estimates within the 95% CI of the original RCT. This finding was consistent regardless of the application of IPW or overlap weighting approaches that reflect the probability of treatment assignment. Although RCT ECs did not always achieve statistical significance, which would influence regulatory decision-making, this may be related to sample size, particularly in the ACE EC and STEP EC. Our results support the suitability of including RCT ECs in a recent phase II IPF trial of a new drug candidate (6) and highlight the potential for RCT ECs to augment current control arms in future IPF trials.
Unlike the RCT ECs, patients in RWD ECs experienced slower rates of FVC decline compared with the BMS-placebo arm, resulting in an inability to reproduce the treatment effect of BMS-986020 observed in the original RCT. There are several potential reasons for the slow rate of FVC decline in the PFF-PR EC and the EHR EC. First, changes in treatment patterns have occurred over time. Although antifibrotic therapy was not permitted in the BMS-986020 RCT and was unavailable in other RCT ECs, a large proportion of patients in both RWD ECs were exposed to antifibrotic therapy. In fact, the rate of FVC decline in RWD ECs was comparable to that observed in control arms of recent IPF trials that have permitted antifibrotic therapy (20). However, the results were similar in a sensitivity analysis that excluded patients exposed to antifibrotic agents in the EHR EC. Second, irregular FVC assessments, which were determined by clinical status, as well as potential informative missingness of PFTs in RWD, may impact the comparability of ECs. Third, given the balance of many prognostic factors known to influence IPF progression between the RWD ECs and BMS-placebo arm, it is possible that unmeasured factors might confound the behavior of RWD ECs in IPF, including potential patient-, physician-, and center-level biases about the types of patients who are enrolled in clinical trials.
The possibility that RWD ECs may reflect the behavior of a wider range of patients than is enrolled in clinical trials is supported by our inclusion of two independent sources of RWD. The PFF-PR, in particular, has advantages over EHR data with regard to rigor of outcome assessments, clinical phenotypes, and lower rates of missingness. Despite the application of BMS-986020 RCT eligibility criteria and implementation of a requirement for PFTs within 35 weeks of the index baseline date, the behavior of patients in RWD ECs differed from that of patients enrolled in clinical trials. This is in line with a recent study that found that only a limited number of patients with IPF who were enrolled in registries met the eligibility criteria of previous IPF RCTs and that FVC decline was slower in patients who met eligibility criteria (21). Our results may have implications regarding the generalizability of RCT results in patients with IPF, and do suggest caution in the use of RWD ECs to supplement future RCTs in IPF.
An evaluation of the appropriate circumstances and the analytical framework under which external data may serve as ECs is needed to create a useful approach for future trials in IPF. For example, when evaluating RWD ECs, overlap weights produced effect estimates that were more comparable to the original BMS-986020 RCT, and this approach warrants further study. Our results provide useful guidance for future research involving ECs in IPF, outlined in Table E19, that are also aligned with recent FDA recommendations addressing the use of ECs (22). We propose that RCT ECs could be useful to supplement concurrent control arms. However, based on the present study, variability in PFT assessment schedules even among RCT protocols may decrease the comparability of outcomes evaluating the time to composite disease progression events that include a decline in FVC. Thus, comparison of the rates of FVC decline may be a more appropriate analytic framework when incorporating ECs. In addition, extending ongoing trial emulations in cardiovascular outcomes (23) to pulmonary fibrosis would help to better understand contexts in which RWD match RCT findings.
The generation of ECs and evaluation of endpoint comparability in this study should be interpreted in the context of some limitations. Although our balancing scores and weighting approaches included physiologic variables from accepted IPF risk prediction models such as the GAP index (gender, age, physiology) (24) to create comparable EC groups, it is impossible to account for all clinical variables that may influence FVC decline. This is particularly relevant given the differences in data collection and availability across data sources, as well as the different years each data source spanned, particularly the RWD ECs. Second, the BMS-986020 trial included non-U.S. sites, and differences in ECs related to geographic regions are possible. Third, it is possible that the EHR EC may have included patients with pulmonary fibrosis secondary to etiologies other than IPF, although it was created based on a previously validated coding algorithm that identified patients with IPF in the EHR (25). Finally, not all endpoints, particularly hospitalization, were captured in a similar manner across data sources, which limits comparability.
In summary, ECs created from RCT placebo arms matched the disease progression of the BMS-986020 RCT placebo arm adequately to show treatment effect estimates within the 95% CI of the original RCT, whereas ECs created from RWD sources did not. The findings suggest that RCT ECs can provide supportive data to augment concurrent control arms, resulting in many potential benefits. Additional methodological advances and data exploration to select comparable EC populations in RWD and to incorporate statistical approaches to address unmeasured biases that may confound behavior of RWD ECs are needed to generate regulatory-quality data to supplement IPF trials.
Medical writing assistance was provided by Amanda Martin, Ph.D., of Medical Expressions (Chicago, IL) and was funded by Bristol Myers Squibb. The authors thank all patients who participated in the Pulmonary Fibrosis Foundation (PFF) Patient Registry. We also thank the investigators, clinical research coordinators, and other staff at participating PFF Care Centers for providing clinical data; the PFF, which established and has maintained the Registry since 2016; and the many generous donors.
1. | Richeldi L, du Bois RM, Raghu G, Azuma A, Brown KK, Costabel U, et al.; INPULSIS Trial Investigators. Efficacy and safety of nintedanib in idiopathic pulmonary fibrosis. N Engl J Med 2014;370:2071– 2082. |
2. | King TE Jr, Bradford WZ, Castro-Bernardini S, Fagan EA, Glaspole I, Glassberg MK, et al.; ASCEND Study Group. A phase 3 trial of pirfenidone in patients with idiopathic pulmonary fibrosis. N Engl J Med 2014;370:2083–2092. |
3. | Pocock SJ. The combination of randomized and historical controls in clinical trials. J Chronic Dis 1976;29:175–188. |
4. | Cave A, Kurz X, Arlett P. Real-world data for regulatory decision making: challenges and possible solutions for Europe. Clin Pharmacol Ther 2019;106:36–39. |
5. | Sherman RE, Davies KM, Robb MA, Hunter NL, Califf RM. Accelerating development of scientific evidence for medical products within the existing US regulatory framework. Nat Rev Drug Discov 2017;16:297–298. |
6. | Richeldi L, Azuma A, Cottin V, Hesslinger C, Stowasser S, Valenzuela C, et al.; 1305-0013 Trial Investigators. Trial of a preferential phosphodiesterase 4B inhibitor for idiopathic pulmonary fibrosis. N Engl J Med 2022;386:2178–2187. |
7. | Palmer SM, Snyder L, Todd JL, Soule B, Christian R, Anstrom K, et al. Randomized, double-blind, placebo-controlled, phase 2 trial of BMS-986020, a lysophosphatidic acid receptor antagonist for the treatment of idiopathic pulmonary fibrosis. Chest 2018;154:1061–1069. |
8. | Noth I, Anstrom KJ, Calvert SB, de Andrade J, Flaherty KR, Glazer C, et al.; Idiopathic Pulmonary Fibrosis Clinical Research Network (IPFnet). A placebo-controlled randomized trial of warfarin in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med 2012;186:88–95. |
9. | Idiopathic Pulmonary Fibrosis Clinical Research Network; Martinez FJ, de Andrade JA, Anstrom KJ, King TE Jr, Raghu G. Randomized trial of acetylcysteine in idiopathic pulmonary fibrosis. N Engl J Med 2014;370:2093–2101. |
10. | Zisman DA, Schwarz M, Anstrom KJ, Collard HR, Flaherty KR, Hunninghake GW; Idiopathic Pulmonary Fibrosis Clinical Research Network. A controlled trial of sildenafil in advanced idiopathic pulmonary fibrosis. N Engl J Med 2010;363:620–628. |
11. | Wang BR, Edwards R, Freiheit EA, Ma Y, Burg C, de Andrade J, et al. The Pulmonary Fibrosis Foundation patient registry. Rationale, design, and methods. Ann Am Thorac Soc 2020;17:1620–1628. |
12. | Mahendraratnam N, Silcox C, Mercon K, Kroetsch A, Romine M, Harrison N, et al. Determining real-world data’s fitness for use and the role of reliability. White paper. Washington, DC: Duke-Margolis Center for Health Policy; 2019 [accessed 2023 Apr 20]. Available from: https://healthpolicy.duke.edu/sites/default/files/2019-11/rwd_reliability.pdf. |
13. | Carrigan G, Whipple S, Capra WB, Taylor MD, Brown JS, Lu M, et al. Using electronic health records to derive control arms for early phase single-arm lung cancer trials: proof-of-concept in randomized controlled trials. Clin Pharmacol Ther 2020;107:369–377. |
14. | Li F, Morgan KL, Zaslavsky AM. Balancing covariates via propensity score weighting. J Am Stat Assoc 2018;113:390–400. |
15. | Jahanshahi M, Gregg K, Davis G, Ndu A, Miller V, Vockley J, et al. The use of external controls in FDA regulatory decision making. Ther Innov Regul Sci 2021;55:1019–1035. |
16. | Goring S, Taylor A, Müller K, Li TJJ, Korol EE, Levy AR, et al. Characteristics of non-randomised studies using comparisons with external controls submitted for regulatory approval in the USA and Europe: a systematic review. BMJ Open 2019;9:e024895. |
17. | Anderson M, Naci H, Morrison D, Osipenko L, Mossialos E. A review of NICE appraisals of pharmaceuticals 2000-2016 found variation in establishing comparative clinical effectiveness. J Clin Epidemiol 2019;105:50–59. |
18. | Berry D, Elashoff M, Blotner S, Davi R, Beineke P, Chandler M, et al. Creating a synthetic control arm from previous clinical trials: application to establishing early end points as indicators of overall survival in acute myeloid leukemia (AML). J Clin Oncol 2017;35:7021–7021. |
19. | Jia Z, Lilly MB, Koziol JA, Chen X, Xia XQ, Wang Y, et al. Generation of “virtual” control groups for single arm prostate cancer adjuvant trials. PLoS One 2014;9:e85010. |
20. | Martinez FJ, Yow E, Flaherty KR, Snyder LD, Durheim MT, Wisniewski SR, et al.; CleanUP-IPF Investigators of the Pulmonary Trials Cooperative. Effect of antimicrobial therapy on respiratory hospitalization or death in adults with idiopathic pulmonary fibrosis: the CleanUP-IPF randomized clinical trial. JAMA 2021;325:1841–1851. |
21. | Khor YH, Schulte M, Johannson KA, Marcoux V, Fisher JH, Assayag D, et al.; Austin ILD Registry and CARE-PF Investigators; ALLIANCE Study Group. Eligibility criteria from pharmaceutical randomised controlled trials of idiopathic pulmonary fibrosis: a registry-based study. Eur Respir J 2023;61:2202163. |
22. | U.S. Food and Drug Administration, Department of Health and Human Services. Considerations for the design and conduct of externally controlled trials for drug and biological products guidance for industry. 2023 [accessed 2023 Apr 20]. Available from: https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-design-and-conduct-externally-controlled-trials-drug-and-biological-products. |
23. | Franklin JM, Patorno E, Desai RJ, Glynn RJ, Martin D, Quinto K, et al. Emulating randomized clinical trials with nonrandomized real-world evidence studies: first results from the RCT DUPLICATE initiative. Circulation 2021;143:1002–1013. |
24. | Ley B, Ryerson CJ, Vittinghoff E, Ryu JH, Tomassetti S, Lee JS, et al. A multidimensional index and staging system for idiopathic pulmonary fibrosis. Ann Intern Med 2012;156:684–691. |
25. | Durheim MT, Judy J, Bender S, Baumer D, Lucas J, Robinson SB, et al. In-hospital mortality in patients with idiopathic pulmonary fibrosis: a US cohort study. Lung 2019;197:699–707. |
* Co–senior authors.
Supported by Bristol Myers Squibb.
Author Contributions: All authors have directly accessed and verified the underlying data reported in the manuscript. All authors confirm that they had full access to all the data in the study and accept responsibility to submit for publication. A.C.S., L.D.S., H.H., S.R.S., A.S.L., Y.Q., R.L., H.Z., A.F., L.B., L.M.W., and S.M.P. contributed to study conception and design as well as data interpretation. A.C.S., L.D.S., S.R.S., L.M.W., and S.M.P. performed data acquisition. H.H., S.R.S., A.S.L., E.Y., R.L., H.Z., and L.M.W. performed data analysis.
Data sharing: The datasets analyzed during the current study are not publicly available, but are available from the corresponding author on reasonable request.
This article has an online supplement, which is accessible from this issue’s table of contents at www.atsjournals.org.
Originally Published in Press as DOI: 10.1164/rccm.202210-1947OC on June 29, 2023
Author disclosures are available with the text of this article at www.atsjournals.org.