American Journal of Respiratory and Critical Care Medicine

Rationale: Poor lung health in adult life may occur partly through suboptimal growth and development, as suggested by epidemiological evidence pointing to early life risk factors.

Objectives: To systematically investigate the effects of lung development genes on adult lung function.

Methods: Using UK Biobank data, we tested the association of 391 genes known to influence lung development with FVC and FEV1/FVC. We split the dataset into two random subsets of 207,616 and 138,411 individuals, using the larger subset to select the most promising signals and the smaller subset for replication.

Measurements and Main Results: We identified 55 genes, of which 36 (16 for FVC, 19 for FEV1/FVC, and one for both) had not been identified in the largest, most recent genome-wide study of lung function. Most of these 36 signals were intronic variants; expression data from blood and lung tissue showed that the majority affect the expression of the genes they lie within. Further testing of 34 of these 36 signals in the CHARGE and SpiroMeta consortia showed that 16 replicated after Bonferroni correction and another 12 replicated at nominal significance level. Of the 55 genes, 53 fell into four biological categories whose function is to regulate organ size and cell integrity (growth factors; transcriptional regulators; cell-to-cell adhesion; extracellular matrix), suggesting that these specific processes are important for adult lung health.

Conclusions: Our study demonstrates the importance of lung development genes in regulating adult lung function and influencing both restrictive and obstructive patterns. Further investigation of these developmental pathways could lead to druggable targets.

Scientific Knowledge on the Subject

Epidemiological studies on early life risk factors suggest that poor lung health in adult life may be partly due to suboptimal growth and development. Although the early environment has been implicated in the etiology of impaired lung function, there has been no systematic investigation of the role of genes known to play a vital role in lung development.

What This Study Adds to the Field

Our findings show a clear effect of lung development genes on adult lung function, influencing both restrictive and obstructive patterns. Further investigation of these developmental pathways could ultimately lead to druggable targets aimed at optimizing adult lung health and preventing chronic obstructive pulmonary disease.

Gaining a full understanding of the genetic and environmental causes of impaired lung function is important if we are to discover ways to prevent chronic obstructive pulmonary disease (COPD) and to optimize lung health. Furthermore, the public health benefits of improving lung function are far-reaching, given that poor lung function, especially a lower FVC, is a powerful predictor of increased mortality, in particular from cardiovascular disease, even in nonsmokers (1, 2).

A long-standing hypothesis states that low lung function and COPD in late adult life may occur partly through suboptimal growth and development, with failure to attain maximal lung capacity in young adulthood (36). There is substantial epidemiological and experimental evidence supporting the concept of the developmental origins of adult lung disease and impaired lung function (7). Epidemiological evidence includes tracking of lung function from early childhood to adulthood, which implicates environmental factors operating early in life (3); various prenatal, perinatal, and postnatal risk factors have been linked to impaired adult lung function, including maternal smoking, low birth weight, prematurity, and respiratory tract infections (4, 5).

Although the early environment has been implicated in the etiology of impaired lung function, there has been no systematic investigation of the role of genes known to play a vital role in lung development. Genetic variants affecting adult cross-sectional lung function have shown little or no effect on longitudinal lung function decline (8), and some of these variants have been identified in children as well as adults. These observations suggest that lung function at a given point in adulthood may be more influenced by genetic factors that affect the developmental trajectory of lung function rather than the rate of subsequent decline. Indeed, lung development gene variants have been identified in genome-wide association studies (GWASs) of lung function (9); some of these have been associated with infant lung function (10), and for others, there is evidence of differential expression during human fetal lung development (11). However, it is likely that other lung development gene variants genuinely associated with lung function may not have achieved the stringent genome-wide significance thresholds (typically 5 × 10−8) required to protect against false positive findings. Taking a complementary hypothesis-driven approach, here, we investigate 391 genes known to influence lung development for association with adult lung function, in particular FVC and the ratio of FEV1 to FVC (FEV1/FVC), using data from the large UK Biobank (UKB) dataset.

This article was previously published in preprint form (https://doi.org/10.1101/447367).

UKB Data

The UKB is a study of 502,543 volunteer participants aged 39–70 recruited from 22 study centers across the United Kingdom, which collected data on a large number of genetic and nongenetic risk factors for chronic disease and related disease traits (12, 13). We included in our analyses 346,027 individuals of self-reported white ethnicity with available good quality genetic and lung function data, as shown in Figure E1 in the online supplement. For lung function data, we used FVC and FEV1 “best measure,” as proposed in the UK BiLEVE (Biobank Lung Exome Variant Evaluation) study (14). Table E1 provides UKB data field numbers and web links for full descriptions of all variables used in the analyses.

For the genetic data, quality control and genotype imputation were performed by UKB, as previously described (13); we used the genetic dataset made available on July 2017.

Selection of Genes Related to Lung Development

The list of genes related to lung development was prepared by two experts (C.H.D. and M.H.), as previously described (15). An initial list of genes was compiled by each expert separately based on their knowledge from both human and experimental data, including orthologs of genes known to affect lung development in a variety of model organisms. The two lists were compared, and they agreed on a common list. This list was further extended to include relevant additional genes identified based on pathway information from Kyoto Encyclopedia of Genes and Genomes (KEGG) (16) (relevant genes lying in the same pathways as those in the list) and literature data from Human Genome Epidemiology (HuGE) Navigator (17) (genes considered as associated with lung development in previous genetic association studies). In the case of large gene families, if in doubt about which genes to select, we chose those with higher gene expression in fetal lungs, using information from BioGPS (Human U133A/GNF1H Gene Atlas database) (18).

From this list of 403 genes, after excluding genes on the X chromosome, we considered 391 genes (Table E2). Within these genes, 106,384 variants were available in the UKB after the exclusion of variants with minor allele frequency of <0.01 and imputation quality (info score) of <0.5.

Association of Lung Development Genes with Adult Lung Function

We first considered which of the 391 genes were associated with adult lung function in the largest, most recent GWAS by Shrine and colleagues (9); 19 of them were reported either as novel signals or as replications of findings from previous studies (14, 1927), and their results for FVC and FEV1/FVC in UKB (n = 346,027) are presented in Table E3.

To identify and replicate further associations in the remaining 372 genes, we randomly split the UKB dataset into two subsets of 60% (n = 207,616) and 40% (n = 138,411) of the total sample. Main participants’ characteristics, including lung function, for the whole study sample and for the two subsets separately are summarized in Table E4. We used the larger subset (stage 1) to select the most promising signals, taking the “best SNP” for each gene (i.e., the SNP with the lowest P value, if the P value was lower than an arbitrary screening threshold of 1 × 10−3), and used the smaller subset (stage 2) for replication. In stage 1, we tested all 98,255 variants in the 372 genes; for each gene, we selected the best SNP for replication. In stage 2, we tested all best SNPs and considered as replicated those associations with effect in the same direction as in stage 1 and a one-sided P value below a Bonferroni-corrected threshold (0.05 divided by the number of SNPs sent to replication: 102 SNPs, P < 4.9 × 10−4, for FVC; 113 SNPs, P < 4.4 × 10−4, for FEV1/FVC). The use of Bonferroni correction in stage 2, on which all our inferences are based, fully addresses the issue of multiple testing.

In both stage 1 and stage 2, we estimated the association of each variant with FVC and FEV1/FVC using linear mixed models as implemented in BOLT-LMM (28), accounting for cryptic relatedness and the fine-scale population structure that can be found within self-reported white ethnicity. The analyses assumed an additive genetic model and were adjusted for age, age2, sex, height, smoking status (ever vs. never), genotyping array, and assessment center. Adjustment for height ensures the genetic effects on lung function are independent of body size.

For both FVC and FEV1/FVC, we evaluated whether our replicated SNP for a gene was in linkage disequilibrium (LD) (r2 > 0.1) with the best SNP for a different gene, in which case we performed conditional analyses, mutually adjusting one for the other.

We performed the following three sets of secondary analyses on replicated SNPs: 1) we assessed their association with spirometrically defined COPD (defined as an FEV1/FVC below the lower limit of normal [LLN] based on the NHANES (National Health and Nutrition Examination Survey) III study equation for white ethnicity [29]), adjusting the models for the same variables as in the main analyses; 2) we repeated the main analyses stratified by smoking status; if lung development genes are largely influencing maximal level attained through lung growth, then we might expect stronger associations in nonsmokers; in contrast, if their influence on lung function is through increasing lung repair in response to insults such as smoking, which would affect lung function decline, then we might expect stronger associations in smokers; and 3) we repeated the main analyses stratifying participants below and above the median age of 58. If a lung development gene affects lung regeneration, we might expect a stronger effect in older people and vice versa, although the age range in the UKB (39–70 yr) limits the extent to which effect modification by age can be investigated in this dataset. To increase the statistical power of these secondary analyses, we performed them on the whole UKB sample (N = 346,027), which included 35,840 spirometrically defined COPD cases (FEV1/FVC < LLN) (10.4%) and 211,689 ever-smokers (61.2%).

Using results in individuals of European ancestry from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) and SpiroMeta consortia, we further tested for replication signals that had not been reported by Shrine and colleagues (9) either as newly identified or as replicated from previous studies. For CHARGE, we used the results of a GWAS meta-analysis of 18 studies (26), and for SpiroMeta, we used the publicly available results of a GWAS meta-analysis of 22 studies downloaded from the GWAS catalog (https://www.ebi.ac.uk/gwas/publications/30804560). As in our analyses, all studies in CHARGE and SpiroMeta controlled for age, age2, sex, height, and smoking status (with additional adjustments in CHARGE for height2 and smoking pack-years as well as weight for FVC) as well as population stratification and, when necessary, family relatedness and center. We could not use a standard meta-analysis to combine the results from CHARGE and SpiroMeta because the latter used rank-based inverse normal transformation, and we therefore used Fisher’s meta-analysis of P values. Replication was defined as an effect in the same direction as that in the UKB, with a one-sided P value below the Bonferroni-corrected threshold. In Fisher’s meta-analysis, P values were inverted for estimates in the opposite direction.

Stage 1 results for all 98,255 SNPs in the 372 genes are reported in Table E5 for both FVC and FEV1/FVC. Taking to stage 2 the best SNP per gene, 102 SNPs with P values <1 × 10−3 were tested for replication for FVC, and 113 were tested for replication for FEV1/FVC; results for all SNPs tested for replication are reported in Table E6 for both traits.

In conditional analyses adjusting the effect of a replicated SNP for any other replicated SNP in LD with it, we identified three signals (two for FVC and one for FEV1/FVC) in which the effect disappeared, and these were dropped (Table E6).

We replicated signals in 42 genes (P value in stage 2 below the Bonferroni-corrected thresholds of P < 4.9 × 10−4 for FVC or P < 4.4 × 10−4 for FEV1/FVC). To assess whether these associations might be explained by neighboring genes previously associated with lung function, we repeated the analyses after adjusting for SNPs reported by Shrine and colleagues (9) that were in LD (r2 > 0.1) with the 42 SNPs we had identified. After these conditional analyses, 36 signals remained as independent findings (Table E7), and all further analyses focused on them.

The results for these 36 genes are reported in Tables 1 and 2; of these, 16 were uniquely associated (replication in stage 2) with FVC, 19 were uniquely associated with FEV1/FVC, and only one signal was associated with both traits. In the secondary analysis testing the association with spirometrically defined COPD, 14 of the 36 genes showed a statistically significant association after Bonferroni correction (P < 1.4 × 10−3), and a further seven showed nominal statistical significance (P < 0.05), with the odds ratio for COPD always in a consistent direction with the effect on FEV1/FVC (lower or higher ratio) (Table 3). In the secondary analysis stratified by smoking, results for FVC and FEV1/FVC were broadly similar in smokers versus never-smokers (Figures E2 and E3), with no statistically significant interactions after Bonferroni correction. The same was observed for the analysis stratified by age, with results broadly similar and no significant interactions after Bonferroni correction (Figures E4 and E5).

Table 1. UKB Results from Stage 1 and Stage 2 (Internal Replication) for the 17 Signals for FVC*

GeneSNPChrBPEAEAFFunctional ConsequenceBlood eQTL P ValueLung eQTL P ValueStage 1 (n = 207,616)Stage 2 (Internal Replication) (n = 138,411)
βSEP ValueβSEP Value
ACTN3rs571278451166,318,325C0.82Intron variant7.4 × 10−60.022−12.32.42.5 × 10−7−14.72.92.3 × 10−7
ACTN4rs1898099001939,147,164A0.98Intron variant0.1120.005−25.77.23.5 × 10−4−35.88.92.8 × 10−5
CLDN20rs342682546155,590,120TA0.39Intron variantNANA10.21.94.5 × 10−89.02.34.6 × 10−5
GSK3Brs68052513119,560,606T0.38Intron variant3.0 × 10−971.1 × 10−57.41.96.8 × 10−58.02.32.1 × 10−4
HOXA1rs45571645727,135,096G0.98Missense variant2.7 × 10−50.32433.36.89.8 × 10−728.78.32.9 × 10−4
HOXB4rs2016036351746,653,038T0.943′ UTR variantNANA−12.83.86.9 × 10−4−16.34.61.9 × 10−4
KAT8rs1382590611631,136,066A0.64Intron variant3.5 × 10−70§2.8 × 10−23§−8.71.94.1 × 10−6−7.72.34.5 × 10−4
ITGB5rs172820783124,481,760T0.873′ UTR variant3.2 × 10−60.3269.12.77.4 × 10−415.73.39.0 × 10−7
MMP24rs72802033,864,484A0.583′ UTR variant2.2 × 10−360.49910.31.82.7 × 10−89.72.39.5 × 10−6
NCOR2rs7245102112124,811,393ACT0.89Intron variant6.3 × 10−110.42912.82.91.2 × 10−512.83.61.9 × 10−4
NR3C1rs728010515142,685,670A0.84Intron variant1.1 × 10−80.35212.12.46.9 × 10−710.73.01.6 × 10−4
ROR2rs12684752994,682,990T0.94Intron variant1.4 × 10−90.266−13.83.93.7 × 10−4−17.84.78.5 × 10−5
RUNX1rs124835012136,224,276T0.63Intron variant0.926NA8.92.09.3 × 10−610.12.52.1 × 10−5
SERPINC1rs22276031173,882,548A0.97Intron variant0.9670.776−21.65.45.8 × 10−5−23.16.62.5 × 10−4
SOX9rs7962094341770,122,505CT0.533′ UTR variantNANA7.31.86.2 × 10−57.92.22.0 × 10−4
WNT2Brs3513701113,054,659C0.41Intron variant0.1916.9 × 10−47.01.81.2 × 10−47.72.23.2 × 10−4
WNT9Ars357990121228,133,322C0.83Intron variant0.3220.041−7.92.49.1 × 10−4−13.52.92.2 × 10−6

Definition of abbreviations: BP = base position (build GRCh37); Chr = chromosome; EA = effect allele; EAF = EA frequency; eQTL = expression quantitative trait loci; NA = not available; UKB = UK Biobank; UTR = untranslated region.

*FVC in milliliters.

Functional consequence for SNPs with different consequences associated with different transcripts; we considered the most deleterious.

Per-allele effect estimate.

§Expression data for proxy rs9936329 (r2 = 0.95).

Expression data for proxy rs11057583 (r2 = 1.0).

Table 2. UKB Results from Stage 1 and Stage 2 (Internal Replication) for the 20 Signals for FEV1/FVC*

GeneSNPChrBPEAEAFFunctional ConsequenceBlood eQTL P ValueLung eQTL P ValueStage 1 (n = 207,616)Stage 2 (Internal Replication) (n = 138,411)
βSEP ValueβSEP Value
CSNK2Brs3117579631,633,496G0.805′ UTR variant1.1 × 10−40.0250.240.021.2 × 10−230.190.032.0 × 10−11
CTNND1rs6650581157,579,166T0.56Intron variant5.5 × 10−120.460−0.080.021.0 × 10−5−0.120.021.6 × 10−7
ELNrs2528794773,480,805G0.88Intron variantNA0.413−0.130.037.5 × 10−6−0.130.041.7 × 10−4
FARP2rs3773242242242,393,182TG0.63Intron variantNANA0.080.029.5 × 10−50.090.021.1 × 10−4
FGFR3rs313587741,804,276G0.96Intron variantNA0.479−0.240.059.4 × 10−7−0.340.067.0 × 10−9
FGFR4rs31359115176,513,896C0.715′ UTR variant9.5 × 10−60.3590.100.028.0 × 10−70.100.037.5 × 10−5
GFI1rs150037086192,952,080G0.31Intron variant1.6 × 10−18§NA0.110.021.8 × 10−70.110.038.0 × 10−6
GJE1rs2256076142,455,130C0.54Missense variantNANA0.100.021.1 × 10−70.130.022.1 × 10−8
KAT7rs7557361747,891,904A0.34Intron variant3.5 × 10−40.4390.070.025.4 × 10−40.090.021.1 × 10−4
MAPRE1rs8538542031,420,757T0.48Intron variant3.0 × 10−34NA−0.080.023.2 × 10−5−0.080.022.1 × 10−4
NFATC3rs5480922761668,210,935C0.84Intron variantNANA−0.160.033.6 × 10−9−0.160.037.5 × 10−7
PDGFBrs22674062239,633,749T0.25Intron variant9.8 × 10−1840.396−0.090.022.1 × 10−5−0.120.031.1 × 10−5
PPARDrs2267666635,370,728A0.24Intron variant0.0025.9 × 10−4−0.130.025.2 × 10−9−0.100.037.0 × 10−5
RARArs27155541738,489,170A0.85Intron variant3.6 × 10−40.5680.160.032.2 × 10−90.120.031.5 × 10−4
RUNX3rs9438876125,241,116A0.49Intron variant8.9 × 10−170.3230.110.024.1 × 10−90.100.022.4 × 10−5
SERPING1rs112290631157,369,730G0.73Intron variant7.2 × 10−2530.0790.120.022.3 × 10−80.120.032.5 × 10−6
SFRP2rs170304374154,704,225C0.75Intron variant7.2 × 10−800.7760.100.021.1 × 10−50.100.031.5 × 10−4
SOX9rs7962094341770,122,505CT0.533′ UTR variantNANA−0.040.022.4 × 10−2−0.080.024.3 × 10−4
TCF7L1rs4346385285,504,989A0.29Intron variant0.0020.006−0.080.021.6 × 10−4−0.090.033.1 × 10−4
WNT7Ars73151668313,920,594G0.85Intron variant0.2811.1 × 10−5−0.110.033.0 × 10−5−0.210.039.5 × 10−11

For definition of abbreviations, see Table 1.

*FEV1/FVC expressed as a percentage.

Functional consequence for SNPs with different consequences associated with different transcripts; we considered the most deleterious.

Per-allele effect estimate.

§Expression data for proxy rs4565725 (r2 = 0.84).

Table 3. Results for the Association of the 36 Signals with COPD

GeneSNPChrBPEAEAFOR95% CIP Value
ACTN3rs571278451166,318,325C0.821.010.99–1.030.299
ACTN4rs1898099001939,147,164A0.981.030.97–1.100.326
CLDN20rs342682546155,590,120TA0.390.990.97–1.000.105
CSNK2Brs3117579631,633,496G0.80.950.93–0.971.1 × 10−6*
CTNND1rs6650581157,579,166T0.561.031.01–1.040.001*
ELNrs2528794773,480,805G0.881.051.02–1.082.2 × 10−4*
FARP2rs3773242242242,393,182TG0.630.980.96–1.000.011
FGFR3rs313587741,804,276G0.961.111.06–1.161.3 × 10−6*
FGFR4rs31359115176,513,896C0.710.960.94–0.974.0 × 10−7*
GFI1rs150037086192,952,080G0.310.970.95–0.989.5 × 10−5*
GJE1rs2256076142,455,130C0.540.970.96–0.994.1 × 10−4*
GSK3Brs68052513119,560,606T0.380.990.98–1.010.447
HOXA1rs45571645727,135,096G0.980.980.92–1.040.743
HOXB4rs2016036351746,653,038T0.941.010.97–1.040.729
ITGB5rs172820783124,481,760T0.871.000.98–1.030.759
KAT7rs7557361747,891,904A0.340.980.96–1.000.018
KAT8rs1382590611631,136,066A0.641.011.00–1.030.190
MAPRE1rs8538542031,420,757T0.481.021.01–1.040.008
MMP24rs72802033,864,484A0.581.011.00–1.030.119
NCOR2rs7245102112124,811,393ACT0.890.990.97–1.020.522
NFATC3rs5480922761668,210,935C0.841.061.03–1.081.6 × 10−6*
NR3C1rs728010515142,685,670A0.841.041.02–1.077.1 × 10−5*
PDGFBrs22674062239,633,749T0.251.031.01–1.050.006
PPARDrs2267666635,370,728A0.241.051.03–1.071.7 × 10−7*
RARArs27155541738,489,170A0.850.940.92–0.968.6 × 10−8*
ROR2rs12684752994,682,990T0.941.000.97–1.040.824
RUNX1rs124835012136,224,276T0.630.980.97–1.000.044
RUNX3rs9438876125,241,116A0.490.970.95–0.984.1 × 10−5*
SERPINC1rs22276031173,882,548A0.970.970.92–1.010.155
SERPING1rs112290631157,369,730G0.730.960.95–0.982.1 × 10−5*
SFRP2rs170304374154,704,225C0.750.970.96–0.990.004
SOX9rs7962094341770,122,505CT0.531.011.00–1.030.079
TCF7L1rs4346385285,504,989A0.291.031.01–1.050.003
WNT2Brs3513701113,054,659C0.411.000.98–1.020.944
WNT7Ars73151668313,920,594G0.851.061.03–1.082.0 × 10−6*
WNT9Ars357990121228,133,322C0.831.000.98–1.020.749

Definition of abbreviations: BP = base position (build GRCh37); Chr = chromosome; CI = confidence interval; COPD = chronic obstructive pulmonary disease; EA = effect allele; EAF = EA frequency; OR = per-allele odds ratio.

Analyses in the whole dataset: N = 346,027. In bold are results with P < 0.05.

*Statistically significant after Bonferroni correction (P < 1.4 × 10−3).

For the external replication in CHARGE and SpiroMeta, 34 of the 36 signals had available data for the SNP or a proxy (LD r2 ≥ 0.8). The sample sizes varied across SNPs (Tables 4 and 5) from 108,318 to 143,612 in the meta-analysis of CHARGE and SpiroMeta. Overall, of these 34 variants, 16 variants replicated after Bonferroni correction, and another 12 variants replicated at the nominal level of significance (Tables 4 and 5).

Table 4. Results from External Replication in CHARGE and SpiroMeta for the 17 Signals for FVC

GeneSNPChrEAEAFInternal ReplicationExternal Replication
UKB Stage 2 (N = 138,411)CHARGESpiroMetaMeta-Analysis CHARGE + SpiroMeta
βSEP ValueNβSEP ValuenDirectionP ValueP Value
ACTN3rs5712784511C0.82−14.72.92.3 × 10−760,507−11.24.23.6 × 10−375,423+0.4730.014
ACTN4rs18980990019A0.98−35.88.92.8 × 10−536,1124.721.30.41481,0810.2320.407
CLDN20rs342682546TA0.399.02.34.6 × 10−560,5077.93.26.4 × 10−375,422+0.0117.4 × 10−4*
GSK3Brs68052513T0.388.02.32.1 × 10−460,5064.93.20.06474,551+0.0830.033
HOXA1rs455716457G0.9828.78.32.9 × 10−458,9292.312.00.42582,8630.4810.559
HOXB4rs20160363517T0.94−16.34.61.9 × 10−4NANANANANANANANA
ITGB5rs172820783T0.8715.73.39.0 × 10−760,50813.54.72.0 × 10−3*75,422+0.3656.1 × 10−3
KAT8rs13825906116A0.64−7.72.34.5 × 10−460,508§−6.2§3.3§0.029§82,8650.0990.020
MMP24rs728020A0.589.72.39.5 × 10−660,5088.53.35.1 × 10−381,992+0.0147.5 × 10−4*
NCOR2rs7245102112ACT0.8912.83.61.9 × 10−445,2561.95.90.37275,422+0.0380.074
NR3C1rs728010515A0.8410.73.01.6 × 10−460,5077.54.40.04675,421+0.1110.032
ROR2rs126847529T0.94−17.84.78.5 × 10−560,506−2.56.60.35375,4210.3980.415
RUNX1rs1248350121T0.6310.12.52.1 × 10−560,5084.03.90.15081,992+0.0260.026
SERPINC1rs22276031A0.97−23.16.62.5 × 10−460,508−20.610.00.02075,4230.3670.044
SOX9rs79620943417CT0.537.92.22.0 × 10−460,50610.43.25.4 × 10−4*75,422+8.3 × 10−36.0 × 10−5*
WNT2Brs3513701C0.417.72.23.2 × 10−460,5067.43.40.01474,550+3.2 × 10−34.9 × 10−4*
WNT9Ars357990121C0.83−13.52.92.2 × 10−660,508−9.44.70.02474,5522.8 × 10−3*7.1 × 10−4*

Definition of abbreviations: CHARGE = Cohorts for Heart and Aging Research in Genomic Epidemiology; Chr = chromosome; EA = effect allele; EAF = EA frequency; NA = not available; UKB = UK Biobank.

Analyses in the whole dataset (N = 346,027). For SpiroMeta, reported only effect direction because β not interpretable (use of rank-based inverse normal transformation). In bold are external replication results with P < 0.05. β values are per-allele effect estimates.

*External replication results significant at Bonferroni (P < 3.1 × 10−3).

P value inverted in Fisher’s meta-analysis to reflect the effect in opposite direction (P values reported for UKB stage 2, CHARGE, and SpiroMeta are all one-sided, see text).

Proxy: rs13220615 (r2 = 0.97).

§Proxy: rs1978485 (r2 = 0.98).

Proxy: rs11057583 (r2 = 1.0).

Proxy: rs1042678 (r2 = 0.97).

Table 5. Results from External Replication in CHARGE and SpiroMeta for the 20 Signals for FEV1/FVC

GeneSNPChrEAEAFInternal ReplicationExternal Replication
UKB Stage 2 (n = 138,411)CHARGESpiroMetaMeta-Analysis CHARGE + SpiroMeta
βSEP ValueNβSEP ValuenDirectionP ValueP Value
CSNK2Brs31175796G0.800.190.032.0 × 10−11NANANANA83,081+5.9 × 10−5*
CTNND1rs66505811T0.56−0.120.021.6 × 10−760,531−0.050.040.13275,6390.0780.057
ELNrs25287947G0.88−0.130.041.7 × 10−458,707−0.270.077.0 × 10−5*74,7670.0161.6 × 10−5*
FARP2rs3773242242TG0.630.090.021.1 × 10−4NANANANANANANANA
FGFR3rs31358774G0.96−0.340.067.0 × 10−939,004−0.460.131.6 × 10−4*69,5590.1202.3 × 10−4*
FGFR4rs31359115C0.710.100.037.5 × 10−551,0190.210.051.9 × 10−5*75,639+1.7 × 10−3*5.9 × 10−7*
GFI1rs1500370861G0.310.110.038.0 × 10−660,5300.120.043.9 × 10−375,638+0.0542.0 × 10−3*
GJE1rs2256076C0.540.130.022.1 × 10−860,5310.070.040.06075,638+0.1140.040
KAT7rs75573617A0.340.090.021.1 × 10−458,7060.040.040.18883,081+0.0180.023
MAPRE1rs85385420T0.48−0.080.022.1 × 10−458,949−0.040.040.19183,0790.0250.030
NFATC3rs54809227616C0.84−0.160.037.5 × 10−760,531§−0.16§0.05§1.7 × 10−3*§75,639§§4.8 × 10−3§1.0 × 10−4*
PDGFBrs226740622T0.25−0.120.031.1 × 10−551,669−0.200.059.1 × 10−5*83,0790.0637.5 × 10−5*
PPARDrs22676666A0.24−0.100.037.0 × 10−560,532−0.220.052.6 × 10−6*83,0797.1 × 10−33.5 × 10−7*
RARArs271555417A0.850.120.031.5 × 10−437,5870.040.080.28583,080+0.0170.030
RUNX3rs94388761A0.490.100.022.4 × 10−558,6790.080.050.04182,209+0.0278.6 × 10−3
SERPING1rs1122906311G0.730.120.032.5 × 10−660,5290.110.050.01075,638+0.0181.7 × 10−3*
SFRP2rs170304374C0.750.100.031.5 × 10−460,5030.190.051.3 × 10−5*75,638+0.1432.6 × 10−5*
SOX9rs79620943417CT0.53−0.080.024.3 × 10−460,529−0.080.040.02275,6372.1 × 10−3*5.1 × 10−4*
TCF7L1rs43463852A0.29−0.090.033.1 × 10−460,531−0.030.040.27675,6380.0430.065
WNT7Ars731516683G0.85−0.210.039.5 × 10−1160,530−0.240.064.6 × 10−5*82,2109.3 × 10−4*7.7 × 10−7*

For definition of abbreviations, see Table 4.

Analyses in the whole dataset (N = 346,027). In bold are external replication results with P < 0.05. β values are per-allele effect estimates.

*External replication results significant at Bonferroni (P < 2.6 × 10−3).

Proxy: rs451643 (r2 = 1).

Proxy: rs4565725 (r2 = 0.84).

§Proxy: rs8048034 (r2 = 0.80).

Proxy: rs1042678 (r2 = 0.95).

To help interpret our findings, we grouped all 55 genes into biological categories based on their known function, as shown in Table 6; such information was derived from the National Center for Biotechnology Information (NCBI) Gene (www.ncbi.nlm.nih.gov/gene), Ensembl (https://www.ensembl.org), GeneCards (https://www.genecards.org), and Mouse Genome Informatics (http://www.informatics.jax.org) databases. Of the 55 genes, 53 genes fall into only the following four categories: growth factors, transcriptional regulators, cell-to-cell adhesion and cytoskeletal, and extracellular matrix (ECM). Genes encoding growth factors, or their receptors, are the most well-represented category (n = 19), and within this group, Wnt-signaling genes (CSNK2B, DVL2, GSK3B, ROR2, SFRP2, TCF7L1, WNT2B, WNT7A, and WNT9A) are particularly prevalent. Genes encoding transcription factors are also highly represented (n = 17); within this category, we identified genes involved in vitamin A signaling, including the retinoic acid ligand–activated transcription factors (RARA and RARB), and glucocorticoid signaling genes, including the glucocorticoid receptor gene (NR3C1), NCOR1, and its paralogue NCOR2 that modulate the activity of nuclear receptors, including RARs (retinoic acid receptors), PPARD, and the glucocorticoid receptor. Ten genes relate to cell-to-cell adhesion and the cytoskeleton, including three genes associated with actin microfilaments (ACTN3, ACTN4, and TNS1). Another seven genes relate to the ECM, including ELN, which encodes elastin.

Table 6. Gene Function and Associated Biological Categories for All the 55 Genes Identified for FVC, FEV1/FVC, or Both*

Gene and Biological CategoryFull NameFunction 
Growth factors   
CSNK2BCasein kinase 2 βUbiquitous protein kinase that regulates metabolic pathways, signal transduction, transcription, translation, and replication 
FGFR3Fibroblast growth factor receptor 3Encodes a tyrosine kinase and cell surface receptor for fibroblast growth factors 
FGFR4Fibroblast growth factor receptor 4Encodes a tyrosine kinase and cell surface receptor for fibroblast growth factors 
GSK3BGlycogen synthase kinase 3 βEncodes a serine-threonine kinase belonging to the glycogen synthase kinase subfamily 
PDGFBPlatelet-derived growth factor subunit BEncodes a member of the protein family comprised of PDGFs 
ROR2Receptor tyrosine kinase like orphan receptor 2Encodes a receptor protein tyrosine kinase and a type I transmembrane protein that belongs to the ROR subfamily of cell surface receptors 
SFRP2Secreted frizzled related protein 2Encodes a member of the SFRP family that acts as soluble modulators of Wnt signaling 
TCF7L1Transcription factor 7–like 1Encodes a member of the T-cell factor/lymphoid enhancer factor family of transcription factors 
WNT2BWnt family member 2Member of the WNT gene family 
WNT7AWnt family member 7AMember of the WNT gene family 
WNT9AWnt family member 9AMember of the WNT gene family 
BMP4Bone morphogenetic protein 4Encodes a secreted ligand of the TGF-β (transforming growth factor β) superfamily of proteins 
FGF10Fibroblast growth factor 10Encodes a member of the fibroblast growth factor family with roles in morphogenesis of epithelium, reepithelialization of wounds, hair development, and early lung organogenesis 
FGF18Fibroblast growth factor 18Encodes a member of the fibroblast growth factor family with roles in cell growth, morphogenesis, and tissue repair and is particularly important in bone development 
HHIPHedgehog interacting proteinEncodes a member of the HHIP family, which is a highly conserved, vertebrate-specific inhibitor of HH signaling 
IGF1Insulin-like growth factor 1Encodes an insulin-like protein involved in mediating growth and development 
KDRKinase insert domain receptor—vascular endothelial growth factor receptor 2Encodes one of the two receptors of the VEGF; this receptor functions as the main mediator of VEGF-induced endothelial proliferation, survival, migration, tubular morphogenesis, and sprouting 
PTCH1Patched 1Encodes a member of the patched family of proteins and a component of the hedgehog signaling pathway 
TGFB2Transforming growth factor β 2Encodes a secreted ligand of the TGF-β superfamily of proteins 
Transcriptional regulators   
GFI1Growth factor independent 1 transcriptional repressorEncodes a nuclear zinc-finger protein that functions as a transcriptional repressor 
HOXA1Homeobox A1Encodes a DNA-binding transcription factor involved in spatial patterning in development 
HOXB4Homeobox B4Encodes a DNA-binding transcription factor involved in spatial patterning in development 
KAT7Lysine acetyltransferase 7Encodes a protein that is part of the multimeric HBO1 complex and possesses histone H4-specific acetyltransferase activity; this activity regulates gene transcription (e.g., VEGFR2, by influencing chromatin conformation) 
KAT8Lysine acetyltransferase 8Encodes a member of the MYST histone acetylase protein family; the encoded protein regulates gene transcription by influencing chromatin conformation 
NCOR2Nuclear receptor corepressor 2Encodes a protein that regulates repression of thyroid-hormone and retinoic-acid receptors 
NFATC3Nuclear factor of activated T cells 3Encodes a member of the nuclear factors of activated T cells family of transcription factors 
NR3C1Nuclear receptor subfamily 3 group C member 1Encodes glucocorticoid receptor 
PPARDPeroxisome proliferator-activated receptor deltaEncodes a member of the PPAR family that is believed to function as an integrator of transcriptional repression and nuclear receptor signaling 
RARARetinoic acid receptor αEncodes the retinoic acid receptor α that acts as a ligand-activated transcription factor 
RUNX1Runt-related transcription factor 1Encodes for a member of the runt family of transcription factors that regulate hematopoiesis and skeletal development 
RUNX3Runt-related transcription factor 3Encodes for a member of the runt family of transcription factors that regulate hematopoiesis and skeletal development 
SOX9SRY-box 9The protein encoded is an HMG box DNA-binding protein 
GATA6GATA-binding protein 6Member of the GATA family of transcription factors that regulate cellular differentiation and organogenesis during embryonic development 
NCOR1Nuclear receptor corepressor 1Encodes a protein that regulates repression of thyroid-hormone and retinoic-acid receptors 
RARBRetinoic acid receptor βEncodes the retinoic acid receptor β that acts as a ligand-activated transcription factor 
RUNX2Runt-related transcription factor 2Encodes for a member of the runt family of transcription factors that regulate hematopoiesis and skeletal development 
Cell-to-cell adhesion and cytoskeleton   
ACTN3Actinin α 3 (gene/pseudogene)Involved in crosslinking actin filaments, part of the cytoskeleton 
ACTN4Actinin α 4Actin-binding protein, part of the cytoskeleton 
CLDN20Claudin 20Encodes a tight junction protein; important for cell polarity and regulating movement of molecules via the paracellular route. 
CTNND1Catenin delta 1Armadillo protein family, which function in adhesion between cells and signal transduction 
FARP2FERM, ARH/RhoGEF, and pleckstrin domain protein 2ρ guanidine exchange factor 
GJE1Gap junction protein epsilon 1Gap junction protein; Gap junctions are specialized intercellular connections that enable cell-to-cell communication 
MAPRE1Microtubule associated protein RP/EB family member 1Encodes a protein that localizes to microtubules, a dynamic network of filaments that form part of the cytoskeleton 
DSPDesmoplakinEncodes a protein component of functional desmosomes 
PARD3Par-3 family cell polarity regulatorEncodes a member of the PARD protein family that regulates cell polarity and cell-to-cell integrity 
TNS1Tensin 1Encodes for a protein that localizes to focal adhesions and crosslinks actin filaments 
Extracellular matrix   
ELNElastinEncodes a protein that is one of the two components of elastic fibers 
ITGB5Integrin subunit β 5Encodes the integrin β subunit 5 protein 
MMP24Matrix metallopeptidase 24Encodes a member of the peptidase M10 family of MMPs 
SERPINC1Serpin family C member 1Encodes a plasma protease inhibitor and a member of the serpin superfamily 
SERPING1Serpin family G member 1Encodes a highly glycosylated plasma protein involved in the regulation of the complement cascade 
ITGAVIntegrin subunit α VEncodes a member of the integrin α chain family 
MMP15Matrix metallopeptidase 15Encodes a member of the peptidase M10 family and membrane-type subfamily of MMPs 
Oxidative stress and endothelial dysfunction   
AGERAdvanced glycosylation end-product (AGE) specific receptorMultiligand receptor; role in chronic vascular injury 
Immune response and surfactant regulation   
SFTPDSurfactant protein DThe protein encoded is part of the innate immune response and has a role in surfactant regulation 

*Bold formatting indicates the 36 novel genes.

Functional Annotation and Gene Expression

Using the Ensembl variant effect predictor tool (https://www.ensembl.org/info/genome/variation/prediction/predicted_data.html#consequences) (30), we investigated the functional consequence of the 36 novel signals; Tables 1 and 2 show that most of them are intron variants.

We also assessed whether the 36 signals affect the expression of the genes they lie within. For gene expression in the blood, we used cis–expression quantitative trait loci (eQTL) data from the eQTLGen Consortium (https://www.eqtlgen.org/cis-eqtls.html) (31), which includes 37 datasets with a total of 31,684 individuals; for the 36 SNPs, the actual sample size varied from 8,269 to 31,684. For gene expression in lung tissue, we used data from the Genotype-Tissue Expression Portal (GTEx) (https://www.gtexportal.org/home/eqtls/tissue?tissueName#x3d;Lung), which includes lung tissue samples from 383 individuals, with actual sample sizes varying from 12 to 286 for our 36 SNPs. Tables 1 and 2 report the effects of the 36 signals on the expression of the gene they lie within. Of 27 SNPs with available data, 22 showed eQTL evidence in the blood, and 10 showed it in the lung tissue. For four signals (WNT2B, WNT7A, WNT9A, and ACTN4), we found evidence in the lung tissue, but not in the blood, despite the very small sample size of lung eQTL data.

Our study demonstrates the role of lung development genes in regulating adult lung function and provides further support for the developmental origins of both restrictive and obstructive impairment of adult lung function and spirometrically defined COPD. Overall, we identified 55 lung development–related genes associated with adult lung function; of these, 36 had not been reported in the largest and most recent GWAS of lung function (9), showing the value of our hypothesis-driven approach in complementing agnostic GWASs. Only 6 of the 36 signals could not be replicated in external populations from the CHARGE and SpiroMeta consortia; for three of them, this is not surprising, given the low allele frequency and, therefore, low power to detect realistic effect sizes despite the large replication sample size.

To further assess the novelty of the 36 genes, we searched the literature for any evidence of association with lung function and related outcomes, using PhenoScanner (32) and HuGE Navigator (17) and checking references of relevant papers. We found previous evidence for just four of the 36 genes. An intergenic variant annotated to NCOR2 (NCOR2/SCARB1 locus) was previously associated with adult FEV1 (26) but did not replicate in the study by Shrine and colleagues (9). NCOR2 was also associated with FVC in young adults but could only be replicated in children (15); the same study identified, but did not replicate, KAT8. SOX9 was associated with adult FEV1 in a study that included SNP by smoking interaction (33). NR3C1 was previously identified in a GWAS of spirometrically defined COPD (34); recently, an intergenic variant annotated to NR3C1 (NR3C1/ARHGAP26 locus) was also associated with FEV1/FVC in a methodological study incorporating functional genomics data to increase power in the GWAS (35). Interestingly, two additional genes were previously associated with asthma-related phenotypes, RUNX1 with pediatric asthma (36) and IgE concentrations (37) in two candidate-gene studies, and ITGB5 with airway hyperresponsiveness in individuals with asthma in a GWAS (38).

Among all 55 genes, the large majority show an association with either FVC or FEV1/FVC, but not both, which is not surprising, given that these parameters identify distinct patterns of lung function impairment. In population-based epidemiological studies, a low FVC is a marker of restriction, indicating small lung volumes, and is a strong predictor of all-cause mortality, even in the absence of chronic lung disease (1). Similarly, a low FEV1/FVC is an epidemiological marker of COPD, which is projected to become the third leading cause of death worldwide by 2020 (39). Knowledge of whether a lung development gene affects restriction, obstruction, or both links the development of lung structure with function and points to underlying mechanistic pathways that will inform future experimental follow-up studies.

Biological Interpretation

Our finding that 53 of the 55 genes identified in this study fall into four biological categories that regulate organ size and cell integrity indicates the particular importance of these processes for adult lung health. Growth factors, the best-represented gene category, are diffusible signaling proteins that exert a variety of biological responses important for organ generation, including proliferation, morphogenesis, and angiogenesis. They are also important for maintaining homeostasis in adulthood. Abnormal production of growth factors can lead to lung diseases; for example, perturbed angiogenic growth factors can lead to bronchopulmonary dysplasia (40), and overactive TGF-β signaling can lead to idiopathic pulmonary fibrosis (41). Within this group, Wnt-signaling genes are highly represented; in addition to being critically required for all stages of lung generation, the Wnt-signaling pathway has an important role in maintaining lung health by stimulating repair after injury (42, 43).

Genes encoding transcription factors are also well represented; these regulate the expression of multiple genes by binding to specific DNA sequences to activate or repress gene transcription. During development, transcriptional regulators control growth in a highly ordered spatiotemporal manner (44), the disruption of which can affect organ size, architecture, and function. Within this category, we identified genes involved in vitamin A and glucocorticoid signaling. Vitamin A signaling has an important role not only in lung development but also in adult lung structural homeostasis, with abnormal vitamin A signaling associated with histological emphysema, driven possibly via aberrant endothelial cell repair in patients with COPD (45, 46). Interestingly, we also identified transcription factors, such as the homeobox genes HOXA1 and HOXB4, which themselves are transcriptional targets of other genes that we identified, including RARA, RARB, WNT2B, WNT7B, and WNT9A.

Some of the genes identified relate to cell-to-cell adhesion and the cytoskeleton. Cell-to-cell adhesion is important to maintain tissue integrity; its breakdown and subsequent loss of epithelial barrier function is also frequently a component of lung disease (47). Three of the genes identified (ACTN3, ACTN4, and TNS1) act on the actin cytoskeleton, a network of intracellular fibers that are integral to both cell-to-cell and cell-to-ECM interactions and that are required to maintain cell integrity and movement (48).

Finally, we identified genes related to the ECM, which in the lungs, provides not only a scaffold to support cells but also a source of biological signals and mechanical strength to maintain cell integrity and health through a bioactive environment interacting with surrounding cells (48, 49). Pathological changes to the ECM are a recognized hallmark of lung diseases, including asthma, COPD, and idiopathic pulmonary fibrosis, and current regenerative medicine strategies are exploring the efficacy of targeting the ECM as a possible avenue for the treatment of lung diseases (49). Included in this category is the gene encoding elastin (ELN); elastin is a major component of the ECM that not only links alveoli to the conducting airways but also is a key determinant of the elastic recoil in the lung. We speculate that the association of ELN with FEV1/FVC and COPD (FEV1/FVC < LLN) in our data might reflect an effect of this gene on elastic recoil and the risk of emphysema.

Strengths and Limitations

Despite the high heritability of lung function, genetic variants identified by agnostic GWASs still explain only a small proportion of its variability in the population (9). By using a hypothesis-driven approach, we have identified a substantial number of additional variants associated with lung function, especially polymorphisms with relatively low allele frequencies, which may not have reached strict genome-wide significance thresholds in previous GWASs. Although this suggests that focusing the analyses on many genes related to a pathophysiological process believed to affect the outcome is a promising approach, a practical issue is how to select the genes to be investigated. Our list of about 400 genes was previously prepared following a thorough process based on experts’ knowledge from animal and human studies, integrated with data from bioinformatic tools (15). However, we acknowledge that there is a degree of subjectivity involved in this method.

Epidemiological studies have linked the early life environment to adult lung function and COPD, and it is assumed that these associations are mediated through impacts on lung growth and development. By demonstrating clear associations of multiple lung development genes with adult lung function, we have provided more direct evidence that lung development plays a crucial role in adult lung health. Furthermore, in contrast to observational studies implicating the early environment, our genetic findings are unlikely to be affected by classical environmental and lifestyle confounders, and this strengthens causal inference. That said, given the cross-sectional nature of our study and the age of UKB participants, measured lung function will reflect a combination of the maximal level attained through growth and subsequent decline. We therefore cannot determine whether the implicated lung development genes are only influencing the former or whether they may also be influencing repair and, hence, combating insults such as smoking, which can cause accelerated decline later in life. The broadly similar results in smokers and nonsmokers do not favor one explanation over the other.

An obstructive pattern, indicated by a low FEV1/FVC ratio, can be caused by respiratory conditions other than COPD, including bronchiectasis, bronchiolitis, and cystic fibrosis, but these are uncommon in the general population. Asthma is more common, however, and can also result in a low prebronchodilator FEV1/FVC, which cannot exclude the presence of reversible obstruction. As post-bronchodilator lung function was not measured in the UKB, we performed sensitivity analyses excluding individuals with a self-reported doctor diagnosis of asthma, and these confirmed the results of the main analyses (data not shown).

Future Research

Further detailed investigation of our findings is required to identify the underlying causal variants and possible pathogenetic mechanisms. For some of the identified genes, there is experimental evidence of an ongoing role in adult lung homeostasis and repair through alveolar maintenance and regeneration after injury later in life, with potential implications for understanding the rapid decline of lung function and identifying future pharmacological targets. Longitudinal cohorts offer an opportunity to examine associations with lung function trajectories across the life course. If lung development genes are acting primarily on growth and development, we might expect to see stronger associations in children and young adults before lung function decline has commenced. Conversely, if they are acting primarily on repair, stronger effects might be seen on decline in older individuals. Extending the investigation of lung development genes to incorporate cross-sectional data on children, adolescents, and young adults from different studies would also help disentangle effects on lung growth from those on lung regeneration. However, such investigation would require very large sample sizes to ensure adequate power to detect signals with relatively small effects and/or low allele frequencies such as those that we have identified.

We have taken a conservative approach that only considered one best SNP per gene, but a gene may contain multiple independent signals. Similar to GWAS findings, the majority of our novel 36 signals are intronic variants, which might exert their effect by modifying the expression of other genes; however, most of them do affect the expression of the genes they lie within. These signals could be further investigated in relevant human cell lines or animal models, for example by using gene editing to delete a small region that includes the SNP identified, as recently done by Parker and colleagues (50).

Finally, further research is needed to clarify whether the identified genes act on lung function independently or through gene–gene or gene–environment interactions. For example, NCOR2 might affect lung function through its effects on vitamin A metabolism via the RAR or alternatively through interaction with genes encoding non–nuclear receptor transcription factors like Foxp1, which are also important for lung development (51). Another example is a possible gene-to-environment interaction between lung development genes involved in vitamin A metabolism and vitamin A intake on lung function; for example, the beneficial effect of prenatal vitamin A supplementation on offspring lung function (52) may be modified by vitamin A–related genes.

In conclusion, our findings show a clear effect of lung development genes on adult lung function, influencing both restrictive and obstructive patterns. Furthermore, they demonstrate how genetic knowledge of relevant biological processes can be used to help identify novel genetic associations for complex traits. Further investigation of these developmental pathways could ultimately lead to druggable targets, with the aim of optimizing adult lung health and preventing COPD.

This research was conducted using the UK Biobank resource (application number 19136). The authors thank the participants, field workers, and data managers in the UK Biobank for all their time and efforts.

1. Burney PG, Hooper R. Forced vital capacity, airway obstruction and survival in a general population sample from the USA. Thorax 2011;66:4954.
2. Gupta RP, Strachan DP. Ventilatory function as a predictor of mortality in lifelong non-smokers: evidence from large British cohort studies. BMJ Open 2017;7:e015381.
3. Martinez FD. Early-life origins of chronic obstructive pulmonary disease. N Engl J Med 2016;375:871878.
4. Melén E, Guerra S. Recent advances in understanding lung function development. F1000 Res 2017;6:726.
5. Stocks J, Hislop A, Sonnappa S. Early lung development: lifelong effect on respiratory health and disease. Lancet Respir Med 2013;1:728742.
6. Lange P, Celli B, Agustí A, Boje Jensen G, Divo M, Faner R, et al. Lung-function trajectories leading to chronic obstructive pulmonary disease. N Engl J Med 2015;373:111122.
7. Krauss-Etschmann S, Bush A, Bellusci S, Brusselle GG, Dahlén SE, Dehmel S, et al. Of flies, mice and men: a systematic approach to understanding the early life origins of chronic lung disease. Thorax 2013;68:380384.
8. John C, Soler Artigas M, Hui J, Nielsen SF, Rafaels N, Paré PD, et al. Genetic variants affecting cross-sectional lung function in adults show little or no effect on longitudinal lung function decline. Thorax 2017;72:400408.
9. Shrine N, Guyatt AL, Erzurumluoglu AM, Jackson VE, Hobbs BD, Melbourne CA, et al.; Understanding Society Scientific Group. New genetic signals for lung function highlight pathways and chronic obstructive pulmonary disease associations across multiple ancestries. Nat Genet 2019;51:481493.
10. Collins SA, Lucas JS, Inskip HM, Godfrey KM, Roberts G, Holloway JW; Southampton Women’s Survey Study Group. HHIP, HDAC4, NCR3 and RARB polymorphisms affect fetal, childhood and adult lung function. Eur Respir J 2013;41:756757.
11. Miller S, Melén E, Merid SK, Hall IP, Sayers I. Genes associated with polymorphic variants predicting lung function are differentially expressed during human lung development. Respir Res 2016;17:95.
12. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015;12:e1001779.
13. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 2018;562:203209.
14. Wain LV, Shrine N, Miller S, Jackson VE, Ntalla I, Soler Artigas M, et al.; UK Brain Expression Consortium (UKBEC); OxGSK Consortium. Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): a genetic association study in UK Biobank. Lancet Respir Med 2015;3:769781.
15. Minelli C, Dean CH, Hind M, Alves AC, Amaral AF, Siroux V, et al.; SpiroMeta consortium; CHARGE consortium. Association of forced vital capacity with the developmental gene NCOR2. PLoS One 2016;11:e0147388.
16. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 2012;40:D109D114.
17. Yu W, Gwinn M, Clyne M, Yesupriya A, Khoury MJ. A navigator for human genome epidemiology. Nat Genet 2008;40:124125.
18. Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, et al. BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol 2009;10:R130.
19. Repapi E, Sayers I, Wain LV, Burton PR, Johnson T, Obeidat M, et al.; Wellcome Trust Case Control Consortium; NSHD Respiratory Study Team. Genome-wide association study identifies five loci associated with lung function. Nat Genet 2010;42:3644.
20. Wilk JB, Chen TH, Gottlieb DJ, Walter RE, Nagle MW, Brandler BJ, et al. A genome-wide association study of pulmonary function measures in the Framingham Heart Study. PLoS Genet 2009;5:e1000429.
21. Hancock DB, Eijgelsheim M, Wilk JB, Gharib SA, Loehr LR, Marciante KD, et al. Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function. Nat Genet 2010;42:4552.
22. Soler Artigas M, Loth DW, Wain LV, Gharib SA, Obeidat M, Tang W, et al. Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nat Genet 2011;43:10821090.
23. Cho MH, McDonald ML, Zhou X, Mattheisen M, Castaldi PJ, Hersh CP, et al.; NETT Genetics Investigators; ICGN Investigators; ECLIPSE Investigators; COPDGene Investigators. Risk loci for chronic obstructive pulmonary disease: a genome-wide association study and meta-analysis. Lancet Respir Med 2014;2:214225.
24. Hobbs BD, de Jong K, Lamontagne M, Bossé Y, Shrine N, Artigas MS, et al.; COPDGene Investigators; ECLIPSE Investigators; LifeLines Investigators; SPIROMICS Research Group; International COPD Genetics Network Investigators; UK BiLEVE Investigators; International COPD Genetics Consortium. Genetic loci associated with chronic obstructive pulmonary disease overlap with loci for lung function and pulmonary fibrosis. Nat Genet 2017;49:426432.
25. Jackson VE, Latourelle JC, Wain LV, Smith AV, Grove ML, Bartz TM, et al.; Understanding Society Scientific Group. Meta-analysis of exome array data identifies six novel genetic loci for lung function. Wellcome Open Res 2018;3:4.
26. Wyss AB, Sofer T, Lee MK, Terzikhan N, Nguyen JN, Lahousse L, et al. Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function. Nat Commun 2018;9:2976.
27. Soler Artigas M, Wain LV, Miller S, Kheirallah AK, Huffman JE, Ntalla I, et al.; UK BiLEVE. Sixteen new lung function signals identified through 1000 Genomes Project reference panel imputation. Nat Commun 2015;6:8658.
28. Loh PR, Tucker G, Bulik-Sullivan BK, Vilhjálmsson BJ, Finucane HK, Salem RM, et al. Efficient Bayesian mixed-model analysis increases association power in large cohorts. Nat Genet 2015;47:284290.
29. Hankinson JL, Odencrantz JR, Fedan KB. Spirometric reference values from a sample of the general U.S. population. Am J Respir Crit Care Med 1999;159:179187.
30. McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F. Deriving the consequences of genomic variants with the ensembl API and SNP effect predictor. Bioinformatics 2010;26:20692070.
31. Võsa U, Claringbould A, Westra H-J, Bonder MJ, Deelen P, Zeng B, et al. Unraveling the polygenic architecture of complex traits using blood eQTL metaanalysis [preprint]. bioRxiv; 2018 [accessed 2019 Jul 9]. Available from: https://www.biorxiv.org/content/10.1101/447367v1.
32. Staley JR, Blackshaw J, Kamat MA, Ellis S, Surendran P, Sun BB, et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 2016;32:32073209.
33. Hancock DB, Soler Artigas M, Gharib SA, Henry A, Manichaikul A, Ramasamy A, et al. Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function. PLoS Genet 2012;8:e1003098.
34. Schwabe K, Vacca G, Dück R, Gillissen A. Glucocorticoid receptor gene polymorphisms and potential association to chronic obstructive pulmonary disease susceptibility and severity. Eur J Med Res 2009;14:210215.
35. Kichaev G, Bhatia G, Loh PR, Gazal S, Burch K, Freund MK, et al. Leveraging polygenic functional enrichment to improve GWAS power. Am J Hum Genet 2019;104:6575.
36. Haley KJ, Lasky-Su J, Manoli SE, Smith LA, Shahsafaei A, Weiss ST, et al. RUNX transcription factors: association with pediatric asthma and modulated by maternal smoking. Am J Physiol Lung Cell Mol Physiol 2011;301:L693L701.
37. Chae SC, Park BL, Park CS, Ryu HJ, Yang YS, Lee SO, et al. Putative association of RUNX1 polymorphisms with IgE levels in a Korean population. Exp Mol Med 2006;38:583588.
38. Himes BE, Qiu W, Klanderman B, Ziniti J, Senter-Sylvia J, Szefler SJ, et al. ITGB5 and AGFG1 variants are associated with severity of airway responsiveness. BMC Med Genet 2013;14:86.
39. GBD 2016 Causes of Death Collaborators. Global, regional, and national age-sex specific mortality for 264 causes of death, 1980-2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet 2017;390:11511210.
40. Thébaud B. Angiogenesis in lung development, injury and repair: implications for chronic lung disease of prematurity. Neonatology 2007;91:291297.
41. Saito A, Horie M, Nagase T. TGF-β signaling in lung health and disease. Int J Mol Sci 2018;19:2460.
42. Frank DB, Peng T, Zepp JA, Snitow M, Vincent TL, Penkala IJ, et al. Emergence of a wave of Wnt signaling that regulates lung alveologenesis by controlling epithelial self-renewal and differentiation. Cell Rep 2016;17:23122325.
43. Logan CY, Nusse R. The Wnt signaling pathway in development and disease. Annu Rev Cell Dev Biol 2004;20:781810.
44. Costa RH, Kalinichenko VV, Lim L. Transcription factors in mouse lung development and function. Am J Physiol Lung Cell Mol Physiol 2001;280:L823L838.
45. Ng-Blichfeldt JP, Alçada J, Montero MA, Dean CH, Griesenbach U, Griffiths MJ, et al. Deficient retinoid-driven angiogenesis may contribute to failure of adult human lung regeneration in emphysema. Thorax 2017;72:510521.
46. Massaro D, Massaro GD. Lung development, lung function, and retinoids. N Engl J Med 2010;362:18291831.
47. Gon Y, Hashimoto S. Role of airway epithelial barrier dysfunction in pathogenesis of asthma. Allergol Int 2018;67:1217.
48. Yu W, Datta A, Leroy P, O’Brien LE, Mak G, Jou TS, et al. Beta1-integrin orients epithelial polarity via Rac1 and laminin. Mol Biol Cell 2005;16:433445.
49. Burgess JK, Mauad T, Tjin G, Karlsson JC, Westergren-Thorsson G. The extracellular matrix: the under-recognized element in lung disease? J Pathol 2016;240:397409.
50. Parker MM, Hao Y, Guo F, Pham B, Chase R, Platig J, et al. Identification of an emphysema-associated genetic variant near TGFB2 with regulatory effects in lung fibroblasts. eLife 2019;8:e42720.
51. Mottis A, Mouchiroud L, Auwerx J. Emerging roles of the corepressors NCoR1 and SMRT in homeostasis. Genes Dev 2013;27:819835.
52. Checkley W, West KP Jr, Wise RA, Baldwin MR, Wu L, LeClerq SC, et al. Maternal vitamin A supplementation and lung function in offspring. N Engl J Med 2010;362:17841794.
Correspondence and requests for reprints should be addressed to Cosetta Minelli, M.D., Ph.D., National Heart and Lung Institute, Imperial College London, Emmanuel Kaye Building, 1B Manresa Road, London SW3 6LR, UK. E-mail: .

*Co–first authors.

Co–senior authors.

M.P. was funded by the National Heart and Lung Institute Foundation. C.H.D. and M.H. are supported by the Royal Brompton and Harefield Hospitals Charity. M.H. is also supported by an award from Mr. and Mrs. Youssef Mansour. A.B.W. and S.J.L. are supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences NIH ZO1 ES43012. A.B.W. is also supported by contract no. HHSN273201600003I. Infrastructure for the CHARGE Consortium is supported by the NHLBI grant R01HL105756.

Author Contributions: L.P., M.P., M.H., C.H.D., and C.M. designed the study. L.P. and M.P. performed the statistical analyses. S.O.S., M.H., C.H.D., and C.M. wrote the manuscript. P.G.J.B. contributed to the interpretation of the data. A.B.W. and S.J.L. contributed to the replication of the findings. All authors contributed to and approved the final version of the manuscript.

This article has a related editorial.

This article has an online supplement, which is accessible from this issue’s table of contents at www.atsjournals.org.

Originally Published in Press as DOI: 10.1164/rccm.201912-2338OC on May 11, 2020

Author disclosures are available with the text of this article at www.atsjournals.org.

Comments Post a Comment




New User Registration

Not Yet Registered?
Benefits of Registration Include:
 •  A Unique User Profile that will allow you to manage your current subscriptions (including online access)
 •  The ability to create favorites lists down to the article level
 •  The ability to customize email alerts to receive specific notifications about the topics you care most about and special offers
American Journal of Respiratory and Critical Care Medicine
202
6

Click to see any corrections or updates and to confirm this is the authentic version of record