Genetic loci and prioritization of genes for kidney function decline derived from a meta-analysis of 62 longitudinal genome-wide association studies

Estimated glomerular filtration rate (eGFR) reflects kidney function. Progressive eGFR-decline can lead to kidney failure, necessitating dialysis or transplantation. Hundreds of loci from genome-wide association studies (GWAS) for eGFR help explain population cross section variability. Since the contribution of these or other loci to eGFR-decline remains largely unknown, we derived GWAS for annual eGFR-decline and meta-analyzed 62 longitudinal studies with eGFR assessed twice over time in all 343,339 individuals and in high-risk groups. We also explored different covariate adjustment. Twelve genome-wide significant independent variants for eGFR-decline unadjusted or adjusted for eGFR-baseline (11 novel, one known for this phenotype), including nine variants robustly associated across models were identified. All loci for eGFR-decline were known for cross-sectional eGFR and thus distinguished a subgroup of eGFR loci. Seven of the nine variants showed variant-by-age interaction on eGFR cross section (further about 350,000 individuals), which linked genetic associations for eGFR-decline with age-dependency of genetic cross-section associations. Clinically important were two to four-fold greater genetic effects on eGFR-decline in high-risk subgroups. Five variants associated also with chronic kidney disease progression mapped to genes with functional in-silico evidence (UMOD, SPATA7, GALNTL5, TPPP). An unfavorable versus favorable nine-variant genetic profile showed increased risk odds ratios of 1.35 for kidney failure (95% confidence intervals 1.03–1.77) and 1.27 for acute kidney injury (95% confidence intervals 1.08–1.50) in over 2000 cases each, with matched controls). Thus, we provide a large data resource, genetic loci, and prioritized genes for kidney function decline, which help inform drug development pipelines revealing important insights into the age-dependency of kidney function genetics.


Supplementary Note S7
Narrow-sense heritability

Supplementary Figures
Supplementary Figure

General approach for GWAS meta-analysis
An analysis plan and standardized scripts for phenotype generation and GWAS analyses were developed and implemented in all 61 CKDGen studies and UK Biobank. The 61 CKDGen studies consisted of 58 studies that were long-term partners of CKDGen ("old" CKDGen studies) and three studies that have joined CKDGen more recently allowing for more elaborate analyses (AugUR, HUNT and MGI; extended analysis plan, see below). Most studies were population-based and thus including individuals with specific kidney diseases according to the prevalence in the general population. Each study conducted GWAS analyses according to this pre-defined plan, separately by ancestry (if applicable). Ancestry was defined by genetic principal components or participants' self-report. For each study, phenotypic information and genome-wide summary statistics per SNP were transferred to the meta-analysis centers.
Each study had been conducted according to the declaration of Helsinki. The studies have been approved by each local ethics committee. All participants in all studies provided written informed consent.
Meta-analyses were conducted, significant variants identified and respective locus regions selected. A GWAS across all available studies was shown to be advantageous over conducting a discovery followed by a replication stage on selected variants S1,S2 . Therefore, rather than conducting a discovery GWAS in old CKDGen studies and a replication in recently joined CKDGen studies and UK Biobank, we included all studies into the GWAS meta-analysis on eGFR decline.

Phenotype definition
In each contributing study, serum creatinine was measured at least two times, utilizing two measurements at largest time distance (study-specific details in Supplementary Table S1).
When measurements were obtained by Jaffé assay (before 2009), creatinine measurements were calibrated (multiplying by 0.95 S3 ). Serum creatinine measured at baseline and follow-up was used to estimate eGFR at baseline and follow-up, respectively, according to the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation S4 . This equation contains an age, sex, and ancestry term for a best fit of creatinine-based eGFR to measured GFR. At baseline and follow-up, eGFR was winsorized at 15 and 200 mL/min/1.73m². Annual eGFRdecline was defined as "-(eGFR at follow-up -eGFR at baseline) / number of years of followup"; thus, eGFR-decline is positive when eGFR is lower at follow-up compared to baseline and comparable across studies with different follow-up length.
In each study, eGFR-decline was analyzed overall and restricted to individuals with CKD or DM at baseline. CKD at baseline was defined as eGFR<60 mL/min/1.73m² at baseline.
In CKDGen, DM at baseline was defined as fasting plasma glucose ≥126 mg/dl (7.0 mmol/L) or diabetes therapy, or (fasting glucose unavailable) as non-fasting plasma glucose ≥200 mg/dl (11.0 mmol/L) or diabetes therapy, or (glucose unavailable) as self-reported diabetes. For UK Biobank, DM was defined as HbA1c≥48 mmol/mol or diabetes therapy.

Study-specific generation of outcome variables according different adjustment models
In each study, different models for the SNP-association with annual eGFR-decline as outcome were computed genome-wide: (i) adjusted for age, sex, and DM and applied to all individuals ("decline DM-adjusted"); (ii) adjusted for age and sex restricted to individuals with DM or CKD at baseline ("decline in DM", "decline in CKD"). In the recently joined CKDGen studies and UK Biobank, an extended suite of models was applied: additional analyses were (iii) adjusted for age and sex using all individuals ("decline"), (vi) adjusted for age, sex and eGFR baseline using all individuals (eGFR baseline on log-scale, ln(eGFR), "decline adjusted for baseline").
Further study-specific adjustments were applied (as applicable), including genetic principal components to account for population substructure.
These adjustments were implemented by generating residuals of annual eGFR-decline adjusted for the respective covariates and using these residuals as outcome in GWAS. This is a standard approach yielding comparable results to using the unadjusted phenotype as outcome in GWAS adjusting for the respective covariates. The utilized approach implies fewer covariates in GWAS being computationally more efficient. We standardized the creation of these outcome variables for GWAS by providing a centrally developed script, which also provided descriptive statistics on the study-specific phenotype.

Genotyping, imputation, and study-specific GWAS
In each study, genotyping was conducted using Affymetrix and Illumina arrays (Supplementary Table S2). Imputation was performed using 1000 Genomes S5 phase 1 or phase 3, the Haplotype Reference Consortium S6 v1.1 or customized reference panels, annotating all variants on the GRCh build 37 reference build; imputed genotypes were coded as allelic dosages and imputation quality was provided as IMPUTE2 S7 info score, MACH/minimac S8 RSQ or similar; quality control before and after imputation was conducted study-specifically (Supplementary Table S2).
In each study, GWAS analyses were conducted according to the centrally defined analysis plan. CKDGen studies included different ancestries (European, African American, East Asian, South Asian, and Hispanic) and contributed analyses ancestry-specific. Since most CKDGen studies individuals were European ancestry (94.90%), UK Biobank analyses focused on unrelated European ancestry individuals where two assessments of eGFR were available (n=15,442). For each GWAS, linear regression on the respective outcome variable was computed per SNP (modelled as allele dosages linearly) adjusted for principle components and other study-specific covariates as applicable (Supplementary Table S2).
This yielded three GWAS results for "old" and recently joined CKDGen studies (decline DMadjusted, decline in DM, decline in CKD) and two further GWAS results for recently joined CKDGen studies and UK Biobank (decline, decline adjusted for baseline). Summary statistics were collected and quality controlled centrally with GWAtoolbox S9 .

Study-specific summary statistics for decline adjusted for baseline
As noted above, GWAS results on eGFR-decline adjusted for eGFR-baseline was not available in all studies. GWAS meta-analyses logistics in so many studies are highly complex; it is not trivial to "add" analyses applying other models. However, there is mathematical help to facilitate covariate adjustment post-hoc, i.e. by formula, based on GWAS summary statistics unadjusted for eGFR-baseline and GWAS summary statistics for eGFR-baseline and studyspecific phenotype information S10 . We demonstrate how this works (Supplementary Note S1) and that it works in this setting by validation studies: we compared formula-derived summary statistics for baseline-adjusted decline with model-computed baseline-adjusted decline in a subset of studies (the recently joined CKDGen studies, UK Biobank, selected "old" CKDGen studies). For eGFR-decline adjusted for baseline in the following, we used formula-derived summary statistics for the "old" CKDGen studies and computed summary statistics for the recently joined studies and UK Biobank.

Meta-analyses of GWAS summary statistics
Before meta-analysis, we excluded, from each study file, multi-allelic variants, variants with a Minor Allele Count <10, and variants with an imputation quality <0.6 (R² from minimac S8 or info score from Impute S7 ). Per study, genomic control (GC) correction was applied when the GCfactor lambda was >1. We excluded a study for a specific analysis, when it contributed <100 individuals after quality control for this analysis.
Per model, we conducted a fixed-effects inverse-variance-weighted meta-analysis using metal S11 . To account for the sequential recruitment of studies, we meta-analyzed pervariant summary statistics across "old" CKDGen studies (GC-corrected) and across recently joined CKDGen studies plus UK Biobank (GC-corrected), and then meta-analyzed these two (again GC-corrected, Supplementary Figure S1). After meta-analysis, only variants present in ≥50% of GWAS files and minor allele frequency ≥1% were retained for further analyses.

Identification of associated loci
For our GWAS search, we selected genome-wide significant variants (P<5.00x10 -8 ) in the meta-analyzed summary statistics and identified independent locus lead variants by an iterative approach, as applied previously S12 : (i) from all genome-wide significant variants, we selected the variant with the smallest P-value as the first lead variant and defined this variant's locus region as lead variant ±500kB, (ii) omitting this identified region, we selected the next variant with the smallest P-value, and (iii) repeated this procedure until no further variant with P-value<5.00x10 -8 was observed. The MHC region (chr6:28.5-33.5MB) was considered a single locus. We checked for overlapping loci, but there were none.
For the candidate-based approach, we used the 265 lead variants previously reported for association with cross-sectional ln(eGFR) S12 , excluded the locus regions identified by the GWAS search, and, for the remaining candidate variants, judged significance at Bonferronicorrected level.
For identified variants, we evaluated ancestry-related heterogeneity using MR-MEGA v.0.1.5 (Meta-Regression of MultiEthnic Genetic Association S13 , including three principle components. We also conducted sensitivity analyses incorporating further models of covariate adjustment for identified eGFR-decline associations in a validation meta-analysis.

SNP-by-age interaction on cross-sectional eGFR
We investigated the lead SNPs identified for (creatinine-based) eGFR-decline for SNP-by-age interaction on cross-sectional eGFR (based on creatinine or cystatin C, eGFRcrea, eGFRcys).
For this we used data that was independent of the SNP identification step: unrelated European ancestry UK Biobank individuals with one eGFRcrea or eGFRcys assessment excluding the 15,442 individuals in the decline GWAS (yielding > 350,000 individuals).
For each SNP, we applied two linear regression models, one each for the outcome eGFRcrea or eGFRcys, using the covariates age, sex, SNP, SNP-by-age interaction term, and four principal components (age centered at 50 years). We modelled (i) the main age effect on the outcome allowing for non-linear effects (to avoid spurious effects from non-linear main age effect when modelling age linearly), (ii) the main SNP effect linearly per allele dosage, and (iii) for the SNP-by-age interaction effect, the SNP-effect was modelled linearly per allele dosage and the age effect was allowed to vary non-linearly (smooth function, varying coefficient model S14 , penalized thin-plate regression splines, mgcv-package in R S15 ). In a second analysis, the age effect in the SNP-by-age interaction was modelled linearly (i.e. linear effects for both SNP and age in the SNP-by-age term). We judged significance of the interaction at Bonferronicorrected level.

Genetic effect sizes and GRS analysis for eGFR-decline
We provide SNP-specific effect sizes on eGFR-decline in mL/min/1.73m 2 per year over all individuals and focused on individuals with DM at baseline or CKD at baseline. We provide cumulative effects by GRS analysis in the population-based study HUNT (19-90 years old, European ancestry, up to 21 years of follow-up, mean of age-/sex-adjusted residuals for eGFR-

GRS analyses for ESKD and AKI
We were interested in whether the GRS across the variants identified for eGFR-decline showed association with severe clinical endpoints, ESKD and AKI. For this, we used three case sets for ESKD and one case set for AKI as well as controls (eGFR>60 mL/min/1.73m²) from population-based studies frequency-matched with regard to age-group and sex as described previously S16  For each of these four case-control studies, we retrieved the respective SNPs and computed a weighted GRS across identified variants for each individual as described above.
We tested the quantitative GRS with ESKD or AKI. We applied a one-sided test, since we were only interested in this association when the GRS increased the odds of ESKD or AKI. We also compared individuals with high versus low GRS (≥95 th GRS percentile, ≤5 th percentile and ≥90 th versus ≤10 th GRS percentile, defined in UK Biobank individuals excluding individuals in eGFR-decline GWAS) and tested (one-sided) for increased odds of ESKD (meta-analysis across the three studies) or AKI. Associations are derived via logistic regression adjusted for matching variables age-groups and sex (for AKI additionally for the first two principal components).

Supplementary Note S1: Equivalence of DM-adjusted versus not DM-adjusted GWAS on eGFR-decline in the validation meta-analysis
In the recently joined studies (HUNT, MGI, AugUR) and UK Biobank, we had more adjustment models computed for GWAS on eGFR-decline, to better understand similarities and differences. In these, we compared the GWAS summary statistics for eGFR-decline adjusted for DM-status to GWAS without adjustment for DM-status (i.e. GWAS on age-and sexadjusted residuals and with and without adjustment for DM-status at baseline). In each study, we found precisely the same beta-estimates and standard errors (SE): (i) for the 265 SNPs identified previously for cross-sectional eGFR S12 , for which we had a prior hypothesis that these contained the SNPs associated with eGFR-decline, as well as (ii) genome-wide where most of the SNP-associations are under the Null (Supplementary Figure S4A).
We added further "old" CKDGen studies to substantiate these findings in further studies and in an expanded validation meta-analysis (n=103,970). Again, we found DM-adjusted and not DM-adjusted beta-estimates and SEs to be precisely the same (Supplementary Figure S4A). Of note, this validation meta-analysis included general population studies and studies of specific scope: hospital-based (MGI), focused on individuals aged 70+ years (AugUR), or focused on individuals with chronic kidney disease (GCKD).
Given this equivalence, we did not distinguish any more between results DMunadjusted or DM-adjusted. Here, we assume that we know the standard deviation of C and Y, sdC and sdY, respectively, the phenotypic correlation, (estimated as Pearson correlation coefficient between Y and C) and the genetic correlation between Y and C, !" # , %, (using all genetic effects for Y and C genome-wide for estimation as reasonable proxy). When is zero, the adjusted model SNP-effects, , are the same as the unadjusted model SNP-effects, .
Alternatively, when we have GWAS summary statistics from the adjusted model, We apply this on our example to summary statistics for annual eGFR-decline adjusted for eGFR-baseline (BL): given the beta-estimates for decline unadjusted for ln(eGFRcreaBL) (in fact, residuals adjusted for age, sex), +,-./+ , and the beta-estimates for ln(eGFRcreaBL) (i.e. residuals adjusted for age and sex), 01 , we can "adjust" results for BL using the formula, i.e., derive the beta-estimates for decline adjusted for BL (residuals adjusted for age and sex), Effect sizes here are given for the BL-lowering effect allele (which is usually the declineincreasing allele). The can also be written as . This shows that the effect size of decline adjusted for BL standardized to the scale of standardized effects (i.e. divided by () +,-./+ ) is the sum of (i) the (standardized) effect size of decline unadjusted (i.e. the vertical distance of this effect to the x-axis in a /() versus /() plane) and (ii) the vertical distance from the intersection point of the x-axis at /() (i.e. < 0 when the coding allele is the -lowering allele) to the phenotype correlation line, @(B) = * B , when the phenotype correlation is positive, like =0.33 in UK Biobank, i.e. to the point ( /() , 0.33* /() ). This also shows that +,-./+ =3> 9: < +,-./+ , since < 0, by definition.

Supplementary Note S3: Validation of the formula-derived association for eGFRdecline adjusted for eGFR-baseline
In the recently joined studies and UK Biobank, we had more adjustment models computed for GWAS on eGFR-decline, to better understand similarities and differences. In these, we compared the summary statistics for eGFR-decline adjusted for eGFR-baseline (i.e. age-and sex-adjusted residuals and additional adjusted for ln(eGFRcrea baseline)) with eGFR-decline unadjusted for eGFR-baseline (i.e. age-and sex-adjusted residuals) and found substantial differences (Supplementary Figure S4B). Thus, the two models, unadjusted and adjusted for eGFR-decline were considered further.
Generally, in GWAS meta-analysis, the number of GWAS models computed needs to be as parsimonious as possible to remain feasible. In each of the "old" CKDGen studies, we had GWAS summary statistics for eGFR-decline unadjusted for eGFR-baseline, GWAS summary statistics for cross-sectional eGFR, and study-specific phenotypic information. We knew that this enabled us to do the adjustment by formula S10,S18 (Supplementary Note S1).
For the "old" CKDGen studies, we thus derived GWAS summary statistics for eGFR-decline adjusted for eGFR-baseline applying this formula.
While the formula was established previously S10 , we validated that it worked in this setting using the recently joined CKDGen studies and UK Biobank, where we had the model "eGFR-decline adjusted for eGFR-baseline" computed: we also derived the SNP-associations for "eGFR-decline adjusted for eGFR-baseline" based on the formula for comparison in these studies for the purpose of validation. We found the formula to work very precisely per study: we observed equivalence in beta estimates and SEs when focused on the 265 SNPs identified previously for cross-sectional eGFR S12 , for which we had a prior hypothesis that these contained the SNPs associated with eGFR-decline, as well as genome-wide, where most SNP-associations were under the Null (Supplementary Figure S4C; e.g., in UK Biobank for the 265 variants: Pearson correlation coefficient r=1.00 for betas and SEs; maximum difference in beta=3.26x10 -2 , maximum differences in SEs =1.01x10 -3 ). We added further "old" CKDGen studies also to yield an expanded validation meta-analysis (n=103,970). Again, we found the formula to work precisely in each study and in the expanded validation meta-analysis (Supplementary Figure S4C).
The formula is mathematically derived and works perfectly when GWAS summary statistics for baseline eGFR are available. For studies with GWAS on cross-sectional eGFR, the sample size for cross-sectional eGFR is typical a bit larger than the sample size for eGFRbaseline for longitudinal studies (i.e. restricting to individuals in the follow-up). We evaluated the impact of using cross-sectional eGFR summary statistics rather than baseline eGFR summary statistics in the formula in three "old" CKDGen studies at the hand of the Regensburg meta-analysis center. There was no difference in SEs for the 265 variants or genome-wide, a slight difference for beta estimates of the 265 variants, and a larger (random, not biased) difference in betas genome-wide (Supplementary Figure S4D). This difference in genomewide SNP-estimates can be attributed to random noise in the per-variant estimates under the null hypothesis (considering most genome-wide SNPs as not associated with eGFR-decline).
We extended this validation experiment by three further studies, and found the same (Supplementary Figure S4D). In summary, we concluded that the formula-derived association estimates worked well in this setting for the 265 variants and also, with some more random noise, for the other genome-wide variants.
Of note, these validation meta-analyses included general population studies as well as studies of specific scope: hospital-based (MGI), focused on individuals aged 70+ years (AugUR), focused on individuals with chronic kidney disease (GCKD), or focused on individuals with DM (Diacore).

Supplementary Note S4: Graphical illustration of the relationship between SNP-effects
on eGFR-decline unadjusted and adjusted for eGFR-baseline. while the baseline-unadjusted decline effect, FG 1HIG /() FG 1HIG , is the vertical distance from symbol to X-axis, the baseline-adjusted decline effect, FG 1HIG_ _01 /() FG 1HIG , is the vertical distance from symbol to phenotype correlation line.

Supplementary Note S5: Comparison of the signals for eGFR-decline unadjusted and adjusted for eGFR-baseline and cross-sectional eGFR for the 11 identified loci
We compared the association signals for the 11 identified loci for eGFR-decline (unadjusted for eGFR-baseline) with signals for eGFR-decline adjusted for eGFR-baseline with signals for eGFR cross-sectional S12 in regional association plots (Supplementary Figure S5A- For the 4 variants identified for eGFR-decline unadjusted for eGFR-baseline, we found unadjusted eGFR-decline signals to coincide with adjusted eGFR-decline signals and with cross-sectional eGFR signals (Supplementary Figure S5A). Lead variants for unadjusted eGFR-decline (i.e. the variant with the smallest P-value for unadjusted eGFR-decline) were the same or highly correlated with the respective cross-sectional lead variants (r²=same, same, 1.00 and 0.93 for UMOD-PDILT (2), PRKAG2 and SPATA7, respectively).
Among the 5 lead variants identified by GWAS on eGFR-decline adjusted for eGFRbaseline with significant association for eGFR-decline unadjusted for eGFR-baseline (i.e. "genuine" eGFR-decline variants, Supplementary Figure S5B), all signals for decline adjusted coincided with respective signals for decline unadjusted, except for the TPPP locus (but there, the signal for decline unadjusted sharpened when including the studies with lower imputation quality and then coincided). Three of the 5 lead variants were the same as (FGF5) or highly correlated with (C15ORF54 and ACVR2B, R²=0.61 and 0.98) the respective lead variants for decline unadjusted. In the OVOL1 locus, the lead variant for decline adjusted (rs4930319) depicted the same association signal as for decline unadjusted, but was not highly correlated with the variant with the smallest P-value for decline unadjusted (R² with rs117829045=0.11) due to differing allele frequencies (MAF=0.11 and 0.33, respectively); the variants were suggested to be inherited via the same haplotypes (D'=1.00). Among the 5 variants, we found 3 signals for eGFR-decline adjusted for eGFR-baseline to coincide with the signal for cross-sectional eGFR (for FGF5, OVOL1, ACVR2B) and lead variants for decline adjusted as highly correlated with the respective lead variants for cross-sectional eGFR (r²= 0.95, 0.98, 0.96, respectively; Supplementary Figure S5B). In C15ORF54 and TPPP loci, the decline adjusted signal appeared to be a 2 nd signal for cross-sectional eGFR: the lead variant for decline adjusted were not correlated with the lead variant for cross-sectional eGFR (R²= 0.04 and 0.11). The lead variant for decline adjusted near TPPP depicted a cross-sectional signal 22kb distant from the reported cross-sectional lead variant with different allele frequencies (MAF=0.49 and 0.27, respectively; D'=0.57); of note, the lead variants for decline adjusted captured a 2 nd signal identified in the recently published cross-sectional eGFR analysis S19 and there the lead variants were exactly the same. The C15ORF54 lead variant for decline adjusted was highly correlated with a 2 nd signal for cross-sectional eGFR (rs28833881, r²=0.98).
For the 3 loci identified by eGFR-decline adjusted for eGFR-baseline without significant association with eGFR-decline unadjusted for eGFR-baseline (i.e., not a genuine eGFRdecline association), there was no signal for decline unadjusted (GATM, CPS1, SHROOM3; Supplementary Figure S5C). The lead variants for decline adjusted were the same or highly correlated with the respective cross-sectional eGFR lead variant (R²=0.98, same, 0.59).

Supplementary Note S6: Age-dependency of SNP-effects and main age effect on eGFR.
Before interpreting SNP-by-age interaction effects on cross-sectional eGFRcrea and eGFRcys, we evaluated the main age effect on eGFRcrea and eGFRcys (i.e. age and sex in the model). We found large main age effects, which were fairly linear:  Figure S6Z). We nevertheless allowed for non-linear main age effects in the SNP-by-age interaction analyses, since the main age effect was large and even a slight deviation from non-linearity can distort interaction effects if unaccounted.
We found the age-dependency of the SNP-effects on eGFRcrea and eGFRcys (i.e. age-effect in the interaction term) to be fairly linear when non-linear modelling of main age effect was applied (Supplementary Figure 6 SA,B,C). Of note, when the main age effect was modelled linearly, the SNP-effects on eGFRcrea and eGFRcys appeared to be non-linearly modified by age, which is a known problem in interaction analyses (data not shown); this supported the choice of the main age effect modelled non-linearily.

Supplementary Note S7: Narrow-sense heritability
We estimated SNP-based heritability (h 2 ) for eGFR-baseline and for eGFR-decline unadjusted and adjusted for eGFR-baseline using the genomic relatedness matrix restricted maximum likelihood (GREML) method as implemented in the GCTA software package (https://yanglab.westlake.edu.cn/software/gcta/#Overview). For this, we used individual participant data from UK Biobank for the ~15,000 unrelated individuals of European ancestry that had baseline and follow-up eGFR measurements available.
The small heritability for eGFR-decline in UK Biobank might derive from a large measurement error in eGFR-decline based on a study with only two measurements only 4 years apart. The larger heritability for eGFR-decline adjusted for eGFR-baseline compared to unadjusted for eGFR-baseline is reflective of the collider bias.
Supplementary Figure S1: Meta-analysis workflow. Shown is the meta-analysis workflow to capture the sequential recruitment and different suite of computed models (eGFR-decline unadjusted and adjusted for eGFR-baseline, "decline" and "decline adjusted"). In the first level, we conducted a metaanalysis of summary statistics across studies that were part of CKDGen since a long time ("old CKDGen studies", green boxes) and a meta-analysis across recently joined CKDGen studies ("new studies", blue boxes) and UK Biobank (orange box). In a second level, we meta-analyzed these two results. At each level, genomic-control (GC) correction was applied, when lambda was >1.00.
Supplementary Figure S2: Study-specific median annual eGFR-decline versus sample size, follow-up time and median age. Shown are, for each of the 62 studies, the studyspecific median of annual eGFR-decline versus (A) number of individuals, (B) time to followup, and (C) median age at baseline. Whiskers represent interquartile range.
Supplementary Figure S3: No influence of alternative adjustments for age on eGFRdecline in UK Biobank. We explored alternative adjustments for age in UK Biobank (n=15,442, age range 40-70 years): (A) residuals of eGFR-decline adjusted for age, sex, and ln(eGFR-baseline) versus residuals of eGFR-decline adjusted for age, sex and residuals (ln(eGFR-baseline) adjusted for age and sex) and (B) residuals of eGFR-decline adjusted for age_centered (i.e. centered at 50 years) and sex with residuals of eGFR-decline adjusted for age_centered, (age_centered)² and sex. Alternative adjustments did not change the GWAS phenotype.
Supplementary Figure S4A: No influence from adjusting SNP-associations for eGFR-decline for diabetes mellitus (DM). We compared SNPassociations for eGFR-decline with DM-adjustment with SNP-associations for eGFR-decline without adjustment for DM in recently joined CKDGen studies, UK Biobank, several "old CKDGen studies", and their meta-analysis (total=103,970; Supplementary Note S2). Columns 1&2 show betaestimates and standard errors (SE) among the 265 variants known for cross-sectional eGFR S12 , where we had a prior hypothesis that these might be associated with eGFR-decline. Columns 3&4 show betas and SEs genome-wide, where most SNP-associations are under the Null (i.e., not associated with eGFR-decline). Column 5 shows QQ-plots for P-values genome-wide. Coded allele is the cross-sectional eGFR-lowering allele, SNPs with minor allele frequency ≥0.05 are in green and with minor allele frequency <0.05 in orange. All SNPs have imputation quality>0.6 and MAC>10 for each study.

Supplementary Figure S4B: Differences between SNP-association for eGFR-decline unadjusted versus adjusted for eGFR-baseline
We compared SNP-associations for eGFR-decline adjusted for eGFR-baseline with SNP-associations for eGFR-decline unadjusted for eGFR-baseline in recently joined studies, UK Biobank, several "old CKDGen studies", and their meta-analysis (total=103,970). Columns 1&2 show beta-estimates and standard errors (SE) among the 265 variants known for cross-sectional eGFR S12 , where we had a prior hypothesis that these might be associated with eGFR-decline. Columns 3&4 show betas and SEs genome-wide, where most SNP-associations are under the Null (i.e., not associated with eGFR-decline). Column 5 shows QQ-plots for P-values genome-wide. Coded allele is the cross-sectional eGFR-lowering allele, SNPs with minor allele frequency ≥0.05 are in green and with minor allele frequency <0.05 in orange. All SNPs have imputation quality>0.6 and MAC>10 for all studies.
Supplementary Figure S4C: Validation of formula-derived adjustment for eGFR-baseline in eGFR-decline associations (part 1). We compared SNP-associations for eGFR-decline adjusted for eGFR-baseline by model with SNP-associations for eGFR-decline adjusted for eGFRbaseline by formula (using beta-estimates for eGFR-baseline) in recently joined studies, UK Biobank, several "old CKDGen studies", and their metaanalysis (total=103,970). Columns 1&2 show beta-estimates and standard errors (SE) among the 265 variants known for cross-sectional eGFR S12 , where we had a prior hypothesis that these might be associated with eGFR-decline. Columns 3&4 show betas and SEs genome-wide, where most SNP-associations are under the Null (i.e., not associated with eGFR-decline). Column 5 shows QQ-plots for P-values genome-wide. Coded allele is the cross-sectional eGFR-lowering allele, SNPs with minor allele frequency ≥0.05 are in green and with minor allele frequency <0.05 in orange. All SNPs have imputation quality>0.6 and MAC>10 for all studies.
Supplementary Figure S4D: Validation of formula-derived adjustment for eGFR-baseline in eGFR-decline associations (part 2). In "old CKDGen studies", sample sizes were typically larger for cross-sectional eGFR than for baseline eGFR (i.e. restricted to individuals in follow-up). We compared SNP-associations for eGFR-decline adjusted for eGFR-baseline by model with SNP-associations for eGFR-decline adjusted for eGFRbaseline by formula using beta-estimates for cross-sectional eGFR in six "old" CKDGen studies. Columns 1&2 show beta-estimates and standard errors (SE) among the 265 variants known for cross-sectional eGFR S12 , where we had a prior hypothesis that these might be associated with eGFRdecline. Columns 3&4 show betas and SEs genome-wide, where most SNP-associations are under the Null (i.e., not associated with eGFR-decline).
Column 5 shows QQ-plots for P-values genome-wide. Coded allele is the cross-sectional eGFR-lowering allele, SNPs with minor allele frequency ≥0.05 are in green and with minor allele frequency <0.05 in orange. All SNPs have imputation quality>0.6 and MAC>10 for all studies.
Supplementary Figure S5C: Regions of the 3 variants in 3 loci identified from GWAS for eGFR-decline adjusted for eGFR-baseline without significant association for eGFR-decline unadjusted for eGFR-baseline. Shown are regional association plots (1 st column) for cross-sectional eGFR S12 ("eGFRcrea", n up to 765,348), (2 nd and 3 rd column) for eGFR-decline unadjusted for eGFR-baseline ("decline"; n up to 343,339; blue dashed line P=0.05/263=1.90x10 -4 in 2 nd column and P=0.05 in 3 rd column), and (4 th column) for eGFR-decline adjusted for eGFR-baseline ("declineadj"; n up to 320,737). Highlighted are lead variants for cross-sectional eGFR S12 (1 st and 2 column) and declineadj lead variants (3 rd and 4 th column). Red lines indicate P=5.00x10 -8 . Signals for declineadj coincide with signals for cross-sectional eGFR; there is no association for decline (unadjusted) in these regions.
Supplementary Figure S6: Age-dependency of cross-sectional eGFR and age-dependency of SNP-effects on cross-sectional eGFR in UK Biobank. We conducted SNP-by-age interaction analyses on cross-sectional eGFRcrea and eGFRcys in individuals from UK Biobank that were independent from the GWAS (n=351,462; i.e. excluding the 15,442 individuals in the eGFR-decline GWAS) using linear regression with covariates sex, age, SNP, SNP-by-age and outcome eGFRcrea or eGFRcys. The SNP-effect was modelled as linear dosage effect (for main effect and in interaction term; i.e. additive genetic effect per allele). Age was centered at 50 years and modelled linearly as well as allowing for a smooth nonlinear change by age. For cross-sectional eGFRcrea (1 st row) and eGFRcys (2 nd row), we show the age-dependency (Z) of the main age effect on eGFRcrea and eGFRcys, (A) on the SNP-effects of the 4 variants identified for eGFR-decline (unadjusted for eGFR-baseline), (B) on the SNPeffects of the 5 variants identified for eGFR-decline adjusted for eGFR-baseline with significant association for eGFR-decline unadjusted for eGFRbaseline, and (C) on the SNP-effects of the 3 variants identified for eGFR-decline adjusted for eGFR-baseline without significant association for eGFR-decline unadjusted for eGFR-baseline. In A-C, the main age effect was modelled non-linearly (to avoid residual confounding) and the interaction effects modelling the age-dependency of the SNP-effect linearly (green lines) are the ones reported in Table 3.

34
Supplementary Figure S6 (     The 12 identified variants for eGFR-decline were associated with other kidney phenotypes, but not with DM-status. For the 12 identified variants, we show association results for eGFR based in cystatin C S19 ("eGFRcys", n up to 460,826), blood urea nitrogen S12 ("BUN", n up to 416,178), urine albumin-to-creatinine ratio S20 ("UACR", n up to 564,257), chronic kidney disease S12 ("CKD", n up to 625,219) and DM S21 (n up to 898,130) from published GWAS. Coded allele is the faster-decline allele (which is always the eGFR-lowering allele). Genome-wide significant P-values (P<5.00x10 -8 ) are stated in bold. SNPID=Variant identifier on GRCh37, Locus name=Nearest Gene, EA/OA=Effect allele / other allele, Beta and P=genetic effect coefficient of association and association P-value, OR=odds ratio, P=association P-value.

Supplementary Table S5: The 12 identified variants for eGFR-decline do not show heterogeneity between ancestries and FHS is not an influential study.
We conducted MR-regression to test for heterogeneity between ancestries S13 and the meta-analyses restricted to European or African American individuals (n=325,840 and 9,038, respectively; sample sizes for other ancestries were small). We also conducted a sensitivity meta-analysis excluding the FHS study (due to an initial uncertainty in the median eGFR-decline, n=2,925) and explored direction-consistency of genetic effects in FHS alone. Shown are the P-values for between-ancestry heterogeneity (P-anc-het) and beta-estimates in mL/min/1.73m² as well as P-values for the sensitivity analyses; significant P-values (Pdecline≤0.05/12=4.17x10 -3 ) are stated in bold. Among the 12 variants, there was no evidence for between-ancestry heterogeneity (P-anc-het≥0.05). Association estimates excluding FHS were similar to the original analysis estimates ( Table 1) and FHS-specific estimates were mostly directionally consistent. 360 SNPID=Variant identifier on GRCh37, Locus name=Nearest Gene, EA/OA=Effect allele / other allele, P-anc-het=P-value of the test for between ancestry heterogeneity, beta and P=genetic effect coefficient of association and association Pvalue. § Since the TPPP locus lead variant had imputation quality <0.6 in 45% of the studies (median 0.64), we analyzed this locus omitting the imputation quality filter in all studies.

Supplementary Table S6: No influence by DM-adjustment versus no DM-adjustment or by model-based versus formula-based adjustment for baseline eGFR (eGFR-BL) on the 12 variants' association with eGFR-decline.
We conducted a validation meta-analysis for the 12 identified variants for eGFR-decline (total n=103,970) to compare models with different covariate adjustment. Shown are beta-estimates and P-values for eGFR-decline DM-adjusted versus DM-unadjusted, and adjusted for eGFR-baseline by model as well as by formula (Supplementary Note S1); all models are age-and sex-adjusted. We found no impact by DM-adjustment, but by adjustment for eGFR-BL (when compared to "not adjusted for DM", which is unadjusted for eGFR-BL). For adjustment for eGFR-BL, we found the same association statistics when model-computed versus formula-derived.  Table S7: Associations of APOL1 risk variants with eGFR-decline in African American and European ancestry. While our data was derived primarily from persons of European ancestry, we explored variants in the APOL1 gene due to previous reports for chronic kidney disease progression in 8,500 African American individuals S22 . We conducted GWAS with the additive model for eGFR-decline unadjusted for eGFRbaseline restricted to African Americans (n up to 9,038) or to European ancestry (n up to 325,840). Shown are beta-estimates (in mL/min/1.73m²), standard errors (SE) and P-values. From 6 previously reported APOL1 risk variants (the 7 th , indel rs71785313, not analyzable here), none was associated with eGFR-decline in African American ancestry (P≥0.05). Interestingly, we detected two yet unreported SNPs near/in APOL1 suggestively associated with eGFR-decline with P=2.8x10 -05 and 3.10x10 -05 in African Americans (effect allele frequency=0.01; monomorphic in European), uncorrelated with the previously reported variants (r²<0.01).

ESTHER
The ESTHER study was funded by the Saarland state Ministry for Social Affairs, Health, Women and Family Affairs (Saarbrücken, Germany), the Baden-Württemberg state Ministry of Science, Research and Arts (Stuttgart, Germany), the Federal Ministry of Education and Research (Berlin, Germany) and the Federal Ministry of Family Affairs, Senior Citizens, Women and Youth (Berlin, Germany).

FHS
The Framingham Heart Study is supported by HHSN268201500001.

FINCAVAS
The Finnish Cardiovascular Study (FINCAVAS) has been financially supported by the Competitive Research Funding of the Tampere University Hospital (Grant 9M048 and 9N035), the Finnish Cultural Foundation, the Finnish Foundation for Cardiovascular Research, the Emil Aaltonen Foundation, Finland, the Tampere Tuberculosis Foundation, EU Horizon 2020 (grant 755320 for TAXINOMISIS; grant 848146 for To_Aition), and the Academy of Finland grant 322098. The authors thank the staff of the Department of Clinical Physiology for collecting the exercise test data.

MGI
The authors acknowledge the Michigan Genomics Initiative participants, Precision Health at the University of Michigan, the University of Michigan Medical School Central Biorepository, and the University of Michigan Advanced Genomics Core for providing data and specimen storage, management, processing, and distribution services, and the Center for Statistical Genetics in the Department of Biostatistics at the School of Public Health for genotype data curation, imputation, and management in support of the research reported in this publication. Funding was provided by the NIH (R35-HL135824).