The two-sample MR design
The Two-sample Mendelian Randomization (TSMR) is based on three basic assumptions: (1) the first is that the single nucleotide polymorphisms (SNPs) are associated with the exposure. (2) the second assumption requires no association between SNPs and any potential confounders. (3) the third means the SNPs are not associated with the outcome only through exposure (shown in Figure 1). At the same time, these SNPs associated with 25(OH)D levels were selected based on p<5×10-8 and minor allele frequency >0.01. Furthermore, we calculated the F statistics of SNPs to minimize weak instrument bias (12).
Data for IV
We extracted instrumental variables (IVs) of Vitamin D from a GWAS analysis with 79366 European-ancestry individuals, shown in ST 1(13). There are four SNPs (which involve genes with a direct role in vitamin D synthesis and metabolism explaining 2.84% of the increase in 25OHD levels), rs10741657 (CYP2R1), rs10745742 (AMDHD1), rs12785878 (NADSYN1-DHCR7), rs17216707(CYP24A1). The F statistic is 579.93. Besides, the IVs of BMI were the 10 lead single nucleotide polymorphisms (SNPs) reported in the largest European genome-wide association study (GWAS) of obesity published by Speliotes et al, shown in ST 2(14). The IV of KMI-1 (Kidney Injury Molecule-1) was rs1039438 (Beta=0.5, p= 7.81E-38) extracted from the study by Per-Henrik Groop, shown in ST 3(15).
The GWAS summary Data of diseases
The ultimate aims of this MR analysis are to clarify the causal relationship between serum 25(OH)D and diabetic nephropathy. Therefore, we used two types of data: the GWAS summary statistics of eGFR and UACR with diabetes and the GWAS summary data on the different stages of diabetic nephropathy. Then, to corroborate previous findings that the serum 25(OH)D level is associated with a decrease in eGFR (16), we used the GWAS summary data on eGFR. Besides, since the validity of the IVs needs to be tested, we set up a positive control group, Multiple sclerosis, as well as two negative control groups, prostate cancer, the breast cancer (17). What’s more, the KMI and BMI are connected with DN in the previous study, which is set as a positive control.
The GWAS summary Data on two renal function traits, eGFR, and UACR, were obtained from two GWAS analyses based on 133720 (18)and 54448(19) European-ancestry individuals. The eGFR was defined by the four-variable Modification of Diet in Renal Disease Study Equation. The UACR was calculated as urinary albumin/urinary creatinine (mg/g). The values of eGFR and UACR are obtained by log () transformation.
The GWAS summary statistics of DM (Diabetes mellitus) were from two different GWAS analyses mentioned above, including the eGFR data with 11529 European-ancestry individuals, and the UACR statistic involving 5825 European participants. Diabetes mellitus was defined as fasting glucose≥126 mg/dl, pharmacologic treatment for diabetes, or self-report.
The GWAS summary statistics of the early/later DN in T1D (TypeⅠDiabetes) include 1820 European patients with early DN and 2,495 European patients with later DN(19). The definition of T1D is diagnosed by their attending physician, with age at diabetes onset <40 years and insulin treatment initiated within 1 year of diagnosis. The early DN was defined by “at least 2 out of 3 consecutive measurements with AER ≥20 AND <200 mg/min” or “AER ≥30 AND, <300 mg/24 h” or “ACR≥ 2.5/3.5 AND, <25/35 mg/mmol”. Patients receiving dialysis treatment, with a kidney transplant, or with an eGFR≤15 ml/min per 1.73m2 were defined as the later T1DN. The statistic adjusted for sex, diabetes duration, and age at diabetes onset.
The GWAS summary statistics of the early/later DN in T2D (TypeⅡDiabetes) were from an analysis of 4227 European individuals with early T2DN, and 3711 European patients with later T2DN(20). The “early DKD” phenotype identifies variants that contribute to the early dysfunction of the glomerular barrier. The “late DKD” phenotype to identify variants that contribute to severe glomerular barrier dysfunction. the statistic adjusted for sex, diabetes duration, and age at diabetes onset.
The GWAS summary statistic of prostate cancer was from 2495 cases and 334644 controls Neale Lab (HTTP:// www.Neale lab. is/UK-biobank) using Hail (https://hail. is/), with adjustment of the first 20 principal components, sex, age, age squared, the interaction between sex and age, and interaction between sex and age squared.
The GWAS summary statistic of breast cancer was from Neale Lab with 25865 cases and 283784 controls, with adjustment of sex, age, age squared, the interaction between sex and age, and interaction between sex and age squared.
The GWAS summary statistic for Multiple sclerosis includes 47429 cases and 68374 controls in the International Multiple Sclerosis Genetics Consortium (21).
Statistical analysis
We use the inverse variance-weighted (IVW) method to estimate the effect of 25(OH)D on DN, and weighted-median, wald ratio were used as supplements to IVW, shown in ST 4-5. At the same time, we choose the method according to the result of horizontal pleiotropy and pleiotropy (22). The heterogeneity is assessed by Cochrane’s Q value. The horizontal pleiotropy of SNPs was evaluated through the MR-Egger intercept [10] and MR-PRESSO methods (23). Specifically, the outlier test corrects for horizontal pleiotropy via outlier removal.
TMR analysis was performed using the R packages “TwoSampleMR”. The MR-PRESSO was conducted by the R packages “MRPRESSO”. Data visualization was performed in GraphPad Prism 9. All statistical analyses were performed in R software version 4.1.2 (https://www.r-project.org/).