Causal relationships between serum matrix metalloproteinases and estrogen receptor-negative breast cancer: a bidirectional mendelian randomization study

To better clarify the causal effects between matrix metalloproteinases (MMPs) and estrogen-receptor (ER)-negative breast cancer (BC), we investigated the bidirectional causal relationship between MMPs and ER-negative BC by mendelian randomization (MR) analysis. Summary statistic data of five MMPs were extracted from European participants in 13 cohorts. Data of ER-negative BC collected from one of genome-wide association studies of European ancestry was used as experimental datasets and another four ER-negative BC datasets were used as validation sets. Inverse variance weighted method was used for main MR analysis and sensitivity analysis was also conducted. Serum level of MMP-1 has negative effect on ER-negative BC (odds ratio = 0.92, P = 0.0008) but the latter one was not the cause of the former one, which was supported by validation sets. No bidirectional causal effect was detected between the other four types of MMPs and ER-negative BC (P > 0.05). Sensitivity analysis indicated robustness of the above results without remarkable bias. To conclude, serum MMP-1 may be a protective factor against ER-negative BC. No reciprocal causality was found between the other kinds of MMPs and ER-negative BC. MMP-1 was indicated as a biomarker for risk of ER-negative BC.

Causal effect of on ER-negative BC on serum MMP levels. To evaluate reverse causation effects, we planned to use the above five GWAS summary data of ER-negative BC. In the five BC GWAS data, no SNP potentially associated with confounders was removed. For the five GWAS summary datasets of ER-negative BC, the first one (GWAS ID: ieu-a-1128) has 40 exposure-associated SNPs, the second one (GWAS ID: ieu-a-1135) has 14 exposure-associated SNPs, The third one (GWAS ID: ieu-a-1136) has seven SNPs, the fourth one (GWAS ID: ieu-a-1137) has only 2 significantly related SNPs, and the last one (GWAS ID: ieu-a-1166) has eight SNPs. Because the number of selected SNP in the fourth dataset (GWAS ID: ieu-a-1137) was less than five, we used the other four datasets to investigate potential causal effect of ER-negative BC on serum level of the five MMPs. Using IVW method, neither of the results derived from these datasets indicated causality from ER-negative BC to the serum level of the five kinds of MMPs (For MMP-1, GWAS ID: ieu-a-1128: P= 0.63; GWAS ID: ieu-a-1135:    (Tables 6, 7 , 8, 9, 10). No remarkable heterogeneity was found either by MR-Egger or IVW methods for analysis of causal effect of BC on MMP-1/-7/-10/-12 (P>0.05) but MMP-3 (In GWAS data ieu-a-1128: MR Egger: P=0.03; IVW method: P=0.01) ( Table 7). For analysis between ER-negative BC (GWAS data ieu-a-1128) and the five types of MMPs, three SNPs were removed for being palindromic with intermediate allele frequencies: rs2735846, rs62116991, and rs191981806. As a result, the results were derived from the remaining 37 SNPs. Leave-one-out plots indicated that no SNP in all four GWAS summary datasets of ER-negative BC had great impact on the MR analysis (Figs. 6,7,8,9,10). F-statistic for SNPs of all the four datasets of ER-negative BC were greater than the threshold of 10, suggesting strong IVs, which reducing bias of IVs estimates (Supplementary Tables S6, S7, S8, S9, the F statistics for analysis with the other four types of MMPs were the same as the analysis with MMP-1).

Discussion
In our study, we found that serum MMP-1 is a protective factor for ER-negative BC. In other words, a reduction of serum MMP-1 concentration had causal effect on the risk of ER-negative BC. In the opposite direction, no causal effect was found from ER-negative BC to the serum MMP-1 level. Different from MMP-1, no mutual causal relationship between the other four types of MMPs (MMP-3, MMP-7, MMP-10, and MMP-12) and ERnegative MMPs.
To the best of our knowledge, our study is the first study reporting the bidirectional causality between serum MMP level and BC. MMP could stimulate tumor cell migration, invasion, and metastasis through proteolysis of the extracellular matrix 16,44 . MMP-1 is a kind of interstitial collagenase which is capable of degrading type I, II, and III collagens. Previous studies have found that exosomal MMP-1 in circulation and MMP-1 expressed on BC cells empowered BCs (especially for TNBC) the potential of distal metastasis (brain, lung, etc.) and led to a www.nature.com/scientificreports/ poor disease-free survival 45,46 . One study found that MMP-1 expression was significantly higher in TNBC tissue than in ER-positive and HER-2-positive BC tissue. And MMP-1 expression was also enriched in metastatic BC tissue than in non-metastatic BC tissue 47 . Studies also reported that MMP-1 or their specific polymorphisms contributed to initiation and progression of BC but the association between MMP-1 level and overall survival was still controversial [48][49][50][51] . What's more, certain studies even found that specific genetic variants of MMP-1 did not affect the risk of BC [52][53][54] . Different from the above results, one study suggested that serum MMP-1 level was significantly lower in BC patients than in healthy controls (P < 0.0001) and patients with a lower serum concentration of MMP-1 had a remarkably shorter 5-year survival 55 . And another study even demonstrated that stromal expression of MMP-1 was an independent prognostic factor for a longer overall survival (Hazard ratio = 0.528, P = 0.042) 56 . Nevertheless, few studies focused on association between circulating/serum MMP-1 and each subtype of BC. In our study, serum MMP-1 level had causal effect on ER-negative BC and a high level of MMP-1 serum level caused a lower risk of ER-negative BC, suggesting a protective role of MMP-1 in ER-negative BC. The result was derived not only from IVW method, but also from weighted median and weighted mode methods. In our study, all results were based on IVW method. Moreover, our results were considered robust as selected GWAS summary data of MMP-1 and ER-negative BC had a large sample size. Different types of sensitivity analysis also corroborated the strength and power of our results. According to result of MR analysis of causal effect of ER-negative BC on MMP-1, no positive result was found. This suggested that low serum level of MMP-1 caused ER-negative BC instead of that the latter one resulted in reduction of MMP-1 level. According to the status quo of the research of MMP-1 in breast cancer, inconsistent conclusions could be found in these studies mentioned in our discussion. Firstly, some studies only indicated that MMP-1 promoted carcinogenesis and metastasis of BC though whether all subtypes of BC could be empowered by MMP-1 was unclear. Secondly, most of MMPrelated studies focused on tumoral or histological MMP expression. Instead, our study investigated relationship between specifically serum MMP molecules and ER-negative BC. Whether same result could happen in serum MMP should be further discussed. Lastly, studies have suggested that MMP-1 has several genetic variants (polymorphisms) and different variants could impact on prognosis of each subtype of BC in different ways 57 . In one study from the US in which most of the patients were from Hispanic and non-Hispanic white, investigators www.nature.com/scientificreports/ found that not all polymorphisms of serum MMP-1 were associated with prognosis of BC and different gene sequences could cause different clinical outcomes. MMP-1 rs17293761 TT genotype was not a risk factor for more advanced breast tumor 57 . Hence, our study not only corroborated research results in studies believing that MMP-1 was a protective factor but also put forward a new possibility of relationship between serum MMP-1 and ER-negative BC. Regarding that research of relationship between serum MMP-1 and BC was still lacking and the potential mechanism of this phenomenon was unclear, it is worth being furtherly explored to validate this result. For the other four types of MMPs, we did not find any causality between each of them and ER-negative BC. For MMP-3 (Stromelysin-1) and MMP-10 (stromelysin-2), both of them degrade extracellular matrix (ECM) proteins including aggrecan, collagen types III and IV, and fibronectin 58 . The former one is not only expressed in cancer cells, but also in normal cells (endothelial cells, epithelial cells, macrophages, and stromal fibroblasts) while the latter one is merely detected in abnormal tissue including acute or chronic injury and cancer 59,60 . One study suggested that serum level of MMP-10 was significantly higher in BC patients than that in healthy control (P < 0.001). Median serum of MMP-3 was significantly higher in advanced BC (stage III and IV) than that in early-stage BC (stage I) (P = 0.018) 61 . Another study drew a different conclusion, suggesting that expression of MMP-10 was lower in BC tissue compared with adjacent normal tissue 62 . Also in the aforementioned study published by Dr. Martha L Slattery from Utah, USA, the clinical significance of MMP-3 were investigated 57 . Results showed that MMP-3 was associated with breast cancer risk only in part of Native Americann, with merely borderline significance (P = 0.06). For relationship between MMP polymorphism and tumor prognosis, two genetic variants of MMP-3 could drastically increased risk of tumor progression and distant metastasis. Nevertheless, these results were based on mixed population in which Hispanic and Native Amerivans predominated. Whether the results could be applied in other ethnicities should be further explored. For association between MMP-3 and prognosis of breast cancer, one study indicated that MMP-3 did not impact on overall survival but a higher level of cellular expression of MMP-3 had a significantly poorer metastasis-free survival 63 . Up till now, studies on relationship between MMP-10 and prognosis of breast cancer was not available. As a type of matrilysin, MMP-7 disrupts the structure of and degrade casein, collagen, elastin, fibronectin, gelatins, laminin, and proteoglycans 64 . www.nature.com/scientificreports/ Amongst, collagen IV, laminin, and proteoglycan are the major components of basement membrane 65 . Thus, the biological process of MMP-7 plays a crucial role in local invasion, lymph-node, and distal metastasis of cancer cells 66 . Studies have shown that serum MMP-7 was higher in BC patients compared with control group 67 . Another study found that BC patients with bone metastasis had a higher serum level of MMP-7, suggesting a potential www.nature.com/scientificreports/ circulating biomarker for BC progression towards bone metastasis 68 . In one study from Xi'an Jiaotong University, researchers illustrated that MMP-7 expression was higher in tissue from advanced breast cancer (larger focus,    www.nature.com/scientificreports/ risk of breast tumor and it did not reduce after the removal of the tumor 70 . Currently, few study has reported positive result and conclusion for association or causal relatoinship between tissue/serum level of MMP-12 with BC. Above all, no consensus has been made on causal effect between these four types of MMP and BC. More intense investigtions in this field should be performed. Despite the originality and a robust result of our study, some limitations are necessary to be stated: (1) Individuals of this study are from European Ancestry, results derived from selected SNPs could not directly extend to other ethnic groups; (2) Temporarily the GWAS summary data did not contain sufficient IVs to complete analysis for other types of MMPs so that MR analysis between these MMPs and BC could not be conducted; (3) Number of SNPs for the five MMPs were relatively small, especially for MMP-3, MMP-7, and MMP-10, a larger GWAS with more eligible IVs is needed to increase the power of MR analysis.

Conclusions
To conclude, a low level of serum MMP-1 has a causal effect on a high risk of ER-negative BC in European population. In reverse analysis, no causal effect was found from ER-negative BC on the level of serum MMP-1. No evidence supported any causality between MMP-3, -7, -10, -12 and ER-negative BC in European ancestry. More intense research ought to be carried on to validate the serum MMPs as potential biomarkers and therapeutic targets in ER-negative BC.

Methods
Study design. Selection of IVs from genetic variants in this MR analysis strictly meet the three stringent assumptions of MR: (1) as selected IVs, the genetic variants is remarkably associated with the exposure; (2) the genetic variants is not associated with any confounding factors; (3) genetic variants could only indirectly affect the outcome via the exposure, not directly affecting or any other pathways (Fig. 11) 71 . In our study, we selected summary-level data of 5 kinds of MMPs (MMP-1,-3,-7,-10,-12, containing not less than 5 SNPs) and ER-negative BC from open database of published genome-wide association studies (GWASs) summary dataset (https:// gwas. mrcieu. ac. uk/) 72,73 . We firstly collected genetic variants for each type of MMP in order to determine the causality from MMP to ER-negative BC. Then we collected genetic variants robustly associated with ER-negative BC to validate the reverse causality from BC to MMPs. This is the main goal of our study. The design of bidirectional MR study is overviewed in Fig. 12.

Data sources and SNP selection for MMPs. Genetic variants of 5 kinds of MMPs (MMP-1, MMP-3,
MMP-7, MMP-10, and MMP-12) were obtained from a meta-analysis of GWASs including 21,758 individuals from 13 cohorts of European ancestry 74 . All the five kinds of MMPs passed quality control and were normalized with rank-based inverse normal transformation and/or standardized to unit variance in order to control unrelated variables among cohorts. Genetic associations between 20.3 million genetic variants (SNPs) and logtransformed MMPs were adjusted for population structure (age, sex, smoking status, oral contraceptive usage, blood cell counts, etc.) and study-specific parameters (OLINK plate, storage time, MDS component, etc.) 74 . To meet the first assumption of MR, we extracted the IVs of the five types of MMPs at genome-wide significance (5 × 10 −8 ). 17 SNPs were significantly associated with MMP-1; 12 SNPs were significantly related with MMP-3; seven SNPs were significantly associated with MMP-7; eleven SNPs were significantly associated with MMP-10; and 15 SNPs were remarkably associated with MMP-12. Meanwhile, a linkage disequilibrium (LD) test was conducted on these SNPs to clump SNPs for independence. All SNPs were strongly and independently (R 2 < 0.01 within 5 Mb) predicted MMP level from the published GWAS. Subsequently, we input all the SNPs significantly associated with MMPs into Phenoscanner database (V2) (http:// www. pheno scann er. medsc hl. cam. ac. uk/) to determine if any SNPs were associated with confounders (P < 5 × 10 −8 ) 75,76 . Resulted SNPs would be deleted to reduce the possibility of pleiotropic effect.

Data sources and SNP selection for ER-negative BC.
Summary-level data on ER-negative BC were extracted from a GWAS of 127,442 individuals of European ancestry from Breast Cancer Association Consortium (BCAC), combined with Discovery, Biology and Risk of Inherited Variants in Breast Cancer Consortium (DRIVE), iCOGS project, and data from other GWAS meta-analysis 77 . This data would be used as experimental dataset to explore potential causal effect between MMPs and ER-negative BC. Then we used the other four datasets as validation datasets to prove the conclusion draught from the experimental datasets. The four datasets were all derived from European Ancestry (OncoArray1, case: 9655, control: 45494; iCOGS, case: 7333, control: 42892; GWAS meta-analysis1, case: 4480, control: 17588; GWAS meta-analysis2: case: 3611, control: 18084) 77,78 . Similar to SNP selection for MMPs, potential SNPs correlated with confounders would be removed by using Phenoscanner database (P<5×10 −8 ).To further evaluate robustness of selected SNPs, statistics F and R 2 were used in both the process of SNP selection for MMP and ER-negative BC. F statistic stands for the precision and magnitude of the genetic effect on the trait. The Eq. (1) is: N stands for sample size of a certain GWAS and R 2 is the proportion of the variance of the trait caused by genetic variants (SNPs). The Eq. (2) is:    www.nature.com/scientificreports/ EAF is short for "Effect Allele Frequency" (EAF) of the SNP and β is the estimated effect of SNP on the trait. SNPs with F less than ten would be removed and SNPs with F larger than 10 were robust to prove the validity of selected SNPs 79,80 . Bidirectional mendelian randomizasion analysis. Bidirectional two-sample MR was performed by using the R pacakge "TwosampleMR". Information of SNPs, β value (created by log-transformation of odds ratios [ORs]), standard error, P-value, and EAF value of selected exposure instrument were necessary for this package to harmonize exposure and outcome data to investigate direction of causality between MMPs and ERnegative BC by using summary association data. In our study we did not look for proxies to replace SNPs that were not available in the outcome datasets. During data harmonization, we should ensure that all selected SNPs were derived from the same allele no matter in exposure or outcome data. For palindromic SNPs, however, they were too difficult to be recognized whether the SNPs were from the same allele because the sequence were same on both strands. As a result, palindromic SNPs were removed to eliminate the ambiguity as to whether exposure and outcome GWAS infer the same effect allele 43 . In the core process of MR analysis, we measured Wald ratio (i.e, β outcome /β exposure ) for each SNP and then summarized these SNP-sepcific Wald ratio via inverse-varianceweighted (IVW) method which estimated causal effects of genetically predicted exposure on outcome 81,82 . We demonstrated the estimate effects in ORs for binary outcome (ER-negative BC) and in β for continuous outcome (MMP level). To explore the direction of causality from MMP to BC, OR was elaborated as risk for ER-negative BC (outcome) per unit increase in serum level of certain type of MMP. Other methods in MR anaysis include: MR-Egger, weighted median, simple mode, and weighted mode. A series of sensitivity analysis were performed, consisting of weighted median (WM) method, MR-Egger, and Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO). WM method reckons the causal effect by selecting median MR estimate for condition in which multiple genetic variants are invalid or present pleiotropy 83 . MR-Egger method is robust not only to provide a consistent estimate of causal effect, but also to evaluate horizontal pleiotropy of IVs and a nonzero intercept suggesting that the IVW estimate is biased 84,85 . MR-PRESSO is capable of detecting and correcting any potentially pleiotropic outliers (SNPs) for all reported results to avoid bias 86 . Heterogeneity was quantified  87 . Furthermore, "leave-one-out analysis" was also conducted by removing each SNPs to test the stability and reliability of the MR results. By virtue of multiple testing in our analysis, Bonferroni correction was used to modify the significant level for multiple tests. Thus we considered P-values below (0.05/25=0.002) as strong evidence of associations. Results with P-values between 0.002 and 0.05 were regarded as suggestive associations 43 . All statistical analysis were two-sided. All analysis was conducted using R software (4.2.0) with R package of "TwosampleMR" (version 0.5.6), "MRPRESSO" (version 1.0). Reporting follows the STROBE-MR statement.   www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.