Study on the Potential Biomarkers of Maternal Urine Metabolomics for Fetus with Congenital Heart Diseases Based on Modified Gas Chromatograph-Mass Spectrometer

Background There has been significant research on the genetic and environmental factors of congenital heart defects (CHDs), but few causes of teratogenicity, especially teratogenic mechanisms, can be clearly identified. Metabolomics has a potential advantage in researching the relationship between external factors and CHD. Objective To find and identify the urinary potential biomarkers of pregnancy (including in the second and third trimesters) for fetuses with CHD based on modified gas chromatograph-mass spectrometer (GC-MS), which could reveal the possibility of high-risk factors for CHD and lay the foundation for early intervention, treatment, and prevention. Methods Using a case-control design, we measured the urinary potential biomarkers of maternal urine metabolomics based on GC-MS in a population-based sample of women whose infants were diagnosed with CHD (70 case subjects) or were healthy (70 control subjects). SIMCA-P 13.0 software, principal component analysis (PCA), orthogonal partial least squares-discriminant analysis (OPLS-DA), Wilcoxon-Mann-Whitney test, and logistics regression were used to find significant potential biomarkers. Result The 3D score graph of the OPLS-DA showed that the CHD and control groups were fully separated. The fitting parameters were R2x=0.78 and R2y=0.69, and the forecast rate was Q2=0.61, indicating a high forecast ability. According to the ranking of VIPs from the OPLS-DA models, we found 34 potential metabolic markers with a VIP > 1, and after two pairwise rank sum tests, we found 20 significant potential biomarkers, which were further used in multifactor logistic regressions. Significant substances, including 4-hydroxybenzeneacetic acid (OR=4.74, 95% CI: 1.06-21.06), 5-trimethylsilyloxy-n-valeric acid (OR=15.78, 95% CI: 2.33-106.67), propanedioic acid (OR=5.37, 95% CI: 1.87-15.45), hydracrylic acid (OR=6.23, 95% CI: 1.07-36.21), and uric acid (OR=5.23, 95% CI: 1.23-22.32), were associated with CHD. Conclusion The major potential biomarkers in maternal urine associated with CHD were 4-hydroxybenzeneacetic acid, 5-trimethylsilyloxy-n-valeric acid, propanedioic acid, hydracrylic acid, and uric acid, respectively. These results indicated that the short chain fatty acids (SCFAs) and aromatic amino acid metabolism may be relevant with CHD.


Introduction
Congenital heart defects (CHDs) are the most common types of birth defects (BDs), accounting for nearly one-third of all BDs [1,2]. Only 15% of heart defects can be attributed to a known cause [3]. The causes of CHD are complicated and include environmental factors, genetic susceptibilities, and their interactions [4]. However, given the complicated causes of CHDs, there is little research on how multiple external influencing factors could induce CHDs in fetuses or how the pathogenesis of CHD correlates with external factors. Metabolomics is a new tool that could enlarge the effects of external environmental factors [5] and, thereby, indirectly deduce the mechanisms of how external environmental factors induce CHDs. Compared to genomics, which requires the disclosure of personal information and is expensive, metabolomics is lower cost and noninvasive [6]. Metabolomics have been used in the field of obstetrics and gynecology which may allow physicians to identify possible associated etiologies that affect the mother during pregnancy and lead to associated complications affecting the offspring [5,7]. Therefore, metabolomics research is an important approach for studying some mechanisms of CHD and is a future developmental trend [8].
Given the nature of the sample, the purpose of the experiment, and the chemical and physical properties of the metabolites, different analysis and identification techniques were selected.
Presently, the most commonly used separation methods include nuclear magnetic resonance (NMR), liquid chromatography-mass spectrometry (LC-MS), gas chromatography-mass spectrometry (GC-MS), etc. [9,10]. GC-MS is a mass spectrometry technology used for the separation and identification of volatile substances [11]. It is complemented by a relatively complete National Institute of Standards and Technology (NIST) database and high throughput, sensitivity, selectivity, and resolution capabilities [12]. Relative to LC-MS, GC-MS has certain advantages in the identification of metabolites [13]. The difficulties and key of GC-MS are sample pretreatment and processing. We selfdeveloped metabolomics methods for sample pretreatment (patent number: 2012101142462) on the basis of traditional GC-MS. This method optimizes the impurity removal and chemical derivatization process, improves the derivatization efficiency, and protects components that are not stable and easily volatilize at high temperatures. The GC-MS method can be used to analyze amino acids, organic acids, fatty acids, amides, nucleosides, sugars, and other small molecules at the same time, including in particular, semivolatile metabolites of SCFAs, such as hydracrylic acid, propanedioic acid, and so forth [14].
To our knowledge, until now, there has been only one study that included a comprehensive metabolomic analysis for the prediction of fetuses with CHD in the first trimester [15]. It is worth mentioning that it was reported that the maternal gut microbiota in late pregnancy and early lactation was stable [16]. However, it is difficult to find CHD cases in the first trimester using four-dimensional echocardiography which is the only diagnostic method available, excluding some very complicated cases. In the findings, we present the search for potential metabolic biomarkers during midterm pregnancy (including the second and third trimesters) based on advanced GC-MS and discuss the association between CHDs and the mother's metabolism in vivo.

Patient Recruitment and Sample
Collection. This study was approved by the Ethics Committee of Maternal and Child Care, Hospital of Hunan Province. Each recruited pregnant woman signed a written consent.
The pregnant women included in the study had no history of antibiotic use in the last three months and no history of abortion, diabetes, gestational diabetes, digestive system diseases, liver and renal disease, thyroid disease, chromosomal disorders, family history, or picky eating. The research period was the optimal time for screening for CHD, during gestational weeks 22-24 and 30-32. The patients were diagnosed with a CHD fetus by highly professional doctors with fourdimensional echocardiography. The controls were diagnosed as normal, and health was confirmed after delivery at the Maternal and Child Care Hospital of Hunan Province. Urine samples were collected in clean containers from both the last urination of the night and the first urination of the morning, and the exact time of collection was recorded. The urine from the night needed to be frozen in the refrigerator. The next morning, the first urination was sent to the test center within two hours, along with the urine from the previous night. The two urine samples from the same individual were combined into one sample. Each urine sample was enriched with filter paper and stored at -80 ∘ C immediately after collection until analysis.

Sample Pretreatment.
The samples were pretreated as described in the research of Xiong X Y [17]. Urine samples were briefly thawed at room temperature and centrifuged (at 3000 g) for 10 min, and 100 L urine samples (containing 2.5 mmol/L creatinine) were first treated with 30.0 L urease (1.2 U/ L) at 37 ∘ C for 30 min to remove interfering urea and then spiked with heptadecanoic acid (0.5 mg/mL, 50 L). Proteins, including the added urease, were precipitated with 800 L ethanol and removed after 15 min of centrifugation (12000 r/min). Forty microliters of 0.04 mol/L hydroxylamine hydrochloride and 60 L of 0.05 mol/L Ba(OH) 2 were added to the deproteinized solution, and the mixture was then incubated at room temperature for 15-20 min. Subsequently, the mixture solution was evaporated to dryness, and the compounds in the dried residue were converted into TMS derivatives with 100 L of MSTFA/TMCS (100:1) and analyzed by GC-MS (ZL 201210114246.2). Quality control (QC) samples, used for monitoring the repeatability and stability of the analytical method, were prepared by mixing all the urine samples and tested 5 times repeatedly.

GC-MS
Analysis. An Agilent GC-MS system (7890-5975C) was used to analyze the derivative samples. A sample (1 L) was injected with a split ratio of 20:1 into the GC and then separated with a fused silica HP-5 capillary column (60 m, 0.25 mm inside diameter, 0.25 m thickness of the inner liquid in the column). The injector temperature was set to 250 ∘ C. High purity nitrogen was used as a carrier gas at a constant flow rate of 1.5mL/min. The column temperature was initially kept at 60 ∘ C for 2 min, ramped up to 300 ∘ C at 5.5 ∘ C/min, and then held for 10 min. The interphase and ion source temperatures were 280 ∘ C and 230 ∘ C, respectively. Ions were generated by electronic impact (EI) at 70 eV. Masses were acquired from 50 to 1000 m/z. The drift of the retention time of each peak was minimized by locking heptadecanoic acid at 36.00 min using retention time locking technology (RTL, Agilent). GC-MS ChemStation software was used for auto-acquisition of GC total ion chromatograms (TICs) and fragmentation patterns. Each compound had a fragmentation pattern composed of a series of split molecular ions; the mass charge ratios and their abundance were compared with a standard mass chromatogram in the NIST (National Institute of Standards and Technology) mass spectra library by the ChemStation software. Peaks with a similarity index greater than 70% were assigned compound names, and compounds with a matching rate < 70% were excluded.

Statistical Analysis.
After GC-MS analysis, each sample was represented by a GC-MS TIC, and the ion peak areas of compounds were integrated. The results were expressed as ratios to the urinary creatinine concentration. Statistical analysis was used for the comparison of the metabolite levels to determine significant differences between the CHD and control groups.
Data were exported from the GC-MS ChemStation software, including all information regarding the TICs after the above process, into SIMCA-P 13.0. PCA and OPLS-DA were used to differentiate the samples. PCA scores are unsupervised as a basis for OPLS-DA which is supervised. R 2 and Q 2 provide a measure of model reliability of PCA and OPLS-DA [18]. The OPLS-DA model was assessed by the intercepts of R 2 and Q 2 in Permutation test to avoid the overfitting. The criteria for model validity include two conditions: (1) all Q 2 values on the permuted data set to the left are lower than the Q 2 values on the actual data set to the right and (2) the regression line (line joining the point of observed Q 2 to the centroid of a cluster of permuted Q 2 values) has a negative value of intercept on the y-axis [19]. Through the variable importance for the projection (VIP) of the OPLS-DA model, the potential biomarker was found with VIP >1. A larger VIP indicated a more important biomarker. The Wilcoxon-Mann-Whitney test was used to verify the significant difference in potential biomarkers between the CHD and control groups. Based on the Wilcoxon-Mann-Whitney test, logistic was used to screen the biomarkers with in =0.05 and out =0.1. The two-class criteria of all independent variables in the logistic model were confirmed through the ROC curve. P values of <0.05 were considered to be statistically significant.

Screening Potential Urinary
Biomarkers. Before analysis, some data treatments were made. Firstly, after NIST library searching and credibility screening, there were 220 metabolites. Secondly, delete the metabolites which are not present in the 80% of the population. Finally, 130 metabolites were retained which mainly were organic acid (49%), saccharides (26%), amino acid (13%), and so on ( Figure 1). Thirdly, these metabolites were parametric but not normal. After some transformations including log transformation, these data still not to be normal.
The separation between groups ( Figure 2) can be observed on the PCA 3D score, and the model fitting parameter was R 2 x=0.57 with a forecast rate of Q 2 =0.46. Although some samples were overlapped in the PCA score plot, samples showed separation pattern between groups, indicating that metabolite profiles of two groups were different.
OPLS-DA modeling yielded two principal components, and 3D score maps (Figure 3) showed the CHD and control groups were fully separated. Through a permutation test repeated 200 times, Q 2 and R 2 values were found to be higher than their original values, proving suitability and validity of this model (Figure 4). The fitting parameters were R 2 x=0.78 and R 2 y=0.69, and the forecast rate was Q 2 =0.61, indicating  a high forecast ability. According to the ranking of VIPs from the OPLS-DA models, we found 34 potential metabolic biomarkers with VIP >1 (Table 3), which mainly were organic acid (59%), saccharides (20%), amino acid (9%), and so on ( Figure 5). After the Wilcoxon-Mann-Whitney test, we found 20 significant potential biomarkers, which are listed in Table 3 (P<0.05). We also listed their diagnostic predictive value through the ROC curve, which is the two-class standard. If the value of the potential biomarker was more than the predictive value, it was judged to be "1." If it was less than the predictive value, it was judged to be "0" in the logistic model. The results of the logistic regression showed that the significant substances, including 4-hydroxybenzeneacetic acid (OR=4.74, 95% CI: 1.06-21.06), 5-trimethylsilyloxyn-valeric acid (OR=15.78, 95% CI: 2.33-106.67), propanedioic acid (OR=5.37, 95% CI: 1.87-15.45), hydracrylic acid (OR=6.23, 95% CI: 1.07-36.21), and uric acid (OR=5.23, 95% CI: 1.23-22.32), were associated with CHD ( Table 4). The five compounds were further qualitatively analyzed by comparing their retention time and fragment-ion of the chromatograms between the urine sample and the corresponding standards (Figures 6-11; see Supplementary Material). All of the data support the identification of the five compounds as 4-hydroxybenzeneacetic acid, 5-trimethylsilyloxy-n-valeric acid, propanedioic acid, hydracrylic acid, and uric acid, respectively.

Discussion
Our study showed that the separation between groups could be observed using PCA and OPLS-DA modeling. We found 34 potential metabolic markers with a VIP >1, and after two pairwise rank sum tests, we found 20 significant potential biomarkers. The significant substances were 4-hydroxybenzeneacetic acid, 5-trimethylsilyloxy-n-valeric acid, hydracrylic acid, propanedioic acid, and uric acid after further multifactor logistic regression. This result is similar to the research of Bahado-Singh RO, which concluded that abnormal lipid metabolism was a significant feature of CHD pregnancy [15]. The main meaning and the relative mechanism of these potential makers as follows.
In our study, 4-hydroxybenzeneacetic acid was higher in the CHD group than in the control group (OR=4.74,95% CI: 1.06-21.06). As reported, tyrosine is converted from phenylalanine under the action of intestinal bacteria [20], and tyrosine decarboxylation is further converted into 4hydroxybenzeneacetic acid [21]. It can be seen that the increase in 4-hydroxybenzeneacetic acid in the case group may be due to a disorder of microecological bacteria in the pregnant mother of the CHD fetus or the increasing of tyrosine.
5-Trimethylsilyloxy-n-valeric acid from enterogenous SCFAs was higher in the CHD group than in the control group (OR=15.78, 95% CI: 2.33-106.67). SCFAs mainly include acetic acid, propionic acid, butyric acid, pentanoic acid, and so on [22]. After traveling to the liver through the bloodstream, SCFAs participate in biotransformation reactions, such as oxidation, reduction, hydrolysis, and conjugation [23]. Thus, all kinds of SCFAs produce a large amount of intermediate or terminal metabolites, such as propionic acid, which produces acrylic acid, beta-hydroxypropionic acid, malonic acid, propionylglycine, etc. [23,24]. Valeric acid produces 5-trimethylsilyloxy-n-valeric acid, valerylglycine, and so on [23]. Without special dietary conditions, 5trimethylsilyloxy-n-valeric acid may be derived primarily from valeric acid via omega-oxidative metabolism (valeric acid is similar to the omega-oxidative metabolic pathway of valproate (VPA) [25]. The excessive concentration of 5trimethylsilyloxy-n-valeric acid in the urine of the case group    may indicate valproate metabolic disorders in the pregnant mother of the CHD fetus. When there are too many SCFAs, such as valeric acid, CHD can be induced by exerting physiological effects similar to VPA, such as deacetylase inhibitors [26,27], secondary demethylation enhancement [28], and folic acid antagonists [29].
Hydracrylic acid and propionic acid levels were higher in the CHD group than in the control group (OR=6.23, 95% CI: 1.07-36.21), (OR=5.37, 95% CI: 1.87-15.45). Hydracrylic acid and malonic acid are mainly derived from intestinal-derived propionic acid [30]. Propionic acid enters the epithelial cells of the intestine to provide energy, dehydrogenation into acrylic acid, addition of water to produce Hydracrylic acid, and then dehydrogenation, thiolysis reactions into malonate [31]. In the study, the increase in hydracrylic acid and propionic acid in the maternal urine of mothers with CHD fetuses may be caused by too much propionic acid in the intestinal microecology. Propionic acid, one of the most important components of SCFAs, is the major product of the fermentation of undigested carbohydrates, such as oligosaccharides, nonstarch polysaccharides, and resistant starches by Clostridium species IV and XIV [32]. When there are too much propionic acid, CHD can be induced by exerting physiological effects similar to VPA. In additional, this suggests that the CHD group may be associated with an excess of Clostridium genera IV and XIV in the thick-walled bacteria in the intestinal tract.
Uric acid was higher in the CHD case group than in the normal control group (OR=5.23, 95% CI: 1. 23-22.32). Uric acid is the final product of purine metabolism. Dysfunction of   purine metabolism, abnormal energy metabolism, and renal excretion of uric acid can cause either an increase (hyperuricemia) or decrease (hypouricemia) in plasma uric acid concentration [33]. More studies have shown that metabolic diseases, such as obesity, can affect the risk of hyperuricemia. Changes in intestinal flora may affect purine metabolism by altering certain body characteristics [33]. Animal experiments have also shown that probiotics can reduce the levels of serum uric acid in mice [34], mainly due to the reduction of serum endotoxin and the increase of xanthine oxidase activity caused by probiotic treatment, which in turn reduces uric acid production and decreases serum uric acid levels [35]. It can be speculated that the occurrence of CHD may be related to a disorder in purine metabolism caused by the imbalance of intestinal flora in the mother (the specific mechanism is not yet clear).
In our study, the pretreatment technology of GC-MS is simpler and produces a more accurate result, especially in semivolatile metabolites from SCFAs, such as hydracrylic acid, propanedioic acid, etc. In addition, several statistical methods were combined to seek out potential biomarkers in the maternal urine. This result could make up for the lack of ultrasounds. The estimated current prenatal ultrasound screening in developed countries detects only 30-50% of cases with CHD [36]. Many pregnant women do not receive an ultrasound in time, and the results are affected by pregnancy weeks, doctor's experience, apparatus, and obesity. Of course, our study has some limitations at the same time. First, for the case-control study, we could not make an inference of causal association between the significant biomarkers and CHD but it gave us some clue about the mechanism of CHD. Second, the samples size is limited; we could not make subgroup comparison. We will collect more samples and make subgroup comparison in the next step. Thirdly, although the model of OPLS-DA was validated, there may be some overoptimistic view of the separation between groups. So, the study was a discovery-phase worthy of further study.

Conclusions
In sum, the major potential biomarkers in maternal urine associated with CHD were 4-hydroxybenzeneacetic acid, 5trimethylsilyloxy-n-valeric acid, hydracrylic acid, propanedioic acid, and uric acid. The occurrence of CHD was modestly associated with the preternatural gut microbiota of pregnant mothers and the metabolism of SCFAs. These are new directions for CHD research and more energy for research is needed.

Data Availability
If someone needs the data, s/he could contact Donghua Xie by email "2210685350@qq.com". Figure 6: chromatograms of the five compounds in standard solution (black) and urine sample (red). Figure 7: electron-impact mass spectrum of hydracrylic acid TMS derivative. Figure 8: electron-impact mass spectrum of propanedioic acid TMS derivative. Figure 9: electron-impact mass spectrum of 5-trimethylsilyloxy-n-valeric acid TMS derivative. Figure 10: electron-impact mass spectrum of 4-hydroxybenzeneacetic acid TMS derivative. Figure 11:

Supplementary Materials
electron-impact mass spectrum of uric acid TMS derivative. (Supplementary Materials)