Molecular Characterization and Elucidation of Pathways to Identify Novel Therapeutic Targets in Pulmonary Arterial Hypertension

Background: Pulmonary arterial hypertension (PAH) is a life-threatening chronic cardiopulmonary disease. However, there are limited studies reflecting the available biomarkers from separate gene expression profiles in PAH. This study explored two microarray datasets by an integrative analysis to estimate the molecular signatures in PAH. Methods: Two microarray datasets (GSE53408 and GSE113439) were exploited to compare lung tissue transcriptomes of patients and controls with PAH and to estimate differentially expressed genes (DEGs). According to common DEGs of datasets, gene and protein overrepresentation analyses, protein–protein interactions (PPIs), DEG–transcription factor (TF) interactions, DEG–microRNA (miRNA) interactions, drug–target protein interactions, and protein subcellular localizations were conducted in this study. Results: We obtained 38 common DEGs for these two datasets. Integration of the genome transcriptome datasets with biomolecular interactions revealed hub genes (HSP90AA1, ANGPT2, HSPD1, HSPH1, TTN, SPP1, SMC4, EEA1, and DKC1), TFs (FOXC1, FOXL1, GATA2, YY1, and SRF), and miRNAs (hsa-mir-17-5p, hsa-mir-26b-5p, hsa-mir-122-5p, hsa-mir-20a-5p, and hsa-mir-106b-5p). Protein–drug interactions indicated that two compounds, namely, nedocromil and SNX-5422, affect the identification of PAH candidate biomolecules. Moreover, the molecular signatures were mostly localized in the extracellular and nuclear areas. Conclusions: In conclusion, several lung tissue-derived molecular signatures, highlighted in this study, might serve as novel evidence for elucidating the essential mechanisms of PAH. The potential drugs associated with these molecules could thus contribute to the development of diagnostic and therapeutic strategies to ameliorate PAH.


INTRODUCTION
Pulmonary arterial hypertension (PAH) is a rare vascular disease with an annual incidence of two cases per million (Peacock et al., 2007). PAH is defined by a mean pulmonary arterial pressure (mPAP) > 20 mmHg at rest, a pulmonary artery wedge pressure (PAWP) ≤ 15 mmHg, and a pulmonary vascular resistance (PVR) ≥ 3.0 Wood units (Gouyou et al., 2021). As an obliterative vasculopathy, PAH is characterized by high pulmonary arterial pressure, resulting in right ventricular failure and even death (Boucly et al., 2017;Tang et al., 2018). Over the past decades, the progression of effective medical treatments and the application of combined therapy have significantly improved the prognosis of patients with PAH (Sitbon et al., 2014(Sitbon et al., , 2016Galie et al., 2015). Although current diagnosis and therapy strategies have effectively ameliorated the abnormal hemodynamics and severe pulmonary vascular remodeling of PAH, and efficaciously alleviated the clinical symptoms in patients with PAH, there are still a number of patients suffering from persistent symptoms and even right heart failure (Van De Veerdonk et al., 2011. Thus, investigating PAH biomarkers is essential not just for a better understanding of PAH development but also as a key step to establish promising novel treatment strategies (Kerkhof et al., 2019).
The correlation between PAH and poor cardiorespiratory outcomes justifies the necessity of early diagnosis and treatment of this disease (Farhadi et al., 2017). The wide application of cardiac catheterization and echocardiography in clinical practice, evaluation of pulmonary artery pressure and right heart hemodynamic as well as high-resolution image data are no longer difficult to access, which significantly improve the differential diagnosis and assessment of patients with PAH (Claessen and La Gerche, 2017;Farhadi et al., 2017). Indeed, there is tremendous progress in understanding the essential pathophysiology of PAH, including the prognostic biomarkers as well as many promising therapy options . However, there are limited studies providing PAH-associated gene expression profiles, and a number of studies implying that there may be an extraordinary significance of investigating this type of clinical molecular biomarkers and elucidating the fundamental mechanisms involved in PAH. This may be useful in developing a new scientific-based diagnosis modality and performing targeted therapy in patients with PAH more precisely (Farhadi et al., 2017;Sullivan and Kass, 2019;Ma et al., 2020;Maremanda et al., 2020).
In recent years, bioinformatics analysis has been widely employed to investigate the microarray data to estimate differentially expressed genes (DEGs) and adopt various analyses (Kanwar, 2020). However, the limited sample size or high falsepositive rate of a single microarray analysis might hinder the derivation of reliable conclusions. The present study retrieved two separate microarray datasets from gene expression omnibus (GEO) for further analyses. Common DEGs between patients with PAH and controls of these two datasets were screened to identify significant biomarkers. Potential differentially expressed genes and hub genes participating in PAH were estimated via Gene Ontology (GO) annotations, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses, protein-protein interaction (PPI) network investigations, and protein subcellular localization. Eventually, a total of 38 DEGs and 9 hub genes was chosen as prospective diagnostic candidates and targeted biomarkers for PAH.

Identification of the Differentially Expressed Genes of Microarray Datasets From Lung Tissue With Patients With PAH
The high-throughput datasets of microarray gene expression in PAH were retrieved from the NCBI-GEO database (Barrett et al., 2013). The datasets were obtained from two separate studies on human lung tissues that were compared across normal individuals and patients with PAH on Affymetrix microarrays, which have been deposited in the NCBI-GEO database under the accession numbers GSE53408 and GSE113439. GSE53408 contains samples of 16 individuals, including 11 controls and five patients with idiopathic pulmonary arterial hypertension (IPAH), taken from a study originally published by Zhao et al. (2014). GSE113439 encompasses 26 lung tissue samples (11 controls and 15 cases), the PAH group contains six patients with IPAH, four patients with PAH secondary to the connective tissue disease (CTD), four patients with PAH secondary to the congenital heart disease (CHD), and one patient with chronic thromboembolic pulmonary hypertension (CTEPH). This dataset was originally analyzed by Mura and colleagues to explore DEGs in PAH compared to the controls (Mura et al., 2019). To verify the results, GSE15197 (13 control and 18 PAH lung tissues) and GSE117261 (25 control and 58 PAH lung tissues) were also used in our research. First, we normalized the gene expression datasets for log 2 transformation. Thereafter, these datasets were analyzed in GEO2R of NCBI with Limma package in hypothesis testing, and the false discovery rate was regulated by Benjamini and Hochberg correction. As the cutoff criteria, a p < 0.01 was considered to select the significant DEGs. Jvenn was exploited to identify the common DEGs of two datasets and make the Venn plot (Bardou et al., 2014).

Gene Ontology and Gene Pathway Enrichment Analysis
NetworkAnalyst was performed to visualize and analyze the topological network (Xia et al., 2015). The hub proteins were selected based on the topological indices, i.e., degree > 15.

Differentially Expressed Genes-Transcription Factor (TF) Interaction Analysis
To investigate the regulatory TFs that regulate the DEGs at the transcriptional level, we identified the interactions of TF-targeted genes with the JASPAR database and estimated the topological parameters using NetworkAnalyst (Xia et al., 2015;Khan et al., 2018).

Differentially Expressed Genes-miRNA Interaction Analysis
The regulatory miRNAs that control DEGs at the posttranscriptional level were evaluated by identifying the interactions of miRNA-target genes with TarBase and miRTarBase and by examining the topological parameters using the NetworkAnalyst (Sethupathy et al., 2006;Hsu et al., 2011;Xia et al., 2015).

Protein-Drug Interaction Analysis
The protein-drug interaction was estimated by the DrugBank database (version 5.1.8) via the NetworkAnalyst, which could highlight the potential drugs applied in the treatment of PAH (Wishart et al., 2018). The results suggested the interaction of HSP90AA1 (heat shock protein) with few drugs. Moreover, according to the protein-ligand docking server SwissDock, molecular docking analysis between HSP90AA1 and drugs was elucidated with the three-dimensional crystal structure (Grosdidier et al., 2011;Biasini et al., 2014). The protein-drug interaction analysis also indicated the interaction of HSP90AA1 with nedocromil and SNX-5422 (Wishart et al., 2018).

Investigation of Protein Subcellular Localization
WoLF PSORT helped to determine the subcellular localization of a set of proteins encoded by DEGs (Horton et al., 2007). The subcellular localization of proteins was predicted based on their amino acid sequences. This method made predictions in the light of the known sorting signals, amino acid content, and functional motifs, collected from UniProt (Universal Protein) and GO database.
Gene-set enrichment analysis revealed the abundance of DEGs in biological processes, molecular functions, and cellular components. Table 1 summarizes the specific information. The enrichment analysis of the molecular pathway claimed that the pathways involved in non-homologous end-joining, protein processing in the endoplasmic reticulum, tuberculosis, and PI3K-Akt signaling pathway were altered ( Table 2).

Proteomic Signatures in PAH
To demonstrate central proteins, the protein-protein network of mutual DEGs was constructed to offer deep knowledge in the biological characterization of targeted proteins encoded by DEGs and the estimation of drug targets (Figure 2). The hub proteins The top 10 abundant Gene Ontology (GO) terms were tabulated.
Frontiers in Physiology | www.frontiersin.org responsible for transmitting signal stimulus to other proteins in networks are obtained from the topological examination of PPI networks. Based on the topological metric, the present method estimated the hub proteins, which could serve as biomarkers and drug targets in PAH. Besides, Table 3 highlights nine central hub  proteins (HSP90AA1, ANGPT2, HSPD1, HSPH1, TTN, SPP1, SMC4, EEA1, and DKC1). The other two independent studies also showed that these hub proteins play an important role in PAH (Supplementary Table 1).

Protein-Drug Interactions
The protein-drug interaction network reported the relation of HSP90AA1 protein with the drugs nedocromil and SNX-5422 ( Figure 5, Table 5). Nedocromil is a pyranoquinolone derivative that can suppress the activation of inflammatory cells, such as eosinophils, neutrophils, macrophages, mast cells, monocytes, and platelets. SNX-5422 is not only an oral agent exhibiting strong efficacy and tolerability but also a new synthetic Hsp90 inhibitor, and it was considered as a drug with breakthrough treatment and widespread applicability in a wide range of cancers. According to the statistical significance threshold of protein-drug interaction and the potential influence of the targeted protein in PAH pathogenesis, several protein-drug interactions were screened, and molecular docking simulations were carried out to determine the binding affinities of the drugs with targeted proteins ( Table 5). The resultant energetic states and docking scores confirmed the thermodynamic feasibility of all these interactions.

Protein Subcellular Localization
The WoLF PSORT software package can predict the subcellular localization of proteins encoded by the 38 DEGs in PAH.
FIGURE 2 | Protein-protein interaction network for the DEGs in pulmonary arterial hypertension (PAH). The nodes indicated the DEGs, while the edges indicated the interactions between different proteins. The medium confidence score was performed to construct the Protein-protein interaction (PPI) networks.
Frontiers in Physiology | www.frontiersin.org TABLE 3 | Summary of hub proteins identified from protein-protein interactions analysis of encoded differentially expressed genes in PAH disease.

HSP90AA1
Heat shock protein Influenced the progression of pulmonary disease (Deng et al., 2021).

ANGPT2
Protein marker and mediator Participated in the direct regulation of inflammation-related signal pathways in PAH (Zhong et al., 2018).

HSPD1
Heat shock protein Involved in pulmonary disease as differential expression gene (Maremanda et al., 2020).

HSPH1
Heat shock protein Interacted with STAT3 and enhanced its phosphorylation in acute lung injury .

TTN TITIN protein
Served as a pathogenic gene associated with total anomalous pulmonary venous connection (Shi et al., 2018).

SMC4
A core subunit of condensin complexes Enriched in facilitating mitotic cell cycle process in PAH .

DKC1
Dyskeratosis congenita 1 Encoded the protein dyskerin and maintained telomeres in pulmonary disease (Khincha et al., 2014). The proportions of these proteins in distinct subcellular compartments were computed, which indicated the localization of 89.5% of the proteins in extracellular areas, whereas the remaining 10.5% existed in the nuclear area. Notably, all hub proteins manifested the extracellular localization, including HSP90AA1, ANGPT2, HSPD1, HSPH1, TTN, SPP1, SMC4, EEA1, and DKC1.

DISCUSSION
The diagnosis of PAH currently relies on the right heart catheterization, but the specific biomarkers for PAH diagnosis are an unmet challenge. The present research focused on the comprehensive analysis of the gene expression patterns in lung tissues of patients with PAH and exploration of   .
hsa-mir-106b-5p MicroRNA 106 Suppress the migration of pulmonary artery smooth muscle cell in PAH .
the robust candidate molecular targets, which may act as potential biomarkers of PAH. Our study may, therefore, provide the relevant information regarding the progression of PAH.
With a wide application in biomedical investigation, microarray datasets have become a major resource for elucidating biomarker candidates (Budinska et al., 2013;Marisa et al., 2013). The extensive contribution of microarray gene  expression profiling has also been reflected in the investigation of DEGs in various diseases, including Alzheimer's disease and breast cancer (Nami and Wang, 2018;Rahman et al., 2020). Significant alterations in the profiles of 38 genes in two separate transcriptomic datasets have been observed from the gene expression patterns in the lung tissues of patients with PAH. The enrichment analysis reported PAH-related molecular pathways in the endoplasmic reticulum and PI3K-Akt signaling pathway ( Table 2) (Li et al., 2016). Interactive analysis of PPI networks facilitates the identification of the proteins that play a key role in the pathophysiology of various diseases (Goh et al., 2007;Yang et al., 2009). The PPI analysis offers deep knowledge in the biological characterization of targeted proteins encoded by DEGs and the estimation of drug targets (Taz et al., 2021;Xu et al., 2021;Zhang T. et al., 2021). In light of DEGs, PPI networks detected a set of key hub proteins (Table 3), which may reflect the onset and development processes of vascular diseases. These hub proteins were involved in a number of biological and pathologic processes. As a hub protein, HSP90AA1 was differentially expressed in the lung tissues of patients with PAH and indulged in the inflammatory responses of the airway and the smooth muscle function of the bronchia, which reflected that HSP90AA1 may be a potential prognostic biomarker and drug target for the treatment of PAH (Deng et al., 2021). Consistent with our findings, ANGPT2 was upregulated in several inflammatory diseases and took part in the direct regulation of inflammationrelated signal pathways in PAH (Zhong et al., 2018). Differential expression of HSPD1, involved in mitochondrial biogenesis, was noted in patients suffering from pulmonary diseases compared with normal controls (Maremanda et al., 2020). Interaction of HSPH1 with STAT3 may enhance its phosphorylation, which exacerbated pulmonary inflammation in acute lung injury, though there is no such report relating to PAH . Furthermore, TTN, a pathogenic gene in total anomalous pulmonary venous connection, appeared to play a critical role in the genetic mechanism (Shi et al., 2018). It has been suggested that TTN isoform composition was unchanged in PAH cardiomyocytes but that TTN phosphorylation was significantly decreased in patients with PAH (Rain et al., 2013). Pulmonary myofibroblasts can be activated by highly proliferative SPP1 macrophages, in turn, contributing vitally to lung fibrosis (Morse et al., 2019). Previous studies have shown that the SPP1 gene was significantly increased in PAH and played an important role by promoting pulmonary vascular smooth muscle cell (PVSMC) proliferation (Saker et al., 2016;Zeng et al., 2021). SMC4 manifested higher expression levels in patients with PAH . Similarly, SMC4 is a vital core subunit of condensin, which has an essential impact on mitotic chromosome condensation (Takemoto et al., 2004;Luo et al., 2020). It has been suggested that knockdown of SMC4 could inhibit Toll-like receptor-mediated production of several proinflammatory cytokines, including IL-6 and TNF-α in macrophages . Interaction of EEA1 with plateletderived growth factor receptor can mark early endosomes, which are required for the progression of secondary pulmonary alveolar septa, which reflected that the internalization of TGF-R by clathrin-mediated endocytosis in EEA1-positive endosomes resulted in productive nuclear signaling via the interaction of TGF-R with Smad anchor for receptor activation (SARA) (Saker et al., 2016;Chrifi et al., 2019). DKC1 was known as a gene-encoded protein dyskerin and affected several modules of the telomere complex in pulmonary disease (Heiss et al., 1998;Khincha et al., 2014).
The interaction of DEGs with TFs and DEGs with miRNAs was also examined to highlight the transcriptional or posttranscriptional regulators associated with the mutual DEGs. Table 4 illustrates a set of regulators related to DEGs in PAH, including TFs and miRNAs. The measured transcriptional regulatory TFs (FOXC1, FOXL1, GATA2, YY1, SRF) that interacted with DEGs in PAH were in accordance with the previous observations (Stankiewicz et al., 2009;Ding et al., 2017;Yang et al., 2019Yang et al., , 2020Zhang L. et al., 2021). miRNA is known as a single-stranded non-coding RNA that targets their transcripts to regulate gene expressions (Caruso et al., 2017). As potential biomarkers, miRNAs may provide breakthrough treatment strategies for the diagnosis and management of PAH (Caruso et al., 2017). Therefore, the miRNAs were explored as regulatory factors of target DEGs. The mir-17-5p identified in this study was involved in the proliferation of pulmonary vascular smooth muscle cells and could thus provide a potential novel treatment target for the control of PAH (Liu et al., 2018). In comparison with the control group, the lower concentration of miR-26b-5p in the lung tissue of patients with PAH signified its involvement in the remodeling process of PAH (Chouvarine et al., 2020). As a good biomarker for hypertension, mir-122-5p had a prominent diagnostic performance, and its dysregulation could indulge in the risk of PAH (Zhang et al., 2018). The mir-20a-5p-induced proliferation of pulmonary artery smooth muscle cells can exacerbate the development of PAH via targeting ATP-binding cassette subfamily A1 (Zhou et al., 2020). The mir-106b-5p played a key role in suppressing the migration of pulmonary artery smooth muscle cells, indicating that mir-106b-5p may act as a potential marker in PAH . These biomolecules may regulate target genes at transcriptional or posttranscriptional levels.
Given the importance of hub genes and their potential role in the pathogenic process in PAH, the interactions between target proteins and drugs were further studied in this research. Two drugs were spotted from the interaction network according to the DrugBank database (Table 5). A previous study has suggested that nedocromil sodium could inhibit antigen-induced shrinkage of human lung parenchymal as well as bronchial strips (Napier et al., 1990). As a therapy for non-smallcell lung cancer (NSCLC), another study also supported the evaluation of SNX-5422, especially in cases where cancer was driven by c-Met amplification and mutated epidermal growth factor receptor (EGFR) forms that were resistant to EGFR inhibitors (Rice et al., 2009). Several molecular modelingbased techniques were conducted in pharmaceutical research to evaluate complex biological systems, especially for molecular docking methods, which were broadly applied in drug designs to demonstrate the ligand conformation with binding sites of target proteins. The free energy was estimated in molecular docking methods by evaluating critical phenomena that participated in the intermolecular recognition processes in ligand-receptor binding . Henceforth, the binding modes of drugs with target protein HSP90AA1 were screened, and the energetically stable conformations were obtained from the existing databases ( Figure 5). Moreover, the protein subcellular localizations for DEGs confirmed that extracellular and nuclear areas were the primary subcellular sites for the DEGs. This result substantiated the participation of these genes in PAH, extending from the nucleus to extracellular areas. For further drug selection to alleviate PAH, the protein subcellular localization may provide targeting sites for specific drugs. The present study documented the relationship between drugs and putative PAH molecular biomarkers; however, the consequence of molecular targets blockade was ambiguous from this study that should be taken into account for further investigation. Although the molecules are lung tissue-based and how they participate in pulmonary pathogenesis is still not known, clinical evidence has substantiated the effect of several of these identified drugs on PAH, and it may be helpful to know what are their influences on lung tissues.

CONCLUSION
Integrative multi-omics analysis was adopted for the evaluation of the lung tissue-based transcriptomic profiles to identify the molecular signatures at protein levels (hub proteins, TFs) and RNA levels (mRNAs, miRNAs). Based on the genome transcriptome datasets, nine hub genes (HSP90AA1, ANGPT2, HSPD1, HSPH1, TTN, SPP1, SMC4, EEA1, and DKC1) were found in this study. The significant abundance of numerous pivotal hub genes was evident in pathways that participated in endoplasmic reticulum-related protein processing and PI3K-Akt signaling pathway. We also screened TFs (FOXC1, FOXL1, GATA2, YY1, and SRF) and miRNAs (hsa-mir-17-5p, hsa-mir-26b-5p, hsa-mir-122-5p, hsa-mir-20a-5p, and hsa-mir-106b-5p) regulating the expression of DEGs in PAH. These biomolecules could be accounted as candidate system biomarkers at protein levels and RNA levels. Therefore, the potential molecular signatures obtained for PAH could be detected as transcripts in lung tissues, and these signatures warrant clinical analyses in patients with PAH to access their utility. The availability of these biomolecules in pulmonary tissues presumes the establishment of these biomarkers as a novel aspect of PAH development and progression.

DATA AVAILABILITY STATEMENT
The original contributions generated for the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
DC, HZ, and WM conceived and designed the experiments. XY and TJ performed all experiments. XY, TW, and XC collected and analyzed the data. CG, HF, TJ, and FC drafted the manuscript. All authors have read and agreed to the published version of the manuscript.