Polymorphism of SERF2, the gene encoding a heat-resistant obscure (Hero) protein with chaperone activity, is a novel link in ischemic stroke

Background Ischemic stroke (IS) is one of the most serious cardiovascular events associated with high risk of death or disability. The growing body of evidence highlights molecular chaperones as especially important players in the pathogenesis of the disease. Since six small proteins called “Hero” have been recently identified as a novel class of chaperones we aimed to evaluate whether SNP rs4644832 in SERF2 gene encoding the member of Hero-proteins, is associated with the risk of IS. Methods A total of 1929 unrelated Russians (861 patients with IS and 1068 healthy individuals) from Central Russia were recruited into the study. Genotyping was done using a probe-based PCR approach. Statistical analysis was carried out in the whole group and stratified by age, gender and smoking status. Results Analysis of the link between rs4644832 SERF2 and IS showed that G allele is the risk factor of IS only in females (OR=1.29, 95%CI 1.02–1.64, Padj=0.035). In addition, the analysis of associations of rs4644832 SERF2 and IS depending on the smoking status revealed that this genetic variant is associated with an increased risk of IS exclusively in non-smoking individuals (OR=1.26, 95%CI 1.01–1.56, P = 0.041). Discussion Sex- and smoking interactions between rs4644832 polymorphism and IS may be related to the impact of tobacco components metabolism and sex hormones on SERF2 expression. Conclusion The present study reveals the novel genetic association between rs4644832 polymorphism and the risk of IS suggesting that SERF2, the part of the protein quality control system, contributes to the pathogenesis of the disease.


Introduction
Ischemic stroke (IS) is a life-threatening diagnosis dramatically increasing risk of death and leading to disability in the adult population (Meschia and Brott, 2018). In the vast majority of cases IS is caused by thrombosis following the rapture of the atherosclerotic plaque (Campbell et al., 2019). Study of the mechanisms underlying these events may help to improve approaches for the prediction, prevention and the treatment of the disease (Polonikov et al., 2022;Bushueva et al., 2021;Malik et al., 2018;Vialykh et al., 2012;Bushueva et al., 2015b). There are multiple molecular pathways driving the development and outcomes of atherosclerotic plaque in cerebral arteries (Libby, 2021). Foremost, these pathways are linked to cholesterol turnover, vascular wall inflammation and platelets aggregation (Rahman and Woollard, 2017; Koutsaliaris et al., 2022). Accordingly, major research has been focused on the certain molecular contributors such as lipoproteins, cytokines and their receptors as well as proteins involved in coagulation and platelets dynamic.
IS is a multifactorial disease with multiple genes involved. Association studies of the last decades revealed a broadest spectrum of genetic markers associated with IS (Appunni et  Maintaining proteins is one of the top priority tasks for the cell. Accordingly, chaperones, molecules undergoing the proteins quality control, are the key factors involved in cellular homeostasis. Substantial body of evidence suggests that chaperones serve as important links in genesis and course of IS (Hwang et al., 2019;Xu et al., 2012). For instance, an imbalance of the 70-kDa heat shock proteins was reported to impede endothelial function (Costa-Beber et al., 2022). Moreover, on preclinical models, modulators of several molecular targets have been tested in various cardiovascular disorders with promising results (Diteepeng et al., 2021).
We set out to investigate whether a recently discovered class of chaperones, so-called "Hero-proteins", may impact the risk of IS. In brief, Hero is a family of chaperones comprising 6 small heat-resistant proteins. These proteins were shown to protect enzymes from drying, organic solvents and other damaging factors exposed in vitro (Tsuboyama et al., 2020). Moreover, some of these proteins also prevented pathological protein aggregation in neural cells during experimental neurodegeneration. Given the critical role of chaperons in atherosclerotic plaque development (Xu et al., 2012) and cellular response to hypoxia (Thiebaut et al., 2019), the Hero proteins may be considered as probable contributors to the risk of IS.
We tested this hypothesis via the analysis of association between the polymorphism of SERF2 gene encoding a member of the Hero proteins and IS risk in the adult Russian population.

Material and methods
A total of 1929 unrelated Russians (861 patients with IS and 1068 healthy individuals) from Central Russia were recruited into the study.  Table 1. The patients were enrolled into the study in two periods: from the Regional Vascular Center of Kursk Regional Clinical Hospital between 2015 and 2017 and from the Neurology Clinics of Kursk Emergency Medicine Hospital Kursk between 2010 and 2012. All the patients were examined by qualified neurologists. The diagnosis of IS was made in the acute phase of stroke, according to the results of the neurological examination and computed tomography and/or magnetic resonance imaging of the brain. The patients were recruited consecutively. The IS patients were enrolled under the following exclusion criteria: hepatic or renal failure or endocrine, autoimmune, oncological, or other diseases that can cause an acute cerebrovascular event; intracerebral hemorrhage, hemodynamic or dissection-related stroke, traumatic brain injury. All the patients with IS had a history of hypertension and received antihypertensive therapy. Due to the fact that in the studied cohort there were patients with concomitant comorbid pathology, we conducted an additional differential analysis of the baseline and clinical characteristics in subgroups of patients, depending on the presence / absence of ischemic heart disease (IHD) and type 2 diabetes mellitus (T2DM) ( Table S1).
The control group was compiled from healthy volunteers with no clinical evidence of cerebrovascular, cardiovascular, or other chronic diseases and with normal blood pressure without antihypertensive therapy. Healthy individuals were included in the control group if they had a systolic blood pressure less than 130 mm Hg and/or a diastolic blood pressure less than 85 mm Hg on at least 3 separate measurements. Control subjects were enrolled from Kursk hospitals during periodic medical examinations at public institutions and industrial enterprises of the Kursk region. This group was recruited from the same population and during the same period.
The following criteria were used in the selection of SNPs: the SNP must be tagging, have a minor allele frequency of at least 0.05 in the European population and be characterized by a high regulatory potential. According to the bioinformatic tool SNPinfo Web Server (https:// snpinfo.niehs.nih.gov/snpinfo/snptag.html), which was used to select SNPs based on the reference haplotypic structure of the Caucasian population (CEU) of the project HapMap, the only tag SNP in the gene SERF2 (small EDRK-rich factor 2, ID:10169) is rs4644832.
This genetic variant is localized in the intron. Several bioinformatic resources were used to assess the regulatory potential of this SNP to justify the selection of this genetic variant according to the study inclusion criteria. According to SNPinfo Web Server and the search query "SNP Function Prediction" this SNP possess the high regulatory potential (RP=0.403) and is localized within transcription factors binding site (https://snpinfo.niehs.nih.gov/snpinfo/snpfunc.html) (accessed on 8 October 2022) (Xu and Taylor, 2009). According to rSNPBase (http://rsnp.psych.ac.cn/index.do), rs4644832 is characterized by proximal and distal regulation of transcription and RNA-binding proteins-mediated regulation (accessed on 8 October 2022) (Guo and Wang, 2018). The RegulomeDB tool showed that rs4644832 SERF2 is characterized by a regulatory coefficient of 4 (TF binding + DNase peak) (http://regulome.stanford.edu/) (accessed on 8 October 2022) (Dong and Boyle, 2019). According to the data presented in the Ensembl genome browser, this genetic variant is characterized by an average frequency of the minor G allele in European populations of 0.18 (https://www.ensembl.org/). Thus, rs4644832 SERF2 was selected for our research, which meets the necessary criteria for inclusion in the study.

Genetic Analysis
DNA analysis was carried out at the Research Institute for Genetic and Molecular Epidemiology of Kursk State Medical University (Kursk, Russia). Approximately 5 mL of venous blood from the cubital vein of each participant was collected into EDTA-coated tubes and maintained at − 20 •C until processed. Genomic DNA was extracted from thawed blood samples by the standard procedure of phenol/chloroform extraction and ethanol precipitation. Genotyping of the SNP was done using allele-specific probe-based real-time polymerase chain reaction assays according to the protocol designed in the Laboratory of Genomic Research, Research Institute for Genetic and Molecular Epidemiology. Primers and probes were designed using the Primer3 program online (http://primer3.ut.ee/) (Koressaar and Remm, 2007), selected, and then synthesized by the Syntol company (Moscow, Russia). The two primers were used for genotyping of the polymorphism: forward 5 ′ -TTCCGTTCACCCTAAACACC-3 ′ and reverse 5 ′ -AGGGTGG TCCCGTGAAGTAG-3 ′ as well as two allele-specific probes  Fig. S1 of supplementary shows allelic discrimination plot for SNP rs4644832 SERF2 assay designed for this study. To ensure quality control, 10% of DNA samples were genotyped in duplicates blinded to the case-control status. The concordance rate was > 99%.

Statistical and Bioinformatics Analysis
All statistical analyses were performed in R software v3.6.3. The distribution of quantitative data was tested for normality using Shapiro-Wilk's test ("normtest" package) and variance equality was assessed using Levene's test ("lawstat" package). For quantitative variables, the significance of the difference between means was determined using Wilcoxon-Mann-Whitney test in case of pairwise comparison or Kruskal-Wallis rank sum test in case of multiply comparison. For categorical variables, the statistical significance of differences was evaluated by Pearson's chi-squared test with Yates's correction for continuity.
Compliance of the genotypes' distribution with Hardy-Weinberg equilibrium was assessed using Fisher's exact test. Genotype frequencies in the study groups and their associations with the disease risk were analyzed using SNPStats software (https://www.snpstats.net/ start.htm) (Sole et al., 2006). For the analysis of associations of genotypes, additive models were considered. Associations were adjusted for age, gender and smoking status.
The following bioinformatics resources were used to analyze the functional effects of rs4644832 SERF2: • GTExportal (http://www.gtexportal.org/) was used to analyze the expression levels of the studied genes in brain, whole blood, and blood vessels, as well as to analyze the binding of SNPs to quantitative expression trait loci (eQTLs) (accessed on 8 October 2022) (The GTEx Consortium, 2020). • Bioinformatics resource HaploReg (v4.1) (http://archive.broadinstitute.org/mammals/haploreg/haploreg.php) was used to assess the association of rs4644832 SERF2 with histone modifications marking promoters and enhancers: mono-methylation at the 4th lysine residue of the histone H3 protein (H3K4me1), tri-methylation at the 4th lysine residue of the histone H3 protein (H3K4me3), the acetylation at the 9th lysine residues of the histone H3 protein

Table 2
Results of the analysis of associations between rs4644832 (A/G) SERF2 and ischemic stroke risk. Statistically significant differences are marked in bold.

Table 3
Results of the analysis of associations between rs4644832 (A/G) SERF2 and ischemic stroke risk depend on sex and smoking status.

rs4644832 is associated with IS risk in females and non-smokers
The frequency of the minor allele G in Russians (0.156) was within the range of frequencies for this allele in European populations from the 1000 Genomes Project, Phase 3 (http://www.ensembl.org (accessed on 11 October 2022): 0.154-0.222).
The genotype frequencies of rs4644832 SERF2 in study groups are presented in Table 2. The distribution of genotypes frequencies corresponded to the Hardy-Weinberg equilibrium in the control group (P > 0.05). However, patients with IS showed an increase in observed heterozygosity (Ho=0.3136) compared to expected (He=0.2877); P < 0.01.
The analysis of the total sample did not reveal significant associations between rs4644832 polymorphism SERF2 and IS risk. However, sex-stratified analysis demonstrated that G allele is the risk factor of IS in females (OR=1.29, 95%CI 1.02-1.64, P = 0.035). Also, this genetic variant was found to be associated with an increased risk of IS in nonsmoking individuals (OR=1.26, 95%CI 1.01-1.56, P = 0.041) ( Table 2).   Since the revealed association between the genetic polymorphism and smoking-status may be biased by the disequilibrium of proportion of smokers between males and females we further performed a separated analysis accounting both factors. The analysis has confirmed that smoking-status has its own impact because we had found that the association of G allele with the higher risk of IS in nonsmokers still took place even after exclusion of males (Table 3).
Given the fact that patients with IS were characterized by differences in clinical and laboratory characteristics depending on the presence of a comorbid pathology (IHD/T2DM), there may be data bias in the results of associations. Therefore, in order to understand the marker specificity for IS, we considered comparison of the 3 patients' subgroups depending on the presence/absence of other diseases: stroke patients without IHD and T2DM, combination of stroke and CAD, stroke patients with T2DM (Tables S2-S4). Of note, we did not segregate a subgroup of stroke patients aggravated by both CHD and T2DM in the analysis due to the small number of observations in this cohort. The association of rs4644832 with the IS in non-smoking patients persisted after the exclusion of stroke patients with comorbid IHD and T2DM. However, the association of this genetic variant with IS in women was lost, possibly due to a reduction in study power due to a decrease in the number of patients by 207 people. We did not find associations of rs4644832 SERF2 in the subgroups of patients with IHD and T2DM, which is an additional justification for the specific association of rs4644832 SERF2 with the IS development. On the other hand, the number of stroke patients with comorbid IHD and T2DM was too small to draw any conclusions about associations with genetic markers.

Molecular correlates of rs4644832 SERF2
The SERF2 gene is expressed in brain tissues, blood vessels, and whole blood. In brain tissues, gene expression levels of SERF2 measured as median (Me) TPM (Transcripts Per Million) vary from 56.54 to 164.7; in blood vesselsfrom 203.8 to 246.5; in whole blood MeTPM= 153.5 (Fig. 1).
Besides the others, the list of SERF-regulated genes includes HYPK (Huntingtin Interacting Protein K), involved in negative regulation of apoptosis and protein stabilization; MAP1A (Microtubule Associated Protein 1A), PDIA3 (molecular chaperone that prevents the formation of protein aggregates) ( Table S5).
The SERF2 protein is associated with 20 functional partners, and 92 total links the most significant of which are SERF1A, ZNF706, SERF1B (Fig. 2). Together with SERF2, these proteins are involved in such general biological processes as protein destabilization (GO:0031648), Padj= 0.0002; regulation of stem cell population maintenance (GO:2000036), Padj= 0.02; amyloid fibril formation (GO:1990000), Padj= 0.04 (https://maayanlab.cloud/Enrichr). Subsequent analysis revealed a significant effect of rs4644832 SERF2 on histone modifications. This genetic variant is located in the region of DNA binding to histone H3, characterized by mono-methylation at the 4th lysine residue of the histone H3 protein (H3K4me1) and marking enhancers in Brain Hippocampus Middle, Brain Cingulate Gyrus, Cells from peripheral blood, as well as tri-methylation at the 4th lysine residue of the histone H3 protein (H3K4me3) and marking promoters in all brain tissues represented by HaploReg (v4.1), and Cells from peripheral blood. The effect of this histone mark is enhanced by acetylation of the lysine residues at N-terminal position 27 of the histone H3 protein (H3K27ac), marking enhancers in peripheral blood cells and all brain tissues, as well as the acetylation at the 9th lysine residues of the histone H3 protein (H3K9ac), marking promoters in blood cells and all brain tissues, except Brain Hippocampus Middle (Table 4).
H3K4me1 -mono-methylation at the 4th lysine residue of the histone H3 protein; H3K4me3 -tri-methylation at the 4th lysine residue of the histone H3 protein; H3K9ac -the acetylation at the 9th lysine residues of the histone H3 protein; H3K27ac -acetylation of the lysine residues at N-terminal position 27 of the histone H3 protein; Enhhistone modification in the enhancer region; Prohistone modification at the promoter region.
According to the bioinformatic resource Cardiovascular Disease Knowledge Portal (CVDKP, https://cvd.hugeamp.org), which combines and analyzes the results of genetic associations of the largest consortiums for the study of CVD, the protective allele A is associated with a number of stroke-associated phenotypes (Table 5).

Discussion
Preserving proteome quality is one of the primary tasks for the cell. Recently described class of chaperones, Hero, obviously provides important homeostatic functions and should be taken into account for its role in pathogenesis of various pathologies. Hero proteins were discovered in 2020 as heat-resistant molecules responsible for chaperone-like activity of the supernatants collected from the boiled S2 or HEK293T cell cultures lysates (Tsuboyama et al., 2020). Thorough search for the nature of protein-preserving components of these boiled supernatants allowed identifying six proteins which were shown to protect proteins under stress conditions and to prevent pathogenic misfolded TDP-43 aggregations in cells. The authors also demonstrated that the hallmarks of the novel class of chaperones were low molecular weight, high polarity and charge, as well as disorganized structure. The most obvious way to explain their chaperone activity is that because of chemical properties they are able to physically screen client proteins.
Since Hero-proteins had been demonstrated as substantial regulators of proteostasis we aimed to investigate whether their member SERF2 is involved in pathogenesis of IS. Previously, SERF2 was reported to enhance polyglutamine, β-amyloid and α-Synuclein aggregation and to . Moreover, brain-specific Serf2 knockout mice, though viable, appear to be more prone to deposition of amyloids, and show modified fibril morphology (Stroo et al., 2021). Whole-body knockouts are perinatally lethal due to an apparently unrelated developmental issue (Cleverley et al., 2021). A number of works reported that SERF2 interacts with a wide spectrum of RNA molecules, however the physiological significance or specificity of this interaction is unclear at this time (Sahoo and Bardwell, 2022).
Here, providing genetic evidence that rs4644832 is associated with IS risk, we report SERF2 is involved in pathogenesis of stroke. Probably, the role of SERF2 in IS may be related to some of its protective functions because disease-causing allele G was shown to decrease SERF2 expression. In respect of main causes of IS these protective effects are most likely related to the role of SERF2 in the cardiovascular system. This assumption is further enhanced by previous findings confirming correlation between rs4644832 polymorphism and different cardiovascular phenotypes. For instance, CDKP demonstrates that protective A allele is associated with a decrease in Diastolic blood pressure (Beta=− 0.0152, P = 7.96 ×10 − 10 ), Systolic blood pressure (Beta=− 0.0074, P = 0.001) as well as with a risk of Arterial hypertension (OR=0.9894, P = 0.02) ( Table 5).
Despite it seems difficult to separate the impact of rs4644832 on the risk of IS or of other related cardiovascular diseases, we report that genetic association remains significant even after excluding IHD and DM type 2 patients. Nevertheless, all the patients had confirmed arterial hypertension in anamnesis. Thus, bioinformatic analysis and our results indicate that this genetic variant may play a significant role in the rise of hypertensionthe most important factor contributing to IS risk. Differentiation of rs4644832 contribution to isolated arterial hypertension is one of the main limitations in the present study and should be considered the task for further research. We also have to note that another limitation of our research is missing values of some clinical parameters in a certain number of patients (Table 1, Table S1) due to the retrospective type of the study.
Additionally, CDKP shows that rs4644832 A allele is protective for Elevated heart rate (Beta=− 0.0915, P = 2.04 ×10 − 4 ) and Peripheral artery disease (OR=0.95, P = 0.03). Elevated heart rate increases the risk of atrial fibrillation acting as another significant risk factor for cardioembolic type of IS. Notably, CDKP data also indicates A allele to have a protective role in peripheral artery diseasepathogenetically related cardiovascular pathology. Noteworthy, all these associations are concordant to pretty high levels of SERF2 expression in cardiovascular system, however precise molecular avenues of SERF2 involvement in the listed phenotypes are yet to be understood in details. Obviously, of the main mechanisms, chaperone-like or RNA-binding activity should be considered first.
Revealed sensitivity to mutation for only women and non-smokers may be theoretically related to abolishment of protective effects of the A allele for males and smokers who are more prone to cardiovascular diseases. However, previous data show that in contract to estrogens (Gertz et al., 2012) tobacco components (Takami et al., 2007;Jiang et al., 2017; Gonzalez-Rivera et al., 2020) and androgens (Bereketoglu et al., 2021) serve positive regulators of SERF2 expression making a hint that these sex/smoking-gene relationships may still be explained by cellular effects of SERF2 only. Taking together our data suggest that SERF2 has a protective role because the factors increasing its expression (male sex, smoking, A allele) decrease the risk of IS. Moreover, our previous results showing correlation between rs2900262 polymorphism of another Hero gene -C9ORF16 with IS only in smoking individuals also match this tendency because components of cigarette smoke, on the contrary, reduce the expression of C9ORF16 (Kobzeva et al., 2022).
Interestingly, in an attempt to bridge SERF2 function and IS we have found that of the genes influenced by eQTL effects of rs4644832 none had previously been shown to correlate with cardiovascular phenotypes. However, it turned out that some of them possess great impact for neurodegeneration and protein quality control. For instance, MAP1A, regulative by rs4644832, encodes a neurospecific protein which is known as a causative for neurodegenerative diseases (https://www. genecards.org/cgi-bin/carddisp.pl?gene=MAP1A&keywords=MAP1A, Liu et al., 2015). SNP rs4644832 is also correlated to expression of PDIA3, the gene encoding a protein of the endoplasmic reticulum that interacts with lectin chaperones calreticulin and calnexin to modulate folding of newly synthesized glycoproteins (Tatusova et al., 2016). PDIA3 сatalyzes the formation, isomerization, and reduction or oxidation of disulfide bonds (Bourdi et al., 1995;Gaucci et al., 2016), playing an important role in post-translational modification and folding of client proteins. Another gene, HYPK, is involved in negative regulation of apoptosis regulation and protein stabilization (Raychaudhuri et al., 2008). Moreover, HYPK has the chaperone-like activity and is related to regulation of heat shock response (Das and Bhattacharyya, 2016) and protein homeostasis (Ghosh and Ranjan, 2022). Altogether, these findings open up future perspectives to study possible links between SERF2 and outcomes of IS because proteostasis takes an especially important place during response to ischemia-reperfusion.

Conclusion
The present study reveals the novel genetic association between rs4644832 SERF2 and the risk of IS in females and non-smokers. These data provide the new insights of involvement of recently discovered class of chaperones Hero in both neurological and cardiovascular pathology. Further studies may help to uncover the details of its particular role in the pathogenesis of stroke.

Compliance with ethical standards
The study was conducted according to the guidelines of the Declaration of Helsinki, and was approved by the Ethical Review Committee of Kursk State Medical University, Russia (Protocol No. 12 from 7.12.2015). All the participants gave written informed consent before the enrollment in this study.

Patient consent for publication
Not applicable.

Consent for publication
All authors have read and accepted responsibility for the content of the manuscript.

Declaration of Competing Interest
The authors declare no conflict of interests.

Appendix A. Supporting information
Supplementary data associated with this article can be found in the online version at doi:10.1016/j.ibneur.2023.05.004.