Structural Changes in Proteins With Post-translational Modications in Female Oncopathologies

Post-translational processing leads to conformational changes in protein structure that modulate molecular functions and change the signature of metabolic transformations and immune responses. Some post-translational modications (PTMs), such as phosphorylation and acetylation, are strongly related to oncogenic processes and malignancy. This study investigated a PTM pattern in patients with gender-specic ovarian or breast cancer. Proteomic proling and analysis of cancer-specic PTM patterns were performed using high-resolution UPLC-MS/MS. Structural analysis, topology, and stability of PTMs associated with sex-specic cancers were analyzed using molecular dynamics modeling. We identied highly specic PTMs, of which 12 modied peptides from eight distinct proteins derived from patients with ovarian cancer and 6 peptides of three proteins favored patients from the group with breast cancer. We found that all dened PTMs were localized in the compact and stable structural motifs exposed outside the solvent environment. PTMs increase the solvent-accessible surface area of the modied moiety and its active environment. The observed conformational changes are still inadequate to activate the structural degradation and enhance protein elimination/clearance; however, it is sucient for the signicant modulation of protein activity. innate immune system (n=49, HSA-168249), hemostasis (n=37, HSA-109582), and regulation of post-translational phosphorylation (n=18, HSA-8957275). Proteins shared between the cancerous groups are also characterized as extracellular and regulate the immune response (n=10, HSA-168256), metabolism of proteins (n=10, HSA-392499), innate immune system (n=8, HSA-168249), and hemostasis (n=10, HSA-109582). Cancer-specic proteins are implicated in vesicle-mediated transport (n=11, HSA-5653656), activation of the immune system (n=35, HSA-168249, HSA-168256), and membrane tracking (n=8, HSA-199991). molecular surfaces that have been annotated previously and not detected in normal plasma or regular cells/tissues; therefore, we can suggest that they are atypical. Assumingly, the origin of such PTMs is featured for cancer phenotype and can progress tumorigenesis. However, the majority of protein molecules maintain stability after accommodation of non-specically localized PTM moiety, as has been shown by molecular dynamics experiment. We suggest a new approach for analyzing conformational changes and structural re-ordering caused by PTM mounting. The analysis targets a small part of the protein molecule (motif) and the active environment organized by adjacent amino acid residues. It includes secondary structural elements that are most affected after PTM of the target residue. We determined the main structural features of the studied motifs, including spatial coordinates,


Introduction
Epithelial ovarian cancer accounts for up to 90% of all malignant ovarian neoplasms. Moreover, breast cancer most commonly affects milk ducts and lobules (also known as invasive ductal carcinoma). It can also originate from glandular tissue (invasive lobular carcinoma) and germ cells [1]. Breast cancer is the most prevalent type of tumor in women and is the second most common cancer type in general [2]. Determination of molecular events (signatures) associated with the onset and progression is a major task due to the lack of e cient early diagnostics and limited success in treatment strategy. As a result, high disability and mortality levels are arising [2].
The transformation of normal, regular cells into neoplastic cells is accompanied by various endogenous molecular events, which are generally orchestrated by the comprehensive and dynamic network of post-translational modi cations (PTM) [3]. PTMs are thought to play a pivotal role in the maintenance of different biological processes; thus, disruption of PTM crosstalk may initiate a complex of oncogenic events. Several PTM moieties can be represented on the protein surface and create a "PTM code." The proper "code" is important for the organization of intracellular signaling through interaction with various effectors. Therefore, different strategies and proteomic tools, including a nity-based separation, TiO2 enrichment, and the HDX technique, have been recently developed to decipher distinct PTMs and determine their localization on the protein surface. To date, up to 450 different PTMs have been distinguished and annotated. The most prevalent PTMs are phosphorylation, acetylation, methylation, and ubiquitination [4]. Less known citrullination and SUMOylation can be more favorable due to alterations in CD4 + T cells to cellular stress, immune activation [5], sensing of DNA damage response, and telomere maintenance [6].
The main source of PTM identi cation is mass spectrometry-based proteomics, supplied by interatomic and transcriptomic integrative approaches.
However, mass spectrometry data are highly redundant, and search engines are imperfect, whereas validation of the putatively identi ed PTMs by immunochemistry is limited. This leads to several misidenti cations and signi cantly obstructs the investigation of the exact role of speci c PTMs in oncogenesis. Therefore, only a small fraction of PTMs is well validated and is related to several types of cancer, including glycosylation of COX-2 in colorectal cancer [7], citrullination of bronectin during renal cancer [8], phosphorylation of PKM2 in thyroid cancer [9], and deSUMOylation mediated by SENP1 in prostate cancer [10].
Dysregulation of the SUMOylation pathway has been reported to be associated with breast cancer via Mel-18 activity, which controls ESR1 expression and governs SENP1 activity [11,12]. Ubiquitination of POU class 5 homeobox 1 protein (OCT4) leads to decreased survival of breast cancer patients [13]. Despite the wide scrutiny of ovarian cancer tumorigenesis, little is known about the interplay of PTMs in this phenotype. According to a meta-analysis, most information is related to histone acetylation and ubiquitination [14]. Other studies have reported the speci c role of TRIM71 ubiquitin ligasemediated regulation of p53, which can retard ovarian cancer development [15]. Irregular N-glycosylation triggers the IRE1α-XBP1 pathway, which causes a stress response of the endoplasmic reticulum to unfolded proteins in T cells and enhances ovarian cancer tumorigenesis and malignancy [16]. In addition, repression of HDAC6, which deacetylates p53 on Lys120, suppresses ovarian tumorigenesis [17].
In this report, we would like to highlight that breast and ovarian cancer-speci c molecular events can be considered through the prism of posttranslational actions and the induced changes in protein structure stability. Structural biology has provided a good understanding of motif elements determined by the speci c folding of secondary structures. Such motifs are typically tightly packed and accommodated by the neighboring or closest segments of the polypeptide chain, helixes, and β-strands and maintain the originality of spatial folding regardless of the homology of proteins where the elements were being observed in [18,19]. Owing to the high stability of the protein globule, helical pairs are the most attractive for structural analysis. Therefore, in silico structural molecular analysis and modeling of PTM moieties can be accomplished on isolated sustained helical pairs instead of ordered protein molecules, which improves calculations without losing information.
In this study, we performed a structural analysis of the geometry and topology of motives with PTM moieties accommodated in targeted peptides derived from cancer-associated proteins. We found that the cancer-speci c serological proteome signi cantly differed from those of the aligned phenotype of healthy volunteers. We identi ed 74 and 25 proteins speci c for ovarian cancer and breast cancer groups, respectively, and 50 proteins were shared between assayed groups but omitted in the control group. The study determined that 65 and 88 proteins in ovarian cancer and breast cancer groups, respectively, signi cantly differed in their abundance compared to those in the control group. We identi ed cancer-speci c PTMs and conducted a structural analysis in support of AMBER software. Among the identi ed PTMs, 12 modi ed peptides were localized in eight proteins of ovarian cancer patients, whereas six modi ed peptides were localized in three proteins of breast cancer patients.
We suppose, that most oncopathologies can be induced by uncharacterized PTMs even before sensitive alterations in gene expression. Furthermore, based on the presented data, a particular PTMs pattern can be featured for a certain cancer phenotype.

Proteomic analysis of plasma samples by ovarian and breast cancer
Identi cation of serological protein signatures that de ne ovarian cancer and breast cancer phenotypes was achieved by categorizing qualitative and quantitative features. The mean number of plasma protein identi cations for the studied groups is shown in Figure 1A.
Symmetry comparative analysis showed a wide cluster of proteins (n=147) shared between all studied groups (including the control group), a smaller collection for both types of cancer (n=50; breast cancer and ovarian cancer), and two clusters distinct for patients with either ovarian (n=25) or breast (n=74) cancer ( Figure 1A).
A comparative analysis of the protein content in the blood samples of the study participants revealed that 65 and 88 proteins (p<0,05) in the ovarian cancer and breast cancer groups, respectively, signi cantly differed in their abundance compared to those in the control group (Supplementary Table  S1). Furthermore, the abundancy-based principal component analysis (PCA), taken for n=147 proteins, returned satisfactory segregation of all three studied groups by the projection of the rst two principal compounds with PC1 explaining 11.87% variance, and PC2 of 6.46% ( Figure 1B). Differentially expressed proteins and their fold-changes in patients with oncopathologies compared to the control group are listed in Supplementary Table S1.

Post-translational modi cations of proteins
Modi cation moieties were analyzed for the ve most prevalent PTMs, including phosphorylation of serine (pS), threonine (pT), tyrosine (pY), N-terminal acetylation of lysine, and ubiquitination of lysine (Supplementary Materials Figure S1). After validation, the resulting PTMs were populated and extracted into a separate list of PTMs associated with and found explicitly in the considered oncopathologies (Table 1). We observed PTMs among 12 peptides derived from eight proteins in patients with ovarian cancer and six peptides, carrying modi ed moieties, from three proteins found in breast cancer patients.
There were also a few overlapping PTMs revealed in albumin and Ig heavy chain V-III region CAM. Sequence coverage of peptides carrying PTMs consisted of 12% to 79%, and at least three distinct and proteotypic peptides were identi ed with a high con dence rate for each protein.  Table S1).

Structural analysis of proteins carrying PTMs
Several well-known structural motifs are characterized by speci c spatial folding and geometry in structural biology. Such motifs are typically tightly molded and shaped by the adjustment segments of polypeptide chains, helices, and β-strands [18,19].
In this study, we investigated motifs molded by secondary structures and carrying different PTMs. Such motifs are represented as helical pairs designed by two consequent helices bridged by irregular linkers of different lengths and conformations [18]. Recently, we demonstrated that modifying moieties are frequently observed in α-α-corners, α-α-hairpins, and L-and V-structures. The geometry rules and topological classi cation of the helical pairs are shown in Supplementary Materials Figure S2.
Based on the rules for recognizing helical pairs (Supplementary Materials Figure S1), we selected protein structures from the Protein Data Bank (PDB) matching the targeted peptides that carry empirically identi ed PTMs [18]. Each selection sampled from 1 to 172 protein structures t the recognition and speci c polypeptide chain rules. (Table 2, column "PDB structures").
A comparative analysis of the geometric features of the assayed protein structures was performed according to the following criteria: 1. Similarity of the de ned helical pairs among selected polypeptide chains.
2. Type of spatial conformation for motifs shaped from selected secondary structures.
3. Solvent accessibility to the local folded motif 4. In uence of modi ed amino acid residues on protein structure stability.
Molecular dynamics experiments determined the stability of ovarian and breast cancer-speci c protein structures bearing PTM moieties. Table 2 summarizes the structural analysis data of the selected proteins with PTM. The solvent-accessible surface area of the intact amino acid residue exposed to the solvent was always less than that of the modi ed residue (Table 2, column "Active environment"). Thus, the modi able amino acid residues are continuously exposed to the solvent, and the modi cation process is associated with the enlargement of the solvent-accessible surface area. The surface area of the modi ed residue exceeded that of the intact (unmodi ed) amino acid residue by 50%.
The adjacent environment (neighboring amino acid residues) was evaluated as an alteration of the total surface area accessible to the surrounding solvent (data are represented in column "Active environment," Table 2). The mean surface area of the modi ed environment frequently exceeds that of the intact (unmodi ed) environment. However, the difference is not as explicit as that for the separate modi ed amino acid residue. In some cases, the PTM moiety can signi cantly increase the accessible surface area. However, the total surface area of the exposed active environment is equal to or even less than that of the unmodi ed polypeptide chain. Hence, it can be assumed that the neighboring amino acid residues eliminate the increased surface area caused by the mounted PTM.
The results of the solvent-accessible surface area alterations observed for the modi ed amino acid residue and its active environment are shown in Figure 2. The active environment is de ned as the amino acid surrounding the modi ed residue and is capable of changing the solvent-accessible surface area of a certain motif. We determined the active environment within each motif and identi ed the number of constituent amino acids and their spatial coordinates within the affected polypeptide chain.
Excluding four cases, the total solvent-accessible surface area of the modi ed residue and its active environment increased, in contrast to that of the intact residue. The notable instances (peptides EQL-ac(K)-AVMDDFAAFVEK (ALBU), ac(K)-VPQVSTPTLVEVSR (ALBU), RHPYF-p(Y)-APELLFFAK (ALBU), and YF-ac(K)-PGMPFDLMVFVTNPDGSPAYR (CO3)) are highlighted by the larger solvent-accessible surface area of the modi ed residue compared to the intact residue. However, the solvent-accessible surface area of its active environment is smaller than that of the unchanged form of residue.
The distribution of the protein population bearing the identi ed PTMs (Table 2) depends on the solvent-accessible area for the particular amino acid residue and its active environment before and after mounting the PTM. The modi ed moiety increased exhibition of the surface area of the modi ed amino acid residue to solvent and its active environment ( Figure 3). However, the summed solvent-accessible surface areas do not always exceed those before the modi cation and are characterized by generous scattering.

Molecular dynamics simulation of protein molecules containing PTM associated with the development of ovarian and breast cancer
In this study, we analyzed the similarity, stability, and in uence of the PTM moiety on the degree of the solvent-accessibility local area for the modi ed polypeptide chain of protein molecules. The results of molecular dynamics showed that the geometry and topology features of motifs before and after mounting the PTM are kept within acceptable ranges over the time of the simulation experiment. Particular attention has been drawn to motifs with strongly interacting and axially intercepting helices, typically α-α-corners. The between-distance is negligible (contacting helices), and the area and projection perimeter are distinct from the null value. We did not scrutinize L-and V-structures because helices do not intercept (d≠r, d<r). Moreover, the area and perimeter for such motifs are close to a null value. Simultaneously, the contribution of motifs comprising more than two helices allowed the extension of the range of observable structures. The main criteria for inclusion in the MD simulation was the actual contact between helices; however, the linker length and structure did not act as a signi cant input.
By employing these selection criteria, we expanded the number of motifs that can be sampled for the analysis. In the previous study, only helix pairs comprising two sequential helices were utilized. In this study, we investigated closely contacted helices (from 1 to 30 helices) with intersecting axes.
Simultaneously, the connection may vary in length and conformation and may include secondary structures ( Table 2, column "Motifs"). We detected ve PTM moieties in albumin (ALBU), three in IGHA1 and СО3, four in APOA2 and A1AT, and eight in APOA1 (Supplementary Materials Table S2 and Figure 4). The calculated coordinates for the detected PTMs, α-and θ-angles, minimal (r), and inter-planar (d) distances between helices, areas, perimeters, and standard deviations (sd, sr, sα, sθ, sS, and sP) for the PTM-containing motifs are presented in Supplementary Materials Table S2.
The molecular dynamics results showed that during the set time (0.5 ns), the geometrical features of the studied motifs (intact and modi ed) were within the acceptable ranges. However, we have identi ed conformational changes that distinguish modi ed from intact forms. (Figure 4). The molecular dynamics experiment has demonstrated that the stability of structural blocks directly depends on the strength of interactions between helices (as indicated by the values of d, r, S, P, etc.). It has also been demonstrated that the PTM moiety (mounting of modi cation function) does not induce complete rupture between interacting helices and, consequently, does not disrupt the motif with accommodated PTM function.
Molecular dynamics simulation revealed that mounting of the modifying moiety did not lead to the complete rupture of intramolecular bonds and motif disruption. However, some protein molecules showed an ambiguous attitude (Figure 4 and Supplemental Materials). The two distinct motifs containing polypeptide chain EQL-ac(K)-AVMDDFAAFVEK with acetylated lysine (bolded) should be noted. The topology of the rst modi ed helix was 396-410 (400-414 in the intact molecule), and the location of the second modi ed helix was 538-555 (542-559 in the intact molecule). The motif comprises seven helices, although there are ve helices between the considered contacting helices and the intersecting axes. At the beginning of the molecular simulation, the initial geometry met the following features: the minimal and inter-planar distances were equal and made 11.7Å, while the area and polygonal perimeter were 110.2Å 2 and 43Å, respectively, and the axial angles were α=55 o and θ= -57 o . Following-up the molecular simulation, the distances (r and d) increased, indicating misalignment between helices. The axial angles, area, and perimeter also decreased. The inspected predisposition holds for both modi ed and unmodi ed motifs; however, the modi ed structure inclines to a more remarkable alteration in conformational geometry (inter-planar and minimal distances were 12.9Å and 15.6Å; area and perimeter were S=22.9Å 2

Discussion
Since PTMs change the physical and chemical properties of proteins, they regulate catalytic activity and molecular function; however, the newly acquired properties and functions of the non-speci cally modi ed proteins are unknown in addition to whether the essential molecular property is completely lost. Further, such PTMs can accelerate DNA mutations through inappropriate regulation of gene expression and misleading signaling pathways. Such non-speci cally modi ed proteins are bene cial for cancer phenotypes and promote cancer onset, invasion, and progression.
Therefore, we suggest that nding such PTMs and modi ed proteins can be a central point for deeper insights into metabolic processes in cancer cells.
In this study, we attempted to specify PTM-induced conformational changes in proteins found in patients with ovarian or breast cancers. We suggest that such PTMs, whether mounted at an unspeci c topology or cause conformation instability, can change and modulate protein activities and functional properties, contributing meaningfully to oncogenesis. Evidence exists regarding the cautioning role for such PTMs in biological processes, which provides an excellent opportunity for the utility of cancer-speci c PTM patterns.
Due to cancer is typically accompanied by immune response and acute in ammation, alpha-1-antitrypsin (A1AT) is one of the most investigative markers of dire reactions on tumorigenesis. It is and essential tissues and cells protecting element against neutrophil elastase and proteinase-3 activity increased during in ammation, and participates in various metabolic activities through binding to different ligands (e.g., cytokines, lipoproteins, lipids, plasmin, trypsin) with moderate to high reversible a nity [20]. Because of its vigorous conformational exibility, A1AT can take on different forms (polymeric, oxidized, cleft) adapted for speci c biological processes and ligand types [21,22].
Reports indicate that PTM of A1AT induces conformational changes in A1AT, which modulates structural sustainability and manages molecular catalytic activities [20]. Such PTMs enhance protein stability and sterically protect the molecule against proteolytic activity and conformation-induced aggregation and degradation, thus increasing the protein half-life [23][24][25]. The known A1AT PTMs play a central role in the realization of its immunomodulatory and catalytic activities. Glycosylation of K 342 and V 75 is essential for protein polymerization [26,27]. Recent studies showed that different glycosylation patterns are speci cally characterized for patients with non-small-cell lung cancer and lung adenocarcinoma and may serve as a promising marker for non-invasive differentiation and early diagnostics [28]. Unspeci c bi-antennary di-sialylated glycosylation and increased fucosylation of A1AT has been demonstrated in patients with aggressive form of ovarian and breast cancers and assumingly may contribute in rapid malignancy due to partial inactivation of A1AT activity [29]. A high vulnerability of A1AT to modi cation has been detected in non-cancer patients with rheumatoid arthritis. Non-speci c carbamoylation of K 359 was screened in both seropositive and seronegative rheumatic patients and suggestively enhances autoimmune response since the same amino acid residue can be targeted for citrullination [29]. In contrast, oxidation of M 351 and M 358 is responsible for binding A1AT with the inhibitor when the protein manifests protease activity during in ammation [30], and S-nitrosylation of C 232 plays a vital role in the regulation of apoptosis [31].
However, we discovered two novel PTMs (ac-K 25 and ac-K 125 ) that are distant from known modi able sites ( Table 3). The propensity for PTM-induced conformational polymorphism emphasizes a comprehensive pattern of A1AT molecular functions. Thus, the two newly discovered PTMs may possess unknown functions and have a high potency in predicting therapy response. To date, there are no appropriate methods that utilize a sign of A1AT modi cation and types of recognized PTMs for the evaluation of disease progression or management.
Albumin is one of the major serological proteins and is essential for the transportation of ligands with various properties (e.g., hormones, lipids, xenobiotics) and maintenance of osmotic blood pressure [43]. Albumin is highly attractive as a transporting tool in pharmaceutics because of its ability to bind with and transfer a wide variety of functional molecules, including exogenous origin. There is evidence of many albumin modi cations with vague or explicit responsibilities, but they change its a nity. Glycation of C 34 is the most scrutinized and known type of albumin PTM, which induces conformational change and favors the binding capacity with various ligands, including warfarin, tolazamide, acetohexamide, and tolbutamide [44,45].
In contrast, S-nitrosylation of C 34 is responsible for transporting anionic organic compounds and heavy cations and negatively regulates a nity to fatty acids [35,46]. The role of another type of albumin modi cation, cysteination of C 34 , has not yet been determined; however, it may diminish the a nity binding with bilirubin, tryptophane, warfarin, and diazepam [34].
Previously, we identi ed several types of albumin PTMs (acetylation and phosphorylation) in patients with colorectal cancer [18]. We hypothesized that albumin phosphorylation at the warfarin-binding site (C 34 ) might be regulated by impairments of the cellular signaling network and contribute meaningfully to tumor growth and progression [18]. We suggest that the presently discovered unusual PTMs are associated with the regulation and dealing with the a nity to low-molecular-weight compounds because the indicated PTMs are localized at structural domain II (topological position 196-383) and domain III (topological position 384-585) close to the fatty acids (S 342 and R 348 ) and drugs (R 348 -E 450 ) binding sites [47].
Apolipoproteins (APO) bind lipids and act as ligands for cell surface receptors and co-factors for some enzymes [48]. APOA2 is the major compound constituting high-density lipoprotein (HDL) particles and is involved in vesicle remodeling through direct interaction with APOA1. However, the possible role of the modi ed or intact form of APOA2 in oncogenesis is still unclear, since its expression level is frequently controversial in different cancer types ( Table 3).
The complement system is believed to be part of the innate immune response. It can be initiated in three different pathways (classical, alternative, and lectin), distinguished by the manner of continual activation. While the classical pathway is triggered after binding with either circulating or surfaceimmobilized immune-complexes, lectin and alternative pathways are primed by pathogen-associated molecular patterns, which are well-recognized microbial antigens or their conserved motifs [49]. Regardless of the activation pathway type, the complement starts by ssion of the C3-factor consisting of 13 structural domains, which following by several consequent conformational transformations essential to exhibit multiple interaction sites with various immune-stimulating effectors [50]. The proteolytic ssion of C3 produces a C3b-factor that induces conformational relocations to uncover ligand-binding sites and arrange speci c thioester segments needed for covalent binding with the target antigen surface. Due to tremendous and supreme in uence on angiogenesis, positive regulation of VEGF, cell migration, and extracellular matrix reorganization, the C3 is frequently mentioned beyond the immune system and in the context of malignancy origination and oncogenic properties [38]. The post-translational processing of the C3 factor primes the cleavage of the tetra-arginine linker and generates α-chain (110 kDa) and β-chain (75 kDa) adjusted by a disul de bond and containing well-recognized glycosylation sites (N 917 and N 63 in the α-and β-chains, respectively) [39]. Recently, the phosphorylation of pT 1009 has been detected as a new yet uncharacterized type of PTM, apparently produced after the catalytic processing of the cleaved C3-factor by kinases [40]. The presence of pT 1009 was con rmed in patients with breast or ovarian cancer, but was absent in the control group (Table 3). The C3-factor can be phosphorylated at numerous different sites, which expectedly affects its molecular properties and controls the functional activity; however, the consequences of such processing are inexact due to insu cient attention to the role of C3 phosphorylation in cancer pathophysiology. Based on the obtained data, we suggest the high oncogenic potency of this PTM moiety because of the phosphorylation of pT 1009 . Increased fucosylation at N 85 and sialylation at N 939 patterns provides opportunity in distinguishing of patient with colorectal cancer from adenoma, when changes in protein level are not su cient [41]. Speci c glycoforms pattern has been proposed as utility in early diagnostic tool of hepatocellular carcinoma, despite the cancer-speci c role of the detected glycosylation sited did not examined [51]. Global pro ling of human serum revealed several new proposed glycans in patient with liver cancer [52], among which the glycan S1H5N was the most amenable to distinguish cancer patients from healthy donors and may improve the level of diagnostics. As has been shown by molecular dynamics simulation, the observed in our study cancer-speci c modi cation (Ac-K 283 ) induces a drastic conformational change. It keeps the C3-factor stable, which probably re ects the acquisition of new activities and functional properties.
Changes in iron status was found in a prevalent of patients with advanced cancer. It is suggested, the cancer-related anemia can be caused by chronic in ammation or as an effect of treatment of such patients [53]. In this respect, changes in the expression of serotransferrin and its receptors might be targeted for the anti-cancer therapy. It has been discussed, that cancer patients are characterized by the distinct patterns of TRFE glycosylation, which may in uence the binding capacity and restrict interaction with TRFE receptors [54]. High glycosylation levels at N 432 and N 630 has been shown speci cally in patients with the advanced pancreatic cancer and attributed to acute in ammation [42] and, thus, inversely promoting expression of HIF1α expression [55]. Although, the number of versatile glycosylation sites in TRFE for cancer phenotypes is rigorously investigated, their roles remain mostly suggestive and outcome from the iron transporting function of the protein. In this study we identi ed two new PTM sites at Ac-K 331 and Ac-K 453 .
It has been examined that despite newly attached PTMs, the stability of TRFE is su cient, however the Ac-K 453 is the closest site to iron-binding Y 445 residue and the presence of acetyl-moiety may prevent tight interaction of the second iron ion, hence, limiting binding capacity of TRFE.

Conclusions
In this study, we reported PTMs found at non-speci c localizations on molecular surfaces that have never been annotated previously and not detected in normal plasma or regular cells/tissues; therefore, we can suggest that they are atypical. Assumingly, the origin of such PTMs is featured for cancer phenotype and can progress tumorigenesis. However, the majority of protein molecules maintain stability after accommodation of non-speci cally localized PTM moiety, as has been shown by molecular dynamics experiment.
We suggest a new approach for analyzing conformational changes and structural re-ordering caused by PTM mounting. The analysis targets a small part of the protein molecule (motif) and the active environment organized by adjacent amino acid residues. It includes secondary structural elements that are most affected after PTM of the target residue. We determined the main structural features of the studied motifs, including spatial coordinates, distances between helices, torsion angle, length, area, and polygonal perimeter of helices within the accounted motifs. It was demonstrated that the modifying moiety was always exhibited outside the protein globule in the solvent environment. Following the molecular dynamic simulation analysis, we calculated the protein distribution depending on the accommodated PTM type, the end-up solvent-accessible area before and after mounting the PTM function, and the active environment that also mutates exhibition to the surrounding solvent. It has been established that the solvent-accessible area of the modi ed amino acid residue always exceeds that of the intact residue. However, the total accessible area of the active environment, combined with the modi ed residue, can be equal or marginally smaller than that of the incorporated intact residue. The molecular dynamics simulation showed that the accommodated PTM moiety did not dramatically change the molecular stability and did not approximate the molten globule state.
Despite the high structural stability, the discovered molecular conformations might be irreversible and can contribute signi cantly to the supervision of protein functional activity. The elaborated approach can provide an excellent opportunity to evaluate the in uence of PTM-caused functional exibility on the onset and growth of tumors and improve the relevance of the currently under-evaluated PTMs.

Demography and Ethical Consideration
The study population comprised of the group of patients with stage II-III breast cancer (n=24, aged 48±11 years old) and the group of patients with ovarian cancer at stages II-III (n=53, aged 52±12 years old), who had been inpatients at the M.

Sample Preparation for MS Analysis
Following overnight fasting, peripheral blood samples (up to 5 mL) were collected in pre-chilled EDTA-2K + vacuum tubes between 9 and 11 a.m. The collected blood samples were centrifuged for 10 min at 1,500 × g and t=4 o C. The obtained plasma (supernatant) fraction was carefully transferred into clean cryotubes with a nominal volume of 2 mL.
Digestion was performed with trypsin (200 ng/µL supplemented in 30 mM acetic acid) in two stages: at the rst stage, trypsin was added at a ratio of 1:50 (w/w) and the reaction was incubated for 3 h at 37°C. In the second stage, the enzyme was added at a ratio of 1:100 (w/w), and the reaction was incubated at 37°C for 12 h.

Mass Spectrometry Protein Registration
The analysis was conducted on a high-resolution Q Exactive-HF mass spectrometer (Thermo Scienti c, Waltham, MA, USA) with an installed introduced nano-spray ionization (NSI) ionization source (Thermo Scienti c). The selection of mass spectrometry parameters for data acquisition was the requirements of the Human Proteome Organization (HUPO Guidelines, bullet point 9, version 3.0.0, released October 15, 2019) for the minimal length of the detected peptide for consideration and justi cation of PE1 proteins (according to the Uniprot KB Classi cation).
Data acquisition was performed in a positive ionization mode in the range of 420-1250 m/z for precursor ions (with resolution R = 60 K) and in a range with the rst recorded mass of 110 m/z for fragment ions (with resolution of R = 15 K). Precursor ions were accumulated for a maximum integration time of 15 ms, and fragment ions were accumulated for a maximum integration time of 85 ms. Top 20 precursor ions with a charge state between z = 2 + and z = 4+ were collected in the ion trap and pushed to the collision cell for fragmentation in high-energy collision dissociation mode. The activation energy was normalized at 27% for m/z = 524, z = 2+, and ramped within ±20% of the installed value.
Analytical separation was performed using an Ultimate 3000 RSLC Nano UPLC system (Thermo Scienti c). Samples were quantitatively (2 μg) loaded onto the enrichment column Acclaim Pepmap ® (5 × 0.3 mm, 300 Å pore size, 5 µm particle size) and washed at a ow rate of 20 μL/min for 4 min using a loading solvent (2.5% acetonitrile, 0.1% formic acid, and 0.03% acetic acid). Following the loading stage, peptides were separated on an Acclaim Pepmap ® analytical column (75 µm × 150 mm, 1.8 µm particle size, 60 Å pore size) in a linear gradient of mobile phases A (water with 0.1% formic acid and 0.03% acetic acid) and B (acetonitrile with 0.1% formic acid and 0.03% acetic acid) at a ow rate of 0.3 μL/min. The following elution scheme was applied: the gradient started at 2.5% of B for 3 min and raised to 12% of B for the next 15 min, then to 37% of B for the next 27 min, and to 50% for the next 3 min. The gradient was rapidly increased to 90% of B for 2 min and was maintained for 8 min at a ow rate of 0.45 μL/min. Enrichment and analytical columns were equilibrated in the initial gradient conditions for the next 13 min at a ow rate of 0.3 μL/min before the following sample run.
Mass spectrometric measurements were performed using the equipment of "Human Proteome" Core Facility (IBMC, Moscow, Russia).
Protein Identi cation and Criteria Selection for Post-translational Modi cations Adapted peak lists were searched using the OMSSA (version 2.1.9, Proteomics Resource, Seattle, WA, USA) search engine against a concatenated target/decoy protein sequence database UniProtKB (88703 (target) sequences with a restricted taxonomy (Homo sapiens). The decoy sequences were populated by the reverse sequence algorithm of the SearchGUI engine (release 3.1.16, Compomics, Gent-Zwijnaarde, Belgium).
Peptides were parsed with mass tolerance of 10.0 ppm for the MS1 (precursor) level and with a tolerance of 0.01 Da for the MS2 (fragment ions) level. Trypsin was set as a speci c protease, and a maximum of two missed (internal) cleavages were allowed. Modi cations of acetyl (K), phospho (S), phospho (T), phospho (Y), and Gly-Gly (K) were selected as exible. Peptides and proteins were identi ed using PeptideShaker version 1. 16.11 (Compomics, Gent-Zwijnaarde, Belgium) and validated at a 1.0% false discovery rate estimated as a decoy hit distribution.
To eliminate probable false positive results due to the concatenated search of several PTMs, we curated only those results that t the following requirements: (a) at least 98% con dence for peptide identi cation, (b) at least 80% of peptide sequence coverage by fragmentation spectra, and (c) at least 10 units of D-score for PTM probability. Furthermore, the extracted data were manually curated to avoid possible false identi cations. During this step, we removed those spectra that do not contain proper y/b fragment ion pairs designation that mark and locate the exact amino acid residue carrying PTM.
To consider the detectable PTM moieties as relevant to cancer phenotype and for the structural and molecular dynamic analysis, they should meet the following criteria: (a) the total set of PTM moieties must be detectable and identi ed in at least 50% of each cancer phenotype; (b) each PTM moiety should be identi ed in at least 20% of PTM-carrying subjects.

Analysis of Post-translational Protein Modi cations
In the present study, proteins containing tryptic peptides of a certain type of modi cation were selected from the PDB [56]. The selected proteins belong to the class of alpha-helical and globular proteins. For each peptide, a sample was created to determine the conformational template. Furthermore, all motifs containing the tryptic peptides were selected from the database. Our previously developed method was used to recognize and select structural motifs [57][58][59].
The secondary structure of proteins was determined using the Kabsch and Sander DSSP methods [60]. Using the same program, the available contact surfaces with the solvent were determined. De nitions of important structural motif characteristics have been described in previous studies [57,59,61].
Visual analysis of the structures was performed using the RasMol molecular graphics program [62].
In total, the DSSP program distinguishes three types of helices: α-helix, π-helix, and helix 3/10. The DSSP program also solves the problem of determining the beginning and end of the helix. A candidate for the desired structure (helical pair) is a protein region that contains two helices and a region of a protein thread between the helices, which is called connection. For each helix of the structure, the axis of the cylinder on which this spiral was wound was determined using the least square method. The axis of the cylinder will be found more precisely when the closer the helix is to the ideal.
The quality of the axis assessment was characterized by the value of the standard deviation. We selected those helices (and, accordingly, structures) for which the accuracy of the axis estimation satis es a predetermined criterion. The two axes of the helices de ne the spatial structure completely. It is known that two parallel planes can be drawn through two non-intersecting straight lines in space such that the rst axis belongs to the rst plane and the second to the second plane. An axis lying in one plane can be projected onto another plane. Thus, the spatial structure is fully described by the distance between the parallel planes and the projections of the axes of the helices onto the plane.
To establish the stability of structural motifs without and considering the modi cation, a numerical experiment of molecular dynamics was planned and carried out. Molecular dynamic simulation experiments were performed using the AMBER software (version 11) [63]. To calculate the molecular trajectories for the selected proteins, the experiments performed with no consideration of the water environment and at a eld strength of AMBER ff03 at 300 K [64]. Therefore, the complete energy of the considering system was minimized at the xed atomic coordinates, which deployed the condition to organize the atomic interaction order. The molecular system was further heated up to the selected temperature (300 K), and molecular trajectories were recorded during 0.5 nanoseconds every 0.005 nanoseconds. The resulting molecular trajectories were visualized using the VMD (version 1.9.1) software [65]. The free energy for the binding pro le of the molecular complexes was estimated by Born's method [66], and the distance between atoms was calculated using CPPTRAJ software (a part of AMBER (version 11) software). The spatial geometry of polypeptide chains was characterized according to the categorized rules of Kabsch and Sander [60].
The following geometric features [18] of the assayed motifs with mounted PTMs were recorded every 0.005 ns: minimal (r) and inter-planar (d) distances between helices, α-and torsion (θ) angles between axes of helices, area (S), and perimeter (P) of polygons of the helices projections intersection. Means and standard deviations were recorded and utilized to de ned the stability of protein molecule relatively to the initial intact condition. The geometric features were recorded for motifs that met the following criteria: (1) the presence of a PTM moiety, (2) the PTM must be localized on or close to the helix or at least on the polypeptide chain lining the affected helices; (3) helices should be in tight contact and the minimal and inter-planar distances must be equal (r = d ≤16Å), but the area and the perimeter should have a non-null and not close to zero [57,59-61].

Statistical analysis
Mass spectrometric intensities below the instrumental-adjusted threshold were converted by OMSSA to null values and imputated with the minimal threshold value of 10 5 counts. Proteins were quanti ed with Intensity values representing normalized summed peptide intensities correlating with protein abundances.
Principal component analysis was performed for the set of proteins shared between the groups of study. To assess the similarity of identi ed proteomes an upset plot was generated using the UpSetR function in UpSetR [67]. Proteins attributed to certain pathology groups and featured by an arbitrary fold change cut-off of >2 and signi cance p-values of <0.05 (Wilcoxon test) were considered as meaningfully different in quantitative property.
Statistical analyses were performed using an in-house script in R.
Signi cantly altered proteins were submitted for functional and pathways annotation analysis at a q-value threshold less than q <0.01 using PANTHER Overrepresentation Test of Gene Ontology toolset [68], and Bonferroni correction for multiple testing has been applied. The enriched terms were re ned with similarity coe cient of >0.7. Table 1. The list of proteins carrying the de ned PTMs and found speci cally in patients with ovarian cancer or patients with breast cancer. Peptides are listed with the main accompanying mass-spectrometric characterizations: PSM -peptide spectra match (is the number of spectra matching the theoretical peptide sequence with high score); b-and y-type of fragment ions are C-and N-terminal sequential fragment ions populated after peptide (precursor ion) decay. Complete results are available in the Supplemental Materials Ac(K) -acetylation of lysine; p(Y), p(T), p(S) -phosphorylation of tyrosine, threonine and serine, correspondingly; PSM -peptide spectra; Seq., %amino acids coverage for the identi ed peptides; b/y-the number of revealed and attained b-and y-type fragment ions.      The distribution of proteins population with the identi ed PTMs depending on the solvent-accessible area for the certain amino acid residue and its active environment before and after mounting of PTM. The OX axis indicated the surface area, accessible for the surrounding solvent (in Å2); the OY axis indicated the number of the affected protein molecules. Color-code de nes amino acids before modi cation (blue dashed line), amino acids after modi cation (pink dashed line), active environment before modi cation (blue solid line) and active environment after modi cation (pink solid line).