Proteomic and Bioinformatic Investigations of Heat-Treated Anisakis simplex Third-Stage Larvae

Anisakis simplex third-stage larvae are the main source of hidden allergens in marine fish products. Some Anisakis allergens are thermostable and, even highly processed, could cause hypersensitivity reactions. However, Anisakis proteome has not been studied under autoclaving conditions of 121 °C for 60 min, which is an important process in the food industry. The aim of the study was the identification and characterization of allergens, potential allergens, and other proteins of heat-treated A. simplex larvae. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to identify 470 proteins, including allergens—Ani s 1, Ani s 2, Ani s 3, Ani s 4, Ani s 5—and 13 potential allergens that were mainly homologs of Anisakis spp., Ascaris spp., and Acari allergens. Ani s 2, Ani s 3, Ani s 5, and three possible allergens were found among the top 25 most abundant proteins. The computational analysis allowed us to detect allergen epitopes, assign protein families, and domains as well as to annotate the localization of proteins. The predicted 3D models of proteins revealed similarities between potential allergens and homologous allergens. Despite the partial degradation of heated A. simplex antigens, their immunoreactivity with anti-A. simplex IgG antibodies was confirmed using a Western blot. In conclusion, identified epitopes of allergenic peptides highlighted that the occurrence of Anisakis proteins in thermally processed fish products could be a potential allergic hazard. Further studies are necessary to confirm the IgE immunoreactivity and thermostability of identified proteins.


Introduction
Foodborne parasites are one of the most important causative agents of human infectious diseases, especially in less developed countries [1,2]. The climate changes, new feeding habits, and globalization of food supply chains may increase the worldwide incidence of some foodborne diseases [3,4]. Unfortunately, foodborne parasites remain neglected compared with bacterial and viral pathogens [3]. Therefore, studies in this field, like epidemiological surveys [5,6], development of novel diagnostic tools [7][8][9], drug discovery [10], or investigation of pathogenicity [11], are particularly valuable.
Anisakis spp. is one of the most important fish-borne parasites [2]. Live third-stage larvae (L3) of Anisakis simplex consumed with fish or seafood dishes can cause a human disease called anisakiasis. Over 20,000 cases of anisakiasis had been reported worldwide before 2010 [12]. Bao et al. [13] estimated that the total number of worldwide anisakidosis (almost all anisakiasis) cases up to December 2017 might be over 76,000. According to a report of the Orphanet (the portal of rare diseases and orphan (Thermo Electron Corp., San Jose, CA, USA) working in the regime of data-dependent MS to MS/MS switch with higher-energy collisional dissociation (HCD) type peptide fragmentation. A blank run ensuring the absence of cross-contamination from previous samples preceded each analysis.
Mass spectrometric data were preprocessed with Mascot Distiller software (ver. 2.6; Matrix Science, London, UK; http://www.matrixscience.com/distiller.html) and analyzed with the Mascot search engine server (ver. 2.5; Matrix Science, London, UK; http://www.matrixscience.com/server.html) against the A. simplex reference proteome (20,786 sequences; proteome ID: UP000036680) obtained from the Universal Protein Resource (UniProt, http://www.uniprot.org/). To reduce mass errors, the peptide and fragment mass tolerance settings were established separately for individual LC-MS/MS runs after a measured mass recalibration, resulting in values of 5 ppm for the parent and 0.01 Da for the fragment ions in higher-energy collisional dissociation (HCD) MS/MS mode. Peptide sequences were searched using trypsin specificity, allowing one missed cleavage; ion type was set as monoisotopic, and protein mass as unrestricted. Beta-methylthiolation of cysteine was used as a fixed modification, whereas oxidation of methionine was set as a variable modification. A score threshold for all samples was set for 50 to match the highest score threshold computed by the Mascot software. Only proteins identified in all three biological replicates were accepted. Quantification of protein abundance was performed using the exponentially modified protein abundance index (emPAI) provided by Mascot.
The mass spectrometry proteomics data were deposited to the ProteomeXchange Consortium via the PRIDE [45] partner repository with the dataset identifier PXD018059 and 10.6019/PXD018059.

Bioinformatic Analysis
The functional annotation of the identified proteins, including gene ontology (GO) and InterPro analyses, was performed using OmicsBox software (ver. 1.2.4; BioBam Bioinformatics SL, Valencia, Spain, https://www.biobam.com/omicsbox/) based on the Blast2GO annotation methodology [46]. Annotations were run with the default settings, as we previously described [23], and a list of annotations was filtered using nematode taxonomy to improve the prediction accuracy.
Experimentally verified epitopes in the peptides of Anisakis allergens were searched in the Immune Epitope Database (IEDB; last updated on June 14, 2020; https://www.iedb.org/). Allergens in which epitopes were not found in IEDB were subjected to in silico prediction for the detection of potential epitopes in the peptides of these allergens. Bioinformatic detection of potential epitopes was performed using DNASTAR Protean 3D software (ver. 17.0.2.1; DNASTAR, Madison, WI, USA). B-cell epitopes were predicted by applying a confidence threshold of 0.7. Possible epitopes of major histocompatibility complex class II (MHC II) molecules were detected using the default settings. The mapping of potential T-cell epitopes was performed by combining AMPHI and Rothbard-Taylor methods using the default settings.
The ExPASy Compute pI/Mw tool (https://web.expasy.org/compute_pi/) was applied for the calculation of the theoretical isoelectric point (pI) and molecular weight (Mw) of detected proteins.

Comparative SDS-PAGE and IgG-WB Analyses of A. simplex Antigens
SDS-PAGE and WB analyses were performed to investigate the influence of high temperature on the Anisakis antigen. Figure 1a shows the SDS-PAGE multiband profiles of native and heat-treated CR antigens of A. simplex. The bands' profile of antigen heated for 60 min at 100 • C was similar to the native antigen, and just a few high molecular mass bands (132-244 kDa) that were present in the native antigen were not visible in the heated antigen. The intensity of bands was slightly lower compared to the native antigen. The SDS-PAGE profile of antigen autoclaved at 121 • C for 60 min was characterized by diffused band patterns with high background and reduced number of bands compared to the native or heated (at 100 • C) antigens. However, the bands were visible at the following molecular weights: 16-18, 20, 24-26, 34, and about 60 kDa. The IntDen values of all three SDS-PAGE profiles were very similar and ranged from 7.25E+07 to 8.32E+07. Figure 1. Colloidal Coomassie-stained SDS-PAGE analysis (color inversion mode) of the three following crude (CR) antigens of A. simplex: native, heated for 60 min at 100 • C, and autoclaved for 60 min at 121 • C (a). Western blot analysis of anti-A. simplex rabbit IgG antibodies reactivity against following CR antigens of A. simplex: native, heated 60 min at 100 • C, and autoclaved for 60 min at 121 • C (b). Pos.-membrane incubated with hyperimmune serum from a rabbit immunized with A. simplex CR antigen; Neg.-membrane incubated with rabbit preimmune serum. Molecular weight (Mw) estimations are presented in kilodaltons (kDa), as performed by Bio-1D software. The integrated density (IntDen) was calculated by ImageJ software.
The WB profiles of Anisakis antigens are presented in Figure 1b. The profiles of both heated antigens were generally consistent with the native antigen. Similarly to the SDS-PAGE profile, the number and intensity of bands were reduced, and a background appeared. The background in the WB profile of autoclaved antigen was slightly higher than in heated at 100 • C. The IntDen values were very similar for the WB profiles, and they were in the range from 5.82E+07 to 6.46E+07.
The background in the SDS and WB profiles of heated antigens was higher than in native antigen, probably due to degradation. However, the band pattern confirmed that degradation was only partial. Furthermore, the epitopes of degraded proteins probably were not damaged.

Identification and Characterization of A. simplex Proteins
A total of 470 proteins were detected in all three biological replicates of shotgun LC-MS/MS analysis. The identification was performed with high confidence as the peptides false discovery rate (FDR) calculated by Mascot in all cases was 0.99%. A detailed list of all identified proteins with UniProt IDs, protein names, gene names, and OmicsBox annotations is presented in Supplemental File S1.
Detected proteins were displayed on the 3D scatter plot ( Figure 2) based on theoretical Mw, theoretical pI, and estimated relative protein abundance. The molecular weights of all proteins ranged from 4077 to 807,425 Da. An uncharacterized protein (UniProt ID: A0A0M3JJQ2) had the lowest Mw from all detected proteins, while twitchin (UniProt ID: A0A158PN23) had the highest Mw. The majority of proteins (n = 396) were in the range of Mw from about 10 kDa to 94.19 kDa. In the case of the estimated pI values, proteins were in the range of 4.15-10.8. The lowest pI value was calculated for troponin-like protein (UniProt ID: A0A0M3JU57), whereas the highest value was estimated for 60S ribosomal protein L8 (UniProt ID: A0A0M3K2K1). The pI values for the majority of the proteins were in one of the two following ranges: 4.15-7.19 (n = 320) and 8-9.59 (n = 101). The three-dimensional scatter plot analysis of A. simplex proteins (n = 470) identified using LC-MS/MS. The proteins were distributed based on molecular weight (Mw), isoelectric point (pI), and the relative abundance of proteins. The Mw and pI values were calculated using the ExPASy Compute pI/Mw tool, while the relative abundance was estimated based on the average exponentially modified protein abundance index (emPAI) values (mean of three biological replicates) calculated by Mascot. The relative abundance is shown in log2 scale. Red cubes are allergens (n = 5), and cyan cubes are potential allergens (n = 13). Cubes (allergens and potential allergens) are signed with UniProt ID and have a vertical line for better pI and Mw visualization. Yellow spheres are the other identified proteins. A scatter plot was constructed using Teraplot software (ver. 1.4.06; Kylebank Software Ltd., Ayr, UK).
The protein abundance was estimated using emPAI, which provided an approximate protein relative quantification based on the number of observed peptides divided by the number of observable peptides. The estimated emPAI values of proteins by Mascot were in the range 0.02-201. 76. Among all identified proteins, the lowest emPAI values were calculated for collagen alpha-1(IV) chain (emPAI = 0.02; UniProt ID: A0A158PPJ2), uncharacterized protein (emPAI = 0.03; UniProt ID: A0A158PP53), calcium-transporting ATPase (emPAI = 0.03; UniProt ID: A0A0M3JTY4), and uncharacterized protein (emPAI = 0.03; UniProt ID: A0A0M3K3G9). While the most abundant were the following proteins: myosin essential light chain (emPAI = 201.76; UniProt ID: A0A0M3K6N3), tropomyosin (emPAI = 109.71; UniProt ID: A0A0M3KCE6), and DUF4440 domain-containing protein (emPAI = 81.05; UniProt ID: A0A0M3J349). Table 1 shows the top 25 most abundant proteins, with emPAI values in the range of 13.78-201. 76. The Mw and pI values of the majority of highly abundant proteins were in the ranges 10-29.9 kDa (n = 21) and 4.5-6.39 (n = 20), respectively. The distribution of log2-transformed values of the emPAI of all proteins is illustrated in Figure 2. The average emPAI values were log2-transformed to normalize the data. As shown in Figure 2, most of the log2-transformed emPAI values (n = 413) were in the range from −4.2 to 2.99.
InterPro analysis was performed to characterize and classify detected proteins of A. simplex. Three hundred twenty different protein families and 203 domains were detected based on sequence analysis of all 470 proteins, and the most highly represented of them are displayed in Figure 3. A complete list of InterPro matches is presented in Supplemental File S1. The most abundant InterPro families ( Figure 3a) were immunoglobulin-like fold (19 sequences), followed by the NAD(P)-binding domain superfamily (15 sequences), immunoglobulin-like domain superfamily (12 sequences), P-loop containing nucleoside triphosphate hydrolase (10 sequences), and EF-hand domain pair (9 sequences). Most of the identified protein families (n = 220) were represented only by one sequence. As shown in Figure 3b   InterPro analysis of the most abundant 25 proteins allowed the identification of 10 families and 9 domains (see Table 1). Among them, slightly more often (two matches for each) were reported the following proteins families: alpha-crystallin/heat shock protein, EF-hand domain pair, HSP20-like chaperone, intermediate filament rod domain coil 1B, and tropomyosin. Whereas the most abundant InterPro domain was the intermediate filament rod domain (three matches), followed by the alpha-crystallin/Hsp20 domain, domain of unknown function DUF148, EF-hand domain, and myosin tail (each of the two matches).
Gene ontology cellular component annotation was used to predict the localization of the detected proteins. Figure   GO:0043229) and organelle (14%; GO:0043226). Furthermore, sarcomere GO terms (8%; GO:0030017) were of higher abundance, while cytosol (3%; GO:0005829) and membrane-enclosed lumen (2%; GO:0031974) annotations were less represented than in the case of all proteins for cellular component annotations. Other differences in the distribution of cellular component annotations were smaller and are displayed in Figure 4.

Identification and Characterization of Detected A. simplex Allergens and Potential Allergens
Among the identified, the following A. simplex allergens were found: Ani s 1, Ani s 2, Ani s 3, Ani s 4, and Ani s 5. We found 13 potential allergens using the AllergenOnline.org server. All detected allergens and potential allergens are listed in Tables 2 and 3, respectively. Table 4 shows all identified peptides of Ani s 1 and Ani s 5 and matched with these peptides the experimentally verified epitopes from IEDB. B-cell and T-cell epitopes were found in all six peptides of Anis s 1, while B-cell epitopes were detected in 8 of 13 peptides of Ani s 5. Experimentally confirmed epitopes in Ani s 2, Ani s 3, and Ani s 4 were not found in IEDB. In silico predicted epitopes in Ani s 2, Ani s 3, and Ani s 4 peptides are shown in Table 5. Among all 74 Ani s 2 peptides, MHC II, T-cell, and B-cell epitopes were predicted in 15, 31, and 10 peptides, respectively. Ten and nine of all 34 Ani s 3 peptides were matched with T-cell and B-cell epitopes, respectively. T-cell epitopes were found in all five peptides of Ani s 4.
Additionally, the native tertiary structures of the identified allergens were modeled to obtain structural insights into the characteristics of the proteins. Figure 5 shows the native 3D structures of Ani s 1, Ani s 2, Ani s 3, and Ani s 4 modeled with high confidence (≥ 77% of residues modeled at > 90% confidence) using the Phyre2 server and the structure of Ani s 5 derived from RCSB PDB (ID: 2MAR). The displayed structures clearly differentiated various protein and allergen classes.            The largest group of potential allergens (five proteins) were homologous to A. simplex allergens (Ani s 2, Ani s 5, Ani s 8, Ani s 9, and Ani s troponin C). Four potential allergens were homologous to the following Acari (Tyrophagus putrescentiae, Dermatophagoides farina, Dermatophagoides pteronyssinus) allergens: Tyr p 28, Der f 33, and Der f 33-like. Another two proteins were homologous to Ascaris lumbricoides allergen Asc l 3. The last two potential allergens of A. simplex were homologous to Olea europaea allergen Ole e 15 and Aedes aegypti allergen Aed a 8. Three-dimensional native structures of homologous allergens for potential allergens of A. simplex are shown in Figure 6 (≥ 85% of residues modeled at > 90% confidence), except for the Ani s 2 and Ani s 5 models, which are presented in Figure 5. Additionally, the alignment of 3D models of potential allergens and their homologous allergens was performed to visualize the similarities in protein structures (see Supplemental File S2). The displayed 3D models of allergens and potential allergens showed similarities among the proteins belonging to the same allergen classes. A. simplex allergens and homologs of potential allergens (see Tables 2 and 5) were classified to allergen families based on the AllFam database. The five following AllFam families were detected among identified allergens: animal Kunitz serine protease inhibitor (Ani s 1), myosin heavy chain (Ani s 2), tropomyosin (Ani s 3), cystatin (Ani s 4), and SXP/RAL-2 family (Ani s 5). In the case of homologs of potential allergens, six families of AllFam were found. The most highly represented AllFam families of homologous allergens were heat shock protein 70 (Hsp70) and SXP/RAL-2 family (three matches for each), followed by tropomyosin (two matches). The lowest abundant allergen families (one match for each) were the following: EF-hand family, myosin heavy chain, and tubulin/FtsZ family. Two homologous allergens (Ole e 15 and Der f 33-like) were not reported in the AllFam database.
The detected allergens were in the Mw/pI range of 10291-100461 Da/4.68-7.48, while potential allergens were in the range of 9756-101686 Da/4.15-9.04. The lowest Mw of the detected allergens and potential allergens was found for the Ani s 4 and uncharacterized protein (UniProt ID: A0A0M3J8H0), respectively. Ani s 2 (paramyosin) was the allergen with the highest calculated Mw, while the highest Mw value of potential allergens was found for the paramysoin (UniProt ID: A0A158PP35). Ani s 3 (tropomyosin) and troponin-like protein (UniProt ID: A0A0M3JU57) had the lowest calculated pI values of allergens and potential allergens, respectively. Ani s 1 was the allergen with the highest pI, whereas the highest pI of potential allergens was calculated for the peptidyl-prolyl cis-trans isomerase (UniProt ID: A0A0M3J6G4).
The identified allergens were in the range of emPAI values from 2.53 (Ani s 1) to 62.69 (Ani s 3); while the potential allergens were in the range from 0.42 (tubulin alpha chain; UniProt ID: A0A0M3K821) to 109.71 (tropomyosin; UniProt ID: A0A0M3KCE6). In the top 25 most abundant proteins (see Table 1), the following allergens were found: Ani s 2, Ani s 3, and Ani s 5, and the following possible allergens: tropomyosin (UniProt ID: A0A0M3KCE6), SXP/RAL-2 family protein 2 isoform 1 (UniProt ID: A0A0M3KA05), and paramyosin (UniProt ID: A0A158PP35). The distribution of the detected allergens and potential allergens based on the Mw, pI, and emPAI values is displayed in Figure 2.
All five detected allergens were classified using InterPro analysis, whereas the three following potential allergens: uncharacterized protein (UniProt ID: A0A0M3J8H0), tubulin alpha chain (UniProt ID: A0A0M3K821), and tubulin alpha chain (UniProt ID: A0A0M3KAH2) had no domains or families matches. Among the identified allergens, two InterPro families (pancreatic trypsin inhibitor Kunitz domain superfamily and tropomyosin) and four InterPro domains (pancreatic trypsin inhibitor Kunitz domain, myosin tail, cystatin domain, and domain of unknown function DUF148) were detected. Six InterPro families of potential allergens were found, from which the most abundant (three matches for each) were the heat shock protein 70 family and heat shock protein 70 kD peptide-binding domain superfamily. Less represented (two matches for each) were the following InterPro families: tropomyosin and heat shock protein 70 kD C-terminal domain superfamily, while the least abundant (one match for each) were EF-hand domain pair and cyclophilin-like domain superfamily. Five domains of all possible allergens were identified. The most abundant was the domain of unknown function DUF148 (two matches), and other domains (one match for each) were the following: myosin tail, cyclophilin-type peptidyl-prolyl cis-trans isomerase domain, EF-hand domain, and endoplasmic reticulum chaperone binding immunoglobulin protein (BIP) nucleotide-binding domain.

Discussion
The sterilization process is widely used for preserving many types of food, including products containing the meat of sea fish. Studies evaluating the effects of such a process on the A. simplex proteins, which could contaminate fish products, are only fragmentary. Therefore, in the present work, we attempted to reduce the knowledge gap in this topic, especially focusing on the proteome of Anisakis. A current survey was the first proteome and allergome profiling of A. simplex in autoclaving conditions.

Electrophoretic and Immunological Investigations of the Influence of High Temperature on A. simplex Antigen
We performed a comparative analysis of SDS-PAGE and IgG-WB profiles of the following A. simplex antigens: (i) native antigen, (ii) antigen heated at 100 • C, and (iii) autoclaved antigen. We showed that antigen heating and autoclaving caused a reduction in the number and intensity of SDS-PAGE and IgG-WB bands. Furthermore, both heating processes caused an appearance of background in IgG-WB profiles. Based on this experiment, we knew that heating caused changes in the antigens, but did not interfere with the ability to bind antibodies. Similar observations regarding the reduction of the number and intensity of bands in electrophoretic and WB profiles of autoclaved Anisakis antigen were reported by Carballeda-Sangiao et al. [51] and Klapper et al. [52].
We thought that the background in the IgG-WB profiles of the heated antigens was the result of degradation and/or alteration of some of the Anisakis proteins. However, heated antigens were degraded only partially. Similarly to our study, several other investigations concerning the influence of autoclaving on antigens/allergens of tree nuts [53], shrimps [54], lentil, and chickpeas [55] have shown smear appearing on the SDS-PAGE and WB profiles while maintaining their antigenicity.
Based on the calculation of the IntDen values of the SDS-PAGE and IgG-WB profiles, we measured the protein amounts in the antigens and the signal intensity of the immunoreactivity of antigens, respectively. The IntDen values of SDS-PAGE, as well as the IgG-WB profiles, were similar for the native and both heated antigens; therefore, we supposed that the temperature conditions we used did not cause a drastic reduction of antigenicity.
The present study confirmed the immunoreactivity of the heated/autoclaved Anisakis antigens with anti-A. simplex IgG antibodies that were linked to delayed IgG-based allergy. This type of hypersensitivity is associated with chronic urticaria that is a frequent symptom of anisakiasis [56,57]. Detection of IgG antibodies specific to Anisakis is also useful in the serodiagnosis of anisakiasis [42]. The IgE immunoreactivity of heated Anisakis antigens was not investigated in this study, and further studies on this issue are needed. However, the IgE immunoreactivity and thermostability of autoclaved Ani s 1 and Ani s 4 have been shown by Carballeda-Sangiao et al. [51]. Results of other studies that investigated the influence of high temperature on the antigen of A. simplex [51,52,[58][59][60][61] were consistent with our findings regarding the high thermal resistance of Anisakis antigenic profile.

Identification and Label-Free Quantification of Proteins, Allergens, and Potential Allergens of A. simplex
Mass spectrometry allowed us to detect 470 proteins of A. simplex. Another published dataset of proteins derived from heat-treated Anisakis larvae consists of 146 proteins detected in nematode extract heated 5 min at 110 • C [62]. There are no other thermal proteome profiles of A. simplex reported in the scientific literature. Comparing the total number of detected proteins (n = 470) from the present study with the total number of proteins of native Anisakis antigen from our previous investigation [23], we could see a 27% decrease in protein number after autoclaving. In the proteomic profiling of native antigens of A. simplex [23], we used a slightly different bioinformatic approach; nevertheless, this comparison clearly indicated that autoclaving did not drastically reduce the number of proteins.
Among all the detected proteins, we identified peptides derived from the five following allergens of A. simplex: Ani s 1, Ani s 2, Ani s 3, Ani s 4, and Ani s 5. However, it was impossible to confirm the thermostability of the detected allergens since the identification was conducted by the gel-free LC-MS/MS approach, and, using this technology, only peptides could be detected. However, epitopes, as well as possible epitopes, were found in identified allergenic peptides. As was mentioned, among the A. simplex allergens from the WHO/IUIS list, until now, only the autoclaving resistance of Ani s 1 and Ani s 4 is known [51], and further thermostability studies of other Anisakis allergens are necessary. The total number of allergens detected in the autoclaved antigens (n = 5) was 64% lower compared to the number of allergens found in the native Anisakis antigen (n = 14) [23]. However, it should be emphasized that, as identified in our present study, Ani s 1 and Ani s 2 are particularly harmful as they are major allergens, which cause hypersensitivity reactions in more than 50% of the allergic population [63]. Ani s 2 and Ani s 3 are also panallergens that are ubiquitously distributed with highly conserved sequences and structures, and, therefore, they are responsible for cross-reactions, even between phylogenetically distant and unrelated organisms [64][65][66].
In recent studies, novel allergens and novel potential allergens of Anisakis have been described. As the list of A. simplex allergens seems to be still incomplete, we performed bioinformatic screening of the detected proteins for possible allergenicity. We identified 13 potential allergens of Anisakis (see Table 5) using the AllergenOnline.org server, which is commonly used for such computational evaluations [67]. To increase the confidence of the allergenicity predictions, we applied a high level of identity between potential allergens and homologous allergens (70%) as the cut-off, as the level of identity already just above 50% indicates the possibility of cross-reactions [68]. Most of the homologous identified potential allergens were allergens of Anisakis (n = 5) and Ascaris (n = 2), which are relative phylogenetic closely related as both nematodes belong to the same order of Ascaridida. Among the potential allergens, we also frequently detected homologs of Acari allergens (n = 4). The occurrence of cross-reactions of A. simplex antigen with Acari [69] and Ascaris [70,71] antigens was experimentally proven.
Three following potential allergens identified in this work: paramyosin (UniProt ID: A0A158PP35), peptidyl-prolyl cis-trans isomerase (UniProt ID: A0A0M3J6G4), and 78 kDa glucose-regulated protein (UniProt ID: A0A0M3K5H6) were also detected by us in the native antigens of A. simplex [23]. We identified potential Anisakis allergen SXP/RAL-2 family protein 2 isoform 1 (UniProt ID: A0A0M3KA05) in the native antigen of Contracaecum osculatum [23]. Except for these four potential allergens identified in our previous study, the other nine sequences were identified as potential allergens of Anisakis for the first time.
To know the content of allergens and potential allergens in heat-treated A. simplex larvae, we measured the relative abundance of proteins using mass spectrometry label-free quantification. We analyzed the protein abundance calculated by Mascot software, and this data revealed that among the 25 most abundant proteins were present the following allergens: Ani s 2, Ani s 3, and Ani s 5, and the following potential allergens: tropomyosin (UniProt ID: A0A0M3KCE6), SXP / RAL-2 family protein 2 isoform 1 (UniProt ID: A0A0M3KA05), and paramyosin (UniProt ID: A0A158PP35). Comparing these results with the relative quantification of allergens found among the 25 most abundant proteins of the native antigen of Anisakis [23], we could conclude that autoclaving caused a slight reduction in the number of identified allergens. The following five allergens were found in the native antigens: Ani s 8, Ani s 2, Ani s 13, and Ani s 3 (two isoforms).

Computational Investigations of Detected Proteins, Allergens, and Potential Allergens of A. simplex
Especially important bioinformatics analysis was the detection of epitopes in allergenic peptides originated from the autoclaved antigen of A. simplex. This investigation confirmed the results of WB and showed that the epitopes of autoclaved peptides were not destroyed. Among all Anisakis allergens, T-cell/B-cell epitopes in Ani s 1 [48,49] and Ani s 5 [50] were experimentally verified, and we used these datasets via IEDB. Epitopes were identified in the majority (74%) of Ani s 1 and Ani s 5 peptides (see Table 3). MHC II, T-cell, and B-cell epitopes in Ani s 2, Ani s 3, and Ani s 4 were predicted for the first time. Probable epitopes were found in 61% of autoclaved allergenic peptides ( Table 4) using algorithms of DNASTAR Protean 3D software. This software has been successfully used to predict epitopes in many other studies [72][73][74].
Previously, among all Anisakis allergens, the only 3D structure of Ani s 5 has been determined experimentally by nuclear magnetic resonance [50]. Therefore, to acquire more insights into the nature of novel potential allergens, as well as all identified allergens, we predicted 3D models, representing the native conformation of these proteins. The models were predicted with high confidence using an algorithm combining ab initio and homology-based prediction implemented in the Phyre2 server [47]. Pairs of protein models representing the potential allergen and its homologs were subjected to structural alignment within the PyMOL software. Structural alignment analysis confirmed a potential allergen's structural similarities with its homologs.
Computational analyses showed that identified proteins, allergens, and potential allergens were very diverse in properties, such as molecular weights and isoelectric points. High diversity also occurred at the level of protein classification into the InterPro family, the identification of InterPro domains, and allergen assignment according to the AllFam database. For example, the most highly represented family and domain among all detected proteins, i.e., immunoglobulin-like fold (IPR013783) and immunoglobulin-like domain (IPR007110) were assigned only to 4% and 3% of sequences, respectively. This superfamily represents domains with an immunoglobulin-like (Ig-like) fold, and Ig-like domains are one of the most common protein modules found in different organism proteins [75]. Proteins with this fold vary in their cellular localization, amino acid sequence, and biological role [76]. EF-hand domains were relatively abundant among identified proteins (nine sequences). Troponin-like protein (UniProt ID: A0A0M3JU57), a potential allergen detected in this study, and its homolog (Ani s troponin C) also contain EF-hand domains. EF-hand allergen family is the second-largest group of allergens [77] that are detected in different organisms like parvalbumins in a specific species of fish and fungus (Trichophyton violaceum) [78] or polcalcins in the pollen of trees, grasses, and weeds [79]. It has been found that antigenic sites of parvalbumins in Trichophyton violaceum are located on both sides of the Ca 2+ -binding site of the first EF-hand domain and parvalbumin proteins possessing conserved amino acid motifs (cysteine, lysine, and arginine) [78]. Noteworthy proteins among the highly abundant proteins families detected by us were heat shock proteins, like HSP20-like chaperones (eight sequences), which were highly represented among all thermostable proteins, while hsp70s were relatively higher abundant (three sequences) in the group of potential allergens. In our previous study, we found that heat shock proteins were also one of the most abundant proteins in the native antigen of Anisakis [23]. This is not surprising because HSPs are extremely heterogeneous in nature and function mainly as molecular chaperones that help other proteins maintain their native structure and, especially under stresses, are highly expressed [80][81][82]. HSPs are major immune dominant antigens in many parasite infections, and they play a key role in host-parasite interactions [83,84]. Allergens belonging to the Hsp70 family are found in a heterogeneous range of sources. Among others, HSP70s are inhalant allergens of house dust mites, storage mites, biting midges, black flies, and cockroaches [20,85,86].
AllFam classification is an effective way to characterize allergens and potential allergens. This was also the case in our study in which, despite a large variety of detected allergen, the following classes were more common: hsp70 (AF002), which is described above, SXP/RAL-2 family (AF137), tropomyosin (AF054), and myosin heavy chain (AF100).
The allergens of the SXP/RAL-2 family include only three allergens: Ani s 5, Ani s 7, and Ani s 8. In addition to the Ani s 5 allergen, in the present study, we identified two potential allergens related to this family. Members of the SXP/RAL-2 family are characterized by the presence of the domain of unknown function (DUF)148 protein [87]. This family of secreted proteins seems to be specific for nematodes, and several members have been reported in animal parasitic nematodes and in Caenorhabditis elegans [87][88][89]. The role of these proteins is unrecognized; however, it is known that the structure of Ani s 5 resembles that of calmodulin but binds Mg 2+ instead of Ca 2+ [50].
Among allergens from the tropomyosin class, we found Ani s 3 and two potential allergens that were homologs of Asc l 3. Tropomyosin has been identified as a minor inhalation allergen in arthropods (mites, cockroaches) and as a major food allergen in crustaceans and mollusks [90,91], while vertebrate tropomyosin seems to be non-allergenic [92]. Due to repetitive coiled-coil structures, which were visualized on our 3D models, tropomyosin retains IgE antibodies binding ability even after heat treatment or partial digestion [92,93]. Tropomyosin sequences are highly conserved, which causes frequent cases of hypersensitivity cross-reactions with phylogenetically distant allergens [94].
During the analysis of allergen class of myosin heavy chain (AF100), we assigned Ani s 2 and one potential homolog allergen of Ani s 2 to this group. Both of these proteins were paramyosins. Paramyosin is a filamentous protein that is found in many invertebrates, including parasites. This protein may regulate the host's immune responses by inhibiting the classical pathway of complement cascade through inhibition of the complement C1 function [95]. Paramyosin is engaged in the immunological protection mechanism of parasites by acting as Fc receptors and has been shown to induce hypersensitivity reactions in humans [96][97][98].
We performed cellular component GO annotation to predict the localization of detected proteins in relation to cellular compartments and structures. Obtaining this data allowed for a deeper characterization of proteins as it provided context enabling, understanding of their function. The results of the cellular components annotations (see Figures 4 and 7) showed a large variation in the distribution of GO terms. Among all annotations, the most abundant (about one-third of annotations) were the following GO terms: organelle (GO:0043226), intracellular organelle (GO:0043229), and cytoplasm (GO:0005737). The GO term organelle means the organized structure of distinctive morphology and function, includes the nucleus, mitochondria, vesicles, ribosomes, and the cytoskeleton, but excludes the plasma membrane. The GO term intracellular organelle is an organized structure of distinctive morphology and function, occurring within the cell. The GO term cytoplasm is all of the contents of a cell, excluding the plasma membrane and nucleus, but including other subcellular structures.
In our comparison of the cellular component annotations among all identified proteins, the most abundant 25 proteins, allergens, and potential allergens showed some interesting differences. The GO term supramolecular complex (GO:0099080) was the most represented in the 25 most abundant proteins (11% of sequences), followed by allergens (9% of sequences), potential allergens (7% of sequences), and the lowest represented in case of all proteins (5%). The GO term supramolecular complex is a cellular component that consists of an indeterminate number of proteins or macromolecular complexes, organized into regular, higher-order structures, such as a polymer, sheets, networks, or fibers. Among the proteins belonging to this class of GO term, we also assigned the following allergens and potential allergens, which were filament proteins: tropomyosin (including Ani 3), paramyosin (including Ani s 2), tubulin alpha chain, and troponin-like protein sequences. Slightly higher abundance of sequences of allergens, as well as potential allergens, were assigned to the extracellular region (GO:0005576). The GO term extracellular region is the space external to the outermost structure of a cell; for cells without external protective or external encapsulating structures, this refers to space outside of the plasma membrane. The following proteins were assigned to this GO term: like tubulin alpha chain, SXP/RAL-2 family protein, 78 kDa glucose-regulated protein, DUF148 domain-containing protein, and heat shock 70 kDa protein cognate 1. These results corresponded with the fact that Ani s 1 and Ani s 5 are excretory-secretory allergens.

Conclusions
This study provided novel data on the A. simplex proteome. Based on mass spectrometry analysis, it could be concluded that 470 proteins were detected in heat-treated A. simplex larvae. Among identified proteins, peptides of the following allergens were found: Ani s 1, Ani s 2, Ani s 3, Ani s 4, and Ani s 5. In silico predicted and known epitopes in peptides originated from these allergens were detected using bioinformatics tools. Furthermore, thirteen potential allergens were detected, nine of which were identified for the first time. The identified proteins, allergens, and potential allergens were very diverse in terms of properties, such as their molecular weight, isoelectric point, tertiary structure, domain and family classifications, and cellular component annotations. The reactivity of the autoclaved A. simplex antigen with anti-A. simplex IgG antibodies that were relevant to delayed IgG-based allergy was confirmed by WB. The IgE-binding capacity and thermostability of identified allergens and potential allergens were not tested in this study, and, therefore, further studies are needed to investigate these aspects.
Due to the presence of epitopes in allergenic peptides derived from the autoclaved antigen, thermally processed fish products that might contain A. simplex proteins could be a potential threat to sensitized consumers. These findings have implications for the fish processing industry and food safety authorities. It is necessary to search for more effective methods to reduce the allergenicity of food products contaminated by Anisakis. A practical solution to this issue can provide removal of Anisakis allergens during the washing of fish muscle, as described by Olivares et al. [99]. Furthermore, an extensive examination of fish products for A. simplex allergens can improve the protection of Anisakis-allergic consumers. Hence, the implementation of diagnostic tools for the detection of A. simplex allergens is essential for food safety laboratories. Publicly deposited mass spectrometry data could be useful for future studies, such as the development of new diagnostic assays.