New applications of advanced instrumental techniques for the characterization of food allergenic proteins

Abstract Current approaches based on electrophoretic, chromatographic or immunochemical principles have allowed characterizing multiple allergens, mapping their epitopes, studying their mechanisms of action, developing detection and diagnostic methods and therapeutic strategies for the food and pharmaceutical industry. However, some of the common structural features related to the allergenic potential of food proteins remain unknown, or the pathological mechanism of food allergy is not yet fully understood. In addition, it is also necessary to evaluate new allergens from novel protein sources that may pose a new risk for consumers. Technological development has allowed the expansion of advanced technologies for which their whole potential has not been entirely exploited and could provide novel contributions to still unexplored molecular traits underlying both the structure of food allergens and the mechanisms through which they sensitize or elicit adverse responses in human subjects, as well as improving analytical techniques for their detection. This review presents cutting-edge instrumental techniques recently applied when studying structural and functional aspects of proteins, mechanism of action and interaction between biomolecules. We also exemplify their role in the food allergy research and discuss their new possible applications in several areas of the food allergy field.


Introduction
Although any food can trigger anaphylaxis, fourteen of them (milk, egg, peanuts, tree nuts, fish, soybean, cereals containing gluten, crustaceans, mollusks, lupin, sesame, mustard, celery and sulphites), are mandatory on the labeling for safety reasons. However, aspects such as food processing and food matrix interactions should be considered to determine the allergenicity risk assessment. In addition, the use of new sources of dietary protein is increasing in recent years and its potential allergenicity needs to be evaluated to protect consumers (Verhoeckx et al. 2016). An exhaustive allergenicity risk assessment is needed before marketing novel food sources to evaluate the cross reactivity or de novo allergenicity of novel food proteins (Remington et al. 2018). However, the lack of cellular and animal models assays for detection of de novo sensitization are some of the identified gaps in the current allergenicity risk assessment that need to be further investigated.
Current approaches based on electrophoretic, chromatographic or immunochemical principles have allowed characterizing multiple allergens, mapping their epitopes, studying their mechanism of action, and developing detection and diagnostic methods for the food and pharmaceutical industry, as well as advancing therapeutic strategies. However, despite the high number of studies addressing the structure, digestibility and stability of food allergens against technological treatments, as well as the identification of the sequence of most of them (Aalberse 2000;Sirvent et al. 2009;Willison et al. 2011;Bened e et al. 2014;Villas-Boas et al. 2015;Bened e et al. 2015), no common biochemical properties have been associated with all allergenic food proteins, although few of them are shared (Costa et al. 2020;Costa et al. 2021).
Naturally occurring allergens are usually proteins, whose molecular weight needs to be at least 3.5 kDa to cross-link with two IgE molecules on the surface of mast cells and basophils (Bogh, Barkholt, and Madsen 2013). A molecular mass below 70 kDa is usually observed in food allergens (Vickery, Chin, and Burks 2011), although there is great variability in their sizes (Costa et al. 2020). Resistance to digestion and processing are regarded as common properties of food allergens. However, these characteristics do not provide sufficient information for allergenicity prediction because there are exceptions such as Ara h 2 from peanut (Koppelman et al. 2010) or Mal d 2 from apple (Smole et al. 2008). Processing has been used to modify allergenicity of proteins. For example, roasted peanuts are usually used for oral food challenge to confirm the suspicious of peanut allergy as well as for peanut oral immunotherapy in the current practice to reduce the adverse reactions caused by the used of raw peanuts. Oral immunotherapy using roasted peanut flour has been shown to be effective in desensitizing peanut-allergic children although a high number of adverse events have still been described (Anagnostou et al. 2014). Boiled peanuts are less allergenic than roasted peanuts (Beyer et al. 2001) and it has also been proposed to be used in oral immunotheraphy since boiling reduces IgE allergenicity while retaining T cell reactivity (Tao et al. 2016).
Susceptibility to proteolysis also depends on the food product and processing. In this respect, some epitopes may be destroyed and new epitopes may be formed (Thomas et al. 2007). In general, proteolysis potentially disrupts the protein structure, changing the structure of both the linear and the conformational epitopes present in food protein. However, the effect on allergenicity depends on the amino acid sequence and protein secondary structure (Thomas et al. 2007). In the case of peanuts, Ara h 2 hydrolysis has been demonstrated to change the secondary structure of the protein, without affecting its allergenicity due to the fact that linear epitopes are retained after proteolysis (Sathe, Teuber, and Roux 2005). In the case of milk, proteolysis reduces the number of epitopes that results in the reduction of allergenicity (Hochwallner et al. 2014). A high number of allergic children tolerated baked milk and egg due to the destruction of conformational epitopes, facilitating access to the digestive enzymes, and reducing intestinal absorption (Upton and Nowak-Wegrzyn 2018). However, the interactions between milk proteins and some components of the food matrix during heating seem to play the most important role in the reduction of allergenicity, limiting the accessibility of peptides to the immune system (Bavaro et al. 2019). In contrast, food matrix interactions may also increase the allergenicity of proteins, as it happens with roasting that increases allergenicity of Ara h 1 and Ara h 2 due to the neo-epitopes formation by Maillard Reaction (Maleki et al. 2000).
Stability toward proteolytic degradation and processing might differ between allergens depending also on structural properties. Amino acid composition of proteins determines physicochemical properties such as flexibility, solvent accessibility, solubility and electrostatic potential, which can influence their stability and their IgE binding capacity (Dall' Antonia et al. 2014;Lozano-Ojalvo, Bened e, and van Bilsen 2021). For this reason, the evaluation of amino acid sequence homology of new proteins with already identified allergens is a currently applied approach to assess allergenicity and predict the allergenic potential of proteins, although it has been widely criticized (Silvanovich et al. 2006;Goodman et al. 2008;Cressman and Ladics 2009). In addition, many allergens undergo post-translational modifications that may influence protein fold and structure and, therefore, have an impact on protein allergenicity (Pekar, Ret, and Untersmayr 2018). Binding of ligands also leads to changes in allergen structure and stability against processing and proteolytic degradation (Pekar, Ret, and Untersmayr 2018) and they can act as immune-stimulators, thus inducing innate immune responses (Tordesillas et al. 2017). Therefore, the effect of the matrix in which allergens are usually consumed and the ability of the fragments generated upon gastrointestinal digestion to cross the intestinal barrier and interact with the gut-associated lymphoid tissue should also be evaluated (Lozano-Ojalvo, Bened e, and van Bilsen 2021). This will provide more data on how food processing and food matrix interaction modify the allergenicity of proteins. IgE-binding assays, applying sera from patients with established specific IgE response to the target food, are also used to evaluate the allergenicity of new proteins derived from a known allergenic food, but they are useless when the new protein is derived from a non-allergenic food (Mazzucchelli et al. 2018).
Given the current situation, more research is needed to find biochemical features that allow determining whether a protein has the potential to be allergenic. In this regard, the technological development has allowed the expansion of advanced technologies for which their full potential has not yet been used and could provide novel contributions to still unexplored molecular traits underlying both the structure of food allergens and the mechanism through which they sensitize or elicit an adverse response in human subjects, as well as improving analytical techniques for their detection.
In this review, we summarized the novel instrumental techniques often applied when studying structural and functional aspects of proteins, mechanisms of action and interaction between biomolecules. We further described these techniques and exemplify their current and perspective use in food allergy research. In addition, we discussed new possible applications of advanced analytical techniques in several areas of the food allergy field, including the structural characterization of allergens and mapping of epitopes, the study of posttranslational modifications, allergen detection and pathogenesis, diagnosis and treatment of the disease.

New techniques to determine structural and functional aspects of allergens
Three-dimensional structure determination of allergens is essential for analyzing exposed surface areas and mapping conformational epitopes. Moreover, it determines physicochemical properties such as flexibility, solvent accessibility, solubility and electrostatic potential, which can influence their IgE-binding capacity (Dall' Antonia et al. 2014), stability against processing or enzymatic cleavage sites that are accessible to proteases (Abdullah et al. 2016). Protein tertiary structure may be investigated using several spectroscopic approaches. However, many molecular systems in cells that can interact with food allergens are either immobilized or motionally restricted, including membrane proteins, amyloid fibrils, and large molecular weight protein-protein or protein-nucleic acid complexes. Due to the lack of longrange order, these structures are resistant to crystallization and, therefore, create many difficulties in the application of high-resolution methods of X-ray crystallography, solution NMR or electron microscopy (Ladizhansky 2019). Therefore, new technological strategies have been developed to overcome this limitation.
2.1. X-ray free-electron lasers crystallography X-Ray crystallography is the most powerful method to obtain information about the tertiary structure of proteins and protein complex at the atomic level. Briefly, the method is based on the diffraction of X-rays on protein crystals, data collection and model building.
Most allergens are relatively small, stable and well-structured proteins and, therefore, they are perfectly suited for Xray crystallography studies. Numerous structures of food allergens have been solved and deposited in the Protein Data Bank. However, there are still difficulties in protein crystallography of large-sized molecules because this technique requires large well-ordered protein crystals that can diffract to high resolution with limited X-ray dose (Gallat et al. 2014). Big molecules rarely form such crystals, but rather generate nanometer-sized crystals, sensitive to radiation damage, with low-diffraction capabilities.
X-Ray free-electron lasers (FEL) opened new opportunities to overcome these drawbacks (Chapman et al. 2006). With an ultrashort and extremely bright coherent X-ray pulse, a single diffraction pattern may be recorded from a large macromolecule before the sample explodes and turns into a plasma. Sample damage by X-rays limits the resolution of structural studies on non-repetitive and non-reproducible structures such as individual biomolecules or cells. Cooling can slow sample deterioration, but cannot abolish damage-induced sample movement during the time needed for conventional measurements (Neutze et al. 2000). X-Ray FEL are expected to permit high-resolution diffractive images of nanometer-to micrometer-sized objects without the need for crystalline periodicity in the sample. Therefore, all large-size food allergens (i.g. homotrimeric Ara h 1) and the complexes of food allergens with IgE could be analyzed by X-ray FEL.
So far, this new method has not found a wide application in the characterization of food allergens. A singular example of the application of this technology for resolving the structure of a food allergen is the case of Gal d 4 (hen's egg lysozyme). Its room-temperature crystal structures (Protein Data Bank ID: 3WUL) have been reported at a resolution better than 2 Å, using grease as a general carrier of protein microcrystals (Sugahara et al. 2015). The grease matrix-based approach should be applicable to a wide range of proteins, including single allergens and food allergen complexes in order to solve their structures at room temperature. Similar to X-ray crystallography, the method is applicable to purified proteins, and may be useful in risk assessment of novel foods, for which cross-reactivity of single protein to already known allergens is assessed based on homology and structural similarity at primary, secondary and tertiary levels (Mazzucchelli et al. 2018).

Solid-state nuclear magnetic resonance methods
Solid-state nuclear magnetic resonance (ssNMR) is (NMR) spectroscopic technique applied to condensed phase systems, including membrane proteins. This technique does not require crystals or fast molecular tumbling and, therefore, is not limited by molecular size. ssNMR can provide information at atomic level on insoluble systems, which are either disordered or lack long-range order (Sun et al. 2012;Wylie et al. 2016;Ladizhansky 2019). Structural and functional analyses are challenging, requiring new approaches to better characterize these systems. ssNMR is uniquely suited to the examination of membrane proteins in native environments and has the capability to elucidate complex protein mechanisms and structures (van der Wel 2017; Loquet et al. 2018;Ladizhansky 2019).
In the case of insoluble, high molecular weight, non-crystalline, and/or disordered food allergenic proteins (e.g. soybean hydrophobic seed protein), ssNMR can be used effectively for epitope characterization, including interactions between allergen and IgE, and especially hydrophobic natural ligands such as short-chain fatty acids, lipopolysaccharides, and phytosterols. ssNMR can provide information about the binding site and structure of protein-protein and protein-ligand interactions, accurate 3 D structure determination, and dynamical behavior of insoluble and/or disordered allergenic proteins (Dall' Antonia et al. 2014). In addition, ssNMR can contribute to the prediction of crossreactions between different food allergens as in the case of cross-reactivity between birch pollen and apple allergens, the interaction of allergens with solid matrices such as certain polysaccharides, and to the determination of the effect of food processing on the structure of allergen proteins (Krushelnitsky, Reichert, and Saalwachter 2013). However, although ssNMR is a structure-based prediction tool able to provide promising results on the detection of allergen structures, it has not yet been applied in this field. The closest application of ssNMR in the food allergen field regarded the determination of the purity of chitin obtained from ground shrimp shells to ensure the complete absence of allergens in the final product (King et al. 2017).

Cryo-electron microscopy coupled to mass spectrometry
Another emerging structural technique that could contribute to IgE epitope mapping is cryo-electron microscopy (cryo-EM). In this technique, the proteins do not need to be crystallized, but instead are instantly flash-frozen into vitrified ice, allowing to work with much larger biomolecular complexes and much less sample than the required for crystallography (Mueller 2017). Therefore, it opens an opportunity to directly examine the structure of an IgE molecule in complex with an allergen. In fact, cryo-EM has been used to map polyclonal epitopes of HIV immunized rabbits (Bianchi et al. 2018). This technique has also allowed studying the interior of the casein micelles and confirming the existence of inner cavities that are connected by wide channels (Trejo et al. 2011), which could explain its ability to retain macromolecules (Glab and Boratynski 2017). This singular structure allows casein micelles to act as carriers for other cow's milk allergens, which could protect them against gastrointestinal digestion and technological treatments (Wang et al. 2017).
However, cryo-EM shows some limitations in providing high-resolution structures and, therefore, complementary techniques have been sought to solve this problem (Schmidt and Urlaub 2017). Stimulated by the latest developments in both instrumentation and image-processing software, chemical cross-linking combined with mass spectrometry (CX-MS) has evolved rapidly in recent years, consisting of introducing covalent bonds between two amino-acid residues in close proximity using a chemical crosslinking reagent. The cross-linked residues can be identified by mass spectrometry after proteolytic digestion ), allowing to identify regions that are non-covalently bound (van-der-Waals and electrostatic interactions) in interactions within and between proteins, which could be useful to provide advanced knowledge regarding the three-dimensional confirmation of food allergens.
Many food allergens show protein secondary structures, such as beta sheets, which impede cross-linking, as they are stabilized by hydrogen bonds and the reactive residues are no longer available for the crosslinking. Development of more unspecific and concomitantly cross-linkers could be an important direction of CX-MS-based structural research in the future (Schmidt and Urlaub 2017), expanding its application to the study of a large number of food allergenic proteins.

Bioinformatics
Structural similarity among proteins, as well as the correlation of their functions, are important to identify new allergenic proteins. In recent years, the development of bioinformatic tools to predict allergenicity has grown significantly and current methodologies use tools based not only on sequence identity, but also on the homology of functional motifs, which might retain the conformational epitope structure.
According to the guidelines from Food and Agriculture Organization of the United Nations and World Health Organization (FAO/WHO) to assess the potential allergenicity of genetically modified foods, cross-reactivity between a protein and a known allergen is established if they have a sequence similarity greater than 35% over a window of 80 amino acids, and if there are at least six contiguous exactly matching amino acids in both sequences. This report includes a variety of computational techniques mainly based on the comparison of protein sequences, although no indications on how these relationships were analyzed are given (Codex Alimentarius Commission 2001). Therefore, bioinformatics methods of allergenicity prediction can be grouped in three main groups: sequence, motif based approaches and support vector machine-based methods.
Sequence approaches are directly based on the requirements of WHO/FAO. They extract information from allergen databases and compare sequences using standard alignment tools (Stadler and Stadler 2003;Fiers et al. 2004;Goodman et al. 2016). However, the sequence identity between proteins does not always explain cross-reactivity. For example, Act d 11 from kiwifruit and Bet v 1 from birch pollen have a very low sequence identity, but both are recognized by the IgE of sera from birch pollen-allergic patients (Avino et al. 2011).
To increase the efficiency of these methods, the motif based-approaches appeared. They take into account, not only the sequence identity, but also the presence of characteristic motifs on the protein surface, such as B, T and IgE epitopes, the hydrophobic residues or structural similarity regions (Ivanciuc, Schein, and Braun 2003;). These techniques help predicting cross-reactivity between allergens, though they have also been used for the identification of proteins with low allergenic potential in novel foods, which can be used for the development and validation of methods to assess proteins with allergenic potential ).
More sophisticated tools use Support Vector Machinebased (SVM) methods, which are statistical techniques that use input vectors from either amino acid composition or scores of pair-wise sequence similarity with known allergens, followed by the use of machine learning approaches to set up allergenicity (Cui et al. 2007;Muh, Tong, and Tammi 2009;. Studies assessing the performance of these bioinformatic procedures show that motif based-approaches and SVM-based methods increase the accuracy and specificity of predictions Maurer-Stroh et al. 2019). However, despite the soundness of current programs, the number of false positives in allergenicity predictions is still too high and other variables should be taken into account. For example, a significantly large number of food allergens bind and transport lipids whose ability to modulate the immune response has been reported (Tordesillas et al. 2017; L opez-Fandiño 2020). Therefore, it seems advisable to include other characteristics that could help assessing the real likelihood of proteins to be allergenic.
It should be taken into account that these methods are based on the similarity search of sequence or 3 D structure of proteins, so it is necessary the complete sequence. This feature is one of their main limitations since they are not suitable for identification of new allergenic proteins or proteins with unknown sequence. Despite this, it is important to remark that analyzing the proteome of novel foods can be done by means of proteomic analysis. This is the case of some recent advances related to important crops whose consumption has widely increased over the last few years, such as the characterization of soybean (Uniprot proteome ID UP000008827) or quinoa proteome (Capriotti et al. 2015). In this context, computational screening for potential allergenicity/IgE-cross-reactivity is both fast and inexpensive in contrast with in vitro experiments. Therefore, despite their limitations and the desire for further development, computational methods developed in the context of prediction of allergenicity are a useful tool that can be used in combination with other techniques in order to improve the risk assessment prediction.
3. Methods to decipher the mechanism of action of allergens in food mixtures without structural or functional characterization Deciphering the signals that emerge from the cellular microenvironments could provide insights on the initiation of the cellular response to food allergens. Any initiation event starts with an interaction between the molecules of study and the biomolecules from the microenvironment. Therefore, improving our understanding of cellular responses to novel food allergens could arise from the application of recent methodologies developed to identify protein targets, predicting the mechanisms of action from an unbiased perspective. The food ingredient would resemble the complexity of chemical mixtures, that are complex bioactive samples that have been recently evaluated in toxicology to predict the impact of the exposure to complex mixtures of pollutants (Legler et al. 2020).
The current strategies to identify molecular targets have been focused on two directions: (1) phenotypic screening that provide direct observation of the cellular responses, but struggle with the identification of the molecular interactors (Moffat et al. 2017) and (2) screening that requires some structural information of the molecules, but may not be available as in the case of novel food allergens (Ziegler et al. 2013).
The first methodology that provided a more comprehensive identification of protein targets in a realistic cellular environment was the cellular thermal shift assay (CETSA) (Martinez Molina et al. 2013). This method measures the increase in thermal stability from cellular proteins due to their interaction with a bioactive compound and identifies the targets by immunoblotting, which is the limitation of the method (Vedadi et al. 2006;Martinez Molina et al. 2013).The next development was a method called thermal proteome profiling (TPP) that is also based on thermal shift assays, but the targets are identified by quantitative mass spectrometry (MS) based on isobaric tandem mass 10-plex reagents (Savitski et al. 2018). Soluble proteins from a cell lysate are incubated with or without the studied compound at a range of temperatures from 37 C to 67 C, being followed by centrifugation to separate soluble folded proteins from the thermally unfolded ones. The supernatants are analyzed by MS to generate the melting curves of the target proteins (Savitski et al. 2014;Franken et al. 2015). The method has been applied to study drug-target (Becher et al. 2016;Azimi et al. 2018), protein-substrate interaction in complex samples (Turkowsky et al. 2019) and protein degradation (Savitski et al. 2018). The main limitation of the TPP method was related with the fact that the analyzed soluble proteome contained a mixture of a soluble and vesicular fraction. Therefore, the TPP method could not guarantee a constant concentration of protein and compound at different temperatures, which is crucial to correctly calculate the melting point of the targets. This has recently been solved by the development of a new method called bioactive thermal proteome profiling (bTPP) that is based on a different postulate for protein solubility (Carrasco del Amor et al. 2019). The new method includes precipitation of the microsomal fraction before the thermal shift assay to ensure that the concentration of the available protein and compound is constant at any temperature (Carrasco del Amor et al. 2019) (Figure 1). Moreover, this method does not require previous knowledge from the compound at the structural or functional level, and it is applicable to complex mixtures. Hence, it offers new opportunities to characterize novel food allergens that could interact in the cellular microenvironment, individually or integrated into complex and uncharacterized mixtures. Additional improvements have been achieved in this field with the method called proteome integral solubility alteration (PISA) that increases target specificity and highthroughput by measuring the integral under the melting curves of the protein targets (Gaetani et al. 2019). This approach eliminates the statistical uncertainties derived from fitting the thermal curves that is the base of the previous thermal proteome profiling methodologies.

Techniques to identify post-translational modifications of allergens
Biologically occurring post-translational modifications (PTM) are chemical alterations (induced by enzymes) that proteins may undergo after biosynthesis, being critically relevant in regulating their structure and function (Lanucara and Eyers 2013). Additionally, PTM often determine the interactions among proteins and establish the basis for several cellular signaling pathways, playing crucial roles in immune modulation, inflammation, host-pathogen interactions, and degenerative/proliferative disorders (Pascovici et al. 2018). The most common PTM include the specific cleavage of precursor proteins, the formation of disulfide bonds, the covalent addition or removal of low-molecularweight groups, which occur in the amino acid side chains and are generally catalyzed by enzymes (Khoury, Baliban, and Floudas 2011;Ravikiran and Mahalakshmi 2014). Particularly, allergens can suffer single or multiple biological PTM, such as glycosylation, phosphorylation, acetylation, hydroxylation, and methylation (Pekar, Ret, and Untersmayr 2018;Costa et al. 2020Costa et al. , 2021. Process-induced PTM of allergens can also occur by chemical modifications during food processing, namely deamidation or glycation (Maillard reactions) (Le, Deeth, and Larsen 2017). Despite being often used indiscriminately, biological-or process-induced PTM are two complete different and concepts because the first relates to PTM occurring during protein synthesis and the second results from food processing. Currently, mass spectrometry techniques (by means of bottom-up, top-down and middle-down strategies) are able to evaluate the status of protein modification, including their site location (Olsen and Mann 2013;Pascovici et al. 2018). The bottom-up MS approach consists of analyzing the resulting peptides (0.8-3 kDa) from previous enzymatic or chemical cleavage and it has been widely used for qualitative analysis of new PTM (Lanucara and Eyers 2013;Pandeswari and Sabareesh 2019). This bottom-up MS approach is a very suitable technique to identify, locate and quantify modified peptides, although it has some limitations mainly concerning the combinatorial effect of several PTM. Working at a protein level, the top-down approach has been used to provide the number, position and type of PTM on a single polypeptide chain, offering an ideal analytical strategy for discovering the combinatorial effects of PTM. However, the requirements for its use are particularly challenging due to the size of the analytes (>10 kDa). The new middle-down approach has emerged as an alternative, also involving proteolysis, but in a restricted manner, producing longer proteolytic peptides (2.5-10 kDa) and a better sequence coverage (Pandeswari and Sabareesh 2019).
Most of the MS-based methods established for PTM analysis often target a single modification type, such as phosphorylation or glycosylation, and one type of sample/tissue (Pascovici et al. 2018). However, there are numerous proteins/allergens including several PTM sites and/or two or more different PTM (e.g. milk caseins). Significant advances have been carried out in PTM analysis, establishing the basis for developing proficient PTM workflows. Those include the creation of efficient sample preparation protocols (accurate selection of lysis and digestion protocols), as well as strategies for PTM enhancement (effective purification methods based on affinity or immunoprecipitation enrichment), and highly sensitive MS platforms for PTM quantification. Nonetheless, such PTM workflows are not easy to establish because they normally require the use of large amounts of starting materials (mg of proteins/peptides) that, depending on the sample/tissue, might be difficult to obtain (Pascovici et al. 2018). Additionally, the analysis of some PTM (e.g. phosphorylation, glycosylation) is a hard and challenging task due to a number of reasons. For instance, the low amounts of phosphorylated proteins, the challenges in isolating selective phosphopeptides and the peculiar chemistry of each phosphorylated amino acid are frequently accountable for the difficulties in analyzing phosphorylation as a PTM (Yakubu, Nieves, and Weiss 2019). Besides, glycosylation and phosphorylation modification sites are often difficult to identify since they are labile and can be lost during the separation and fragmentation process (Couto et al. 2018).
Glycosylation is one of the most relevant PTM in allergens, which can be identified by bottom-up and top-down approaches. Using the bottom-up approach, the application of MALDI-TOF/MS for the identification of potential glycosylation sites of Cor a 11 (hazelnut) has been described (Lauer et al. 2004). Similarly, the N-glycoforms of soybean allergenic glycoproteins were identified and quantified by  . Phosphorylation, hydroxylation, acetylation and methylation are other important PTM associated to the allergenicity of proteins, which have been characterized by MS using the bottom-up approach. Barral et al. (2005) described the existence of a phosphorylated serine residue at the N-terminal of Ole e 10 (olive) by MALDI-TOF analysis. Li et al. (2009) characterized four isoforms of Ara h 2 (peanut) by LC-MS/MS and identified site-specific hydroxylation of proline residues in each isoform. Lipid transfer proteins (LTP) were purified from the pulp and peel of different peach genotypes and analyzed by MALDI-TOF-MS/ MS, revealing the presence of two methylated sites on the arginine residues in all LTP from peel (Larocca et al. 2013). Using a bottom-up MS approach, Rahman et al. (2010) analyzed the tropomyosin isolated from snow crab (Chionoecetes opilio) and demonstrated the existence of a N-terminal acetylation as the site for the PTM (Rahman et al. 2010). Glycosylation and phosphorylation sites in j-caseins of cow's milk were characterized by two-dimensional gel electrophoresis coupled with MALDI-TOF-MS and nano ESI-MS/MS Alewood 2004, 2006). Applying middle-down and top-down approaches on the subunit and at the protein level of Sin a 1 (mustard), Hummel, Wigger, and Brockmeyer (2015) detected eight consensus isoforms. Despite not identifying specific PTM, different modifications in the isoforms and phytic acid as a potential ligand for Sin a 1 were reported. Bromilow et al. (2017) exploited a combination of platforms to investigate the proteomic profiling of wheat gluten, using a QTOF instrument with a data-independent schema, which incorporated ion mobility (DIA-IM-MS) and a data-dependent acquisition (DDA) workflow, employing a linear ion trap quadrupole (LTQ) instrument. As example of processinduced PTM, the authors observed a widespread glutamine deamidation in all the different types of characterized gluten proteins.

ESI-MS, MS/MS and LC-MS/MS
Recent advances in MS-based approaches have shade some light regarding the composition and location of different PTM in several known allergens. Proteins with specific PTM have been related with increased allergenic potential, although the real impact of PTM in protein allergenicity is still unknown (Costa et al. 2020(Costa et al. , 2021. However, it is also important to highlight that the application of MSapproaches in detecting PTM in novel food proteins might help establishing some relation to their allergenic potential. Most of the studies (if not all) are performed on purified proteins, which restrains extrapolating the concrete impact of modified allergenic proteins within a food matrix. In terms of allergenicity risk assessment, the information retrieved from studies evaluating the PTM on specific proteins is still very limited, but integrating this knowledge with other allergen physicochemical properties might contribute to a better allergenicity risk assessment.

Mass spectrometry
According to the general mechanisms of IgE-mediated allergies, antigenic epitopes bind non-covalently IgE, forming IgE-antigen complexes that fit into a high-affinity receptor (FceRIa) on the surface of mast cells and basophils. Crosslinking of FceRIa-bound IgEs through multivalent allergens triggers the allergic cascade. Some structural aspects of IgE-FceRIa interaction also remain controversial (Hirano et al. 2018). The fragmentary knowledge about the actual antigenic determinants and their structural interaction between with antibody isotypes or with receptors of immunocompetent cells is among the main factors preventing us to establish what makes a protein a food allergen. The incomplete picture of such a complex interplay hinders the process of allergenicity risk assessment, especially for novel potential allergens, as well as the possibility of devising novel therapeutic strategies. Mass spectrometry (MS) is the core technology of several cutting-edge strategies that have been successfully devised to map protein domains involved in non-covalent interactions and expectedly will replace or, at least, support the existing methods in the near future.
Hydrogen/deuterium exchange (HDX) is an emerging MS-based epitope mapping method, which exploits the labeling of protein amide backbone with deuterium at variable rates, depending on the accessibility to solvent and on the local topology of the amino acid sequences. Antigens alone or antigen-antibody complexes are incubated in D 2 O, followed by proteolysis and analysis by LC-MS/MS. The domains involved in the antibody binding are those differing in the deuterium uptake and can be identified due to a molecular weight shift. Recently, the conformational epitopes of Ana o 2 , Pru du 6 (Willison et al. 2013) and Ana o 1 (Guan et al. 2015) recognized by monoclonal antibodies have been mapped using this technique. This information could be useful to predict the epitope structure against human IgE and, with some adjustments, this technique can be in principle extended to the study of epitopic regions involved in the binding to specific IgE populations (Matsuo, Yokooji, and Taogoshi 2015). Related to the HDX-MS method, the chemical footprint by oxidative labeling of amino acid residues not involved in the interaction, namely Fast Photochemical Oxidation of Proteins (FPOP)-MS, is an emerging strategy to elucidate the structural details of protein-protein and protein-ligand interactions, also including the contacting epitope-paratope domains (Yan et al. 2014).
Polypeptide chains in close spatial proximity can be chemically cross-linked using a variety of commercially available bifunctional reagents, hooking specific amino acid residues (Leitner 2016). Upon proteolysis, cross-linked heterodimers are released from the former interacting protein domains and are identified by MS/MS. However, due to the difficult identification of peptides involved in the cross-link, this approach has not yet entered the routine practice for the epitope recognition.
The recently optimized limited proteolysis-coupled to MS (LiP-MS), which is a proteome-wide scale advance of a strategy developed more than 30 years ago (Jemmerson and Paterson 1986), addresses the functional analysis of protein interactions either with ligands or physiological biomolecular effectors (Schopper et al. 2017). The binding of a protein to a ligand or the interaction between proteins results in altered structural conformations of the polypeptide, producing different sets of peptides upon limited proteolysis, which are further hydrolyzed by trypsin. Combining the LC-MS analysis to a software-assisted reconstruction, it is possible to identify the protein regions involved in binding events. Recently, a workflow for quantitative chemoproteomic analysis has been developed (Piazza et al. 2018). This method enables the identification of metabolite-protein interactions and the functional modulation of allergen immunogenicity/allergenicity directly in their native environment. In addition, MS also enables the profiling of the entire repertoire of major histocompatibility complex-binding peptides exposed on the surface of the antigen-presenting cells (Bozzacco et al. 2011;Clement et al. 2016).
It is envisaged that the routine applications of these strategies to food allergen characterization will provide insights into the etiopathological mechanisms of the disease at a molecular level and will prime the development of adequate therapeutic approaches.

Label-free interaction analysis
Label-free detection methods utilize molecular biophysical properties such as the refractive index to monitor molecular interactions that are transduced as mechanical, electrical, or optical signals, acquiring direct information from native proteins and ligands (Ray, Mehta, and Srivastava 2010). They would replace current assays based on fluorescence, luminescence or radioactivity, which use specialized reagents labeled with detection moieties that can introduce artifacts. Any foreign molecule that is chemically or temporarily attached to the molecule of interest can potentially alter its intrinsic properties, changing its interaction capacity and, thus, generating false positives and negative results. Moreover, label detection techniques require a previous labeling process combining synthesis and purification that is usually low yield (Syahir et al. 2015). Label-free detection methods such as Surface Plasmon Resonance (SPR), microfluidics and biosensors are quantitative techniques that show in general higher sensitivity, lower cost, and greater ease of manufacture than traditional methods and can be applied for the analysis of allergens as contaminants in order to facilitate the correct food allergen labeling. 5.2.1. Surface plasmon resonance SPR refers to the electromagnetic response obtained when collective oscillations of free electrons (plasmons) occur on the surface of a metal. SPR takes place when polarized light strikes an electrically conducting surface, generally a gold layer, at the interface between two media with different refractive index, the sensor surface containing the immobilized proteins and a sample containing a potential interacting partner in buffer solution material (Sipova and Homola 2013) (Figure 2). During the interaction, the angle of minimum intensity reflected light is detected, changing as molecules bind and dissociate. This technique allows studying the interactions between biomolecules in real-time and can be performed on complex mixtures, such as cell culture supernatants or cell extracts, working equally well on clear and colored or opaque samples (Michel, Xiao, and Alameh 2017).
In the allergy field, SPR can be used to detect and measure the avidity of antibodies recognizing allergens (Chardin et al. 2014). The study of the influence of immunotherapy against the major birch pollen allergen, Bet v 1, on specific IgE, IgG1 and IgG4 affinities was one of the first applications of this technique that allowed to measure the IgG/IgE ratio in patients undergoing immunotherapy (Jakobsen et al. 2005).
It can also be useful to characterize polyclonal (Pol et al. 2007) and monoclonal (Laffer et al. 2008) anti-IgE to be used in immunotherapy against allergy and asthma. Antibody responses, induced in monkeys by recombinant IgE-derived immunotherapeutic protein against atopic allergies and asthma, were characterized by SPR (Pol et al. 2007). This technique was also used to assess the binding kinetics and affinity of a monoclonal anti-IgE and investigate its potential use for depletion of IgE and isolation of IgE-bearing cells from peripheral blood (Laffer et al. 2008). Another application of SPR has been the improvement of allergy diagnosis, reducing the detection limit in the determination of serum IgE in allergic patients (Hamilton, Saini, and MacGlashan 2012;Joshi et al. 2014) or visualizing the effect of various stimuli or inhibitors on basophils in a high throughput screening system (Yanase et al. 2012).
Due to the high sensitivity, low cost, and easy fabrication, SPR provide real-time methods to monitor the presence of allergenic ingredients in food matrices during processing (Zhou et al. 2019), although it should be noted that the ligand may not maintain its native configuration upon immobilization on the sensor chip surface. Simple and labelfree SPR sensors have been develop to detect casein (Hiep et al. 2007), b-lactoglobulin (Ashley et al. 2018;Wu et al. 2016), ovalbumin (Pilolli, Visconti, and Monaci 2015;Lin et al. 2017), lysozyme (Ocana et al. 2015), Ara h 1 (Wu et al. 2016), parvalbumin (Lu, Ohshima, and Ushio 2004) and tropomyosin (Zhou et al. 2020). In addition, SPR biosensors allow the capture of multi-allergens in a single food sample for high-throughput multiplex analysis, as it was demonstrated by Billakanti et al. (2010), who detected five whey proteins in both raw and processed milk.

Microfluidics
Microfluidics, the science of fluids on the micro-and nanometer scale, has become an increasingly popular tool in biochemistry applications (Cho et al. 2011). The integration of microfluidic systems in other technologies, such as biosensors or microarrays, can be useful in the field of food allergy. Microfluidic ELISA-based optical sensor has been applied for rapid detection of Ara h 1, reducing the total assay time from hours to 15-20 min and decreasing the sample/reagent consumption compared to commercial ELISA kits, with superior sensitivity (Weng, Gaur, and Neethirajan 2016) (Figure 3). Besides, a microarray coupled to microfluidics has also been reported for the in vitro serological detection of specific IgE for allergy diagnosis (Cretich et al. 2009).

Biosensors
The development of electrochemical biosensors for applications in food allergen analysis has increased in recent years. They are powerful tools to perform quick analyses with high sensitivity and selectivity, while allowing on-site analysis, although they generally require expensive instruments and skilled operators. They can be categorized as potentiometric, amperometric, voltammetric, conductometric or impedimetric, sensors according to the electrochemical principle involved. The use of specific antibodies has allowed the development of immunosensors for the selective detection of food allergenic proteins from peanut (Singh et al. 2010;Alves et al. 2015;V. R. V.Montiel et al. 2016), milk (Montiel et al. 2015;Inaba, Kuramitz, and (Montiel et al. 2017). However, limitations due to the use of antibodies such as nonspecific binding or the liability to degrade should be considered in immunosensor development.

Techniques to unravel the phenotype of single cells and novel immune pathways in food allergy
Beyond the technological advances made in the comprehensive investigation of allergen structure and characterization, the development of high-throughput "-omics" techniques (proteomics, transcriptomics, metabolomics, genomics, and epigenomics) has also allowed the progress of sophisticated approaches for studying biological pathways involved in food allergy through characterization of single cells (Dhondalay et al. 2018). High-throughput proteomics has been used to study the allergenic response at single-cell resolution by applying mass cytometry or cytometry by time-of-flight (CyTOF). In contrast to fluorescent-based flow cytometry, that uses dyelabeled antibodies, CyTOF is a MS-based flow cytometry approach that uses isotopically purified heavy metal atoms as reporters (rather than fluorochromes), which are identified by a MS-TOF allowing the detection of molecules expressed by immune cells. Conventional fluorescent dyebased flow cytometry distinguishes up to 18 proteins at a single-cell level. However, CyTOF accurately differentiates more than 40 targets (theoretically up to 100 isotopes) with less cross-talk between detector channels and lesser signal overlap (Spitzer and Nolan 2016;Simoni et al. 2018). Based on this technology, imaging mass cytometry (IMC) has been recently applied to analyze tissue sections, being reviewed by Chang et al. (2017). Some studies have revealed the importance of CyTOF for research and clinical monitoring of patients with food allergies. High-dimensional analyses of immune biomarkers using mass cytometry have been employed to study the diversity of T-cell phenotypes in food allergic patients (Chinthrajah et al. 2018) and to identify discrete B-cell subsets in subjects with red meat allergy (Cox et al. 2019). In addition, Tordesillas et al. (2016) used CyTOF to describe a potential interaction between basophils and platelets after peanut exposure. Moreover, this technique has proven to be a robust approach to basophil activation testing that could improve the diagnosis and prognosis of food allergies (Mukai et al. 2017). CyTOF has also been applied for the investigation of a non-IgE mediated allergy, the food protein-induced enterocolitis syndrome (FPIES), and the results obtained have revealed a systemic immune activation following elicitation by foods (Goswami et al. 2017). In general, the current challenge in this field is the development of dimensionality reduction methods that allow examining, visualizing and presenting the vast amount of data generated by CyTOF.
Besides proteomics, transcriptomics is now being applied to better understand the complex mechanisms involved in food allergic responses by measuring transcriptionally active genes expressed by the immune cells (Dhondalay et al. 2018). Microarrays (platforms with probes to measure a set of predefined genes) are commonly used for the characterization of RNA molecules. However, the next-generation platform for gene expression analysis is RNA sequencing (RNA-seq). This approach is capable of analyzing the increasing flux of gene expression along with high-throughput bioinformatics. Compared to microarrays, RNA-seq has several advantages, including a higher percentage of differentially expressed genes and its ability to detect novel transcripts.
So far, it is considered that the genes expressed by immune cells are linked to allergic responses triggered by foods, although transcript levels are not always well correlated with released proteins (Dhondalay et al. 2018). Kosoy et al. (2016) reported the transcriptional profiling of peripheral blood mononuclear cells (PBMC) from reactive individuals to egg proteins, showing an increase expression of genes associated with allergic inflammation and higher secretion of specific immune mediators compare to those from PBMCs from healthy donors. Using RNA-seq, Watson et al. identified variations in gene expression during acute allergic responses triggered by peanut (Watson et al. 2017). Moreover, this study allowed the recognition of key driver genes, biological processes and cell types associated with severe peanut allergic reactions. Recently, Croote et al. (2018) performed plate-based single-cell RNA-seq on IgEproducing B-cells isolated from PBMC of food-allergic subjects to clarify the gene expression and splicing patterns of these cells. Accordingly, these antibodies attained high affinity with unpredicted cross-reactivity to peanut allergens (Ara h 2/Ara h 3), properties resulting from mutations in the variable regions of heavy and light chains of IgE in allergic individuals (Croote et al. 2018). Additionally, transcriptomics has also been employed to better comprehend food allergens and the genes involved in their expression/regulation (Baar et al. 2014;Ramesh et al. 2016;Nugraha et al. 2018). Therefore, transcriptomics data have already highlighted variances in the gene expression patterns of allergic and non-allergic subjects, revealing a strong potential to detect novel immune processes in food allergy (Dhondalay et al. 2018).
Finally, a new technology that combines proteomics and transcriptomics has emerged. The cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) connects the measurement of specific protein markers with the unbiased genome-wide expression profiling. This approach uses oligonucleotide-labeled antibodies to integrate cellular proteins and transcriptome measurements into an efficient, single-cell readout (Stoeckius et al. 2017). CITE-seq has not been used in studies of the allergic responses yet, but its application may unravel pathways involved in the pathogenesis of food allergy.

Conclusions
Technological development has allowed the expansion of advanced technologies that could be applied in food allergy research to decipher the structure of food allergens and the mechanisms through which they sensitize or elicit adverse responses in some individuals, as well as improving current analytical techniques for allergen detection (Table 1). Promising techniques such as X-ray FEL crystallography, ssNMR, cryo-EM/CX-MS or computational predictive methods could determine structural and functional aspects of allergens. In order to decipher the mechanisms of action of allergens without structural or functional characterization, the TPP approach could be useful. MS-based novel approaches could help identifying post-translational modifications of allergens and together with label-free interaction analysis to detect interactions between allergens and other biomolecules. In addition, techniques such as CyTOF, IMC, RNA-seq and CITE-seq can contribute to unravel the phenotype of single cells and novel immune pathways in food allergy.
Due to the fast technological advances, new insights into food allergy research need to be considered. The different approaches collected in this review probably offer the best or promising options to move forward in the food allergy field, to understand what turns a protein into an allergen, what underlies the allergic reaction, and how it can be abolished, in the not too distant future.

Disclosure statement
The authors declare that they have no conflicts of interests.

Funding
This review is based upon work from COST Action FA1402, supported by COST (European Cooperation in Science and Technology, www.cost.eu). S. B. acknowledges financial support from the Juan de la Cierva Program (MICIU, Spain). S. B. and E. M. acknowledge the financial support of the Spanish Ministry of Science and Innovation  Stoeckius et al. (2017)