Plant lectins and their many roles: Carbohydrate-binding and beyond

Lectins are ubiquitous proteins that reversibly bind to specific carbohydrates and, thus, serve as readers of the sugar code. In photosynthetic organisms, lectin family proteins play important roles in capturing and releasing photosynthates via an endogenous lectin cycle. Often, lectin proteins consist of one or more lectin domains in combination with other types of domains. This structural diversity of lectins is the basis for their current classification, which is consistent with their diverse functions in cell signaling associated with growth and development, as well as in the plant’s response to biotic, symbiotic, and abiotic stimuli. Furthermore, the lectin family shows evolutionary expansion that has distinct clade-specific signatures. Although the function(s) of many plant lectin family genes are unknown, studies in the model plant Arabidopsis thaliana have provided insights into their diverse roles. Here, we have used a biocuration approach rooted in the critical review of scientific literature and information available in the public genomic databases to summarize the expression, localization, and known functions of lectins in Arabidopsis. A better understanding of the structure and function of lectins is expected to aid in improving agricultural productivity through the manipulation of candidate genes for breeding climateresilient crops, or by regulating metabolic pathways by applications of plant growth regulators.


Introduction
Carbohydrates, some of the most versatile and abundant molecules in nature, are major sources of energy for living organisms, meeting both short-and long-term energy demands in the form of free and stored metabolites, such as sugars, starch, and glycogen. They are also constituents of structural components, such as the monosaccharides that form the backbone of nucleic acids (e.g., ribose, and deoxyribose), lipopolysaccharides, glycosaminoglycans, and polysaccharides, such as cellulose, pectin, agar, carrageenan, and chitin, that provide shape and integrity to the cells. In addition, carbohydrates modify lipids and proteins, altering their structure, folding, localization, and function (Varki and Gagneux, 2017). Further, they are actively involved in immune recognition, and provide the basis for self vs. non-self-recognition (Agostino et al., 2011;O'Neill et al., 2017).
The ability of carbohydrates to participate in diverse functions is fundamental to their distinctive chemical characteristics. For example, even a simple C 6 (hexose) sugar, such as glucose, can give rise to sixteen possible isomers (2 n , where n is the number of chiral centers in the molecule). On the other hand, a disaccharide, such as sucrose, with nine chiral centers, has 512 (2 9 ) possible isomeric forms. Moreover, longer chains of oligosaccharides introduce even greater biochemical diversity. Indeed, these macromolecules may serve as identification tags. However, for a highly specific identification system to be useful, living systems must be able to: 1) synthesize, modify, and degrade many of these carbohydrates in vivo; and 2) differentiate between the various structural codes that these carbohydrates generate. The former is achieved by glycosyltransferases and glycosidases, which are broadly grouped as carbohydrate-active enzymes (CAZy). In contrast, the latter is achieved either by glycan-binding proteins (GBPs) or by antibodies (see Fig. 1), depending on where they function (Henrissat et al., 2017).
Glycan Binding Proteins (GBPs) are classified into two broad groups, namely glycosaminoglycan (GAG)-binding proteins and lectins (Taylor et al., 2017). The GAG-binding proteins contain clusters of positively charged amino acids that allow them to recognize, via electrostatic interactions, patterns of negatively charged carboxylate and sulfate groups on the polysaccharides, such as heparan, chondroitin, dermatan, and keratan sulfates (Taylor et al., 2017). Lectins, on the other hand, have shallow carbohydrate recognition domains (CRDs), which allow one or more terminal carbohydrate residues of a polysaccharide to fit in, primarily by way of hydrogen-bonded interactions (Loris et al., 1998;Sharma and Surolia, 1997;Sharon and Lis, 2001). Further, residues lining the CRD may also interact with the carbohydrate ligands through van der Waals bonds, ionic bonds, and CH-π interactions (with aromatic/ hydrophobic residues) (Kiessling, 2018;Spiwok, 2017). Thus, unlike enzymes, lectins are proteins that have at least one non-catalytic domain capable of reversibly binding carbohydrates. This is a critical distinction that determines the premier role of lectins as readers of the glycocode, also known as the sugar code (Dedola et al., 2020;Nilsson, 2011). We emphasize that the rapidly growing fields of genomics, transcriptomics, and glycomics are extremely important for our understanding of these protein-carbohydrate interactions.
In this review, we provide a historical introduction to lectins, their classification, and summarize the current knowledge about the roles plant lectins play in modulating the availability of photosynthates, and in cell signaling associated with various biological processes. First, we discuss the importance of the structure of the carbohydrate recognition domains in lectins and their interactions with carbohydrates. Then, we describe insights gained from high-throughput approaches of genomics, transcriptomics, proteomics, glycomics, and cell biology, to understand the function of lectin family members in Arabidopsis thaliana and other plant species including crops.

History of lectin research
In 1888, Peter Hermann Stillmark discovered ricin, a lectin, from extracts of castor beans (reviewed by Olsnes, 2004;Polito et al., 2019) and soon thereafter, another lectin, abrin, was discovered (Olsnes, 2004). Lectins were initially called agglutinins/haemagglutinins/phytoagglutinins because of their ability to agglutinate or cross-link animal blood cells (André et al., 2015). The idea that their selectivity was based on carbohydrate recognition followed much later, after Sumner and Howell (1936) had demonstrated the carbohydrate specificity of concanavalin A (Con A) and after Watkins and Morgan (1952) had discovered that the agglutination of the blood group "O" cells, by the lectin from eel serum, could be inhibited by simple sugars (Sumner and Howell, 1936;Watkins and Morgan, 1952). After it was shown that most lectins have specificities that could distinguish between different human blood types, they were named lectins, based on the Latin word 'lego ' (noun), meaning, chosen; or 'legere' (verb), which means 'to choose or select' (Boyd and Reguera, 1949;Renkonen KO, 1948;Boyd and Shapleigh, 1954).
The word "lectin" was already in use when the first mammalian lectin, the asialoglycoprotein receptor, was discovered (Ashwell and Fig. 1. Lectins belong to a heterogeneous group of carbohydrate-binding proteins. Carbohydrates are recognized and acted upon by three overarching groups of proteins, carbohydrate-active enzymes (CAZy), glycan-binding proteins (GBPs), and antibodies (Abs). The CAZy group includes enzymes, such as glycosyltransferases and glycosidases that are involved in the formation, modification, or degradation of glycosidic bonds. The GBPs are involved in the recognition of carbohydrates and these are subdivided into two families, the lectins, and the sulfated glycosaminoglycan (GAG)-binding proteins. Lectins are a group of structurally diverse proteins that reversibly bind to specific carbohydrates, and are further divided into 25 subfamilies. The distribution of lectin subfamily members is shown across all kingdoms of life. EUL: Euonymus europaeus lectins; Nictaba: Nicotiana tabacum agglutinin; ABA: Agaricus bisporus agglutinin; CRA: Chitinase V related agglutinin; GNA: Galanthus nivalis agglutinin; F-type: fucose-binding fold containing; C-type: calcium-dependent; PA14-like: PA14 domain containing proteins; Ptype: mannose-6-phosphate receptors; and X-type: Xenopus laevis oocyte cortical granule lectin-like. Morell, 1974). Later, many lectins were identified in bacteria, fungi, protozoans, invertebrates, and vertebrates (Kilpatrick, 2002;Nizet et al., 2017). With these findings came the realization that lectins are omnipresent in nature and play a crucial role in deciphering the glycocode of life. A timeline of the many discoveries in the field of lectin research is summarized in Table 1.

Classification of plant lectins
As several hundreds of lectins were being discovered, a new challenge for their rational classification emerged. One possible mode was to simply use the organism in which they were found: microbial lectins, animal lectins, and plant lectins. However, within each kingdom, various carbohydrate specificities were found, requiring a subclassification of these lectins. Several different criteria were examined for subgrouping of the lectins, including sugar specificities of various lectins. However, as crystal structures of the lectins became available, it soon became apparent that there were multiple molecular mechanisms by which two lectins from different plants recognize the same sugar; alternatively, two lectins from the same plant were shown to bind different sugars (Abhinav and Vijayan, 2014). However, using sugar specificities alone as a classification criterion, for the extremely large group of plant lectins, did not produce any kind of generalization about the biochemical and biophysical properties of the lectins themselves (Peumans and Van Damme, 1995). Another classification method of lectins considered the nature and organization of the CRDs that plant lectins possess (Tsaneva and Van Damme, 2020). In this method, lectins were classified as merolectins (lectins with a single CRD), hololectins (lectins with two or more identical CRDs), chimerolectins (those which possess at least one CRD and one other type of conserved domain that had a different function), and superlectins (those containing two or more non-identical CRDs with specificities for structurally different sugars).
However, despite structural conservation, the identity of the glycan that a CRD binds is determined by finer molecular details of the binding site and its interactions with the sugar moiety ( Fig. 2A). For example, the CRDs of all the legume lectins are made of what is called the 'jelly roll fold', and these are dependent on two divalent cations for carbohydratebinding: a Ca 2+ within the CRD, and a Mn 2+ adjacent to it (Loris et al., 1998). In addition, a flexible loop in the CRD contributes to defining the actual structure of the sugar-binding pocket and determining its ligand specificity. Thus, some legume lectin monomers recognize primarily the equatorially oriented C4-hydroxyl of glucose (Glc)/mannose (Man), Table 1 Timeline showing the changing focus of research on plant lectins.

1995-Present
Production of recombinant lectins in heterologous systems (see a review by Lam and Ng, 2011) and development of several diagnostic tools, for example •Lectin-based biosensors for detection of microorganisms (Ertl and Mikkelsen, 2001) •Glycan array for ligand profiling of lectins (Fukui et al., 2002) •Lectin-based microarrays for glycan profiling (Patwa et al., 2006) •Enzyme linked lectin assay (ELLA) used for profiling biomarkers associated with several diseases (for a review, see Tsaneva and Van Damme, 2020) •Lectin based biosensors for clinical oncology by Silva (2018Silva ( ) 2000 Arabidopsis thaliana genome sequence published (Arabidopsis Genome Initiative, 2000).

2001
A lectin-domain containing receptor kinase, the S-locus Receptor Kinase (SRK), was shown to act as a female determinant of self-incompatibility response in Brassica that interacts with its ligand in an allele-specific manner (Kachroo et al., 2001). Later SRK served as a prototype for the classification of a subclass of plant receptor-like kinases (see Shiu and Bleecker, 2001;Naithani et al., 2007;Naithani et al., 2021Naithani et al., ) 2011 Plant lectins as tools for controlling insect pests (Vandenborre et al., 2011(Vandenborre et al., ) 2014 Autoradiography leads to the plant lectin cycle (Nonomura and Benson, 2014) and its link to the carbon reactions of photosynthesis (Nonomura et al., 2020(Nonomura et al., ) 2016 Elucidation of the crystal structure of SRK lectin domains (the extracellular region of SRK9 allele) in complex with its ligand SCR9 from Brassica rapa (Ma et al., 2016(Ma et al., ) 2017 Genome-wide screening for lectin domains in A. thaliana (Eggermont et al., 2017(Eggermont et al., ) 2001 Whole genome sequencing of more than 100 plant species. The genome-wide investigations in plants revealed that •Many proteins containing lectin domains in combination with one or more other types of conserved domains (chimerolectins) are more abundant than proteins comprising only lectin domains (hololectins). For a recent review see Van Holle and Van Damme (2019) •Lectins are present in many subcellular compartments (see Table 2) 2001-present The analysis of individual genes, mutants, and functional genomics studies identified many endogenous roles of lectins in cell signaling associated with plant development, reproduction, plant defense, and abiotic stress tolerance (described in detail later in this review) while others recognize the axially oriented C4-hydroxyl of galactose (Gal)/N-acetylgalactosamine (GalNAc). For the same reason, legume lectin monomers may also produce fucose-specific or chitobiose-specific CRDs or even CRDs that bind complex carbohydrate structures. Many CRDs, including that of the Euonymus europaeus lectin (EUL) and the L-type lectins, have extended ligand binding sites (Fig. 2B) which enable oligosaccharides to bind to them with higher specificity and greater avidity than the monosaccharides (Agostino et al., 2015;Loris et al., 1998). Such extended binding sites, or subsites, normally show cooperativity in ligand binding, which explains the higher specificity and affinity of many lectins/CRDs for oligosaccharides. For example, ConA binds the trimannopyranoside Manα1-3(Man-α1-6)-Man-α-Me with much greater affinity and specificity than Man or Glc monomers (Loris et al., 1998). Often, an additional hydrophobic patch, adjacent to the carbohydrate-binding site in a CRD, enables binding of the lipid tails of a glycolipid or other hydrophobic entities attached to the carbohydrate, thereby enhancing the interaction with sugar-bound entities ( Fig. 2C) (Loris et al., 1998;Sharma and Surolia, 1997). A carbohydrate oligomer may be bound by two or more CRDs, resulting in the clustering of lectin receptors in the presence of the carbohydrate (Fig. 2D). A CRD bound to a domain with a different functionality (for example, to a kinase domain of receptor-like kinases) results in multifunctional proteins ( Fig. 2E). At a higher level of organization, two or more CRDs or monomers may form homomeric or heteromeric complexes that are capable of multivalent ligand binding or cross-linking of carbohydrates on two different cells to produce carbohydrate aggregation, glycoprotein precipitation or cell agglutination ( Fig. 2F) (Abhinav and Vijayan, 2014;Huwa et al., 2021). Thus, a scheme based on the nature and organization of the CRDs neither enabled an understanding of the general principles of carbohydrate-binding, nor did it permit prediction of the biochemical and biophysical properties of newly identified lectins. Moreover, it did not allow us to search in genome databases for the identification of new lectins.
Evidently, carbohydrate-binding is just one of the many properties of a lectin (Barondes, 1988;Komath et al., 2006). Many plant lectins bind to plant growth regulators (PGRs), phytohormones, or to small peptides, in addition to saccharides. In fact, multi-domain multi-functional lectins have been found to be widely distributed across the plant kingdom (Etzler, 1985). Hence, the current, more widely accepted classification method is based on the sequence and the conserved three-dimensional structures of the lectin motif (Taylor et al., 2017;Tsaneva and Van Damme, 2020). In this system, each subfamily is named after the best characterized protein of this group. Indeed, such an approach permits a generalized classification of lectins into subfamilies from all the kingdoms of life. A summary of various lectin gene subfamilies and their distribution across all kingdoms of life is shown above in Fig. 1 (for examples, see also Table 2).

Expansion and diversification of lectin gene family in plant genomes
In general, lectins are abundant in the plant kingdom. To gain an insight about lectin domain coding genes across a broad spectrum of photoautotrophs, we searched for lectin motifs* in the Gramene knowledgebase (http://gramene.org). Gramene, a resource for comparative functional genomics in crops and model plant species, currently hosts fully sequenced genomes and gene pages for 93 species, and systems-level pathway networks for 106 species (Tello-Ruiz et al., 2021). For our analysis, we selected 48 species from the Gramene representing a broad spectrum of photoautotrophs for the analysis of the lectin family (see Fig. 3). We note that land plants contain a higher number of the lectin coding genes and show clade-specific variations, as Fig. 3. Variation in the total number of lectin coding genes (representing all subfamily members present within an organism) in 48 species of photoautotrophs ranging from unicellular algae to higher plants ordered according to their phylogenetic relationship (e.g., algae, bryophytes & lycophytes, basal angiosperms, monocots, and dicots). The image with statistics of lectin genes was generated using the Gramene knowledgebase (http://gramene.org) by searching for lectin motifs*(the wildcard symbol * is commonly used for broadening a search/retrieving similar words).
well as show increases due to ploidy level.
In a previously published study, a comparison of 38 plant genomes revealed that subfamilies of lectins show species or clade-specific expansion (Van Holle and Van Damme, 2019). Agaricus bisporus agglutinin (ABA) is present only in the bryophytes (e.g., Marchantia polymorpha and Sphagnum fallax), but the cyanovirin domain is found also in lycophytes, and in ferns. The ABA and cyanovirin domains of fungal origin are absent in the higher plants and their presence in lower plants is attributed to horizontal gene transfer events from fungal ancestors (Van Holle and Van Damme, 2019). The amaranthin domains are only found in vascular plants (lycophytes, ferns, gymnosperms, and angiosperms), but are not ubiquitous. The Euonymus europaeus lectin (EUL) family members are found in most land plants, including the bryophytes (Van Holle and Van Damme, 2019). Interestingly, both stomata and the EUL domain show their first presence in the bryophyte lineage (during evolution), and a member of this family, ArathEULS3, has been shown to play a role in stomatal closure and drought response (Chater et al., 2017;Van Holle and Van Damme, 2019;Van Hove et al., 2015). The Galanthus nivalis agglutinin (GNA), Lysin-motif (LysM), jacalin, ricin-B, legume lectin, and malectin domains are present in all lineages of the tree of life; however, hevein and nictaba lectins are exclusively present in the eukaryotes (Van Holle and Van Damme, 2019). The jacalin and hevein families show noticeable expansion in the bryophytes. Often, lectin domain-containing proteins, for example, S-Domain Receptor-like Kinases (SDRLKs) are encoded by multi-copy genes that show significant tandem duplications in plant genomes (Myburg et al., 2014;Naithani et al., 2021).
Furthermore, as mentioned earlier, some lectin proteins contain one or more lectin domains. For example, the EUL family proteins exclusively contain one or two lectin domains. Both single and double EUL domain proteins are present in the bryophytes through monocot lineages. The polypodiopsida, gymnosperms, and dicots exclusively harbor single EUL domain architectures (De Schutter et al., 2017;Fouquaert et al., 2009; Van Holle and Van Damme, 2019). However, most lectin proteins have a modular structure: they contain lectin domains in combination with one or more other types of domains. For example, many receptor-like kinases (RLKs) comprised of one or more lectin domains combined with other types of domains, such as glycoside hydrolase (GH), F-box, EGF-like, PAN_APPLE, transmembrane, and kinase domains (Myburg et al., 2014;Naithani et al., 2021). Also, many proteins share structural similarities to RLKs, but lack transmembrane and kinase domains (known as receptor-like proteins, RLPs). Still, others have lectin domains with a kinase domain and have attributes of soluble kinases . It is noteworthy that different plant lectin subfamilies show distinct evolutionary trails shaped by full and partial gene duplications that correspond to gain/loss/recombination in protein domains. Altogether, there exists a rich repertoire of lectins with varied modular structures (Eggermont et al., 2017;Naithani et al., 2021;Shiu and Bleecker, 2001; Van Holle and Van Damme, 2019).

Expression of lectin genes and subcellular location of lectin proteins
During the past twenty years, Arabidopsis thaliana has been the model of choice for functional genomics with the aim of elucidating the roles of plant genes. A survey of Arabidopsis genomes suggests the presence of 217 genes that are likely to encode proteins containing one or more lectin domains (Eggermont et al., 2017). We have examined, in detail, the published literature and the data available in various public databases, including Gramene (Tello-Ruiz et al., 2021), EMBL-EBI Gene Expression Atlas (https://www.ebi.ac.uk/gxa/home), UniProt (http s://www.uniprot.org), TAIR (http://www.arabidopsis.org) and Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo) to gather information about the A. thaliana lectin gene family. We have found gene expression data for 215 out of 217 Arabidopsis lectin family genes; however, no evidence was found to support the expression of two genes, AT5G60320 and AT4G19800, under normal growth and developmental conditions. Of the 215 expressed genes, 112 genes show ubiquitous expression in all the organs and tissues of Arabidopsis, although they differ in their relative expression across various tissues and developmental stages. The remaining 103 genes show specific expression in one or more tissues or at specific developmental stages. Many lectins that show low ubiquitous expression under normal growth and developmental conditions show up-or down-regulation in response to biotic or abiotic stimuli (see Table 2).
Interestingly, most of the "classical" seed lectins, such as ricin, abrin, ConA, and SBA, that had been studied in detail in the initial phase of lectin research, were later discovered to be constitutively expressed and located in the vacuoles of plant cells (Lannoo and Van Damme, 2010); notably, the transport and accumulation of vacuolar lectins are key to crop growth and development (Nilsson, 2011;Sanmartín et al., 2007). Proteome datasets and in vivo visualization of the green fluorescent protein (GFP)-tagged proteins have provided extensive information about the subcellular locations of several Arabidopsis proteins. Therefore, we searched for the subcellular locations of all the 217 Arabidopsis lectin family proteins in, the SUBA database (https://suba.live) that serves as the central resource for Arabidopsis protein subcellular location information. We found that 100 out of 217 proteins are expected to reside in the plasma membrane, 54 in the apoplast/extracellular matrix, 41 in the cytosol and the remaining in the various subcellular locations, including plastids, mitochondria, vacuoles, golgi apparatus, nuclei, and microtubules. Moreover, only a small fraction of lectin domain-containing proteins (encoded by AT1G52040, AT3G12500, AT3G16400, AT3G16420, AT3G16460, AT3G16470, AT3G21630, AT4G21380, and AT5G03700) have the physico-chemical properties required to reside in the vacuole. A summary of gene expression and subcellular location of 60 lectin family members from Arabidopsis (for which biological function is known) is provided in Table 2.
The availability of whole-genome and transcriptome sequences, proteomes, and a range of algorithms now allow us to estimate the subcellular locations of lectins in the plant cell, tissues, and organs. Large-scale analyses of lectins from many crops, including rice, maize, wheat, soybean, and flax, suggest that lectin family genes are expressed in all tissues and show a change in their expression in response to intrinsic developmental signals, as well as in response to biotic and abiotic stress conditions (Huang et al., 2013;De Schutter et al., 2017;Naithani et al., 2021;Petrova et al., 2021). Information on the subcellular location of the lectins, encoded by major crops genomes, is available at the compendium of crop Proteins with Annotated Locations (cropPAL) (https://crop-pal.org). Overall, lectin domain-containing proteins are found in almost all subcellular compartments of plant cells, including cytosol, plasma membrane, extracellular matrix, vacuoles, plastids, and mitochondria.

Roles of lectins in plant development, cell signaling, and stress response
In the early phase of research, lectins were isolated mostly from storage bodies, tubers, and seeds, as for example, the vacuolar seed lectins (Lannoo and Van Damme, 2010). Thus, it was suggested that lectins might function as storage proteins with limited potential for participation in major biochemical and cellular processes. The only other role of lectins known then was for their insecticidal or antimicrobial activity (see Macedo et al., 2015); the first of these reports appeared in 1976 when Irwin Liener and his group (Janzen et al., 1976) showed that black bean lectins were insecticidal against bruchid beetles. Several plant lectins have since then been shown to have insecticidal and antimicrobial properties.
However, in recent decades, lectins have also been found to be localized in different subcellular compartments of various plant tissues, including leaves, stems, stem sap, roots, flowers, and fruits (see Table 2), and thus, suggest their wider roles in plants. Moreover, as discussed earlier, most lectin proteins have multiple domains besides the  (Pagnussat et al., 2005) and in Pi starvation response. Chitinase related agglutinin (CRA) AT4G19810 ubiquitous expression; high expression in roots and leaves; up-regulated in response to ABA, JA, NaCl, and to fungal and bacterial pathogens, and flagellin. The mRNA is mobile from cell-to-cell.

extracellular, apoplastic fluid
Chitinase C (ChiC) is a Class V chitinase of the glycoside hydrolase family 18. It acts as an exochitinase (predominantly cleaves a chitobiose (GlcNAc) 2 residue from the non-reducing end of a chitin oligosaccharide).

Legume-lectins (Concanavalin A-like lectin protein kinase family protein) AT1G15530
ubiquitous expression plasma membrane L-TYPE LECTIN RECEPTOR KINASE S.1 (LECRK-S.1) is involved in resistance response to the pathogenic oomycetes Phytophthora infestans and Phytophthora capsici and to the pathogenic bacteria Pseudomonas syringae . AT2G29250 roots, and seeds plasma membrane L-TYPE LECTIN RECEPTOR KINASE III.2 (LECRK-III.2) is involved in resistance response to the pathogenic oomycetes P. infestans and P. capsici . AT2G43700 ubiquitous expression plasma membrane It is involved in resistance response to the pathogenic oomycetes P. infestans and P. capsici and the pathogenic bacterium P. syringae . AT3G09190 ubiquitous expression extracellular matrix It regulates flowering time in Ws ecotype (Pouteau et al., 2008). AT3G15350 ubiquitous expression extracellular This is a class GT14 glycosyltransferase protein involved in the breakdown of lignocellulosic biomass into simple sugars/ saccharification (Ohtani et al., 2017). It is upregulated in response to ozone treatment (Xu et al., 2015) and after infection with Alternaria brassicicola ( Mukherjee et al., 2010 (Wan et al., 2008). It is also involved in resistance response to the pathogenic bacterium P. syringae . AT3G55550 expression in flowers, roots, and seeds plasma membrane L-TYPE LECTIN RECEPTOR KINASE S.4 (LECRK-S.4) is involved in resistance response to P. infestans, P. capsici and P. syringae . AT3G59700 ubiquitous expression; upregulated after infection with Fusarrium oxysporum (Zhu et al., 2013) extracellular matrix L-TYPE LECTIN RECEPTOR KINASE V.5 (LECRK-V.5/ LECRK1) confers resistance to P. infestans and P. capsici, but susceptibility to P. syringae . AT3G59740 expression in flowers, fruits, hypocotyls, callus, and roots plasma membrane L-TYPE LECTIN RECEPTOR KINASE V.7 (LECRK-V.7) is involved in resistance response to P. infestans, P. capsici, and P. syringae    . AT5G03350 ubiquitous expression extracellular; plastid SA-INDUCED LEGUME LECTIN-LIKE PROTEIN 1 (SAI-LLP1) plays a positive role in the SA-mediated effectortriggered immunity, but it is not involved in the autophagy process. Further, it promotes systemic acquired resistance rather than local immunity against P. syringae pv. tomato Avr-Rpm1 (Breitenbach et al., 2014). It may act in parallel with SA (Blanco et al., 2009;Breitenbach et al., 2014;Armijo et al., 2013). AT5G10530 expression data not found plasma membrane L-TYPE LECTIN RECEPTOR KINASE IX.1, (LECRK-IX.1) may promote hydrogen peroxide (H 2 O 2 ) production and cell death in response to the pathogenic oomycetes P. infestans and P. capsici (Wang et al., , 2015. AT5G42120 flowers, (carpels, stamens), seeds (endosperm, embryo), callus, roots, hypocotyls plasma membrane L-TYPE LECTIN RECEPTOR KINASE S.6 (LECRK-S.6) is involved in resistance response to the pathogenic oomycetes Phytophthora infestans, Phytophthora capsici, and to the pathogenic bacterium Pseudomonas syringae . AT5G55830 ubiquitous expression plasma membrane L-TYPE LECTIN RECEPTOR KINASE S.7 (LECRK-S.7) is involved in resistance response to the pathogenic oomycetes P. infestans and P. capsici . AT5G60270 ubiquitous expression plasma membrane L-TYPE LECTIN RECEPTOR KINASE I.7 (LECRK-I.7) is involved in resistance response to the pathogenic oomycetes P. infestans and P. capsici . AT5G60280 ubiquitous expression extracellular matrix L-TYPE LECTIN RECEPTOR KINASE I.8 (LECRK-I.8) binds NAD + and induces expression of disease resistance genes involved in resistance response to the pathogenic fungus A. brassicicola . AT5G60300 ubiquitous expression plasma membrane DOES NOT RESPOND TO NUCLEOTIDES 1 (DORN1)/ L-TYPE LECTIN RECEPTOR KINASE I.9 (LECRK-I.9) is a Receptor like Kinase which binds ATP with high affinity through its extracellular legume-lectin-like region leading to activation of intracellular calcium response and mitogen-activated protein kinase 3 and 6. It is likely to play a role in cell wall-plasma membrane adhesion, and resistance to the pathogenic oomycetes P. infestans and P. capsici (Bouwmeester et al., 2011Gouget et al., 2006;Wang et al., 2014Wang et al., , 2016Choi et al., 2014aChoi et al., , 2014b. AT5G65600 ubiquitous expression plasma membrane L-TYPE LECTIN RECEPTOR KINASE IX.2 (LECRK-IX.2) promotes H 2 O 2 production and programmed cell death. It is also involved in resistance response to P. infestans and P. capsici (Wang et al, , 2015(Wang et al, , 2016. Lysin motif (LysM) AT1G21880 callus, carpels, cotyledons, cultured plant cells, flowers, flower pedicels, hypocotyls, inflorescence meristem, leaves, leaf apex, leaf lamina base, petals, petioles, embryo, pollens, roots, seeds, sepals, shoots, and stamens extracellular; plasma membrane LYSM DOMAIN GPI-ANCHORED PROTEIN 1 (LYM1)/ LYSM-CONTAINING RECEPTOR PROTEIN 2 (LYP2). This is an ortholog of OsLYP4 and OsLYP6 and contains a C-terminal GPI anchor signal. It serves as a cell surface receptor for peptidoglycan elicitor signaling leading to innate immunity (Willmann et al., 2011). Induction of chitin-responsive genes by chitin treatment is not blocked in the mutant. AT1G51940 ubiquitous high expression plasma membrane LYSM-CONTAINING RECEPTOR-LIKE KINASE (LYK3) may recognize microbe-derived N-NAG-containing ligands, but it is not a major contributor to chitin signaling (Shinya et al., 2012). AT1G55000 ubiquitous high expression plasma membrane LysM domain-containing protein; component of SCF (ASK-cullin-F-box) E3 ubiquitin ligase complexes. It is involved in protein ubiquitination. AT1G77630 ubiquitous high expression plasma membrane LYSM-CONTAINING RECEPTOR PROTEIN 3 (LYM3/ LYP3) is a homolog of CERK1, but it is not a major contributor to chitin signaling (Shinya et al., 2012); it is likely to play a role in peptidoglycan sensing and immunity to bacterial infection. AT2G17120 ubiquitous high expression. plasma membrane LYSM DOMAIN GPI-ANCHORED PROTEIN 2 (LYM2/ LYP1) is a homolog of CERK1. It is predicted to have a minor role in the perception of the chitin oligosaccharide elicitor (Breitenbach et al., 2014). However, in a mutant of this gene, the induction of (continued on next page) carbohydrate-binding lectin domain(s), and accordingly they have diverse biological activities. We note here that lectin domains also interact with entities other than the sugars. For example, lectin domaincontaining S-locus receptor kinase from Brassica, located in the plasma membrane of stigma epidermis cells, binds to a small S-locus cysteinerich (SCR) peptide present on the pollen surface and their productive interaction activates the self-incompatibility response (Kachroo et al., 2001;Naithani et al., 2007Naithani et al., , 2013. Other RLKs interact with RLPs, plant growth hormones, and cell wall components (lipopolysaccharides or chitin) of the microbes, and they may play roles in diverse processes. More specifically, lectin domain-containing proteins are involved in the negative regulation of abscisic acid (ABA) response during seed germination (Eggermont et al., 2017;Xin et al., 2009), cell wall development and expansion (Petrova et al., 2021), ABA-induced stomatal movement (Van Hove et al., 2015), pollen development and male sterility (Peng et al., 2020;Wan et al., 2008), pistil development and self-incompatibility response (Kachroo et al., 2001;Naithani et al., 2007;Rüdiger and Gabius, 2001;Tantikanjana et al., 2009). See Table 2 for different functions of Arabidopsis lectins. In rice, lectin family members have been shown to be involved in leaf senescence, early crown root development, shaping panicle architecture, determining yield, abiotic stress tolerance, and immunity (Chen et al., 2006(Chen et al., , 2013De Schutter et al., 2017;Fan et al., 2018;Pan et al., 2020;Zou et al., 2015). In addition, lectin proteins present in the roots of the legumes is predicted to recognize microbe-derived N-NAGcontaining ligands and to play a role in chitin elicitor signaling and/or in detecting PGNs during bacterial growth and leading to innate immunity. It is involved in resistance to the pathogenic fungus A. brassicicola and the bacterial pathogen P. syringae pv tomato DC3000 ( Wan et al., 2012). AT2G33580 ubiquitous high expression plasma membrane LYSM-CONTAINING RECEPTOR-LIKE KINASE 5 (LYK5) is predicted to recognize microbe-derived Nacetylglucosamine (NAG)-containing ligands. AT3G01840 ubiquitous expression plasma membrane LYSM-CONTAINING RECEPTOR-LIKE KINASE 2 (LYK2) recognizes microbe-derived N-NAG-containing ligands (predicted by similarity). AT3G21630 ubiquitous high expression plasma membrane; vacuole CHITIN ELICITOR RECEPTOR KINASE 1 (CERK1) receptor kinase recognizes microbe-derived N-NAGcontaining ligands (e.g. chitin from the fungal cell wall and PGNs from the bacterial cell wall), and transduces signaling leading to innate immunity (Shinya et al., 2014). Galanthus nivalis agglutinin (GNA) AT1G11300 ubiquitous expression; it is activated by mannitol produced by pathogens such as fungi and contributes to plant defense response whenever mannitol is present (Trontin et al., 2014) plasma membrane ENHANCED SHOOT GROWTH UNDER MANNITOL STRESS 1 (EGM1) is likely to contribute to plant defense response against mannitol producing pathogens ( Trontin et al., 2014). AT1G11305 ubiquitous expression; upregulated by mannitol produced by pathogens such as fungi plasma membrane; extracellular matrix ENHANCED SHOOT GROWTH UNDER MANNITOL STRESS 2 (EGM2) is likely to contribute to plant defense response whenever mannitol is present (Trontin et al., 2014). AT1G11350 leaves, cotyledons, flowers, hypocotyls, inflorescence meristems, petals, petioles, embryo, seeds, sepals, shoots, stamen, and stem; guard cells plasma membrane Calmodulin-binding receptor-like protein kinase (CBRLK1) functions as a negative regulator in plant defense responses. It binds calmodulin and then gets activated. CBRLK1 mutant and CBRLK1-overexpressing transgenic plants show enhanced and reduced resistance against a virulent bacterial pathogen, respectively and this is correlated with increased and reduced induction of the pathogenesis-related gene PR1 ). AT1G61380 ubiquitous expression plasma membrane Lectin S-domain receptor kinase 29 (SD1-29) recognizes lipopolysaccharide (LPS) present in the cell walls of Pseudomonas and Xanthomonas species, and triggers innate immunity. Loss-of-function mutants are hypersusceptible to P. syringae (Ranf et al., 2015). AT1G65790 primarily expressed in the leaves, and roots; low expression in stems and flower buds; induced by wounding and Xanthomonas campestris pv. campestris.
microtubules: GFP-tagging has shown that ARK1 is localized in the growing ends of microtubules in the cortex of the subapical region of growing root hairs.
ARMADILLO-REPEAT KINESIN1 (ARK1) and its homologs ARK2 and ARK3 are plant-specific kinesin microtubule motor proteins involved in the cellular expansion, differentiation, and in determining the growth and polarity of the roots. It may be involved in the ABA-mediated signaling during germination (Tobias and Nasrallah, 1996;Samuel et al., 2008;Sun et al., 2020). AT3G16030 ubiquitous expression plasma membrane CALLUS EXPRESSION OF RBCS 101 (CES101)/ RESISTANCE TO FUSARIUM OXYSPORUM 3 (RFO3) is a receptor-like kinase. It promotes the expression of genes involved in photosynthesis, at least in dedifferentiated calli (Niwa et al., 2006). Also, it may play a role in innate immunity.
interact with the lipo-chitooligosaccharides in Nod factors produced by rhizobia, playing critical roles in establishing symbiotic interactions with rhizobia (Cullimore et al., 2001;De Hoff et al., 2009). Notably, constitutively expressed lectins present in the vegetative tissues of plants have been shown to be involved in mRNA transport across long distances within the plant (Pallas and Gómez, 2013). Others, such as the Gal-specific thylakoid lectin, from cabbage leaves, have been reported to provide cryoprotection against freeze-thaw damage (Hincha et al., 1993;Sieg et al., 1996). A stromal thylakoid membrane lectin of Triticale was shown to be associated with the Photosystem-I chlorophyll molecules (Aleksidze et al., 2002). A similar lectin was reported from the unicellular alga Dunaliella salina, and was shown to be associated with its light-harvesting complex. Further, stromal thylakoid membrane lectins provide ligands to ribulose bisphosphate carboxylase oxygenase (Rubisco) (Kovalchuk et al., 2012).
Overall, evidence from Arabidopsis and several crops suggest that lectin domain-containing proteins play important roles in mechanisms associated with plant development, reproduction, abiotic tolerance, and pathogen resistance (De Hoff et al., 2009;Rüdiger and Gabius, 2001;Van Holle and Van Damme, 2019). Here, we have taken a biocuration approach to obtain information about expression of lectin gene family, the subcellular distribution of lectins, and their function(s).

The endogenous plant lectin cycle
It is well-known that atmospheric CO 2, as well as the CO 2 present within the surrounding plant tissues, reach the chloroplast stroma, the site of the Calvin-Benson-Bassham cycle (Sharkey, 2019), where it is converted into glucose (Nonomura et al., 2017). A large fraction of the photosynthate is then transported into vacuoles, major multifunctional organelles of the plant cell, where it may be modulated for later use, for example, in abiotic stress response (Marty, 1999;Pertl-Obermeyer et al., 2016). However, as soluble carbohydrates accumulate, osmoregulation becomes important. Hence, one process for the storage of these sugars in plants is to convert them to polysaccharides, such as starch in the chloroplasts (Shevela et al., 2018). When the plant requires sugars for energy, starch is enzymatically degraded and utilized as needed. Although stored polysaccharides may be used to meet the long-term energy needs of a plant, yet significant energy is invested in the enzymatic reactions that involve either chain lengthening (polysaccharide biosynthesis) or the cleavage of a monosaccharide (Nonomura et al., 2018a). An alternative strategy to deal with the problem of accumulation of soluble carbohydrates is to reversibly bind sugars to proteins and render them osmotically inert. Lectins are the macromolecules of choice for this function in plant cells (Nonomura et al., 2020). The key feature of this process is that the lectin-bound sugars are readily available for use, as they are reversibly bound. Whether the carbohydrate would remain bound or released from the protein will depend on the direction in which the equilibrium shifts. Protein-carbohydrate interactions can be altered in many ways in the laboratory, such as by varying temperature, pH, or the concentration of the interacting molecules. In the case of endogenous lectin-carbohydrate interactions within the plant, a high concentration of free carbohydrates formed during photosynthesis may shift the equilibrium toward glycosylation of lectins. On the other hand, as the concentration of the carbohydrate depletes -for example, as a result of consumption of sugars in respiration-bound sugars would be displaced. The sugars thus freed from the lectin-carbohydrate complex are osmotically active.
Increasing concentrations of competing sugars that are not catabolized in plants and, in particular, those having comparable affinities for the lectins would promote the release of the bound sugar by chemically outcompeting the bound carbohydrate (Nonomura et al., 2020). Thus, the endogenous plant lectin cycle depends on such a mode of binding and release of glucose (Glc, or other active carbohydrate metabolites) to maintain a reasonably steady level of sugar for immediate energy needs (Nonomura et al., 2020). This catch and release mechanism of sugars from lectins, by a process that is strictly a matter of chemical competition and involves reversible binding, is cost-effective for plants (Dam et al., 2000).

Application of induced glycoregulation (lectin bypass) for improving crop productivity
One of the requirements for the endogenous plant lectin cycle to work is that the competing sugar must have a similar range of affinity for the lectin as Glc itself. This is because, in the presence of a high-affinity competing carbohydrate, even when the concentration of Glc builds, lectin may be unable to compete for the carbohydrate-binding site. Thus, while a fraction of the carbohydrate metabolites might then get stored for later use, a significant amount would be channeled toward the growth and productivity of the plant (Nonomura et al., 2020). Exogenous α-D-mannopyranosides appear to do just that, i.e., they bind tightly to the lectins in the treated crops and, thereby, suppress the endogenous lectin-carbohydrate interaction. This bypass of the plant lectin cycle leads to increased levels of soluble sugars that are then transported to all parts of the plant (Nonomura et al., 2018b). The surplus of photosynthates (sugars) in plants, treated with exogenous α-D-mannopyranosides, may then be directed to other metabolic reactions, some of which can contribute towards fruit flavor, stress resistance, and growth of the plant (Nonomura et al., 2020). Support for the binding of plant lectins to α-D-mannopyranosides came from a series of experiments involving labeled carbon atoms (Gout et al., 2000;Biel et al., 2010) by which it was determined that the labeled carbon molecule was not catabolized, and that a fraction was protein-bound. Suspecting that a sugar-binding protein might play some role in this process, the effects of methyl-β-D-glucoside (MeβGlc), and its α-anomer MeαGlc, were examined in plant of Brassicaceae family (Nonomura and Benson, 2013), showing that not only MeαGlc, but also α-D-trimannopyranoside, are better inducers of seed growth than MeβGlc. Also, adding MeβGlc to solutions of MeαGlc reduces the potency of the latter (Benson et al., 2009). A key finding was that the enhancement of growth by these carbohydrates required the presence of Ca 2+ and Mn 2+ (Nonomura and Benson, 2014) and this is a typical property of many lectins, particularly those from the legume lectin family, such as ConA. Indeed, Man-binding lectins are widely distributed in plants ( Davidson and Stewart, 2004) and the observed growth enhancement by α-D-mannopyranosides was also found to occur in many other C 3 plants, as well as in CAM and C 4 plants (Nonomura and Benson, 2013). A consideration of these findings led to the hypothesis that Man-binding lectins may be involved in growth enhancement in response to α-D-mannopyranosides (Nonomura and Benson, 2014). As many classical seed lectins are localized in the vacuoles (that also serve as major sites for carbohydrate storage within the plant cell), the modulation of glycoregulation was proposed to be mediated in vacuoles, working in conjunction with the cell walls (Nonomura et al., 2020); however, the specific lectins involved in this bypass remain to be identified.
We note that treatment with indoleacetic acid (IAA) and kinetin had been earlier shown to enhance the beneficial effects of MeβGlc (Benson et al., 2009). Moreover, promotion of growth has been observed after the application of relatively high (mM) doses of sugar-conjugated plant growth regulators (SPGRs) (Nonomura et al., 2011), and these findings are consistent with the occurrence of high concentrations of SPGRs in plants and with the enzymes known to be involved in the sugar-linking process (Jakubowska and Kowalczyk, 2004). The above-described experiments have provided clues to the involvement of signaling mechanisms. Future experiments involving α-D-mannopyranosides, other binding substrates, and signaling components in experiments with lectin mutant strains of plants are expected to provide new information on glycoregulation and its application in agriculture.

Concluding remarks
Lectins, the readers of the sugar codes in living organisms, are highly diverse. They are distributed across all kingdoms of life. Genome sequences of photoautotrophs have revealed that lectin proteins generally have a modular structure, often consisting of one or more lectin domains in combination with other conserved protein domains. The structural diversity of lectins is now the basis for their most widely used classification system. In addition to binding sugar molecules, lectin domaincontaining proteins recognize other specific ligands (e.g., small peptides, carbohydrates, lipids, small molecules, and ATP) that are generated within the plants during various developmental stages and differentiation of organs, tissues, and cell or produced by other biotic or abiotic components present in the external environment. In fact, hundreds of lectin domain-containing proteins may be present within a plant species, which differ in their carbohydrate-binding specificities and binding affinities (suited to their function at a given cellular location or tissue).
In more recent decades, analyses of whole genomes and transcriptomes of hundreds of plants, algae, and photosynthetic bacteria have revealed that lectin proteins are ubiquitous and show clade-specific evolutionary expansions. Besides their ability to bind sugars reversibly, lectin domain-containing proteins can form dimers and interact with diverse signaling molecules. We now have much information about the varied expression profiles of lectin family members across various tissues, developmental stages, and in response to pathogens and abiotic stresses -suggesting their wider role in plant function and plant survival. Indeed, experimental studies involving mutants and transgenic plants, expressing lectin family proteins, have shown that they can impart pathogen resistance, tolerance to drought, salinity, heat, and submergence. However, much remains to be studied about their exact roles, their ligands, and other interactors. Thus, further manipulation of candidate genes of the lectin family associated with important agronomic traits are likely to pave the road for breeding climate-resilient crops for the future.
The highly specific binding of lectins with sugars provides an efficient mechanism for carbohydrate storage and its slow release as needed in different tissues (particularly in storage organs) while maintaining the osmotic balance within the cell and its various subcellular compartments. Thus, lectins also participate in osmoregulation/glycoregulation via the endogenous lectin cycle, which is important for maintaining cellular homeostasis. We note that much remains to be understood about the plant lectin cycle and its bypass. For instance, we do not yet know which and how many lectins participate in this process or whether a single carbohydrate metabolite, a group of metabolites, SPGRs or PGRs are reversibly bound to the lectins. Some information on the carbohydrate specificities and affinities are available, for example, α-D-mannopyranosides are preferred to the β-anomers, and trimannosides are preferred to the monomers, yet much remains to be elucidated regarding the nature of lectin-carbohydrate interactions. Thus, future investigations must involve studying the effect of other carbohydrate ligands and their conjugates with plant growth regulators, such as IBA and kinetin. In the context of agricultural productivity, it remains to be determined whether breeding for the expression of specific lectins is feasible. Clearly, sugars from the aforementioned bypass go to growth points and flowering and fruits. Hence, targeting lectins to manage photosynthesis in the field will certainly push genetic, cytological, and organismal investigations in the near future-with the goal of increasing crop productivity.

Author contributions
SSK, AN, SN, and GG contributed to manuscript writing, review, and approval of the final version. SSK, AN, and SN prepared the figures and the tables. SSK summarized the history of lectin research, structuralfunctional interactions between the lectins and the carbohydrates, and different carbohydrate specificities within a single family of lectins. AN and GG summarized the endogenous plant lectin cycle, the role of lectins in the storage of carbohydrates within plants, and the induced bypass leading to glycoregulation of lectins for improving crop productivity. SN analyzed the evolutionary expansion of lectin family genes and reviewed the current information about gene expression, subcellular location of the proteins and function of lectin family members from the model plant Arabidopsis, and the role of lectins in cell signaling.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.