The functional diversity of structural disorder in plant proteins

https://doi.org/10.1016/j.abb.2019.108229Get rights and content

Abstract

Structural disorder in proteins is a widespread feature distributed in all domains of life, particularly abundant in eukaryotes, including plants. In these organisms, intrinsically disordered proteins (IDPs) perform a diversity of functions, participating as integrators of signaling networks, in transcriptional and post-transcriptional regulation, in metabolic control, in stress responses and in the formation of biomolecular condensates by liquid-liquid phase separation. Their roles impact the perception, propagation and control of various developmental and environmental cues, as well as the plant defense against abiotic and biotic adverse conditions. In this review, we focus on primary processes to exhibit a broad perspective of the relevance of IDPs in plant cell functions. The information here might help to incorporate this knowledge into a more dynamic view of plant cells, as well as open more questions and promote new ideas for a better understanding of plant life.

Introduction

One cannot neglect that cells are complex and dynamic biological systems, whose interior, including that of their organelles, constitutes highly crowded environment with a viscosity many times higher than that of water, and where generation of energy, gene expression, DNA replication, protein synthesis and many other essential processes take place. Moreover, cells in all organisms are subjected to a myriad of environmental conditions dependent on their type and/or function, which in turn is modulated by the developmental stage of the organism and by the multiple changes occurring in their surroundings, either under healthy or pathogenic situation. Hence, the characteristics of the cytoplasm are not only determined by factors such as pH, water availability, osmotic pressure, and ionic strength, but also by macromolecular crowding, given the high concentration of macromolecules in it [1,2]. All these elements affect the structure and function of individual proteins and nucleic acids, as well as the formation of macromolecular complexes and the structure/organization of the cytoplasm itself [3]. In addition to these considerations, we should not forget that biological macromolecules have evolved to be functional in this kind of environment, far from those conditions where most properties and action mechanisms of these macromolecules have been studied.

In this crowded cellular environment, proteins with globular or disordered structures, or hybrids showing ordered and disordered regions, have evolved. Most of the best characterized proteins show ordered, globular or stable structure; whereas disordered proteins or regions, often named as intrinsically disordered proteins (IDPs) or intrinsically disordered regions (IDRs), are proteins or domains possessing high structural flexibility, able to present a range of conformational functional states, as a consequence of a biased amino acid composition, where charged and hydrophilic amino acid residues are abundant compared to bulky hydrophobic ones and to cysteines, usually underrepresented [4,5].

IDPs are widely distributed in all domains of life, and they are more represented in eukaryotic than in prokaryotic proteomes, suggesting that IDP and IDR abundance is related to organismal complexity [6,7]. Furthermore, although based on limited information, it has been suggested that structural protein disorder may be associated with the ability of some organisms to adjust to changing environments [[8], [9], [10]]. Among eukaryotes, plants represent complex biological systems with an extraordinary diversity of resources to deal with fluctuating environments given their sessile nature. This competence leads to an immediate association with the existence of intricate networks of interaction between proteins, allowing a fast and efficient integration of essential processes with environmental cues. The necessity of such complex coordination alludes to the structural flexibility of IDPs that enables them to participate in many combinatorial interactions, which could adapt to the variations in the cellular milieu caused by developmental or environmental conditions. In this review, we focus on essential processes where the role of IDPs and IDRs represents an example of the relevance of their inherent plasticity for plant life.

The highly flexible nature of IDPs and IDRs allows them to present regions with particular transient structures, which constitute a central feature to target more than one partner; interestingly, some of these regions contain conserved sequence motifs [11]. The structural plasticity of these proteins, consequence of their physicochemical properties, also permits a high interaction with their surroundings enabling them to sense physical and chemical changes occurring in their vicinity, perception that is exhibited through modifications of their structural conformation. This property leads to the concept of ‘fuzziness’, which refers to the formation of protein interactions where IDPs or IDRs exhibit different degrees of disorder upon binding [12,13]. However, because the differences between IDPs/IDRs amino acid sequences, and between those of their interacting proteins, functional associations with their targets can occur through diverse mechanisms, such as conformational-selection, but also via coupled folding and binding, a combination of both, or any other unknown mechanism [14]. It is important to consider that IDPs/IDRs potential for adopting a continuum of conformational states and transitions implies a complicated scenario to establish the conformational space that particular IDPs/IDRs trace to reach the conformation(s) that will allow specific functions in a particular environment [[15], [16], [17]].

Although sometimes IDPs/IDRs have been referred as sticky proteins, their versatility of interaction is far from promiscuity, as these proteins show selectivity, even when the association affinity could be low. Some of their weak interactions are related to order-to-disorder transitions of short regions, occurring during its association with their partner(s), allowing reversibility, essential in many regulatory processes, such as transcription and translation [16,18]. Bearing in mind that post-translational modifications affect the physicochemical properties of proteins, and therefore modulate their function, the structural flexibility of IDPs/IDRs offers the advantage of exposing one or multiple combinatorial modifications sites, leading to a variety of kinetics and affinity, depending on the cellular stage and/or internal environment [18,19].

Our knowledge on the molecular mechanisms involved in signal transduction networks for different processes has greatly increased, but in recent years the evidence regarding the contribution of protein structural disorder to various signaling systems has shown its relevance and impact on perception, propagation and control of a diversity of developmental and environmental cues [16].

Because the resourcefulness of IDPs/IDRs, it is not surprising that many of them may act as hubs in signaling networks able to interact with different and diverse partners, and as organizers in the formation of higher order complexes. In fact, evidence for this kind of functions has been described for some of them, and it is just a matter of time to find out that protein disorder will be an inherent characteristic of protein networks [20,21]. Most of the described examples for these functions have been described for animal proteins, and still very little is known regarding plant IDPs/IDRs [22].

The majority of information regarding plant IDPs/IDRs comes from the model plant Arabidopsis thaliana, where considering those proteins that contain more than 50% of disordered residues, genome wide analysis indicates that its proteome contains approximately 30% of IDPs, and that 51% of the proteins contain at least one IDR. According to this analysis, the functional categories where protein disorder is best represented are signaling, development, cell cycle and stress response [7]. Biological processes, where transient interactions with multiple partners are involved, seem to be highly represented, together with those implicated in perception of external stimuli and with their respective responses. It is remarkable that proteins associated with perception and signaling of light quality, as well as those involved in abiotic stress signaling and protein folding present high abundance of structural disorder [7,23,24].

Structural disorder plays a relevant role in the function of some proteins involved in transcriptional regulation, particularly eukaryotic transcription factors (TF). This observation has led to propose that structural flexibility might be a widespread and relevant feature in TFs of different eukaryotic organisms (82–94% of TFs contain IDRs) [23,[25], [26], [27]]. Prediction analyses of disorder in TFs indicate that the presence of IDRs is particularly significant in these proteins; notably, disorder is highly abundant in their activation domains, which in turn are involved in the interaction with their partners (co-activators or co-repressors, among others). IDRs have also been found in those TF regions that participate in DNA recognition, facilitating movement along the DNA and modulating selectivity [26,28]. Similar structural features have been found in plant TFs by examination of the Arabidopsis proteome [9,29].

Among the best characterized plant proteins containing IDRs with recognized roles in transcriptional regulation and signaling are those known as GRAS [(GIBBERELLIC ACID INSENSITIVE), RGA (REPRESSOR OF GAI: GIBBERELLIC ACID INSENSITIVE), and SCR (SCARECROW)] proteins [30]. Typically, GRAS protein family presents a variable amino-terminal region and a highly conserved carboxy-terminal segment, designated as GRAS-domain that defines the family identity [31]. It has been shown that the amino-terminal domain is intrinsically disordered, a feature first discovered in a GAI protein of the DELLA (Asp-Glu-Leu-Leu-Ala) subfamily, and widespread in the entire GRAS protein family [32]. DELLA proteins, one of the best characterized set in the GRAS family, are a paradigmatic example of the broad role of this protein family in the plant life. DELLA proteins act as key negative regulators of gibberellic acid (GA) responses, such as those related to plant growth, where these proteins specifically interact with various partners, associations ensuing crucial events for the plant growth and development (e.g. seed germination, stem elongation, transition to flowering, etc.) [33]. Noteworthy, DELLAs interact with PIFs (PHYTOCHROME INTERACTING FACTORs), bHLH (basic HELIX-LOOP-HELIX) TFs (PIF3 and PIF4), to preclude their binding to their corresponding promoters, therefore inhibiting the plant response to light. Also, the interaction between DELLAs and bHLH or MYC (Myelocytomatosis) TFs controls jasmonic acid (JA) signaling and fruit development; while, the association of DELLA proteins with TFs implicated in growth control under adverse environments promotes growth repression through crosstalk mechanisms with ABA (Abscisic Acid) and ethylene signaling pathways, exhibiting in this way their function as integrators of developmental and environmental cues. It is known that the disordered amino-terminal domain in DELLA proteins is a signal perception region, where despite its variability, relatively conserved motifs have been identified with putative binding sites for protein-protein interactions, such as the site for association of DELLA proteins with the GA-bound receptor GID (GA-INSENSITIVE DWARF1). In the case of the carboxy-domain, this plays a central role in their repressive activity (Fig. 1) [34,35].

Proteins in other GRAS subfamilies are involved in a wide number of key processes of plant growth, development and stress responses [30]. Examples of this are SCR and SHR (SHORTROOT) proteins, regulatory elements accomplishing critical functions for normal root development, by participating in the formation of the correct radial pattern for roots and shoots [36]. SCR additionally contributes to the appropriate development of stomata and ligule, by modulating asymmetric cell divisions. Intrinsic disorder in GRAS proteins also influences nodule development in legumes, represented by the NSP1 (NODULATION SIGNALLING PATHWAY 1) and NSP2 (NODULATION SIGNALLING PATHWAY 2), proteins participating in signaling the plant response to rhizobial Nod factors, by binding to the promoter of genes induced by Nod factors through their IDRs in their amino-terminal domains, which are also required to form NSP1-NSP2 activation complex [37,38]. Another example is the LISCL (Lilium longiflorum SCR-like) subfamily of transcriptional regulators, which exhibits a high functional heterogeneity. They act as activators and/or co-activators of genes through conserved motifs in their amino terminal domains. These functions impact different plant developmental processes, such as pollen development, adventitious root formation mediated by auxins, and the response to abiotic and biotic stimuli [39,40]. Other GRAS subfamily proteins, such as AtPAT1 (Arabidopsis PAT (Phytochrome A Signal Transduction)) and AtSCL13 (Arabidopsis SCR-like), are key participants in phytochrome light signaling triggering responses to different light conditions [41,42]. In addition, it has been reported that CIGR (CHITIN-INDUCIBLE GIBBERELLIN-RESPONSIVE) proteins operate as transcriptional regulators of the plant defense response, and as transducers of GA function, showing their roles in various signaling pathways [43]. Likewise, AtLAS (Arabidopsis LATERAL SUPPRESSOR) and HAM (HAIRY MERISTEM) subfamilies are required for axillary shoot establishment during vegetative growth, and for shoot meristem maintenance, respectively [44].

The briefly described examples of plant GRAS proteins illustrate not only their common involvement in transcriptional regulation through their GRAS domain, but also their competence to act as hubs via an assortment of IDRs, mostly located in their amino-terminal domains, performing fundamental functions by their coordinated interaction with different partners, integrating in this way developmental and environmental cues.

Light signal transduction mediated by cryptochromes (CRYs) represents an additional instance of the relevance of structural disorder in plant biology. These proteins participate in the repair of DNA damaged by UV light in plants and animals, and both protein families present a variable region in their carboxy-terminus, which has been demonstrated to be an IDR. CRYs constitute central players in photomorphogenic development and circadian clock; likewise, they achieve leading regulatory roles in floral initiation, and in processes modulated by blue light, such as flowering induction, inhibition of hypocotyl elongation, and stomatal opening, as a result of the interaction of their carboxy-terminal IDRs with the E3 ubiquitin ligase COP1 (CONSTITUTIVE PHOTOMORPHOGENIC 1) [[45], [46], [47]]. Remarkably, in Drosophila it has been shown that this association depends on light stimuli, evocative of conformational changes promoted by physicochemical light effects, and indicating that the signaling mechanisms involved are consequence of sophisticated protein interactions determined by structural plasticity [48]. Also, there is evidence indicating that in flies under darkness CRY carboxy-terminal domains associate with a region of these proteins known as PHR (PHOTOLYASE HOMOLOGY REGION); however, upon light stimulus, IDRs in this domain undergo structural rearrangements exposing COP1 binding sites, which results in the repression of COP1 activity [49]. Even though, the CRY photoreception mechanism in Drosophila shows differences with that described for Arabidopsis, where the C-terminus is the domain responsible for phototransduction and the photolyase domain is the signaling region [50], in both CRY systems the carboxy terminal domains play an important role in the signaling to downstream components. Depending on the conditions, the structural changes in the carboxy-terminal IDRs may expose different sequences able to specifically recognize various effector partners of photomorphogenic processes. Short and rigid regions found in CRY IDRs have been associated with recognition sequences called MoRFs (Molecular Recognition Features), characterized by its propensity to gain structure upon binding to its specific partner, and potentially contributing to the specificity of partner recognition. In addition to these features, the different lengths detected in CRY carboxy domains may also be important for their ligand selectivity, and for their ability to influence a diversity of signaling networks [9,11,51].

The start of UV-B light signaling by the UV-B photoreceptor also encompasses IDRs. This photoreceptor, encoded by UVRB (UV RESISTANCE LOCUS B), is activated by UV-B absorption. This leads to the formation of UVR8 monomer from its inactive homodimer. The active UVR8 monomer binds COP1 through 27 residues in its carboxy-terminal region (C27) and the central RCC1 domain, and this heterodimer is then able to be transported into the nucleus, activating COP1 downstream signaling, including HY5 gene expression [[52], [53], [54]]. UVR8 three-dimensional structure was obtained some years ago by X-ray crystallography; however, the carboxy-terminal region was missing, disregarding it in subsequent functional studies [55,56]. Interestingly enough, a recent report has shown that the UVR8 C27 domain is intrinsically disordered and that in its basal state it adopts numerous random secondary structures [57]. This finding agrees with its role in protein-protein interactions, not only with COP1 but also with RUP1 and RUP2 (REPRESSORS OF UV-B PHOTOMORPHOGENESIS 1 and 2), proteins enabling the dimerization of the UVR8 monomer under dark conditions [54,58], hence allowing to propose UVR8 as a molecular switch with the potential to associate with different partners to integrate the plant response to UV radiation with different processes throughout plant development [57].

Additional plant hub proteins involving IDRs and participating in critical plant processes are the FLZ (FCS-LIKE ZINC FINGER) proteins, which have been proposed as a platform for SnRK1 (SNF1-RELATED PROTEIN KINASE 1), and thus considered as regulatory factors of the plant response to starvation and stress [59,60]. Under these circumstances, the survival of an organism depends on timely and efficient mechanisms to control their energy status, which in eukaryotic cells involve SnRK1 and TOR (TARGET OF RAPAMYCIN) kinases. When energy is limiting, SnRK1 is activated leading to a series of phosphorylation reactions, whose ultimate aim is the inhibition of TOR activity and, in this way, hinders processes compromised with energy consumption. These events are downregulated when energy levels are recovered, and growth can normally proceed [61,62]. SnRK1 and TOR activities are conserved across eukaryotes; however, their sequences have changed during evolution leading to specific lineage variations [59]. In plants, SnRK1-TOR signaling mechanism shows various differences compared to those described in yeast and animals, suggesting that their structural composition was selected to be compatible with plant life requirements [60]. In this context, SnRK1 is part of a connector system to translate internal and external cues, in close collaboration with FLZ proteins, which meet the characteristics to act as adaptors between SnRK1 and its effector proteins. Accordingly, protein-protein interaction assays identified RAPTOR (REGULATORY-ASSOCIATED PROTEIN OF mTOR), a regulatory component of the SnRK1 complex, as an FLZ ligand [61,63]. The involvement of disorder comes when a recent study established the existence of IDRs in the amino- and carboxy-terminal regions of FLZ proteins as potential protein binding sites, which interestingly seem to be conserved in land plants. IDRs in FLZ amino-terminal region are participants in their binding to specific SnRK1 subunits, possibly through very dynamic (fuzzy) interactions, and also in the formation of FLZ homo- and hetero-dimers. In agreement with their role as adaptors or scaffold proteins, enrichment of post-translational modification sites (PTMS) have been predicted in FLZ IDRs, a characteristic that, together with the formation of homo- and hetero-dimers, foresees an expansion in FLZ interaction arrays [64].

An example of the participation of protein structural disorder in hormone signaling is found in brassinosteroids (BRs) perception. BRs are steroid hormones involved in the control of plant growth and development, perceived by a receptor complex localized in the plasma membrane [65]. This receptor is formed by BRI1 (BRASSINOSTEROID INSENSITIVE 1) and BAK1 (BRI1-ASSOCIATED RECEPTOR KINASE 1), leucine-rich receptor-like kinases (LRR-RLKs) that upon sensing BRs start a phosphorylation cascade, which in combination with phosphatases leads to the activation of TFs (BRI1-EMS SUPPRESSOR1 (BES1) and BRASSINAZOLE-RESISTANT1 (BZR1)), directly controlling the expression of target genes [66]. When BRs are in low levels, the signaling pathway is kept with a basal activity by autoinhibition of BRI via its carboxy-terminal region, or by the action of BKI1 (BRI1 KINASE INHIBITOR); as soon as BRs are perceived, inhibition is released by auto- or trans-phosphorylation through BRI1 activity. Once BKI1 is phosphorylated, it associates with 14-3-3 proteins to be released from the plasma membrane. BKI1 is an IDP that not only works by competitively inhibiting the trans-phosphorylation on the cytosolic domains of BAK1 and BRI1 in the receptor complex, but also specifically interacts with other proteins through linear motifs in its IDRs to modulate the effect of BRs on plant cell functions. It is presumed that these BKI1 activities are regulated by the modification of phosphorylation sites located in its IDRs [67,68]. Interestingly, BRI1 also shows structural disorder, particularly in the BR binding site confined to a 70 residues region of its extracellular domain, which upon hormone binding acquires an ordered conformation, allowing BAK1 association and trans-phosphorylation for the activation of the downstream signaling events [68].

Currently, the complexity and diversity of regulatory networks have been recognized as a usual characteristic of those systems that keep the different cell functions throughout the life of an organism under control, despite the multiple and miscellaneous changes to which it is exposed. As mentioned, the participation of IDPs and IDRs in these arrangements is fundamental for an efficient and suitable functional adjustment every time it is needed; however, this has to be combined with the different network structures [16,25,69]. In some transcriptional networks, hubs involve IDRs in TFs interacting with folded partners; but some others transcriptional networks include ordered hubs binding to IDR containing partners, optimizing in this way the orchestration of the cross-talk between different signaling pathways (Fig. 2) [18,70,71]. In plants, this last kind of networking is exemplified by the RADICAL-INDUCED CELL DEATH (RCD1) hub protein. RCD1 contains various functional domains, where the different RCD1 interactions occur; these domains are: an amino-terminal WWE domain, a poly (ADP-ribose) polymerase domain, and a carboxy-terminal RST (RCD1, SRO and TAF4) folded hub domain. RCD1 is a central protein able to interact with different TFs, participating in developmental processes, hormone signaling and in responses to hyper-oxidant conditions. To achieve these different functions, RCD1 interacts with the transcription activation domain (TAD) of a diverse set of TFs, such as NAC013 (NO APICAL MERISTEM, ATAF, CUP-SHAPED COTYLEDON), bHLH011, MYB91, DREB2A (DEHYDRATION RESPONSIVE ELEMENT BINDING PROTEIN 2A), STO (SALT TOLERANCE), COL9 (CONSTANS LIKE), COL10, PIF3 (PHYTOCHROME INTERACTING FACTOR), PIF5, PIF7, WRKY47 (present the amino acid sequence WRKY), IDD5 (INDETERMINATE DOMAINS), and TGA2 (TGACGTCA cis-element-binding protein), all of them playing central roles in signaling processes involved in development, light and stress responses, among others [29,72]. Significantly, all TFs interacting with RCD contain IDRs, indicating that structural disorder is required for RCD-TFs interactions. In some cases, it is known that binding of these transcription factors to RCD1-RST domain involves structural modifications in the interacting IDRs. Upon binding, some of them change from disordered to an ??-helix conformation as in the case of DREB2A, others transit from disordered to extended conformations, and there are cases where they form dynamic or fuzzy complexes [29,73]. Interestingly, it has been found that the RST interacting domain of RCD1 adopts a unique helical conformation sharing some features with other proteins playing crucial roles in transcriptional control. This group of ordered hub domains presents a structure called ??-??-hairpin motif, which has been suggested that, given its shared common features among hub proteins, it could offer functional advantages as a malleable platform for an optimal conformational adaptation, allowing specificity among the diverse structural topology of disordered interactors. The fold in this ??-??-motif forms a super-secondary structure establishing two antiparallel ??-helices, along a considerable length, and a loop, which determines the formation of a narrow angle between the connected helices. RCD1-RST structure in its complex state with DREB2A TF shows four ??-helices flanked by disordered segments, which are connected by short loops. This four helices conformation displays a hydrophobic surface with the characteristics of a possible ligand-binding site, representing a unique arrangement for a protein interacting region. The properties of ??-??-hubs allow them to adopt different topologies suggesting that they may be able to adapt to more transcriptional networks than other hub structures, not only to promote binding but also with the possibility of acting as blockers [74].

Regardless of the network structure, a common and essential element seems to be the participation of IDRs in the involved proteins, most of them identified as TFs [16,25,27]. Additional plant TFs, where IDRs play an important role in their regulatory function belong mostly to NAC and bZIP (BASIC REGION, LEUCINE ZIPPER) families [26,29].

The known NAC TFs contribute to the control of plant development (e. g. xylem formation and senescence), to hormone signaling (ABA signaling during germination), and to the regulation of responses to biotic and abiotic stresses [75,76]. NAC proteins contain a conserved amino-terminal region (NAC domain), responsible for their binding to a consensus cis element CGT(GA) (DBD, DNA Binding Domain), which adopts a ??-sheet conformation surrounded by ??-helices, whereas their carboxy-terminal regions (TRDs, Transcriptional Regulatory Domains) are variable and intrinsically disordered with the potential for disorder-to-order transitions [76,77]. Despite their variability, TRDs show some conserved motifs, specific for NAC sub-groups, which participate in protein-protein interactions; this is the case of sequences in TRDs of ANAC019 (ARABIDOPSIS NAC DOMAIN CONTAINING PROTEIN 19), CUC1 and NAC013 TFs, required for ABA signaling during germination and seedling development, for promotion of adventitious shoot formation, and for transactivation functions related to senescence processes, respectively. Likewise, TDR in ANAC012 from Arabidopsis participates in xylem development, while NAC005 from barley is also involved in senescence; however, the interactions involved seem to be different to those of NAC013, because they show different TRD motifs, supporting the idea that TF specificity is mediated by motifs present in the IDRs of their transactivation or regulatory domains [9,72,78].

As for NAC proteins, the structural organization in the bZIP family TFs shows a DNA-binding domain (bZIP), rich in basic residues preferentially forming ??-helical conformations, and an activation domain usually disordered. Arabidopsis HY5 protein is a well-characterized bZIP TF acting as a positive regulator of plant photomorphogenesis on promoters containing G-box light responsive cis-elements. The carboxy-terminal domain of HY5 binds to DNA, whereas the amino-terminal region corresponds to the regulatory domain with properties to associate to other proteins to specifically coordinate different light modulated processes [[79], [80], [81], [82]]. Among the HY5 interacting signaling proteins are COP1, an effector protein modulating various light sensitive processes (e.g. hypocotyl elongation, flowering time), or CCA1 (CIRCADIAN CLOCK ASSOCIATED 1) protein, a regulator of the circadian clock. HY5 is a highly disordered protein, where the amino-terminal domain contains more structural disorder than the carboxy-terminal domain, which is able to fold in α-helices but without acquiring a stable tertiary structure unless it is bound to DNA [83,84]. Although, folding induced by binding has been experimentally shown for a NAC TF [72,76], evidence still is waiting for HY5 amino-terminal TRD; however, the presence of sequences meeting MoRF characteristics, such as propensity for protein interactions and disorder-to-order transition, suggest this kind of mechanism. The presence of MoRFs has been predicted not only for various bZIP TRDs but also for a number of other plant TFs, upholding their ability to act as association hubs because of their structural flexibility [9,79,85,86].

Processes involved in post-transcriptional regulation of gene expression also show the contribution of the structural flexibility characteristic of IDPs. DCL1 is a ribonuclease that plays a relevant role in microRNA (miRNAs) biogenesis. This protein binds double stranded RNA through two domains tandemly located at its carboxy-terminal region in order to process pri-miRNA precursors to generate mature and active miRNAs. It has been shown that these DCL domains are crucial for its activity, consequently affecting decisive functions in plant cells. Upon binding pri-miRNAs, DCL1 IDRs gain structure showing the canonical dsRBD (double stranded RNA BINDING DOMAIN) fold. Experimental evidence suggests that RNA binding recognition event involves a conformational selection mechanism and an induced adapting mechanism leading to the appropriate folding during the formation of the processing complex [[87], [88], [89]]. DCL1 structural analysis offers new elements to understand the relevance of its disordered nature in miRNAs maturation.

Metabolism is a set of chemical reactions involved in maintaining the living state of organisms. The main actors that carry out these reactions are enzymes, which act as catalysts in numerous reactions in the cell. Formerly, the concept of lock-key was associated with this kind of proteins, conceived as entities structurally complementary to their substrates as a lock and its key. In this analogy, the functionality of these catalytic agents was strongly associated with the idea of a fixed structure; nevertheless, with the observation that substrate binding brings an extensive array of motion along with structural changes, it was understood that a certain degree of flexibility is required [90]. Conformational dynamics in enzymes is as important as other features and therefore it has been subjected to a selection pressure. This was demonstrated by a combination of nuclear magnetic resonance (NMR) and molecular dynamic analyses for the model pancreatic-type RNase superfamily. This study showed that flexibility is largely different among distinct phylogenetic branches; nevertheless, highly conserved within specific branches [91]. This conserved flexibility is partially related to the preservation of intrinsic disorder propension [92]. An analysis of structural disorder distribution in proteins found that disorder in enzymes is as frequent as in non-enzyme proteins, in terms of the length of the longest continuous disordered stretch; this evaluation also found that the best-represented enzyme categories correspond to those classified as transferases, hydrolases, and enzymes with multiple assigned functions [93]. These observations have positioned structural disorder as a relevant functional feature for enzymes. Conformational flexibility can be required to increase the enzyme dynamics, providing with regions able to regulate their own activity or to control associated processes. Despite the relationship between structural order and enzymatic activity, enzymes with high content of structural disorder have also been found. This is the case of UreG, an enzyme considered a molten globule protein due to its high structural disorder content showing significant secondary structure and a minor amount of tertiary structure. UreG is a GTP hydrolase, which complexes with UreF and UreD to assists nickel delivery into the nickel-dependent urease, a pathogenic factor in several bacteria and fungi [94]. More than 70 proteins with sequence similarity to bacterial UreG can be identified in plants, most of them eudicots; UreG orthologue from soybean has been characterized showing to be an IDP [95]. Intrinsic disorder in this enzyme has been associated with a regulatory mechanism, where the degree of structural flexibility corresponds to a favorable location in its active site folding landscape, that seems to be necessary to control its catalytic activity by interacting with other proteins [96]. Similar observations have been reported for other enzymes, suggesting that in these cases, their functional active sites may require flexible conformations resulting from relatively weak molecular interactions, and that this structural plasticity could be essential for optimal enzyme activity [97].

In addition to enzymes, the disorder is widely found in non-enzyme proteins with important roles in plant metabolism, acting as scaffolds, as PTM targets or as interaction sites to recognize multiple partners; hence, operating as metabolic hubs. For example, WRI1 (WRINKLED1) is a TF considered a hub-protein involved in triacylglycerol (TAG) biosynthesis. TAG is a lipid required during seedling development, providing the energy and carbon demand during this process [98]. WRI1 TF belongs to AP2 (APETALA2) family [99], and it is regulated by the central seed developmental regulators LEC1 and LEC2 (LEAFY COTYLEDON 1 and 2), and by MYB89 and FUS3 (FUSCA3) [[100], [101], [102]]; in turn, WRI1 controls different pathways of the glycolytic and fatty acid biosynthesis [98,103]. WRI1 has three different IDRs (1–3), one of which (IDR3) is located at the carboxy–terminal region, facilitating under particular conditions the exposure of a PEST motif (proteolytic signal) and its phosphorylation, a modification that affects its accumulation, consequently disturbing plant oil biosynthesis. TAG biosynthesis requires DGAT1 (DIACYLGLYCEROL ACYLTRANSFERASE 1), an enzyme catalyzing the acyl-CoA-dependent acylation of sn-1,2-diacylglycerol to generate TAG. DGAT1 is a conserved protein in plants, that carries out its function in the ER, where it is attached to the membrane through eight to ten predicted transmembrane segments located at the carboxy-terminal region, which also contains the highly conserved Asn/Asp and His residues, presumably involved in acyltransferase activity [[104], [105], [106]]. The amino-terminal region of Brassica napus DGAT1 (BnDGAT11-113), which localized to the cytosol, contains an IDR and a small folded segment. In the DGAT1 inactive form, CoA interacts with the allosteric site for Acyl-CoA/CoA located in the folded segment of its amino-terminal region, while the activation of this enzyme is dictated by the acyl-CoA/CoA ratio and occurs by the displacement of CoA by acyl-CoA, which is recognized by the same folded segment, triggering an homotropic allosteric activation that is transmitted to the IDR resulting in a highly active state. This positive cooperativity has been explained by the participation of the IDR in DGAT1 dimerization, which leads to optimal enzymatic activity. The presence of cytosolic IDRs in membrane proteins is not an unusual characteristic; it has been reported that IDRs are found in 50% of membrane proteins, suggesting a regulatory function for these regions that regularly are cytosolic [107].

This is exemplified by the CSR (CLASS SPECIFIC REGION) in Physcomitrella patents CESAs (CELLULOSE SYNTHASES). Plant CESAs are differentiated from the bacterial ones by the presence of the plant-specific domains called PLANT-CONSERVED REGION (P-CR), and the disordered region CSR [108,109]. For the synthesis of the primary or secondary cell wall cellulose, plants use different CESA isoforms, each one with non-redundant functions [110]. CESA isoforms are organized differently in the membrane forming a six-lobed oligomeric complex called rosette; at the same time, rosettes are arranged in cellulose synthase complexes (CSCs), whose organization correlates with cellulose microfibril structure [111,112]. The CSR is found in a central cytosolic region of rosettes. For the Physcomitrella CSR, it is proposed that structural disorder serves as a “fly-casting mechanism” to explore the surrounding, while its MoRFs act as the interfaces between CESA isoforms. Because the peripheral location of CSR, it could play a regulatory function considering that cellulose synthesis is subjected to a fine control during plant lifespan [113].

As autotrophic organisms, plants are able to convert inorganic carbon into organic compounds through photosynthesis. This metabolic process is divided in two stages, the first one consists of the light-dependent ATP and NADPH production, and the second phase involves the use of the generated energy-storage molecules to assimilate the inorganic carbon. The processes involved in these two stages take place in the chloroplast, a plastid derived from a single or a double endosymbiotic event, resulting in the transfer of a large number of genes from the plastid to the nucleus, being more than 98% of all chloroplastic proteins translated by cytosolic ribosomes [114]. Among the transferred genes, those related to biosynthesis and metabolism are highly represented [115,116]. Analysis of plant proteomes shows that the abundance of structural disorder in endosymbiotic origin organelles is similar to that found in archaea and bacteria, whereas nuclear proteins, including those of organelle origin, show a disorder proportion similar to that of eukaryotic proteomes. This fact might disclose a selective pressure over disordered regions, because nuclear proteins originally from bacteria acquired IDRs during evolution, suggesting an adaptive advantage under some conditions, such as adverse environments [117]. In the following examples, we describe how structural disorder in chloroplastic proteins is involved in photosynthesis, and how it plays a key role in the regulation of this vital plant metabolic process.

The most important light-absorbing components in photosynthesis are the chlorophylls. These pigments are always associated with specific proteins forming the Light Harvesting Complexes (LHCs). These proteinaceous components are nuclear encoded and translated in the cytosol; they are imported into chloroplasts via translocases and finally embedded in the thylakoid membranes with the contribution of the carboxy-terminal IDR of the membrane protein ALB3 (ALBINO 3) [118,119]. ALB3 protein belongs to a family of insertases that control the inclusion and assembly of membrane proteins [120]. For their membrane insertion, LHCs are recruited via the chloroplast signal recognition particles (cpSRP) [121], organized in a soluble transit complex (LHC-cpSRPs). ALB3 carboxy-terminal segment contains an IDR localized in the stroma, which contains motifs enriched in positively charged residues, and interacts via ‘folding upon binding’ with the transit complex. Once the LHC-cpSRPs is membrane recruited, it interacts with the membrane-bound receptor cpFtsY to be assembled into the thylakoid membrane [119]. The participation of an IDR in the formation of the ALB3 and the LHC-cpSRP complex exhibits the requirement of a highly dynamic association to facilitate its delivery to the membrane receptor.

LHCs together with chlorophylls and other light absorbing pigments constitute the antenna systems, which are able to absorb photons and transmit this energy to a reaction center, where it is transformed into chemical energy. Plants organize their antenna molecules and their reactions centers in two distinct functional arrays called PHOTOSYSTEM I (PSI) and PHOTOSYSTEM II (PSII). The process of light to chemical energy transformation is coupled with the oxidation of water, resulting in the release of O2. This reaction takes place in the OXYGEN-EVOLVING COMPLEX (OEC) composed of a Mn4O5Ca cluster embedded in PSII [122]. Three water-soluble proteins regulate the integrity of the OEC, the 23 and 17 kDa proteins, that facilitate the retention of Ca2+ and Cl ions at the sites where they control Mn2+ redox activity, and the 33 kDa protein or MANGANESE STABILIZING PROTEIN (MSP), whose function is to retain the four manganese atoms of the OEC, avoiding their release as Mn2+, and in this way preserving the function of PSII [123,124]. MSP physicochemical characteristics identify it as an IDP, consistent with the conformational flexibility required to facilitate rebinding of MSP to PSII. MSP assembly into PSII involves a two-step process: MSP contacting PSII, followed by an MSP structural reorganization that increases their binding affinity [125]. MSP structural plasticity represents an advantage to obtain the appropriate MSP-PSII association and dissociation rate constants, which could be fundamental to regulate the water oxidation rate.

The combined activities of the two plant photosystems move electrons from water to NADP+, this produces a proton gradient to drive ATP and NADPH synthesis, required in the reduction process of carbon assimilation. During the carbon assimilation phase of photosynthesis, carbon fixation occurs by condensation of CO2 into a five-carbon acceptor molecule, ribulose 1,5-bisphosphate (RuBP), to form two molecules of 3-phosphoglycerate. This reaction is carried out by the Ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco), a tightly regulated enzyme, starting the Calvin-Benson cycle. This enzyme is activated by the carbamylation of a lysine in its active-site, while the binding of its substrate modifies Rubisco conformation, inhibiting its activity. This conformational modification is reversed with the assistance of a specific ATP-dependent chaperone called Rubisco activase (RCA) [126]. Most plants encode two RCA isoforms, α- and β-isoforms of 45–46 kDa and 41–43 kDa, respectively [127], both containing IDRs in their carboxy-terminal domains [128]. In A. thaliana RCA, IDRs include two cysteines involved in light-dependent regulation of this protein [129], in such a way that under dark conditions the cysteines form a disulfide bridge, allowing the blocking of the ATP binding region by the disordered region, modification causing RCA inhibition. When light is restored, the disulfide bridge is reduced by the action of thioredoxin f, releasing the carboxy-terminal end, responsible for blocking the ATP binding region, and now inducing RCA activation, and consequently the restoration of Rubisco activity (Fig. 3A) [130]. In the second stage of Calvin-Benson cycle also known as the reduction stage, 3-phosphoglycerate is transformed into glyceraldehyde 3-phosphate, by the action of the chloroplastic GLYCERALDEHYDE PHOSPHATE DEHYDROGENASE (GAPDH) [131]. In higher plants, this enzyme exists in different forms; as heterotetramer made up of two GapA and two GapB subunits (A2B2), as homotetramer constituted of four GapA subunits (A4), and as hexadecamer (A8B8) [132]. Both subunits of GADPH are similar except for their carboxy-terminal region, which in GapB is a flexible domain containing two pairs of cysteines that, as in the case of RCA, regulate GAPDH activity during day-night transition [133]. Cysteines form a disulfide bridge under dark conditions, bringing GAPDH carboxy-terminal into its active site, auto inhibiting its activity. Dark-light transition reduces the bridge, moving away the inhibiting region, henceforth enabling 3-phosphoglycerate transformation (Fig. 3B) [134]. The regulation of the third stage of the Calvin-Benson cycle, where RuBP is regenerated from triose phosphates, is coordinated by the protein CP12 (CHLOROPLASTIC PROTEIN 12), a protein of about 80 amino acids sharing the last 30 residues with the GapB carboxy-terminal region. CP12 is a highly disordered protein that contains four cysteines able to form two intramolecular disulfides. This protein is a key component of the supra-molecular regulatory complex formed by GAPDH, PHOSPHO-RIBULOKINASE (PRK) and FRUCTOSE 1,6-BISPHOSPHATE aldolase (FBP aldolase) [135], whose structural flexibility is evident during the formation of the ternary complex GAPDH/CP12/PRK [[135], [136], [137]]. In this arrangement, GAPDH is inhibited under dark conditions, and CP12 forms a disulfide bridge that stabilize an ??-helical conformation that hampers NADPH entrance into the GADPH catalytic site. Under light, the disulfide bridge is reduced, leading to CP12 free movement, which activates the enzymes in the complex, allowing the flux of carbon molecules to the Calvin-Benson cycle (Fig. 3C) [128]. For an extended revision of the function and characterization of these three IDPs involved in the Calvin-Benson cycle consult Thieulin-Pardo et al. [128].

Plant sessile nature and the constant, diverse and sometimes extreme environmental changes have led to the selection of numerous, complex and efficient responses, allowing them to withstand adverse conditions, such as those related to water or oxygen availability [138], temperature [139,140], salinity [141,142], and heavy metals [143,144], among others.

These defense responses involve different control mechanisms, such as transcriptional and post-transcriptional regulation at the genetic and epigenetic level [[145], [146], [147]], allowing activation and/or repression of gene expression; post-translational modifications modulating protein activities and localization [148,149]; and metabolic adjustment leading to an appropriate save of energy [150], among others. The cross-talk between the different mechanisms constitutes networks to coordinate plant growth and development aimed to ensure survival, waiting for the arrival of better conditions.

Among the genes whose expression is highly associated with environmental stress are those encoding LEA (LATE EMBRYOGENESIS ABUNDANT) proteins [[151], [152], [153]]. These proteins are typically accumulated during the last stage of seed maturation, when the seed losses most of its water content (dehydration stage), leading the embryo into a dormant status, condition that allows its dispersion and survival under extreme environments until finding suitable conditions to germinate and develop. Most LEA proteins characterized to date show the physicochemical properties of IDPs, having a biased amino acid composition represented by small, and by hydrophilic and charged residues [154,155]. LEA proteins can be considered as a subgroup of a broader group of proteins, named hydrophilins, with similar physicochemical features, originally defined by a high glycine content (>6%), a high hydrophilicity index (>1.0), and a high level of structural disorder, whose accumulation is closely linked with stress conditions, particularly, water deficit [156]. Hydrophilins have been found in all domains of life [156,157]; however, there are not distinctive sequence motifs that could be associated to those proteins considered as hydrophilins. Nonetheless, plant hydrophilins or LEA proteins can be classified into seven main groups according to their distinctive sequence motifs [152], although other classifications have been proposed [153,158,159]. Even though to date there are few examples, proteins showing motifs shared with some LEA protein groups have also been found in bacteria and animals [157,160,161].

To date many studies have focused in the characterization of LEA protein properties and in the regulation of their expression, and some others in understanding their role in vitro and in vivo in different plant species [[162], [163], [164], [165], [166], [167], [168], [169], [170], [171], [172], [173]]. The compiled information so far confirms that LEA proteins are involved in the plant responses to water-deficit, ionic and osmotic stress, but also to heat and high heavy metal concentrations. Interestingly, some of them are also present during the normal development of various plant organs, suggesting that they may carry out a diversity of functions, possibly sensing different milieu that distinct cell types might encounter during plant life. Consistent with the induction of LEA gene expression under stress conditions, most of them are under the control of ABA-dependent signaling [174,175]; however, the promoters of some LEA genes are rather complex showing cis-elements responsive to other stress signaling pathways, and to additional hormones and various developmental cues [176,177].

Considering the number of genes encoding LEA proteins that have been reported (e. g. 51 in Arabidopsis [153]), very few have been structurally characterized. Even though some of them were described as proteins with α-helical conformation because the propensity showed by in silico studies, circular dichroism analysis demonstrated that, actually, in aqueous solution they show high levels of structural disorder [[178], [179], [180], [181], [182], [183]]. However, it has been shown that these proteins are able to get a more stable conformation, particularly ??-helix, when they are in solution under low water availability and/or high macromolecular crowding. The level of acquisition of α-helical conformation varies according to their amino acid sequence, to the availability of water molecules, and to the level of molecular crowding [179,180]. As expected, the higher differences are observed when the proteins are under extreme conditions, such as during seed dehydration (Fig. 4). These data agree with others obtained from experiments where LEA proteins were dehydrated to completeness also leading to the gain of structural order [184], indicating that LEA protein structural dynamics could be modulated by cell water status, among other factors. Also, this information strongly suggests that in dry seeds, these proteins might be in particular ordered conformations allowing homo- and hetero-oligomeric interactions, possible involved in the glassy stage formation, a common feature of dry seeds associated with the subsistence of the dormant embryo (Fig. 4A) [[185], [186], [187]]. The competence to gain structural order upon dehydration has also been observed in hydrophilins, some of them with similarity to LEA proteins, found in anhydrobiotic organisms, for which it also has been suggested that participate in the conformation of the glassy stage in these organisms, allowing them to revive after being dehydrated for some time [[188], [189], [190]].

Exploring the action mechanism of these proteins have not been an easy task; in particular, because LEA proteins do not share similarity with any other described protein. Their high abundance in dry seeds and their responsiveness to water deficit conditions led to the hypothesis that these proteins could be acting as protectors of other proteins or macromolecules under these stress conditions [155,191], motivating the establishment of in vitro protection assays using enzymes showing sensitivity to partial dehydration or freeze-thaw treatments, and determining their activities upon treatments in the presence or absence of LEA proteins or any other hydrophilin [179,[192], [193], [194]]. These assays revealed the ability of LEA proteins from different families, and of bacterial, yeast and animal hydrophilins, to prevent the inactivation and subsequent aggregation of reporter enzymes caused by the stress treatments. This protective activity does not require the presence of ATP. Although there are some reports indicating that some LEA proteins are able not only to prevent enzyme denaturation but also to promote renaturation upon stress treatments, as it happens for molecular chaperones [195], this is not a common activity for most LEA proteins tested so far. The fact that this in vitro protective activity has been detected using low molar ratios between LEA protein and reporter enzymes (1:1, 1:5), it has been proposed that their protective action occurs through the interaction of the LEA protein with their ‘target’ or ‘client’, likewise chaperone activity [155,191,193]. An additional suggestion considers the possibility that, given the charged and hydrophilic nature of these proteins, they may be preventing denaturation of the stress sensitive proteins by forming a shield around them avoiding in this way the loss of their native conformation (Fig. 4B) [196]. Because all the experiments suggesting both action models have been performed in vitro, at this point, it is difficult to affirm whether the two proposed action mechanisms for LEA protein protective activity occur in vivo, or none of them, or even whether there exists an alternative one; however, this information constitutes a starting point to elucidate their function in vivo.

Additional functions have been attributed to LEA proteins, as it is the case of desintoxication of metals, and other toxic ions such as sodium, because their high content of charged residues, and the demonstrated affinity for divalent cations like Cu2+, Fe2+, Zn2+, Co2+, Cd2+ and/or Ca2+ of LEA proteins from some families, particularly, from group 2 (dehydrins), group 3, group 4, and group 7 (ASR) [[197], [198], [199], [200], [201], [202], [203]]. Their ability to bind Cu2+, Fe2+ or Fe3+ indicate that these proteins might counteract reactive oxygen species (ROS) accumulation, an outcome of different stressful conditions. Also, it is suggested that LEA proteins could act as buffers of some signaling ions (Ca2+) [204], or as accumulators of useful ones (Zn2+, Mn2+ and Mg2+) (Fig. 4 C) [200]. For some LEA proteins (from groups LEA 7 or ASR, LEA 2 or dehydrins, LEA 3), it has been shown that metal binding induces transitions from disorder-to-order in LEA proteins conformation, mostly to ??-helices [200]. These findings imply that LEA protein function(s) may also be modified and/or fine-tuned by the presence of ions in their surroundings. Although post-translational modifications, mostly phosphorylation, have been predicted by in silico analysis for many LEA proteins [166], this has been demonstrated only in a few cases. Phosphorylation has been shown to occur in LEA 2 proteins, where it is associated with their translocation to the nucleus [[205], [206], [207]], but also in LEA 1 proteins, which also were found to be modified by methylations, acetylations and deaminations [208]. LEA 3 proteins seem to experience deaminations and oxidations [208]. From this information, it is unavoidable to think that some other LEA proteins may undergo PTMs, depending on the cellular environment, and that these modifications can be implicated in conformational adjustments, turnover control, intracellular localization, partner recognition, and overall in attuning of LEA protein function(s) [208,209]. Another LEA protein conformational characteristic that can be modified by their ion binding and/or by PTMs is their quaternary structure; that is, the balance as needed between monomeric, homo-oligomeric or hetero-oligomeric arrangements (between LEA proteins, other partners, or both), events that have been little explored [208,209].

Noteworthy are the data showing that the group 7 LEA protein ASR1 presents a zinc-dependent DNA-binding, in addition to its chaperone-like activities. This DNA binding has been associated to ASR1 protein interaction with a DREB TF in the nucleus, and with its ability to recognize cis-elements in the promoter/enhancer region of VvHt1 (Vitis vinifera HEXOSE transporter 1). Furthermore, there is evidence showing that ASR1 plays a regulatory role of primary carbon metabolism connecting ABA, Gibberellic Acid (GA) and glucose signaling, highlighting the multifunctionality of IDPs [210].

LEA proteins have been localized to different cellular compartments, mostly to the cytosol and nuclei; however, apparently, they also localize to chloroplasts, mitochondria, vacuoles, endoplasmic reticulum, peroxisomes, and the plasma membrane, where their functions should be required [152,208,211].

Upon stress, plasma membrane is one of the first cellular components to be perturbed. For instance, fluctuations in temperature can lead to modification in the membrane fluidity, which in turn can affect the formation of microdomains [212,213], further influencing downstream signaling, and therefore altering gene expression and protein accumulation levels [214,215]. In consequence the integrity of the plasma membrane should not be altered. Because of the relevance of this structure, different mechanisms are involved in its protection. Some evidence suggests that LEA proteins may be involved in this task, particularly, LEA proteins from group 2 and 3. Group 2 LEA protein Lti30 from A. thaliana binds artificial lipidic membranes composed of fatty acids that favor electrostatic interactions, suggesting that this binding might compensate for the loss of rigidity under stress [216]. Similarly, A. thaliana group 3 LEA protein, COR15A, binds galactolipid membranes by electrostatics and hydrophobic interactions [217]. Furthermore, in some cases it has been shown that the interaction of LEA proteins with artificial membranes lead to conformational changes resulting in loss of disorder and gain of α-helices. An example of this occurs with LEA11 and LEA25, both A. thaliana group 3 LEA proteins (Fig. 4D) [218,219].

All these pieces of knowledge show not only the diversity of functions that LEA proteins could perform but also the multifariousness of factors that can participate to establish and control the appropriate LEA protein function according to the cellular identity and status.

It is worth mentioning that genome and proteome wide analyses show that, in addition to LEA proteins, other IDPs and proteins containing IDRs accumulate in response to environmental stress [220,221], indicating that structural disorder is a feature representing an adaptive advantage in organisms such as plants. Among these proteins are the so-called Heat Shock Proteins (HSPs), a group of proteins that accumulate during heat, cold, dehydration, salinity and oxidative stress [222], and are present in organisms of all three domains of life [223]. Most of these proteins function as molecular chaperones, aiding in the folding of proteins and in the ordered unfolding for further processing, preventing aggregation, and also as signaling molecules [222,224,225]. In plants, HSPs can be divided according to their approximate molecular mass in five classes: Hsp100, Hsp90, Hsp70, Hsp60 and small heat-shock proteins (sHSPs) [226]. sHPSs are characterized for being ATP-independent chaperones, preventing the aggregation of proteins during heat-stress in vitro [192], as it is the case of LEA proteins during water deficit [179,193,227]. The ATP independence of their function for these proteins represents an advantage, considering that during stress cell metabolism is compromised, and ATP availability is rather limited. Interestingly, from the three structural domains in sHSPs, the carboxy- and amino-terminal domains containing significant level of disorder show low conservation, while the middle α-crystallin domain forms two β-sheets, and is highly conserved in this family [228,229]. It is proposed that the structural organization in these proteins, somehow regulates the formation of oligomeric ensembles, which may contain up to 40 sHSP monomers from different sub-groups [229]. The complexity of these oligomers has been considered as a strategy to regulate sHSPs chaperone activity [230]; however, this has not been confirmed, such that their function remains elusive. The sHSPs oligomerization is a dynamic process modulated by pH [231] and PTMs, particularly, by phosphorylation [[232], [233], [234]]. It is not surprising that the phosphorylation sites have been localized within the IDRs in their carboxy- and amino-terminal domains, and that these modifications influence their conformational organization and binding to other sHSPs subunits and to their target proteins, by exposing a diversity of interfaces, according to cellular conditions. This structural plasticity allows them to recognize and bind to a broad range of clients with heterogeneous structural characteristics.

Biotic stress is caused by pathogenic agents such as bacteria, fungi, virus or by herbivorous animals such as insects that feed on plants. The pathogen response in plants has been well characterized; it begins with the recognition of the pathogen, followed by the triggering of the immune response; then, the pathogen agent is able to detect the immune effectors and counters signaling, action that is also recognized by the plant, establishing a second immune response round. In this battle, the involved molecular actors, from pathogens and from plants, require adaptability and plasticity in order to survive, suggesting that protein structural disorder may play a role in this scene. For example, it has been reported that intrinsically disorder in viral proteins could play a role in the evolution of resistance pathways or could contribute to minimize fitness penalties caused by resistance-breaking mutations [235]. Intrinsic disorder is also useful in terms of spreading and transducing the signals produced by the infection, due to the IDPs competence to interact with multiple partners producing different outputs depending on the infection status. This is the case of the well conserved protein RIN4 (RPM1-interacting protein 4), an immune mediator between the first filter of defense, the PTI (PRR-triggered immunity), and NTI (NLR-triggered immunity), a second filter involving the recognition of pathogen effectors [236]. RIN4 acts as a pathogen signal hub; it is a protein tethered peripherally to the membrane [237] where its structured region undergoes different site-specific PTMs, by the action of pathogen effectors, producing different outputs presumably mediated by its IDRs [238].

Besides RIN4, other membrane proteins play important roles in pathogen recognition signaling pathways, which are closely associated to the efficient regulation of plasma membrane organization [239]. These signaling and cellular responses are concentrated in specialized membrane sub-compartments called nanodomains [240]. One of the best characterized protein families associated to these nanodomains in plants is the REMORIN family (REM) [241]. REM proteins associate to membrane through their carboxy-terminal region called REM-CA (REMORIN-Carboxy-terminal Anchor) [242], which contains a coiled-coil conformation involved in its homo-oligomerization [243]. It has been reported that StREM1.3 protein (Solanum tuberosum REM from group 1b, homologue 3) controls cell to cell movement of Potato Virus X (PVX) in infected Nicotiana benthamiana leaves [244]. Viral proteins such as COAT PROTEIN and TRIPLE GENE BLOCK 1 elicit the defense response activating kinases responsible of REM1.3 phosphorylation [245]. The phospho-status of the IDR located in REM1.3 amino-terminal region triggers multiple responses, it orchestrates nanodomain organization and the deposition of the polysaccharide callose into plasmodesmata, restricting in this way PVX spreading.

In addition to the perception of conserved microbial features, plants release enzymes to breakdown the structural components of pathogens, as it is the case of chitin, a β-1,4-linked polymer of N-acetylglucosamine, a common structural component in diverse plant pathogens, including fungi, insects and nematode eggs [246]. Chitinases hydrolyze chitin, and its overexpression confers a defense tool against pathogens [247]. Plant chitinase 2 belongs to a group of proteins acting on polymeric molecules; this is the case of cellulose degradation by cellulases, or collagen breakdown by MMP9 (MATRIX METALLOPROTEINASE 9). This group of proteins are characterized by the presence of substrate binding domains connected by a disordered linker. They act in a processive way by performing multiple rounds of modification instead of releasing their substrates after the first conversion, thus increasing the activity efficiency, and improving cell energetic economy. Owing to the small entropic barrier of the intrinsically disordered linker, it has been found by statistical-physical bioinformatics that these enzymes could act by the “monkey bar” mechanism [248,249], where the length and flexibility are conserved IDR attributes, impacting the processivity and activity of the enzyme [250].

The examples described above, together with what has been reviewed by Marin and Ott [251] and by Covarrubias et al. [252], show that structural plasticity is used to sense pathogen infections. This might be particularly useful to better respond against rapidly evolving pathogens.

As described in the previous sections, IDPs and IDRs are enriched in proteins involved in signaling, metabolism, and response to stress. In recent years, IDPs/IDRs have been shown to be a prevalent property of proteins that exhibit liquid-liquid phase separation (LLPS). LLPS (sometimes called coacervation) is a physicochemical process where two components of a homogeneous mixture demix from each other to form two distinct phases, each with a particular molecular composition [253]. As a result, liquid droplets concentrated in a particular component can form and co-exist as a suspension within another liquid [254]. LLPS is proposed to be the mechanism by which membraneless compartments are formed in cells [255]. Membraneless compartments (also named biomolecular condensates) are discrete structures within cells that coalesce certain types of macromolecules to perform specific cellular functions [256]. Some examples of membraneless compartments are the nucleolus, PML bodies, P-bodies, stress granules, centrosomes, pyrenoids, and Cajal bodies [257]. The liquid properties of membraneless compartments are relevant because they help to explain how macromolecules dynamically reorganize in response to fast changes of the cellular milieu.

Most knowledge regarding LLPS comes from studies in animal cells and yeast. In those cases, the IDPs/IDRs involved are sufficient to drive the formation of liquid droplets in vitro, in such a way that it is currently considered a key element for this phenomenon to occur (the methods currently required to test for LLPS in vivo and in vitro are summarized by Cuevas-Velazquez and Dinneny [258], and Alberti et al. [259]). Some examples of IDRs driving LLPS are LAF-1 (LETHAL AND FEMINIZING 1), DDX4 (DEAD-Box Helicase 4), HP1⍺ (HETEROCHROMATIN PROTEIN 1), FUS (FUSED IN SARCOMA), BRD4 (BROMODOMAIN-CONTAINING PROTEIN 4), MED1 (MEDIATOR OF RNA POLYMERASE II TRANSCRIPTION SUBUNIT 1), TDP43 (TAR DNA-BINDING PROTEIN 43), hnRNPA1 (HETEROGENEOUS NUCLEAR RIBONUCLEOPROTEIN A1), and BuGZ (Bub3 INTERACTING GLEBS AND ZINC FINGER DOMAIN PROTEIN) [[260], [261], [262], [263], [264], [265], [266], [267]]. In the case of organisms from the green lineage, there is a substantially less amount of studies focused on understanding the role of LLPS on the formation and regulation of membraneless compartments.

In this section, we will briefly describe the recent efforts to understand the role that LLPS plays in the regulation of cellular compartmentalization in photosynthetic organisms. In addition, we will propose new players that might exhibit LLPS in response to environmental perturbation. Finally, we will describe how LLPS could be an integrator of the different functions where IDPs are enriched in the plant kingdom.

A striking example of the role of IDR-mediated LLPS on the flowering time pathway of Arabidopsis was recently reported. FCA (FLOWERING TIME CONTROL PROTEIN) is an RNA-binding protein necessary for the alternative 3′-end processing of COOLAIR, the antisense transcript of the floral repressor FLC (FLOWERING LOCUS C). FCA has two RNA-binding domains and a WW protein-interaction domain. Apart from the aforementioned domains, FCA is predicted to be highly disordered. FCA localizes to nuclear bodies in epidermal root tip cells. These bodies, named FCA bodies, are considered as membraneless compartments formed through LLPS because they fill the three key features that defined them: (1) their components rapidly rearrange, as revealed by fluorescence recovery after photobleaching (FRAP); (2) the small punctate structures are transient, merging to form larger ones; and (3) the recombinant purified protein phase separates in vitro. In accordance to what has been observed in animals and yeast systems, the prion-like domain (PrLD) of the FCA protein, which is highly disordered, is necessary and sufficient to exhibit the phase separation behavior in vitro. In addition, FLL2 (FLX-LIKE 2) was found to be a positive regulator of the phase transition of FCA in vivo, because a semi-dominant mutation (E201K) on FLL2 (PROTEIN FLX-LIKE 2) impairs the formation of FCA bodies. Because it is proposed that the mutation on FLL2 affects a salt bridge connecting two coiled coil structures, the protein-protein interactions mediated by this kind of structures could be an important mechanism driving LLPS. FLL2 is a fully disordered protein with high probability to form coiled-coil structures. FLL2 transiently interacts with FCA and co-localizes to the FCA bodies. Interestingly, other 3′-end processing factors (like FPA, FY, and FIP1) interact with FCA and co-localize with FCA bodies, suggesting that one of the biological roles of this phase-separated ribonucleoprotein complex is to concentrate 3′-end processing factors to optimize polyadenylation at specific 3′-end sites [268].

LLPS of plant IDPs/IDRs is also involved in light perception, splicing and chromatin organization. For instance, TZP protein (TANDEM ZINC-FINGER PLUS3), a key transcriptional regulator of plant growth, is able to interact with DNA, RNA and proteins through PLUS3 and zinc finger domains where IDRs have been identified. This protein functions as a signal integrator of light stimuli by association with PHYB photoreceptor, recruited in nuclear microdomains or photobodies, which presumably act as control sites of gene expression in response to light quality and photoperiod to attune photomorphogenesis and flowering time [269]. Likewise, a serine/arginine-rich protein (SR) involved in constitutive and alternative splicing of pre-messenger RNAs has been localized to nuclei distributed in speckles and nucleoplasm during interphase. It is proposed that SR transit between the speckles and nucleoplasm may impact its activity, eventually influencing pre-messenger RNA splicing [270]. The involvement of LLPS in plant chromatin organization is illustrated by the AGENET DOMAIN CONTAINING PROTEIN 1 (ADCP1), a plant-specific poly-Agenet protein containing three tandem Agenet modular domains. ADCP1 is a multivalent histone H3K9 methylation reader, whose structural characterization exhibits various flexible loops, and that recently has been involved in mediating heterochromatin phase separation, in H3K9 and CHG/CHH DNA methylation maintenance, and in transposon silencing [271,272].

The membraneless compartments responsible for the carbon concentrating mechanism (CCM) in unicellular photosynthetic organisms might be the best characterized compartments with liquid-like properties in a photosynthetic organism (Also see Launay et al. in this issue). The first report about this behavior was found during the study of the pyrenoid of Chlamydomonas reinhardtii, trying to understand how it is segregated during cell division [273]. The pyrenoid shows all the different criteria that describes liquid-like membraneless compartments. Furthermore, a fully disordered protein named EPYC (ESSENTIAL PYRENOID COMPONENT) is proposed to be a scaffold for the load of Rubisco into the pyrenoid. In agreement with the in vivo data, it was later shown that both EPYC and Rubisco are necessary and sufficient to exhibit LLPS in vitro [274]. Interestingly, EPYC alone shows phase separation only at very high concentrations (200 μM) in the presence of PEG. This result suggests that, in contrast to what is observed in several examples of animal IDPs, the disorder character is not sufficient to induce homotypic phase separation of this system.

The LLPS-based mechanism is also observed in the functional homologue of the pyrenoid in cyanobacteria, the β-carboxysome [275]. Carboxysomes are compartments that promote efficient carbon fixation in which Rubisco and carbonic anhydrase are recruited in a protein shell preventing CO2 escape. The recruiter of this protein ensemble is CcmM (Carbon dioxide concentrating mechanism protein), a protein with multiple domains required to gather Rubisco into the complex [276]. CcmM also contains three to five domains with sequence similarity to the Rubisco small subunit (SSU-like domains), which interact with Rubisco with those binding sites used by the Rubisco small subunits (SSU), interlinking in this way adjacent enzymes [277]. In the cyanobacterial β-carboxysome, however, neither Rubisco nor the CcmM protein contain significant IDRs, as confirmed in the solved three-dimensional structure of the complex. Furthermore, disulfide bond formation in the SSU-like domains of CcmM increases the flexibility of the network. The absence of IDPs/IDRs in this particular LLPS system suggests that the driving force in this assemblage comes from the multivalent interactions formed between the different components of the complex. Multivalent interactions might also be important for the EPYC-Rubisco system in vivo, because EPYC has four repeats predicted to bind Rubisco [273]. In vitro, however, varying the valency of EPYC did not have a major effect on droplet formation, affecting only the concentration at which the system phase separates [274]. The results obtained from the study of the CCM compartments of unicellular photosynthetic organisms open new research avenues in the field. If protein disorder is dispensable, then why it is so prevalent and seems to be a hallmark of LLPS in higher organisms? Is IDR-mediated LLPS a mechanism that evolved with multicellularity? Do IDPs/IDRs provide transient multivalent interactions important for maintaining the liquid properties of the compartment? Further research is needed to answer this fundamental problem of the emergent LLPS field.

As described in the previous section, LEA proteins are predicted to be highly disordered and many of them form homo-oligomers [180,[278], [279], [280]], making them candidates to experience LLPS. So far, there are no reports of LEA proteins localizing to liquid-like compartments in plant cells. However, it has been proposed that, because of the ability to modulate their structure in response to changes in the environment, the compartments could form at very specific stressful conditions [258]. The in vitro dehydration protective function of LEA proteins suggests that such compartments could function as shelters for labile proteins during environmental challenges or a mechanism to modulate their activities and/or their integrity. Giving the fact that LEA proteins mostly accumulate in dry seeds, it could be possible that they experience a liquid-gel or liquid-solid phase transition [281], in accordance with the vitrification state of the dry seeds. This is similar to what has been observed in tardigrades, where IDPs form non-crystalline amorphous solids in vitro [188]. This property seems to allow the animals to survive extreme dehydration, so IDP-mediated phase transitions could be a conserved mechanism of dehydration tolerance across life domains.

Section snippets

Concluding remarks

The discovery of structural disorder in proteins opened a new era in structural biology, but their abundance, functional relevance and diversity have uncovered a new panorama and prospects in cell biology. Plants as organisms with limited mobility represent biological systems where the structural and functional plasticity of IDPs could have a significant impact during their evolution. In this review, we have covered different functional categories where protein intrinsic disorder is prevalent.

Acknowledgements

This work was supported by a grant from Consejo Nacional de Ciencia y Tecnología - México (CONACyT-Mexico) (FC-1615) to AAC. PSR-P and DFR-L are supported by PhD fellowships from CONACyT-México. The figures in this manuscript were created with BioRender.com.

References (281)

  • B.B. Kragelund et al.

    Order by disorder in plant signaling

    Trends Plant Sci.

    (2012)
  • X. Sun et al.

    N-terminal domains of DELLA proteins are intrinsically unstructured in the absence of interaction with GID1/gibberellic acid receptors

    J. Biol. Chem.

    (2010)
  • Y. Helariutta et al.

    The SHORT-ROOT gene controls radial patterning of the Arabidopsis root through radial signaling

    Cell

    (2000)
  • B.E. Czikkel et al.

    NtGRAS1, a novel stress-induced member of the GRAS family in tobacco, localizes to the nucleus

    J. Plant Physiol.

    (2007)
  • K. Morohashi et al.

    Isolation and characterization of a novel GRAS gene that regulates meiosis-associated gene expression

    J. Biol. Chem.

    (2003)
  • R.B. Day et al.

    Identification and characterization of two new members of the GRAS gene family in rice responsive to N-acetylchitooligosaccharide elicitor

    Biochim. Biophys. Acta Gene Struct. Expr.

    (2003)
  • A.R. Cashmore

    Cryptochromes: enabling plants and animals to determine circadian time

    Cell

    (2003)
  • H.-Q. Yang et al.

    The C termini of Arabidopsis cryptochromes mediate a constitutive light response

    Cell

    (2000)
  • R. Yin et al.

    How plants cope with UV-B: from perception to response

    Curr. Opin. Plant Biol.

    (2017)
  • M. Nietzsche et al.

    A protein–protein interaction network linking the energy-sensor kinase SnRK1 to multiple signaling pathways in Arabidopsis thaliana

    Curr. Plant Biol.

    (2016)
  • H. Guo et al.

    Mechanisms and networks for brassinosteroid regulated gene expression

    Curr. Opin. Plant Biol.

    (2013)
  • J. Jiang et al.

    The intrinsically disordered protein BKI1 is essential for inhibiting BRI1 signaling in plants

    Mol. Plant

    (2015)
  • H.J. Dyson et al.

    Role of intrinsic protein disorder in the function and interactions of the transcriptional coactivators CREB-binding protein (CBP) and p300

    J. Biol. Chem.

    (2016)
  • L. Waters et al.

    Structural diversity in p160/CREB-binding protein coactivator complexes

    J. Biol. Chem.

    (2006)
  • T. Kjaersgaard et al.

    Senescence-associated barley NAC (NAM, ATAF1,2, CUC) transcription factor interacts with radical-induced cell death 1 through a disordered regulatory domain

    J. Biol. Chem.

    (2011)
  • C. O'Shea et al.

    Structures and short linear motif of disordered transcription factor regions provide clues to the interactome of the cellular hub protein radical-induced cell Death1

    J. Biol. Chem.

    (2017)
  • A.N. Olsen et al.

    NAC transcription factors: structurally distinct, functionally diverse

    Trends Plant Sci.

    (2005)
  • H.J. Kim et al.

    Regulatory network of NAC transcription factors in leaf senescence

    Curr. Opin. Plant Biol.

    (2016)
  • M. Jakoby et al.

    bZIP transcription factors in Arabidopsis

    Trends Plant Sci.

    (2002)
  • C. Bracken et al.

    Temperature dependence of intramolecular dynamics of the basic leucine zipper of GCN4: implications for the entropy of association with DNA

    J. Mol. Biol.

    (1999)
  • T.E. Ellenberger et al.

    The GCN4 basic region leucine zipper binds DNA as a dimer of uninterrupted α Helices: crystal structure of the protein-DNA complex

    Cell

    (1992)
  • C. Andronis et al.

    The clock protein CCA1 and the bZIP transcription factor HY5 physically interact to regulate gene expression in Arabidopsis

    Mol. Plant

    (2008)
  • L.-H. Ang et al.

    Molecular interaction between COP1 and HY5 defines a regulatory switch for light control of Arabidopsis development

    Mol. Cell

    (1998)
  • R.K. Das et al.

    N-terminal segments modulate the α-helical propensities of the intrinsically disordered basic regions of bZIP proteins

    J. Mol. Biol.

    (2012)
  • S.E. Schauer et al.

    DICER-LIKE1: blind men and elephants in Arabidopsis development

    Trends Plant Sci.

    (2002)
  • V.N. Uversky

    Conserved functional dynamics: I like to move it, move it!

    Structure

    (2018)
  • B. Zambelli et al.

    UreG, a chaperone in the urease assembly process, is an intrinsically unstructured GTPase that specifically binds Zn2+

    J. Biol. Chem.

    (2005)
  • C.J. Oldfield et al.

    Intrinsically disordered proteins and intrinsically disordered protein regions

    Annu. Rev. Biochem.

    (2014)
  • J. Habchi et al.

    Introducing protein intrinsic disorder

    Chem. Rev.

    (2014)
  • B. Xue et al.

    Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life

    J. Biomol. Struct. Dyn.

    (2012)
  • E. Schad et al.

    The relationship between proteome size, structural disorder and organism complexity

    Genome Biol.

    (2011)
  • A.L. Darling et al.

    Intrinsic disorder-based emergence in cellular biology: physiological and pathological liquid-liquid phase transitions in cells

    Polymers

    (2019)
  • X. Sun et al.

    Multifarious roles of intrinsic disorder in proteins illustrate its broad impact on plant biology

    Plant Cell

    (2013)
  • M.M. Babu et al.

    Versatility from protein disorder

    Science

    (2012)
  • J. Yan et al.

    Molecular recognition features (MoRFs) in three domains of life

    Mol. Biosyst.

    (2016)
  • M. Fuxreiter et al.

    Local structural disorder imparts plasticity on linear motifs

    Bioinformatics

    (2007)
  • L. Mollica et al.

    Binding mechanisms of intrinsically disordered proteins: theory simulation, and experiment

    Front. Mol. Biosci.

    (2016)
  • P.A. Chong et al.

    A hidden competitive advantage of disorder,

    Nature

    (2017)
  • V. Csizmok et al.

    Dynamic protein interaction networks and new structural paradigms in signaling

    Chem. Rev.

    (2016)
  • P.E. Wright et al.

    Intrinsically disordered proteins in cellular signalling and regulation

    Nat. Rev. Mol. Cell Biol.

    (2015)
  • Cited by (28)

    • Pathogen Effectors: Exploiting the Promiscuity of Plant Signaling Hubs

      2021, Trends in Plant Science
      Citation Excerpt :

      Upon binding, disorder-to-order transition takes place and a more stable fold is established, depending on the identity or properties of the binding partners. In this way, hub proteins can act on different responses in a very precise and flexible manner [65–67]. Proteins involved in sensing or acting on biotic stress stimuli typically rely on transient interactions to activate signal transduction cascades and modulate gene expression [67].

    • An intrinsically disordered radish vacuolar calcium-binding protein (RVCaB) showed cryoprotective activity for lactate dehydrogenase with its hydrophobic region

      2021, International Journal of Biological Macromolecules
      Citation Excerpt :

      In the case of dehydrins, the cryoprotective activities were attributed basically to the large hydrodynamic radius [41,42], whereas transient hydrophobic interaction without binding is needed to facilitate the cryoprotective activities [29]. In addition, LEA proteins and small heat shock proteins, both of which prevent protein denaturation, have been known to possess disordered regions [43]. Taken together, the previous and present results suggested that the disordered nature is a crucial factor for protective IDPs, including RVCaB.

    • Expanding the structural diversity of polyelectrolyte complexes and polyzwitterions

      2021, Current Opinion in Solid State and Materials Science
    View all citing articles on Scopus
    View full text