Structure/function relationships in the rosette cellulose synthesis complex illuminated by an evolutionary perspective

Cellulose microfibrils are a key component of plant cell walls, which in turn compose most of our renewable biomaterials. Consequently, there is considerable interest in understanding how cellulose microfibrils are made in living cells by the plant cellulose synthesis complex (CSC). This remarkable multi-subunit complex contains cellulose synthase (CESA) proteins, and it is often called a rosette due to its six-lobed shape. Each CSC moves within the plasma membrane as it spins a strong cellulose microfibril in its wake. To accomplish this biological manufacturing process, the CESAs harvest an activated sugar substrate from the cytoplasm for use in the polymerization of glucan chains. An elongating glucan is simultaneously translocated across the plasma membrane by each CESA, where the group of chains emanating from one CSC co-crystallizes into a cellulose microfibril that becomes part of the assembling cell wall. Here we review major advances in understanding CESA and CSC structure/function relationships since 2013, when ground-breaking insights about the structure of cellulose synthases in bacteria and plants were published. We additionally discuss: (a) the relationship of CSC substructure to the size of the fundamental cellulose fibril; (b) an evolutionary perspective on the driving force behind the existence of hetero-oligomeric CSCs that currently appear to dominate in land plants; and (c) how cellulose properties may be regulated by CESA and CSC activity. We also pose major questions that still remain in this rapidly changing and exciting research field.


Introduction
This review updates our current understanding of structure/function relationships in the rosette cellulose synthesis complex (CSC). This multimeric plasma membrane-bound protein complex synthesizes cellulose in all the land plants characterized so far, as well as in their closest algal relatives. The CSC is a nanoscale fibril-spinning machine, containing many cellulose synthase (CESA) proteins. This multi-protein complex carries out its functions in a narrow zone of the cell that includes the cortical cytoplasm and cytoskeletal elements, the plasma membrane, and the exoplasmic space beneath the mature cell wall (Liu et al. 2015). The CESAs acquire substrate from the cytoplasm, synthesize b-1,4-glucan, and export the polymers. The multiple glucan chains then co-crystallize into cellulose microfibrils near the plasma membrane surface as the CSC moves forward along a linear path. The formation of a strong cellulose microfibril via CSC activity is one of the most remarkable structural manufacturing processes in nature. Cellulose synthesis has equal importance to photosynthesis in supporting the plant lifestyle and the subsequent roles of plants in essential ecosystem cycles and human industry. As examples, plants are the basis of the food chain in every terrestrial biome, and cellulosic biomaterials have provided humans with fuel, renewable building materials, textiles, paper, and countless other products since civilization began.
We emphasize new results since 2013, which was a watershed year in the history of cellulose synthesis research. The pivotal factor in this revolutionary change was solving the structure of a bacterial cellulose synthase, BcsA from Rhodobacter sphaeroides, at the atomic level within a BcsA-B complex (Morgan et al. 2013). The prokaryotic cellulose synthases have been diverging along an independent evolutionary path as compared to the plant CESA for at least 1.6 billion years, as estimated from the time that a cellulose-synthesizing cyanobacterium is predicted to have been engulfed in route to becoming the plant chloroplast (Yoon et al. 2004). Nonetheless, the catalytic core structure, where the b-1,4-glucan chain is formed, is structurally similar in prokaryotes and plants. Although no plant CESA structure has been solved so far, their similarity with BcsA was confirmed by structural comparison of the modeled catalytic core of a plant CESA with the equivalent region in BcsA. It is now possible to infer likely structure/function relationships in plant CESAs that are shared with BcsA and to generate structurally informed hypotheses about unique aspects of plant CESAs for further experimental testing.
We take an operational approach to structure/function relationships in the rosette CSC and summarize the current state of knowledge about: the structural basis of CESA polymerization and translocation of b-1,4-glucan; the relationship of rosette CSC structure to the fundamental cellulose fibril; origins and roles of diverse CESAs and CSCs; and regulation of cellulose properties by CESA and CSC activity. We conclude with a summary of major unanswered questions about structure/function relationships in CESAs and rosette CSCs.

Structural basis of synthesis and translocation of b-1,4-glucan by CESA
While the work to solve the structure of the prokaryotic BcsA-B complex by X-ray crystallography was occurring (3.25 Å resolution; Morgan et al. 2013), others were working to produce a stable de novo model of 506 amino acids of GhCESA1 from Gossypium hirsutum cotton, representing most of the large cytosolic (catalytic) domain (Sethaphong et al. 2013). [Note that CESA names are preceded by the genus and species abbreviation and include a number within the CESA family of that species. Most commonly, numbers in other species are consistent with their closest homolog in the model plant Arabidopsis thaliana. However, GhCESA1 was the first plant CESA identified (Arioli et al. 1998;Pear et al. 1996), and it is an ortholog of AtCESA8.] A rough initial homology model was assembled from predicted GhCESA1 protein fragments, which were modeled using other solved structures as templates. Then the assembled model was refined in silico, manually and by use a suite of tools for de novo (or ab initio) protein structure modeling, until quality parameters consistent with other protein structures solved by empirical methods at 2 Å resolution were reached. The groups working on the BcsA-B structure and the GhCESA1 model met at the 2012 Plant Cell Wall Gordon Conference, where the catalytic regions of the two structures were immediately compared and found to be structurally similar with 3.9 Å root-mean-square deviation. This structural similarity established the usefulness of the partial GhCESA1 model for broad insights and hypothesis generation about plant cellulose synthesis. For example, many of the mutations in CESAs that caused cellulose-deficiency were in or near the catalytic site of the GhCESA1 model and could be interpreted with a structural context for the first time (Table S3 in Sethaphong et al. 2013).
The single catalytic site in BcsA and the GhCESA1 model was composed of one glycosyltransferase-A fold containing seven beta sheets and seven alpha helices. These secondary structure elements were arranged into one binding site for the substrate [uridine diphosphate-alpha-D-glucose (UDP-glucose)] and another binding site for the forming (or acceptor) glucan chain (McNamara et al. 2015). This structural similarity across Kingdoms occurred despite high overall sequence divergence between the plant and prokaryotic cellulose synthases. As cellulose polymerization occurs, UDP-glucose is cleaved to release energy along with the transfer of glucose to the growing chain. The structural co-alignment between the catalytic domain of BcsA and modeled GhCESA1 showed that key motifs were in similar locations (Table S3 in Sethaphong et al. 2013). These included the D, D, D, QxxRW motifs (or more specifically DDG, DCD, TED, QVLRW in GhCESA1) that are characteristic of processive Glycosyltransferase Family 2 enzymes, including cellulose synthases. Of these, the first two aspartates coordinate the UDP of UDPglucose in the catalytic site (via a divalent cation for DCD), the aspartate in TED is the catalytic base, and the arginine of QVLRW also coordinates UDP while its tryptophan forms van der Waals interactions with the most recently added glucose in the acceptor chain. These functional predictions based on the crystal structure alone were further validated through hybrid quantum mechanical/molecular mechanical computational methods, which were used to generate and optimize models of the BcsA active site during three stages of the polymerization reaction: enzyme-substrate, transition state, and enzyme-product (Yang et al. 2015). The rate of the modeled polymerization reaction was between about 8-27 glucoses per second, as compared to 90 residues per second observed in vitro for BcsA (Omadjela et al. 2013).
The BcsA crystal structure was part of a dimer that also included the non-catalytic, periplasmic, BcsB protein, which interacted with BcsA via one transmembrane helix (TMH) and another periplasmic helix. Protein truncation experiments showed that only these two helices of BcsB were required for the in vitro activity of BcsA (Omadjela et al. 2013). No protein with a similar role to BcsB has been identified in plants, and none may exist. The absence of a BcsB equivalent in plants would be consistent with two purified CESAs, one from the moss Physcomitrella patens (PpCESA5) and one from hybrid aspen (PttCESA8), being active in vitro after reconstitution into proteolipsomes (Cho et al. 2017;Purushotham et al. 2016). A BcsB partner may be uniquely required in prokaryotes due to their multi-layered outer wall that must be traversed by the elongating glucan chains (either singly or in groups) before they are extruded outside the cell, where they may or may not form crystalline microfibrils in different bacterial species. In contrast, the glucan chains synthesized by plants form microfibrils near the surface of the plasma membrane , shortly after each chain passes through the TMH translocation channel that is predicted to exist in CESA (see below). Purified PttCESA8 and PpCESA5 were active in vitro without the addition of another priming molecule (Cho et al. 2017;Purushotham et al. 2016), arguing against the necessity of a macromolecular primer for in vivo plant CESA activity as long-discussed. Possibly, a glucose monomer might bind and provide the initial acceptor for cellulose polymerization, as has been speculated for BcsA (McNamara et al. 2015).
Remarkably, 18 residues of the forming glucan chain also crystallized with BcsA, ten of which were in the translocation channel. This fortuitous outcome led to profound insights into the cellulose translocation as well as its polymerization. A glucan chain translocation channel was formed from cytosolic residues immediately above the catalytic site in conjunction with a pore about 8 Å wide formed by the TMHs that traversed the plasma membrane. A series of amino acids with both polar and non-polar side groups lines the TMH pore of BcsA, consistent with glucoses that are polymerized with b-1,4 linkage presenting different interaction interfaces as the chain translocates up the channel after each successive glucose addition. In addition, a continuous series of hydrogen bonds interacts with the glucoses in the channel (McNamara et al. 2015). Computational approaches have been used to demonstrate the feasibility of an 180°inversion in spatial orientation of each successive glucose immediately after it is added to the elongating chain (Knott et al. 2016;Yang et al. 2015). The models suggest that steric constraints imposed by aromatic residues at the entrance to the translocation channel cause the rotation of the terminal glucose of the acceptor chain. The rotation of each newly-added glucose residue generates an 'in plane' conformation of the glucan chain that is favorable for traversing the membrane via the channel formed by TMHs. This explains how cellulose has a repeating unit of cellobiose despite the addition of glucose units oneby-one in the same orientation within one active site of each cellulose synthase. Both modeling studies concluded that the translocation process would not constitute a rate-limiting step for cellulose synthesis by BcsA (Knott et al. 2016;Yang et al. 2015). Limited conservation exists with the TMH domain of bacterial and plant cellulose synthases, making it reasonable to predict that glucan chain translocation in CESA occurs in a similar way. At the same time, the TMH region between Kingdoms has much less structural conservation than in the catalytic domain (Slabaugh et al. 2014), presenting the need for further investigation of the structural basis for glucan chain translocation in CESA.

Relationship of rosette CSC structure to the fundamental cellulose fibril
The rosette CSC, which was originally seen by freeze fracture transmission electron microscopy (FF-TEM) about 40 years ago (Giddings et al. 1980;Mueller and Brown 1980), was originally called a rosette 'terminal complex' due to its apparent association with the ends (termini) of the impressions of cellulose microfibrils in the plasma membrane. The 'rosette' descriptor refers to the six-lobed CSC structure, as viewed in FF-TEM replicas where the TMHs of multiple CESAs (Kimura et al. 1999) cross the plasma membrane ( Fig. 1). Given that cellulose microfibrils are an essential strength component and scaffold for other polymers within cell wall structure, the size of the fibril made by one rosette CSC has been debated for many years. The size of the fundamental fibril depends on the number of simultaneously active CESAs within one rosette CSC. Although it is possible that not all CESAs in the CSC are active , there is no evidence for this occurring in nature so far.
The long-standing belief that each rosette CSC synthesizes a 36-chain cellulose fibril is now considered unlikely. This idea arose about thirty five years ago from theoretical resonance between two hypotheses: (1) the approximately 4 nm wide microfibrils in the cell walls of the alga Spirogyra, which has rosette CSCs, were composed of 36 glucan chains (Herth 1983); and (2) a 3.5 nm square 'elementary fibril' of cellulose contained 40 chains, extrapolating from early cellulose structural information and TEM images (Mühlethaler 1967). An initial critique of the 36-chain hypothesis was based on calculations using the typical area occupied by one TMH and the assumption of eight TMHs within one CESA, supporting a maximum of 24 CESAs within the rosette CSC (Bowling and Brown 2008). New evidence summarized below suggests that one rosette CSC synthesizes an 18-chain fibril. Given that the term 'elementary fibril' has long been associated with a cellulose fibril containing 36-40 chains, we prefer the term 'fundamental fibril' to describe the fibrillar product of a single rosette CSC. Fig. 1 Three views of rosette CSCs. a Cartoon of a rosette CSC embedded in the plasma membrane, based on a CESA model. A digital cut through the CSC and its surrounding membrane reveals the TMH region of one of the six trimeric lobes. The tops of the TMH regions barely protrude above the exterior plasma membrane surface (see also Haigler et al. 2014 for in situ images), and the modeled cytosolic regions are closely packed near the interior surface of the plasma membrane. b A FF-TEM image of a rosette CSC TMH region with six-lobes, viewed topdown within the plasma membrane of a moss protonemal cell. In this example, the fracture process removed the exterior leaflet of the plasma membrane and the TMH regions were highlighted with shadowing metal to produce the replica that was viewed by transmission electron microscope. The cytosolic regions were below the membrane when it was shadowed and are not visible.
c A top-down, data-driven, schematic representation of a rosette CSC containing 18 CESAs. The TMH regions of the six lobes are represented by an image average of many FF-TEM images as shown in b. The cytosolic regions of each lobe are represented by the semi-transparent triangles, which were placed by hand and reflect the cross-sectional shape of a model derived from small angle X-ray scattering analysis of a trimer formed in solution after heterologous expression of a purified AtCESA1 cytosolic domain . The scale bar in c applies to b and c. Under the Creative Commons License (http://creativecommons.org/licenses/by/4.0/), this figure includes modifications of images originally published by the authors and their coworkers in Scientific Reports  Number of CESAs in the CSC and implications for cell wall structure As summarized and illustrated recently Jarvis 2018) recent spectroscopic analysis is consistent with diverse cell walls most commonly containing microfibrils with 18-28 chains (Fernandes et al. 2011;Newman et al. 2013;Thomas et al. 2014;Turner and Kumar 2018;Wang et al. 2015). A study on primary cell wall (PCW) cellulose in mung bean hypocotyls by solidstate nuclear magnetic resonance (SSNMR) spectroscopy and wide angle X-ray scattering, together with computational simulations of diffractograms from different fibril sizes and aggregates, supported the existence of 18-chain fibrils and the possible coalescence of two of them to form a 36-chain unit. However, 24-chain fibrils could also fit the data (Newman et al. 2013). Many spectral interpretations, assumptions, and calculations must be made as part of this type of research, highlighting the need for other types of complementary data about the size of the fundamental fibril synthesized by one rosette CSC.
Recent studies focusing on CSC structure and composition support an 18 CESA model for the fundamental cellulose fibril. The demonstrations of 1:1:1 ratios for CESA isomers involved in PCW and SCW formation (Gonneau et al. 2014;Hill et al. 2014) were logically correlated with three CESA isomers within each of the six lobes of the rosette CSC, although an 18-, 24-, or 36-mer were equally feasible based on stoichiometry alone. Other kinds of structural analyses supported the 18-mer model. When the large cytosolic domain of a seed plant PCW CESA (AtCESA1) was heterologously expressed, it formed a trimer in solution as supported by modeling of small angle X-ray scattering data ). The same structure had a triangular cross-section, as viewed by negative staining in TEM, and six of these shapes could be accommodated beneath the transmembrane regions of an image-averaged rosette CSC from P. patens protonema . Similarly, six trimeric lobes that had been assembled from a partial GhCESA1 model (including the TMH and catalytic domains, but excluding the N-terminal domain) fit best with the image-averaged rosette CSC. In silico analysis of free energies of various oligomers of this GhCESA1 model also favored a rosette CSC with 18 CESAs, three per each of six lobes .
The feasibility of an 18-chain fundamental cellulose fibril was shown through molecular dynamics simulations of b-1,4-glucan chains with ten cellobiose repeating units (Oehme et al. 2015b). This model was based on the cellulose Ib crystal structure and included water. The approximately hexagonally-shaped fibril model had six layers (with 2, 3, 4, 4, 3, or 2 chains each, designated as a 234432 model). The 18-chain fibril was preferred over a 36-chain model in terms of consistency with the data from spectroscopic analysis of cell walls summarized earlier (Oehme et al. 2015b). When density functional theory calculations were used to analyze the feasibility of 18-chain fibrils with different shapes, the 234432 model was considered slightly less likely than a lower energy, 5-layered, 34443 model (Kubicki et al. 2018). A microfibril model with six layers of three chains each was considered to be unlikely. In the 34443 model, there was one 'core chain' that was mainly two residues below the surface, although, on one of four sides of the microfibril cross-section, only one chain shielded the 'core chain' from a cleft in the fibril surface. Interestingly, one chain protrudes outward on the opposite side of the fibril, allowing us to speculate that it could fit into the cleft of an adjacent fibril in an interaction that could facilitate fibril bundling. Two thirds of the chains are on the surface in both the 234432 and 34443 18-chain fibril models, leading to poor lateral chain order and a high association with water in silico (Oehme et al. 2015b). These features could confer potential to interact with other cellulose fibrils or cell wall matrix components in vivo, which could in turn confer flexibility to the assembly of cell walls with diverse biomechanical properties to serve many roles in plant structure and physiology (Jarvis 2018;Nixon et al. 2016).

Formation of fundamental fibrils and macrofibrils
FF-TEM images of the plasma membrane surface during secondary cell wall (SCW) deposition suggest that cellulose microfibrils form just beyond the extrusion sites in the rosette CSC . Hemispherical domes at the ends of some fibrils may represent a pool of glucan immediately above the rosette CSC. This was supported by molecular dynamics simulations of six atomistic glucan chains, in which the chains formed a pool and then interacted in pairs before forming a unified six-chain fibril through hydrogen bonding and van der Waals interactions. Potentially, a pool of glucan from which fibrils are continuously drawn as the CSC moves forward would promote the continuity of fibril formation even if the 18 CESAs in the rosette CSC synthesize glucan chains at slightly different rates .
The assembly of CESA proteins into multimeric complexes is essential for the crystallization of microfibrils with multiple chains aligned in parallel without chain folding. The b-1,4 linkage produces a stiff and insoluble molecule at cellohexaose and above, favoring cellulose self-aggregation including through chain folding (Diotallevi and Mulder 2007;Fernandes et al. 2011;Taylor 1957;Umemura et al. 2004). Natural cellulose instead forms the cellulose I allomorph from extended, non-folded, glucan chains as the multimeric CSCs move forward in the plasma membrane. Modeling showed that interactions with the membrane surface favored the coalescence of glucan chains aligned in parallel ). This resonates with the observation that CESAs may be modified in ways that generate and/or foster their association with microdomains in the plasma membrane . Protein motif analysis of AtCESA1, 4, 6, 7, and 8 and biochemical experiments on AtCESA7 show that CESAs can be acylated, or have hydrophobic, longchain, saturated fatty acids reversibly attached through covalent bonds. As many as 100 acyl groups could be covalently bound to CESAs and interact with the membrane around each CSC. More exploration is needed of the detailed mechanistic implications of this phenomenon, which could impact many levels of CSC assembly and function in vivo as occurs for membranebound ion channels (Li and Qi 2017).
Larger cellulose fibrils can exist within plant cell walls even if the fundamental fibril synthesized by one rosette CSC has only 18 chains. These small fundamental fibrils may 'bundle', or associate along at least part of their length even though they do not merge into one crystalline core (Oehme et al. 2015a;Zhang et al. 2014). Atomic force microscopy images of hydrated cellulose microfibrils on the innermost layer of intact, minimally perturbed, onion epidermal cell walls are consistent with this possibility (Zhang et al. 2014), and the convergence regions may have special roles in cell wall polymer interactions and mechanics . The atomic force microscopy images revealed 3.5 nm wide apparently single fibrils to 35 nm wide fibrillar aggregates, with about 60% single fibrils, 20% twinned fibrils, and less common higher order associations (Zhang et al. 2016). Correspondingly, irregular groups of two, three, or more rosette CSCs have been seen by FF-TEM in cells of land plants during SCW synthesis Herth 1985;Schneider and Herth 1986), and cellulose microfibrils may also interact post-synthesis in the cell wall space. The use of Sum Frequency Generation spectroscopy to examine intact SCWs showed a stronger 2944 cm -1 peak (carbon-hydrogen stretch region) relative to the weaker 3320 cm -1 peak (hydroxyl stretch region), which was suggested to arise from closely packed, oppositely aligned, cellulose fibrils. The opposite alignment is consistent with the observation that rosette CSCs move close to each other in the plasma membrane, but in opposite directions, during both PCW and SCW cellulose synthesis (Paredez et al. 2006;Watanabe et al. 2015).
The formation of larger cellulose macrofibrils from closely associated fundamental fibrils may be assisted by cell wall matrix components and/or lignin through adhesion or dehydration (Donaldson 2007). There is also evidence to support fundamental fibrils merging before crystallization into larger cellulose crystallites, e.g. cotton fiber SCWs contain cellulose fibrils/crystallites with 4-5 nm overall dimensions that have been modeled as containing 46-52 chains (Fang and Catchmark 2014;Lee et al. 2015;Martinez-Sanz et al. 2017). Populus tension wood has similar 4.5 nm cellulose fibrils, whereas normal wood and opposite wood have smaller microfibrils (3.7-3.9 nm lateral dimensions, respectively) (Foston et al. 2011). A higher density of rosette CSCs could foster formation of larger cellulose fibrils through production of closely-spaced fundamental fibrils that could interact before crystallization finalizes. Low density rosette CSCs have been seen in cotton fibers during PCW and SCW synthesis (Herth 1985), but cotton is not wellsuited for FF-TEM due to challenges in mounting undamaged long fibers prior to freezing. However, FF-TEM revealed rosette CSC densities of 93-135 per lm 2 in tracheary elements engaged in SCW deposition (Schneider and Herth 1986), about ten times higher than in cells synthesizing PCWs (reviewed by Emons 1991;Herth 1985). Fluorescently labeled CESAs also occur at higher density in the plasma membrane (where they are presumed to exist within rosette CSCs) of differentiating tracheary elements monitored by live cell imaging (Li et al. 2016;Watanabe et al. 2015). An increased bias toward parallel movement of rosette CSCs during SCW synthesis, as compared to frequent bidirectional movement during PCW synthesis, would also promote the possibility of co-crystallization of the glucan products of more than one CSC and/or fibril bundling by surface interactions (Li et al. 2016;Watanabe et al. 2015). Currently we do not know what determines the directionality of CSC movement in the plasma membrane once they arrive at the general location of cellulose synthesis, which is established in interaction with cytoskeletal elements (Cosgrove 2014;Jarvis 2013Jarvis , 2018Kumar and Turner 2015;Li et al. 2014;McFarlane et al. 2014;Meents et al. 2018;Schneider et al. 2016;Slabaugh et al. 2014;Turner and Kumar 2018).

Origins and roles of diverse CESAs and CSCs
Rosette CSCs occur in all land plants examined (reviewed in Emons 1991), and also in their closest relatives within the charophycean green algae. Phylogenetic relationships within this algal group have been difficult to resolve (Wickett et al. 2014), and CSC structure has not been determined for members of several charophycean lineages (Klebsormidiales, Chlorokybales, and Mesostigmatales). Furthermore, the unusual structures reported for Coleochaete scutata CSCs (Okuda and Brown 1992) may be misidentified based on strong resemblance to plasmodesmata (Willison 1976). However, available data are consistent with the rosette CSC first appearing in a charophycean green alga at least 630 million years ago (Morris et al. 2018), then being retained throughout the descent to land plants. Additionally, rosette CSCs exist in unicellular and filamentous desmids (e.g. Micrasterias and Spirogyra, respectively) and the giant-celled Charales. All of these organisms are expected to produce 18-chain fundamental fibrils from one rosette CSC, whereas thick microfibrils are produced by large CSCs in chlorophycean green algae (Tsekos 1999). The enhanced interaction potential of smaller fibrils, leading to greater cell wall diversity, may have contributed to the success of charophycean lineages in diverse aquatic and some terrestrial environments, as well as the ability of plants to colonize terrestrial environments (Harholt et al. 2016;Jarvis 2018;Nixon et al. 2016).

Diverse types of rosette CSCs and CESAs
Substantial evidence accumulated through genetic, biochemical, and imaging experiments supports the existence of two major types of CSCs in Arabidopsis, one responsible for the biosynthesis of PCW cellulose and another responsible for the biosynthesis of SCW cellulose (McFarlane et al. 2014;Meents et al. 2018). Results of these experiments are consistent with both types of CSCs being obligate hetero-oligomeric protein complexes. Both the PCW and SCW CSCs contain multiple copies of three non-interchangeable CESA isoforms, all of which are required for CSC assembly and delivery to the plasma membrane (Desprez et al. 2007;Persson et al. 2007;Taylor et al. 2003). In Arabidopsis, AtCESA1, 3 and a member of the AtCESA6-like class are required in PCW CSCs; and AtCESA4, 7 and 8 are required in SCW CSCs (reviewed by McFarlane et al. 2014). These six isoforms define six CESA sequence classes that correspond to six phylogenetic clades that have been strongly conserved in seed plants (Carroll and Specht 2011;Kumar et al. 2016).
The two major types of rosette CSCs participate in two distinct phases of plant morphogenesis. PCW CESAs are expressed in expanding organs (Hamann et al. 2004), and their mutations are either lethal or cause primary defects in cell expansion (reviewed by McFarlane et al. 2014). In contrast, SCW CESAs are expressed in differentiating xylem (reviewed in Taylor et al. 2004) and are tightly co-regulated with other genes involved in SCW deposition (Brown et al. 2005;Ruprecht et al. 2011). SCW CESA mutations produce the ''irregular xylem'' phenotype in which the water conducting vessels have collapsed inward (Turner and Somerville 1997). This typical division of CSC activity evidently appeared and was stabilized early in the evolution of the seed plants, but the two major types of CSCs are also used in other ways. Promoterreporter experiments supported the activity of PCW CESAs in the deposition of the thick cell walls of Arabidopsis trichomes (Betancur et al. 2011). Biochemical and gene suppression experiments show that both PCW and SCW CESAs are involved in the synthesis of thick cell walls in poplar wood (Song et al. 2010;Xi et al. 2017). Analysis of mutant phenotypes and live cell imaging showed that the cellulose in Arabidopsis seed mucilage is synthesized by a PCW type CSC in which CESA1 is replaced by its closest paralog CESA10 (Griffiths et al. 2015). These occurrences also highlight the diversity and complexity of plant cell walls beyond the traditional PCW and SCW categories, with highly specialized cell walls being assembled through use of a 'cell wall toolbox' (Betancur et al. 2011).
Origin of CESA and CSC diversity Like seed plants, the moss P. patens use different CESAs for PCW and SCW biosynthesis, and SCW biosynthesis in leaf midribs requires two distinct CESAs (Norris et al. 2017). Phylogenetic analysis supports independent evolution of this functional divergence in mosses and seed plants, which shared a common ancestor with a single CESA (Carroll and Specht 2011;Roberts and Bushoven 2007;Yin et al. 2009). This common ancestor also likely had rosette CSCs, which must have been homo-oligomeric, with multiple copies of only one CESA (Roberts and Bushoven 2007;Roberts et al. 2012). Gene duplication and sequence divergence within the euphyllophyte lineage (ferns and seed plants) produce the six CESA classes (or clades) now observed in seed plants. The conifers and angiosperms that have been examined have at least one member of each class (Carroll and Specht 2011;Jokipii-Lukkari et al. 2017;Kumar et al. 2016), and several of the classes also include fern CESAs (Yin et al. 2014).
Phylogenetic analyses show that the PCW and SCW CESAs of seed plants were separated before three clades diverged independently within both groups to generate heteromeric PCW and SCW CSCs (Roberts et al. 2012). Evolution of the specialized SCW CSCs in mosses represents a third independent origin of functionally specialized CESAs (Norris et al. 2017). This convergent evolution is consistent with strong selective advantage conferred through the uncoupling of transcriptional regulation of the different CESA isoforms (Norris et al. 2017). For example, diversification of cellular function is promoted by only specific cells accumulating the high densities of CSCs that support the aggregation of fundamental fibrils and SCW thickening after cell expansion ends (Li et al. 2016;Schneider and Herth 1986). An explanation was needed, however, for the multiple cases of independent evolution of hetero-oligomeric CSCs.
The most compelling explanation is found in a hypothesized evolutionary ratchet that drives protein systems toward complexity (Doolittle 2012). This process, a form of constructive neutral evolution (CNE), requires duplicate genes (or paralogs) to exist in the same genome, which is an outcome of whole genome duplications that have frequently occurred during land plant evolution (Lang et al. 2018;Ren et al. 2018). Functional differentiation of duplicate genes often occurs by positive selection when a mutation affecting the expression pattern (sub-functionalization) or biochemical function (neo-functionalization) of one of the encoded proteins confers an adaptive advantage (Conant and Wolfe 2008). To the contrary, the CNE hypothesis explains the evolution of hetero-oligomeric multi-subunit protein complexes without positive selection (Doolittle 2012;Finnigan et al. 2012). Figure 2 illustrates a hypothetical scenario for the evolution of obligate hetero-oligomeric CSCs by the evolutionary ratchet of CNE, combining the effects of gene duplications, divergence of the paralogs by accumulation of neutral mutations, and occasional mutations with differential effects on self-and non-self-interactions between CESAs.
The diagrams and accompanying text in the figure provide a stepwise illustration of how an initially homo-oligomeric CSC could have evolved through the CNE process into a hetero-oligomeric CSC with each of three CESA isomers in a fixed position.
Step one illustrates an ancestral homooligomeric CSC lobe containing only one CESA, and step two illustrates the co-existence of two CESA paralogs (encoding isomers A and B) after a genome duplication. Initially A and B would have been identical, and they remain interchangeable even as they accumulate neutral mutations that lead to them becoming distinct. Although some of the mutations occur at interfaces where the CESA isomers interact, they do not initially alter self-or non-self-interactions between the isomers. In the third step, a mutation blocks the self-interaction of isomer A. This is also a neutral mutation because it does not prevent formation of lobes containing only B or A ? B. However, the mutation results in neofunctionalization of A such that it can now occupy only one position within a CSC lobe. Although deleterious mutations that block both self-and non-self-interactions are expected to be more common, these would be eliminated by selection because the lobe could not form. In step four, an additional genome duplication together with the continued accumulation of neutral mutations generates two pairs of isomers A 0 /A 00 and B 0 /B 00 , with the members of each pair initially able to function interchangeably within the CSC. Eventually, neutral mutations block the interaction of B 0 with itself (step five) and B 00 with itself (step six). At this point, B 0 and B 00 are neofunctionalized, but A 0 and A 00 cannot neofunctionalize in this way because this pair occupies only one position within the CSC due to the inability of their common ancestor (A) to self-interact. These isomers, as well as those arising from gene duplications subsequent to the evolution of the obligate hetero-oligomeric CSC, would be susceptible to subfunctionalization (divergence of expression patterns) as seen for the 6-like CESAs in Arabidopsis PCW CSCs. More commonly, duplicated genes are eliminated by selection. Loss of either A 0 or A 00 would result in an obligate hetero-oligomeric CSC in which each position is occupied by a single CESA isomer, as in Arabidopsis SCW CSCs.
As a result of the hypothetical progression of the CNE process, the CESA isoforms might differ only in the position that they can occupy within the complex. Similar scenarios involving interaction-altering mutations in different interfaces could explain independent evolution of obligate hetero-oligomeric PCW and SCW CSCs in seed plants, which both have three unique positions. To the contrary, the SCW CSCs of mosses have only two unique positions (Norris et al. 2017) as represented in step four of Fig. 2. Known gene duplications in the moss lineage are much more recent than those that generated the six seed plant CESA classes (Lang et al. 2018). Thus, the moss CESA family may be at an intermediate stage in the evolution of hetero-oligomeric CSCs. As the progression toward a hetero-oligomeric CSC occurs via changes in allowed interaction interfaces, the newly distinct CESA isomers could also evolve other differences, as will be discussed later. b Fig. 2 A hypothetical scenario for the evolution of obligate hetero-oligomeric CSCs from an ancestral homo-oligomeric CSC according to the constructive neutral evolution hypothesis (Doolittle 2012;Finnigan et al. 2012). The text in the figure describes a stepwise scenario for the evolution of complexity in the CESA family and in CSC structure, as illustrated in each accompanying diagram Implications of CESA diversity for CSC substructure and function CSCs are often illustrated as having identical heterooligomeric lobes. However, an arrangement with two each of three different homo-trimeric lobes remains possible for CSCs containing 18 CESAs (Olek et al. 2014;Turner and Kumar 2018;Vandavasi et al. 2016). Although a role for CNE in the evolution of CSCs (Fig. 2) has not been tested experimentally, this hypothesis is helpful in thinking about these alternatives. CNE is expected to have operated on CSC lobes containing a random mixture of identical isoforms following gene duplication, leading to CSCs with identical hetero-trimeric lobes. Evolution of CSCs with three different homo-trimeric lobes would require positive selection for lobes containing identical isoforms. This cannot be ruled out, but it is less parsimonious that the scenario illustrated in Fig. 2. Although originally interpreted in the context of the 36-mer CSC model, results from attempts to purify CSCs by tandem affinity chromatography from Arabidopsis lines expressing both His-CESA7 and STREP-CESA7 (Atanassov et al. 2009) are consistent with an 18-mer CSC with hetero-trimeric lobes. The * 440 kDa (4-mer) and * 700 kDa (6-mer) complexes that were observed may represent two associated lobes (6-mer) and single lobes associated with an additional CESA7 subunit (4-mer) as the smallest units with two CESA7 subunits that could be derived from a CSC with hetero-trimeric lobes. Cellulose synthesis researchers have long assumed that regions that vary in sequence between isomers will be where interfaces occur within and between lobes of different types of CSCs (Carroll and Specht 2011;Hill et al. 2018a;Kumar et al. 2016;Vergara and Carpita 2001). Most sequence divergence across all CESAs occurs in four regions: (1) the N-terminus, which is truncated in some isoforms; (2) the hypervariable region between the Zn-binding domain and the first TMH; (3) the class-specific region (CSR) within the catalytic domain; and (4) the short C-terminus following the last TMH (Carroll and Specht 2011). Several ''class-specific'' sequence regions, which have greater amino acid diversity between versus within CESA classes, have been a particular focus for potential roles in CESA-CESA interactions (Carroll and Specht 2011;Hill et al. 2018a;Kumar et al. 2016;Vergara and Carpita 2001). However, the CESAs within CSCs are apparently tightly packed with many contact regions . Whereas evolution of hetero-oligomeric CSCs involved critical mutations that altered the binding properties of some of these contact regions, others retained their original binding properties (Fig. 2). For example, in the moss CESAs, the CSRs are interchangeable between functional CESA classes despite being class-specific at the sequence level (Scavuzzo-Duggan et al. 2018).
The CSR region is also intrinsically disordered, with two important implications: (1) it is inherently suitable as an interaction domain, although we don't currently know whether this occurs between or within lobes; and (2) due to relaxed selection, it is prone to become class-specific at the sequence level by accumulating mutations that do not alter its functional role in the CSC (Scavuzzo-Duggan et al. 2018). Indeed, the critical mutations that alter interactions may reside within regions that are otherwise highly conserved. We can also predict that class-specific interfaces differ in CSCs that evolved the hetero-oligomeric state independently. This prediction has been proven to be true. Domain swap experiments in Arabidopsis (Hill et al. 2018a;Kumar et al. 2016;Wang et al. 2006) and P. patens (Scavuzzo-Duggan et al. 2018) have shown that the specific regions conferring functional classspecificity differ among the CESA isomers composing moss CSCs, seed plant PCW CSCs, or seed plant SCW CSCs.
To summarize and look to the future, it appears likely that the CESA-CESA interfaces within the CSCs of extant plants existed in the ancestral rosette and that they have become modified in different CSC lineages such that they contribute to the specific interactions between isoforms. By combining the results from domain swap experiments with computational modeling of interactions between CESA isomers, it should be possible to predict small sequence motifs or single residues that have high probabilities of participating in CESA-CESA interactions and test their function using targeted mutagenesis and complementation assays. We also need to understand why three CESA isoforms are required for assembly and function of seed plant CSCs in vivo whereas cellulose synthesis can be reconstituted from a single hybrid aspen CESA in vitro (Purushotham et al. 2016) and heterologously-expressed CESAs can homotrimerize  or homodimerize (Olek et al. 2014). Although more complex explanations may eventually be discovered, in vitro conditions may not constrain oligomerization in the same way as occurs in vivo. For example, less stable complexes may persist in vitro, while they would be removed by cellular protein quality control mechanisms in vivo. Interactions in the TMH domain may also more strongly restrict assembly as compared to only cytosolic interfaces in the partial CESAs.
In addition, CESAs may function in non-canonical ways beyond forming typical hetero-trimeric PCW and SCW CSCs, particularly in plants with expanded CESA gene families (poplar has 17 CESAs, Kumar et al. 2009) and phenotypes that depend on very high amounts of cellulose, e.g. wood. Larger CESA gene families arose through successive whole genome duplications, including ones that were specific to particular lineages (Ren et al. 2018). Afterwards, the interaction interfaces in CESA paralogs may have continued to evolve so that CSCs with non-canonical composition may now function in specialized plant tissues. This is an area of important future research, particularly coupled with parallel analysis of cellulose microfibril properties, as exemplified in recent work on wood formation in Populus tremula (aspen trees) (Zhang et al. 2018). Proteomic analysis showed that, during deposition of the cellulose-rich gelatinous layer in developing tension wood, PttCESA8b accounts for 78% of the total SCW CESAs. This observation is consistent with a role for both hetero-oligomeric SCW CSCs and homo-oligomeric PtCESA8b CSCs in biosynthesis of the gelatinous layer, which apparently contains wider-diameter cellulose microfibrils as compared to normal aspen wood (Zhang et al. 2018). Interestingly, cellulose fibril formation occurs in vitro from heterologously expressed PttCESA8, a single CESA isomer from hybrid aspen, after reconstitution into proteoliposomes (Purushotham et al. 2016). Although other possibilities have not been completely ruled out, available evidence suggests that PpCESA5 functions as a homo-oligomer (Li 2017), and this isomer also supports in vitro cellulose fibril formation when reconstituted into proteoliposomes (Cho et al. 2017). Therefore, cellulose fibril formation in vitro seems to occur most readily from isomers that may function as homo-oligomers in vivo. This presents the challenge of generating similar outcomes in vitro from hetero-oligomeric CESA assemblies in order provide an additional tool for analysis of the biochemical characteristics of canonical hetero-trimeric PCW and SCW CSCs.

Regulation of cellulose properties by CESA and CSC activity
Our view of how CESAs function to synthesize b-1,4 glucan and how the CSC generates a microfibril has been strongly influenced by the concept that rosette CSCs are hetero-oligomeric, as demonstrated by analysis of the effect of CESA mutations on typical primary and secondary walls of Arabidopsis. However, the likely existence of functional homo-oligomeric rosette CSCs in the common ancestor of seed plants and mosses indicates that the different subunits of hetero-oligomeric rosette CSCs do not have distinct essential roles in glucan chain elongation and that the distinct interfaces that now exist between different subunits are not essential for formation of the rosette morphology (Roberts and Bushoven 2007). Nonetheless, CESA isomers have evolved distinct roles, including non-interchangeable positions in typical heteromeric PCW and SCW CSCs (see above). Beyond this, the expanding family of CESA isomers has evolved regulatory differences. For example, when catalytically inactivated versions of each Arabidopsis SCW CESA were expressed in their cognate null mutant backgrounds, differences in the extent of rescue were observed as would be consistent with unequal contributions of the different isomers to overall CSC activity ). Our current insights into such changes are summarized below, including discussion of particular cellulose properties that are predicted to be modulated through CESA and CSC activity.

Evidence for differences between CESA isomers and CSCs
As a general principle, protein activities in living organisms are frequently regulated by a variety of post-translational modifications. One of the most common is the addition and removal of phosphate groups, which changes protein conformation and may either activate, inactivate, or modulate CESA activity (Chen et al. 2010). As reviewed elsewhere (Jones et al. 2016;Kumar and Turner 2015;Li et al. 2014;McFarlane et al. 2014;Speicher et al. 2018), there is evidence from analysis of cellular proteomes that AtCESA1, 3, 4, 5, and 7 isomers are phosphorylated at one or more sites within the N-terminus and/or catalytic domains. These phosphorylation sites are often conserved in the orthologs from other plant species (Carroll and Specht 2011), which is consistent with conserved regulatory potential of both PCW and SCW CSCs. Whether or not phosphorylation-dependent regulation actually occurs in particular cell types or developmental stages depends on the presence and activity of the enabling protein kinases and phosphatases, which are not yet identified for CESAs. Functional analysis of CESAs with changed phosphorylation potential by complementation of their cognate mutant lines provided early evidence that CSC velocity and movement direction are under regulatory control in the cell (Chen et al. 2010). Phosphorylation status can also affect the stability and degradation of CESA (Hill et al. 2018b;Taylor 2007).
The velocity and trajectory of CSC movement in the plasma membrane have been analyzed in an expanding set of cells in recent years, building on many analyses done in etiolated hypocotyls of Arabidopsis wild-type and mutant lines. For example, the PCW CSCs of the grass, Brachypodium distachyon, moved with a similar velocity distribution and mean rate (164-184 nm/min) as compared to Arabidopsis hypocotyl PCW CSCs when analyzed in the same lab (Liu et al. 2017). Therefore, there may be conserved aspects of PCW CESA and CSC function in expanding plant cells, which is consistent in theory with the phosphorylation discussion above. The dynamic microtubule array has complex impacts on CESA movement rates that are still being explored mechanistically (Liu et al. 2017;Woodley et al. 2018). Other potential regulatory factors include structural differences between CESA isomers, post-translational modifications, interactions with other proteins, specialized plasma membrane domains, temperature, and/ or cell expansion rate and direction.
Caution is appropriate when comparing CESA velocities in different cell types due to such variables and potential pleiotropic effects of mutations in experimental genotypes. Many of these variables were eliminated in observations of AtCESA1 (PCW CSCs) and AtCESA7 (SCW CSCs) moving at different rates in the same membrane of a differentiating tracheary element . The difference in average velocity was substantial, with AtCESA7 moving about 70% faster than AtCESA1 during two of three developmental stages assayed. The reduced velocity of AtCESA7 toward the end of SCW synthesis highlights the cellular regulation of this important aspect of CSC behavior. These interesting observations were made during the 'transition stage' of cell wall development that bridges PCW and SCW deposition in some cells such as the tracheary elements (Meents et al. 2018). Further supporting isomerspecific cellular regulation, the tagged AtCESA1 and AtCESA7 proteins were internalized independently even though they were delivered to the plasma membrane in the same vesicles. AtCESA1 was selectively removed at the end of the transition stage, while AtCESA7 continued to function in the plasma membrane for SCW cellulose synthesis. These differences in velocity and cellular trafficking observed concurrently between and within CESA isomers in one cell type ) are likely to be determined by differences in protein structure or posttranslational modifications, but these remain to be fully elucidated.
Other recent evidence raises the possibility of functional variation between CSCs. Optimized FF-TEM sample preparation methods showed that the CSCs in moss protonema synthesizing primary walls had a range of diameters, with lobes often, but not always, appearing triangular . The images analyzed are of the TMH region of the membrane-embedded CSC, as commonly revealed by the FF-TEM method. Measurements of 324 CSCs showed an external diameter range of 17.6-25.6 nm (45% variation), with a mean of 21.4 ± 1.3 nm. These values derive from measurements based on hexagonal geometry, i.e. point-to-point on the exterior of the lobes, which is more precise than drawing an encompassing circle given the slight ellipsoidal shape of some rosette CSCs. The average lobe area within the CSCs was 39.9 ± 6.5 nm 2 , based on measurements of the individual lobes of 50 randomly selected CSCs, with a range of 30.8-49.6 nm 2 (61% variation). Some of the lobes (30-70%) appeared triangular, with others appearing more like squares. These observations are expected to be close to reality, given the small grain size and light coating of platinum/carbon used to generate the replicas (see Fig. S3 in Nixon et al. 2016).
As we speculated previously, these large differences in rosette CSC morphology, even within one cell type, could reflect functional variability, e.g. activated versus non-activated CSCs . For example, do other membrane-bound proteins tightly, but perhaps reversibly, associate with CSCs to cause changes in diameter and lobe area and shape? One strong candidate for such a role would be the endo-b-1,4-glucanase, KORRIGAN1, which has one TMH and co-localizes near or with PCW CESAs in Arabidopsis and impacts CESA velocity and cellulose accumulation Vain et al. 2014). Variable CSC diameter in the TMH region and variable spacing between the TMH lobes ) could also arise from regulated changes in the conformation of flexible CESA regions on the periphery of the cytosolic catalytic core, e.g. the CSR, the plant-conserved region (P-CR), and/or the N-terminal regions (Scavuzzo-Duggan et al. 2018). Shape variations may also arise from stresses inherent in cellulose polymerization and microfibril production (Diotallevi and Mulder 2007;Turner and Kumar 2018). In the future, it will be worthwhile to compare CSC morphology in different organisms and developmental stages and after different experimental treatments. Consequently, we may gain a better understanding of how rosette CSC variations are regulated, with potential consequences for cellulose microfibril structure .

Degree of polymerization of cellulose
The degree of polymerization (DP) of cellulose in wild type plants varies. For example, solvated cellulose from cotton fiber PCWs versus SCWs had typical DP values of at least 617 or 6,170, respectively (Timpa and Triplett 1993). The apparent DP was 7500 in SCW-rich Arabidopsis stems subjected to ball milling and relatively gentle cell wall extraction and solubilization (Schneider et al. 2017). The DP varied between 2000 and 2500 in highly purified cellulose from three types of poplar wood (normal wood, tension wood, and opposite wood) (Foston et al. 2011), but these may be underestimates due to the extensive sample processing.
We know little about how cellulose DP is controlled, although the properties of CESAs as well as regulatory factors could have an impact. The length of one cellulose chain would logically be affected by the extent of enzyme processivity, or how long continuous polymerization proceeds before the acceptor chain is released from the active site causing chain termination. Control at this level would determine DP if the enzyme operated in isolation, but in the living plant other factors such as chain cleavage by membrane-anchored endo-1,4-b-glucanases that are required for normal cellulose synthesis (Mølhøj et al. 2001;Yu et al. 2013), CSC retrieval from the membrane under abiotic stress ), or diminished carbon status of the plant (Ivakov et al. 2017) could potentially cause abrupt chain termination. There may also be CESA structural features regulating substrate access to the catalytic site, and implicitly the extent of glucan chain continuity. Part of the P-CR region from rice OsCESA8 (77 amino acids of 125 total) was crystallized, revealing two interacting alpha helices connected by an ordered loop (Rushton et al. 2017). This solved structure was consistent with computational models of the entire P-CR regions from the six major Arabidopsis CESA isoforms (Sethaphong et al. 2016). These two helices of the P-CR were modeled as existing above the catalytic site, where they could potentially be involved in regulating substrate access (Rushton et al. 2017). A still undescribed regulatory signal might cause the two helices of the P-CR to block substrate access to the catalytic site, thereby inducing chain termination and providing a theoretical means of DP control.
The existence of the translocation channel immediately above the active site stabilizes the elongating chain and promotes cellulose synthase processivity (Knott et al. 2016), but it remains to be determined whether inherent structural and/or regulatory differences between isomers help to control DP. If such isomer-specific differences exist, we may find common characteristics that: (a) limit DP in the three isoforms involved in PCW cellulose synthesis (e.g. AtCESA1, 3, and 6, or 6-like isomers); and (b) enhance DP in the three isoforms involved in SCW cellulose synthesis (e.g. AtCESA4, 7, and 8). This prediction assumes that three CESA isomers within each major type of CSC collectively synthesize one continuous cellulose microfibril. The observation that AtCESA1 (a PCW CESA) is retrieved selectively from the same membrane while AtCESA7 (a SCW CESA) continues to function ) demonstrates isomer-specific regulation of persistence time in the plasma membrane, which could result in different cellulose chain lengths synthesized by the two major types of CSCs.
Cellulose microfibril structure and interaction potential Recently, the known allomorphs of native cellulose I have been expanded beyond I alpha and I beta that were first identified over three decades ago (Atalla and Vanderhart 1984). The combination of analysis of never-dried or rehydrated 13 C-labeled PCWs by twodimensional magic angle spinning SSNMR spectroscopy and density functional theory calculations led to the proposal of five interior and two surface cellulose conformations (named a-g) . From the fibril surface inwards, these occur: in solvent-exposed regions, 'f and g'; in small regions with less hydration where cellulose interacts with hemicellulose, 'd'; between the surface and the core, 'a and b'; and within the dehydrated core, 'c'. The structural variation relates to different conformations of the CH 2 OH group outside the five-carbon ring of glucose. These conformations were found in varying percentages in primary cell walls of diverse types , and conformations 'f, b and c' (on the surface, between the surface and the core, and within the dehydrated core, respectively) were found by similar analyses of dried Arabidopsis secondary walls (Dupree et al. 2015). Potentially, conformational changes in the CSC or changes in CSC velocity could influence the final conformations of the chains in the microfibril, which in turn might affect the potential for cellulose fibrils to interact with others or cell wall matrix polymers. The possible existence of such mechanistic links is supported by the subtle variation in the structure of cellulose microfibrils in cotton fibers (with predominantly SCWs) produced by three Gossypium species (Martinez-Sanz et al. 2017) and in three types of wood in one Populus species (Foston et al. 2011).

Cellulose crystallinity
The crystallinity of cellulose is likely to be also affected by CESA and CSC structure and behavior. This conclusion arises from analysis of CESA mutants, with hazards for interpretation arising from potential pleiotropic and/or compensatory effects arising from the primary mutations (Fujita et al. 2013). An Arabidopsis line (ixr1-2) with a T942I mutation in AtCESA3 was resistant to the cellulose synthesis inhibitor, isoxaben, and the stems had about 69% of the control crystalline cellulose content and a slight reduction in cellulose crystallite size (2.2 nm versus 2.34 nm in the control). Analysis by 13 C magicangle-spinning SSNMR spectroscopy showed that the cellulose was less crystalline, and it was enzymatically converted into glucose more efficiently. These changes were correlated with about 8% faster average CESA movement in hypocotyls (Harris et al. 2009(Harris et al. , 2012Scheible et al. 2001). The broad applicability of the results was supported by the expression of the ixr-2 variant of AtCESA3 in tobacco, which reduced the cellulose content, hindered upright growth, and increased saccharification efficiency of the woody stems (Sahoo et al. 2013). Another mutation, A903V in AtCESA1 (aegeus, or ags1-2) conferred resistance to the cellulose-synthesis inhibitor, quinoxyphen, and the mutant plants had celluloserelated phenotypes similar to the ixr1-2 mutants, in association with about 16% faster average CSC movement. While there is uncertainty about where the ixr1-2 mutation lies in CESA structure due to ambiguity about the number of TMHs in CESA (Slabaugh et al. 2014), the ags1-2 mutation is within a TMH where it may directly impact glucan chain translocation (Morgan et al. 2013). There is increasing evidence that the TMH region, as collectively considered across multiple TMHs, is a 'hot spot' for regulation of cellulose biosynthesis and crystallization (Shim et al. 2018).
In Arabidopsis, the relationship between CSC velocity and cellulose crystallinity is not always the same (see Fujita et al. 2013 for a summary). However, substantially different growth temperatures between experiments could impact CSCs in unexpected ways, such as stability in the membrane (Hill et al. 2018b). For example, the analysis of the ixr1-2 and aegeus lines described above was done at 21°C, a typical growth temperature for Arabidopsis thaliana. When the any1 mutant line (D604N in AtCESA1) was grown at 29°C, the hypocotyls showed a 21% reduction in CSC velocity along with reduced cellulose crystallinity in the stems (17.3% versus 21% in the controls) (Fujita et al. 2013). Therefore, the results for the any1 line show an opposite correlation of CSC velocity and cellulose crystallinity as compared to the ixr1-2 and aegeus lines. Notwithstanding potential effects of growth temperature differences, this opposite correlation could arise from the different location of the any1 mutation, which is in a loop connecting core beta strand 5 of the catalytic region with an alpha helix likely to lie near the membrane. This alpha helix contains another quinoxyphen-resistant mutation (lycos, G620E in AtCESA1), which showed decreased cellulose crystallinity when grown at 21°C. In this case, the CSC velocity was not analyzed (Sethaphong et al. 2013;Slabaugh et al. 2014). Similarly, another isoxaben-resistant mutant (ixr1-6, S377F in AtCESA3) had reduced cellulose crystallinity, and the mutation site is within core beta strand 1 (Sethaphong et al. 2013). Therefore, the cytosolic region of CESA may also help to regulate cellulose crystallinity, potentially through impacts on the glucan chain as it exits the catalytic site or through extended impacts of amino acid substitutions on CESA structure (Sethaphong et al. 2013). Overall, the results invoke the potential to modify cellulose crystallinity in useful ways once structure/function relationships in CESAs are fully understood. Such useful alterations will likely be subtle, given that the crystalline nature of cellulose confers strength to growing plants and biomaterials derived from them.
Unanswered questions about structure/function relationships in CESA and the rosette CSC The advances in understanding plant cellulose synthesis via the rosette CSC have accelerated in the structural research era that began in 2013 with the publication of two groundbreaking research efforts (Morgan et al. 2013;Sethaphong et al. 2013). These studies built upon the earlier identification of cellulose synthases in bacteria (Saxena et al. 1990;Wong et al. 1990) and plants (Arioli et al. 1998;Pear et al. 1996) and set the stage for a complete understanding of the historically challenging field of cellulose biosynthesis research (Delmer 1999).
These new structural insights about cellulose synthases and other results in the last five years have also generated many important and exciting questions about glucan chain synthesis and microfibril formation via the rosette CSC, as exemplified by the list below. Trans-disciplinary research bridging many areas of expertise and technical approaches will be key to answering these questions. Most of the questions are directly related to the content of this review, but a few point to exciting related areas that can be explored further in other reviews that we cited earlier. Echoing the sentiments of Dr. Deborah Delmer, a pioneer in this field who stimulated our interests and collaborated with both of us in cellulose synthesis research, we are ''having fun along the way, and we welcome new travelers to share in the adventure'' (Delmer 1987). Important questions about rosette CSC structure/function relationships include: • What is the atomic structure of CESA and its homo-oligomeric and hetero-oligomeric assemblies?
• Where in the cell and how is the assembly of CSCs with rosette morphology controlled? • Given that rosette CSCs occur in Golgi vesicles, how is cellulose synthesis typically prevented in the endomembrane system and activated at the plasma membrane? • What was the structure of the ancestral CSC that directly preceded the rosette and how did its cellulose synthases differ from CESAs? • Did rosette CSC complexity arise through constructive neutral evolution and does this explain multiple independent origins of hetero-oligomeric rosette CSCs? • Do homo-oligomeric CSCs exist in seed plants, and, if yes, do they have an adaptive advantage in certain cells and/or any different outcomes for microfibril formation? • Do homo-oligomeric CESA assemblies, or even rosette CSCs, form in in vitro synthesis systems and facilitate microfibril formation? • What are the regulatory functions associated with the distinct structural features of different CESA isoforms? • How is acylation of CESAs regulated in the cell, and how does it change CESA and CSC behavior? • How is phosphorylation of CESAs regulated in the cell, and what more can we learn about its impacts on CESA conformation and CSC behavior? • Are there other post-translational modifications of CESAs or other proteins associated with CSCs that have critical outcomes for cellulose synthesis? • Do lobes of the CSC actually have variable sizes and shapes, and, if yes, what is the cause and outcome of this variation? • Are different CSC diameters associated with different activity states? Could these differences be related to the activation of cellulose synthesis in the plasma membrane, given that rosette CSCs are not typically active in the endomembrane system? • Do changes in conformation of flexible and disordered CESA regions on the periphery of the catalytic domain regulate CSC diameter? • How do other plasma-membrane-bound or -associated proteins modulate CSC activity and/or cellulose crystallization? What are their precise roles in cellulose synthesis? • How might differences between PCW versus SCW CESAs and CSCs help to regulate the degree of polymerization and cellulose microfibril formation in these two major cell wall types? • Does variation in CESA and/or CSC structure affect the positions of the glucan chain extrusion sites in ways that can modulate chain coalescence and cellulose crystallinity? • How might variation in the rate of polymerization and CSC velocity affect the crystallization of cellulose? • Do the twins, triplets, or higher order aggregates of CSCs observed by FF-TEM have actual consequences for macrofibril formation? If yes, is increased density of CSCs sufficient to induce these associations, or are there other controls? • Does the degree of bidirectional movement of CSCs have functional consequences for cell wall properties, and how is the directionality of movement regulated? • Are cell wall properties changed when PCW and SCW CSCs operate together in the same membrane? • Do 18-chain fundamental cellulose fibrils play a unique role in cell wall assembly and mechanical characteristics as compared to larger fibrils?