Proteomic Analysis of the Soybean Symbiosome Identifies New Symbiotic Proteins*

Legumes form a symbiosis with rhizobia in which the plant provides an energy source to the rhizobia bacteria that it uses to fix atmospheric nitrogen. This nitrogen is provided to the legume plant, allowing it to grow without the addition of nitrogen fertilizer. As part of the symbiosis, the bacteria in the infected cells of a new root organ, the nodule, are surrounded by a plant-derived membrane, the symbiosome membrane, which becomes the interface between the symbionts. Fractions containing the symbiosome membrane (SM) and material from the lumen of the symbiosome (peribacteroid space or PBS) were isolated from soybean root nodules and analyzed using nongel proteomic techniques. Bicarbonate stripping and chloroform-methanol extraction of isolated SM were used to reduce complexity of the samples and enrich for hydrophobic integral membrane proteins. One hundred and ninety-seven proteins were identified as components of the SM, with an additional fifteen proteins identified from peripheral membrane and PBS protein fractions. Proteins involved in a range of cellular processes such as metabolism, protein folding and degradation, membrane trafficking, and solute transport were identified. These included a number of proteins previously localized to the SM, such as aquaglyceroporin nodulin 26, sulfate transporters, remorin, and Rab7 homologs. Among the proteome were a number of putative transporters for compounds such as sulfate, calcium, hydrogen ions, peptide/dicarboxylate, and nitrate, as well as transporters for which the substrate is not easy to predict. Analysis of the promoter activity for six genes encoding putative SM proteins showed nodule specific expression, with five showing expression only in infected cells. Localization of two proteins was confirmed using GFP-fusion experiments. The data have been deposited to the ProteomeXchange with identifier PXD001132. This proteome will provide a rich resource for the study of the legume-rhizobium symbiosis.

Biological nitrogen fixation occurs through the activity of the enzyme nitrogenase, which is found only in certain prokaryotes, including those of the family Rhizobiaceae (termed rhizobia). The enzyme converts atmospheric N 2 to ammonia, a biologically available form of nitrogen, but requires large amounts of ATP to fuel the conversion (1). Legumes, such as soybeans (Glycine max), are able to form an association with these nitrogen-fixing rhizobia. In this symbiotic relationship, N 2 is fixed by the rhizobia and made available to the plant in exchange for organic acids and other nutrients. This mutually beneficial association occurs within specialized root organs termed nodules. Within the nodule infected cells, N 2 -fixing bacteroids (the symbiotic form of rhizobia) are enclosed in a plant-derived membrane to form organelle-like structures termed symbiosomes (2).
The symbiosome membrane (SM) 1 originates from invaginated plasma membrane as the bacteria enter infected cells, but quickly becomes specialized as the symbiosis matures (3). Within symbiosomes of nodules, the rhizobia continue to multiply before differentiating into bacteroids in which symbiosis-related genes are induced (4). Symbiosomes thus result from the coordinated division of bacteria and growth of the surrounding SM, fed by the systems for endomembrane synthesis (5).
The SM surrounds one or more differentiated bacteroids, effectively excluding them from the plant cytosol. The region between the SM and bacteroids is termed the peribacteroid space (PBS). The SM is a physical barrier between the plant and the bacteroid and represents a regulation point for the movement of solutes between the symbionts, via an array of transporters and channels (4,6).
It is estimated that in a mature infected cell, the SM surface area is many times that of the plasma membrane, allowing it to encapsulate the multiplying bacteroids (7). The expanding SM requires the synthesis of lipids and proteins in the infected cell (7). The composition of the SM is thought to vary during nodule development and senescence, to facilitate the dynamic transport requirements of the symbionts (3). Targeting to the symbiosome has been linked to an N-terminal signal sequence for several proteins (8 -10), but no conserved N-terminal signal has been identified for SM proteins.
The principal nutrient transfer across the SM is the exchange of a plant carbon energy source, for nitrogen fixed by the bacteroid. This carbon source is derived from sucrose produced via photosynthesis, which is converted in the nodules to dicarboxylic acids (6). Dicarboxylates, probably malate, are then transported across the SM to the bacteroids (11). Although malate transport across the SM has been characterized biochemically (12,13), a transport protein has not yet been identified on the SM of any of the legumes studied.
The main product of nitrogen fixation in bacteroids is ammonia, the majority of which is thought to be protonated to ammonium in the acidic PBS (14). There are two routes proposed for transport of fixed-N across the SM; as NH 3 through the aquaglyceroporin NOD26 (15,16) and as NH 4 ϩ through a monovalent cation channel (17). Although NOD26 is well described in soybean (16, 18 -21), the protein catalyzing monovalent cation transport has not been identified.
Several additional transport processes on the SM have been identified, including proteins for transport of iron, zinc, calcium, and sulfate (22)(23)(24)(25)(26)(27). In addition, the movement of hydrogen ions has been reported through the activity of an H ϩ -ATPase (28 -30).
The SM is expected to contain many more proteins that facilitate the interaction between the plant host and bacteroids. Identification and characterization of SM proteins has been limited to date and a comprehensive description of the protein content of this membrane is lacking. Previous attempts to characterize the proteome of legume SMs have yielded modest results, with the main barriers to overcome being the lack of completed reference genomes with which to compare sequencing results, and the intrinsic hydrophobic nature of SM proteins hindering their identification. Two proteomic studies have been performed on the G. max : Bradyrhizobium japonicum SM. Both studies occurred prior to the release of the soybean genome and thus were limited in their success at identifying SM proteins (31,32). Proteomic studies of the SM in other legume-rhizobia symbioses (Lotus japonicus, Pisum sativum, and Medicago truncatula) have succeeded in identifying only a small number of SM proteins as they were done at a time when there was limited genomic information available for these legumes (33)(34)(35)(36). In addition, all studies except Wienkoop and Saalbach (35) have relied on 2D-PAGE methodologies, which are known to hinder the subsequent detection by mass spectrometry of hydrophobic membrane proteins. Here, we report a more comprehensive sampling of SM proteins and also proteins from the PBS of soybean. Together, these proteomic analyses provide a valuable resource for future studies on the structure and function of the symbiosome in all legume-rhizobium symbioses.

EXPERIMENTAL PROCEDURES
Plant Growth and Protein Isolation-Soybeans (G. max cv. Stephens) were grown under natural light extended to 16 h day length with incandescent lighting in a temperature controlled glasshouse (26°C day/20°C night). Plants were grown in washed river sand and seed-inoculated with B. japonicum in peat (Nodulaid Group H, Becker Underwood, NSW, Australia), and again at 5 days postsowing. Nodules were harvested from roots at 32 days postinoculation. Nitrogenfixing ability of the mature nodules was confirmed using an acetylene reduction assay as described in (37). SM was isolated from mature nitrogen-fixing soybean nodules using previously established procedures that yield membrane that is generally free of contamination from other organelles (31,38). The SM protein fraction was further purified by either bicarbonate stripping (39) or chloroform-methanol extraction (40). Isolated SM protein pellets were suspended in 100 mM Na 2 CO 3 , then pelleted by ultracentrifugation to isolate stripped proteins. Following bicarbonate stripping, SM proteins were phenol extracted as described in Day et al. (38). For chloroform-methanol extraction, isolated SM proteins were suspended in 50 mM MOPS/NaOH, pH 7.5, with protease inhibitors (cOmplete Protease Inhibitor Mixture Tablets, Roche, Basel, Switzerland) and mixed with a 5:4 chloroform : methanol solution as described (40). After 30 min incubation on ice, soluble and insoluble proteins were recovered by diethyl ether precipitation and ultracentrifugation (86,000 rpm for 1 h). Isolated SM protein fractions were resuspended in 8 M urea/1% SDS buffer and stored at Ϫ20°C prior to proteomic analysis.
The peribacteroid space fraction was isolated during the SM isolation protocol following disruption of isolated intact symbiosomes (38). PBS proteins were concentrated using Nanosep® centrifugal devices (PALL Life Sciences, Long Island, NY), collected, and stored at Ϫ20°C.
For three biological replicates, sodium bicarbonate stripping removed peripheral proteins from the SM. To reduce the complexity of the SM preparations by further fractionation and to enhance the collection of more hydrophobic proteins, chloroform-methanol extraction was performed on a subsequent set of four biological replicates. These four biological replicates were also used to generate PBS samples. Proteins identified from sodium bicarbonate stripped and C:M extracted fractions are together referred to as the SM proteome. Proteins removed from the SM with bicarbonate stripping were analyzed as the SM peripheral proteome. PBS and SM peripheral proteins were concentrated using Nanosep® centrifugal devices (PALL Life Sciences) prior to proteomic analysis.
Western Blot Analysis-Ten micrograms of total nodule protein, nodule microsomal, SM (bicarbonate stripped and chloroform-methanol fractions), SM peripheral and PBS samples were separated by SDS-PAGE using Bio-Rad Mini-PROTEAN gel equipment. Separated proteins were transferred to PVDF membrane (Bio-Rad, Hercules, CA) for Western blotting, or stained with Coomassie brilliant blue G to visualize protein. Blots were stained with Ponceau then destained, blocked and probed with primary antibodies at appropriate dilutions (nodulin 26 1:1000, HDEL 1:200, and porin 1:1000). Nodulin 26 anti-body was provided by Dan Roberts, Knoxville, TN (41), HDEL antibody was sourced from Santa Cruz Biotechnology, Inc (Dallas, TX) and porin antibody was obtained from Dr. Tom Elthon, Lincoln, NE via Harvey Millar, Perth, WA (42). Blots were rinsed twice with TBST (Tris-Buffered Saline with 0.3% Tween 20) and incubated with secondary antibody conjugated to horseradish peroxidase (1/10,000 dilution, Promega, Madison, WI) followed by four washes in TBST. Immunoreactive proteins were visualized by chemiluminescence using Immun-Star TM WesternC TM Chemiluminescence Kit (Bio-Rad, Hercules, CA) as per the manufacturer's instructions and documented with the GelDoc Imager (UVP, Upland, CA).
Sample Preparation and LCMS/MS-Protein concentration of samples was determined using LavaPep Protein Quantification Kit (Gel Company, San Francisco, CA). Each biological sample was prepared and analyzed in triplicate by LCMS/MS. Ten micrograms of protein for each technical replicate was reduced with TCEP (tris(2-chloroethyl) phosphate), alkylated with MMST (methyl methanethiosulfonate) and digested with porcine trypsin (Promega) overnight at 37°C. Digested peptides were prepared for LCMS/MS by removing excess salts with a HLB SPE column (Waters) and excess detergent with a SCX Stage Tip (Thermo Fischer Scientific, Waltham, CA), according to the manufacturer's guidelines. Insoluble components were removed by centrifugation at 10,000 rpm for 10 min. Peptides were then resuspended in 0.1% formic acid.
Samples were separated by liquid chromatography (LC) and analyzed on an Analyst QSTAR ESI-QUAD-TOF mass spectrometer (Thermo Fischer Scientific, Waltham, CA). The LC component consisted of a 150 mm separation column (Zorbax Column 300SB C18), driven by Agilent Technologies (Santa Clara, CA) 1100 series nano/ capillary liquid chromatography system. Peptides were separated over two hours (5% Acetonitrile, 40% Acetonitrile) and eluted directly into the mass spectrometer. The mass spectrometer was run in positive ion mode and MS scans ran over a range of m/z 400 -1500 and at four spectra s Ϫ1 . Precursor ions were selected for auto MS/MS at an absolute threshold of 500 and a relative threshold of 0.01, with a maximum of three precursors per cycle. Precursor charge-state selection and preference was set to 2ϩ and then 3ϩ and precursors selected by charge then abundance. Resulting MS spectra were opened with Analyst QS 2.0 software, and exported to MASCOT (Matrix Science, Boston, MA).
Data Analysis-The soybean proteome derived from version 1.1 of the soybean genome (available at www.phytozome.net, 73,320 entries) was searched for peptide matches using MASCOT. Up to one missed tryptic cleavage was tolerated, variable modifications were Oxidation (M) and Carbamidomethyl (C); peptide and MS/MS tolerance was set as 0.2 Da and peptide charge was set at 2ϩ and 3ϩ monoisotopic.
Three technical replicates were prepared for each sample, with multiple biological replicates analyzed for each sample type (sodium bicarbonate stripped SM: three biological replicates, C:M extracted SM: four biological replicates, SM Peripheral: two biological replicates and PBS: three biological replicates). Results were matched to the predicted soybean proteome (www.phytozome.net; 43) using MASCOT and visualized using Scaffold4 Proteome software (Proteome Software, Portland, OR). Significance thresholds were defined in Scaffold4 at 95% minimum peptide identification probability and 95% minimum protein identification probability. These probabilities are generated using the Peptide Prophet and Protein Prophet algorithms (44,45), which convert the statistical significance output of MASCOT into a discriminate score.
To be considered a significant match, proteins had a minimum of two distinct peptides observed in one or more biological replicates. Where multiple proteins were identified with the same peptides, one unique peptide was required for a protein match to be considered significant (along with one or more shared peptides). Percent coverage was calculated based on coverage of the complete protein sequence by matched peptide queries. The false discovery rate (FDR) was calculated by Scaffold4, based on the method of Kall et al. (46) using a reversed decoy database. The protein FDR was 0.1% and peptide FDR was 0.46% (from merged results).
Bioinformatics-Information on protein function was compiled from the G. max genome annotation (www.phytozome.net) and from top matches in NCBI (www.ncbi.nlm.nih.gov). Proteins were grouped according to functional classification by MapMan (47). Previous SM and PBS proteome data sets (31,33,34,36) were blasted against the soybean proteome (www.phytozome.net) to identify soybean homologs that were identified in this proteomic analysis.
Cloning, Constructs, and Transformation-Soybean Glyma11g-34600. 1 Table I. PCR products were cloned into pENTR or pDONR entry vectors using either TOPO cloning (Invitrogen) or Gateway Recombination (Invitrogen). The Gateway cloning system (Invitrogen) was used to create genetic constructs for promoter-GUS and GFP fusion. Entry clones were recombined into the following destination vectors using LR Clonase (Invitrogen): pKGW-GGRR for promoter-GUS fusion and pK7WGFLhc3-R, creating N-terminal GFP-X fusions driven by the nodule specific leghemoglobin promoter (these vectors are modified from pKGWFS7 and pK7WGF2, respectively, obtained from Plant Systems Biology, Ghent University, Belgium; http://gateway.psb.ugent.be). Agrobacterium rhizogenes-based root transformation of G. max was performed according to Mohammadi-Dehcheshmeh et al. (57).
GUS Staining, Sectioning, and Microscopy-Transgenic nodules were collected, washed twice in 0.1 M sodium phosphate buffer (pH 7.2) and incubated in GUS buffer under vacuum at room temperature for 30 min to allow the buffer to replace oxygen in the tissue, and then at 37°C for 1 h. Hand sections were mounted on microscope slides and analyzed using a Leica M205FA stereo microscope.
Confocal imaging of GFP-fused proteins was done on transgenic hand-sectioned nodules using a Leica SP5 II confocal microscope. Sections were counterstained with FM4 -64 (30 mg/ml).

RESULTS AND DISCUSSION
Purity of Symbiosome Membrane Preparations-To evaluate the enrichment of SM during the isolation and fractionation procedure and to assess the purity of samples, nodule total protein, nodule microsomal fraction, sodium bicarbonate stripped SM, C:M extracted SM, SM peripheral, and PBS protein fractions were Western blotted and probed with marker antibodies for proteins with different subcellular locations (Fig. 1B). Fig. 1A demonstrates the SM proteins resolved by one-dimensional SDS-PAGE. Nodulin 26 was used as a marker for the SM as it is a well characterized SM protein (18,19,21,58), whereas mitochondrial porin and HDEL are markers for mitochondria and endoplasmic reticulum (ER), respectively (59).
Nodulin 26 signal was observed in the microsomal fraction and both SM fractions (sodium bicarbonate stripped and C:M extracted) with highest intensity in the bicarbonate stripped sample. It was not observed in the PBS or SM peripheral fractions but a weak signal was detected in the total nodule preparation. A small number of nodulin 26 peptides were detected in peripheral samples by LCMS/MS, suggesting the proteomic analysis is more sensitive than Western blot analysis. Our immunoblot results showed similar enrichment of SM from the initial nodule extract as that seen by Catalano et al. (36) in preparations of SM from M. truncatula.
The mitochondrial porin (29 kDa) and HDEL (65 kDa) antibodies identified protein bands in the microsomal fraction and total nodule samples. No signal for antibody binding was observed in any of the SM or SM-related fractions. Together our results indicate enrichment from a total nodule homogenate to isolated SM that is relatively free from mitochondrial and ER contaminants, as determined previously via enzyme assays (31) for SM isolated using the same method.
We reviewed the data in the literature for localization of the proteins identified in our SM samples and also used bioinformatic programs to predict their subcellular localization (Table  II). In general, the proteins identified were not given a localization in the prediction program, although a number were suggested to be directed to the ER or secretory pathway. Many of these proteins were predicted to contain signal peptides. Because there may be trafficking of proteins from the ER to the SM (see below) the ER/secretory pathway predictions may still allow targeting of proteins to the SM. Although the bioinformatic predictions of subcellular localization must be treated with caution as the existence of symbiotic membranes is not built into these programs it is of interest that few were suggested to be targeted to organelles. There is little information about how proteins are targeted to symbiosomes. Infected nodule cells, which contain symbiosomes, are a specialized cell type, thus, proteins with roles on the SM may have evolved from proteins with other cellular roles. It is possible that proteins normally localized in one organelle in other tissues may have been recruited to a new symbiotic role in infected cells of nodules or have dual roles in these cells. An example of this is the P-type ATPase (see below).
Possible Contaminants of the SM Preparation-As the SM is partially derived from the ER and Golgi (62), it might be expected that some proteins would be present in all membranes. For example, the soybean calnexin protein (Glyma06g17060.1) was identified in this proteomic analysis   Table II). ER lumen proteins calreticulin, BiP, and protein disulfide isomerases (PDI), identified here, have all been previously identified on the SM of other legumes (34,36). The infected cell is tightly packed with symbiosomes and has high requirements for protein synthesis; consequently, abundant ER proteins may adhere to the SM during isolation, although it is equally possible that these proteins are associated with the SM because of fusion of vesicles derived from the ER (see below). It will be important to further validate the localization of proposed SM proteins by other means in the future. Surprisingly, a number of soluble proteins were present in the SM fractions and we assume that these are contaminants that are in high abundance in nodules and that adhere to the SM during isolation of symbiosomes. These include a number of purine biosynthesis enzymes, uricase, and malate dehydrogenase. Purine biosynthesis is important in soybean nodules for assimilation of fixed nitrogen as ureides and the enzymes involved are localized to both plastids and mitochondria in infected cells of cowpea nodules (63). Uricase is also important for nitrogen assimilation and is localized in peroxisomes of uninfected nodule cells (64). The genes encoding these enzymes are expressed at high levels in nodule tissue (65,66) and the activity of many of the enzymes is high, because of the high requirements for nitrogen assimilation in nodules (67). It is therefore likely that the peptides identified for these enzymes represent a relatively low level of contamination from plastids or mitochondria (63) or peroxisomes from uninfected cells (64) that, at least for mitochondria, is not detectable through the immunological analysis described above.
Leghemoglobins (Lb) are the most abundant plant proteins in nodules and are essential for successful BNF (70). Four Lb proteins were detected in our fractions (Glyma10g34280, Glyma20g33290, Glyma10g34260, and Glyma10g34290). Lbs are known to be a cytosolic protein in infected cells and are encoded by the highest expressed genes in nodules (65,66), so their presence in this and other SM proteomes (33,34) is likely to be caused by contamination, a function of their high abundance in infected cells.
We detected peptides for a malate dehydrogenase (Glyma12g19520) in PBS samples (27 spectra in three biological replicates). A nodule-enhanced form of malate dehydrogenase was identified in alfalfa, but its subcellular localization was not identified (68). The protein we found in the PBS is predicted using bioinformatic programs to be mitochondrial, but previous proteomic analyses have identified homologs in symbiosomes (33,34). We also found a small number of malate dehydrogenase peptides in the SM fractions (results not shown) but these were in low abundance and may reflect the transit of the enzyme to the PBS. Whether the noduleenhanced form of malate dehydrogenase is mitochondrial or symbiosome localized requires further investigation.
Several rhizobial proteins also contaminated the SM fractions (Table III). Interestingly, several outer membrane rhizobial proteins were identified, suggesting perhaps that some bacteroid outer membranes rupture during SM preparation and contaminate the final SM sample. Proteins such as NifHD and FixA are abundant soluble proteins in bacteroids and clearly not localized on the SM. Again, it seems likely that some abundant soluble proteins have become associated with the SM during isolation. These may arise from symbiosomes and bacteroids damaged in the initial homogenization.

FIG. 1. One dimensional SDS-PAGE of SM proteins and Western blot analysis of nodulin 26, HDEL and porin in nodule fractions.
A, Ten micrograms of sodium bicarbonate stripped symbiosome membrane (SM) protein resolved on a 12% SDS-polyacrylamide gel and stained with Coomassie brilliant blue. B, Ten micrograms of protein from total nodule, microsomal, sodium bicarbonate stripped SM, C:M extracted SM, SM peripheral, and peribacteroid space (PBS) fractions were resolved on 12% SDSpolyacrylamide gels then transferred to PVDF membranes. Blots were blocked and probed with antibodies for either nodulin-26, HDEL or porin proteins.

Soybean Symbiosome Proteome
Some of the integral membrane proteins identified, including a nucleotide transporter I (Glyma15g01420.1), are predicted to be localized to plastids, but they could also represent valid SM proteins if this function is also required in the symbiosome and dual localization or modification of an originally plastid function occurred. Mitochondrial substrate carrier family members were identified on the SM but this family is not exclusively localized on mitochondrial membranes (69) and the bioinformatic programs used were unable to predict a subcellular location for the proteins (Table II). Independent experimental validation of the SM localization will be required to resolve these questions.
Proposed Functions for Identified Proteins-In total, 197 proteins were identified in the SM proteome, with a further six proteins found only in the peripheral membrane protein fraction, eight proteins identified only in PBS samples, and one protein identified in both PBS and SM peripheral protein samples (Table II). In Table II proteins identified in the SM proteome were grouped according to their proposed functions within the cell (MapMan predictions) with the data for percentage coverage of the protein by identified peptides, the number of unique peptides identified and the sample in which they were identified. Localization, signal peptide and GPI-anchor predictions compiled with expression data from one of the two soybean transcriptomes (64) are also included (Table II). Selected proteins are discussed below.
Protein Folding and Degradation-Several proteins involved in protein assembly and degradation processes were identified on the SM in this study, including two members of the protein disulfide isomerise (PDI) family (Glyma04g42690 and Glyma06g12090). This family of proteins have ubiquitous expression across soybean tissues, are localized to the ER lumen in other tissues and are involved in the proper folding and quality control of storage proteins (71). Import of proteins into the symbiosome would likely require the same processes and, as the structure and composition of the SM is most closely related to the ER (3), these PDI proteins may have been co-opted for this role during the symbiosis. PDIs have previously been identified in all proteomic analyses of the SM (31,33,34,36).
Members of the protein degradation class feature strongly in the PBS but were also found in the SM proteome. Four members of the subtilase family were identified, three (Glyma-17g14270, Glyma05g03760, Glyma14g06970) most clearly localized (based on number of peptides identified) in the PBS and one (Glyma19g44060) associated only with the SM. Many of these proteins are predicted to have GPI anchors (Table II) as expected for extracellular proteases. Because the SM has the same orientation as the plasma membrane, the inside of the symbiosome can be regarded as equivalent to the apoplast (3). The genes encoding all these proteins show high nodule-specific expression according to the soybean tran- Outer-membrane immunogenic protein 7 7 1 2 1 gi͉27379812

TABLE III List of proteins identified in Bradyrhizobium japonicum by LCMS/MS from the soybean SM. A minimum of two unique peptides were identified in one or more biological samples for all proteins indicated. Symbiosome membrane (SM) (bicarbonate stripped and chloroform-methanol extractions pooled). Percentage coverage (%C) is the maximum percentage of a protein to which peptides have been mapped in a biological sample. Membrane topology has been predicted by three bioinformatic suites: (i) SOSUI, (ii) TMHMM and (iii) TopPred2, with number of predicted Trans-Membrane Domains indicated
Outer-membrane immunogenic protein 12 9 2 0 0 gi͉27379978 Outer-membrane immunogenic protein 41 11 0 1 0 gi͉27382806 Outer-membrane immunogenic protein 59 7 0 1 0 gi͉27382260 Peptidoglycan-associated lipoprotein 17 TolB gene product 5 7 0 0 1 scriptome (66). Subtilases are serine peptidases whose members may be involved in nonselective degradation of proteins or as proprotein convertases (72). They are involved in a range of processes including peptide hormone processing, plant interactions with microorganisms, seed germination and distribution of stomata (72). A number of subtilase genes are induced when L. japonicus is infected by mycorrhiza and rhizobia (73) and silencing of some of these genes reduced mycorrhizal colonization (74) (76). Two aspartate proteinases (Glyma15g41420, Glyma08g17680) were also detected in the PBS. The Phaseolus vulgaris ortholog of Glyma15g41420, Nodulin 41, was recently localized in uninfected cells (77) and the possibility that it is a contaminant in the PBS in this study cannot be ruled out. However, the closest Arabidopsis homolog, constitutive disease resistance 1, has an apoplastic localization (78), which is consistent with a PBS localization in nodules. Because the SM has the same orientation as the plasma membrane, the inside of the symbiosome can be regarded as equivalent to the apoplast (3). Constitutive disease resistance 1 is thought to be involved in generating a peptide signal to induce defense responses (78). The identification of both subtilases and aspartate proteases in the PBS suggests an important role for these enzymes, perhaps in generating peptide signals. There is evidence for activity of nodule-specific cysteine-rich peptides in terminal differentiation of bacteroids in legumes such as M. truncatula (79) and although terminal differentiation does not occur in soybean (80), peptide signals may be involved in other processes including communication between the symbionts.
Membrane Trafficking-Members of three subfamilies of the small GTPase Rab family were present in the SM proteome but because of their conserved amino acid sequences, the peptides identified could not be ascribed to a protein encoded by one particular soybean gene. The RabG (Rab7) peptides are present in proteins encoded by four different soybean genes (Table II). All these genes are expressed in nodules but Glyma12g04830 has the most nodule-enhanced expression (64). RabB (Rab2) peptides are present in proteins encoded by Glyma09g01950 and/or Glyma15g12880. RabE (Rab8) peptides are present in proteins encoded by six soybean genes (Table II). Of these, Glyma12g07070 and Glyma15g12880 have the highest expression in nodules, although both are also expressed in most other soybean tissues. Rabs are involved in vesicular transport within cells. Rab1 and Rab7 have previously been implicated in SM bio-genesis in soybean (81) and Rab7 proteins were identified on the M. truncatula and L. japonicus SM (34). Rab7 is a marker for the late endosome/prevacuolar compartment (PVC) and tonoplast and is essential for PVC-to-vacuole trafficking and vacuole biogenesis (82). Although M. truncatula symbiosomes gain Rab7 (but not the early endosome marker Rab5) they do not develop into a lytic compartment because they do not acquire vacuolar SNAREs (soluble N-ethylmaleimide sensitive factor attachment protein receptor) until nodules start to senesce (83). Instead, a plasma membrane SNARE SYP132 is present on the SM from early in development. It suggests the involvement of an exocytosis-derived process in SM formation, which was proved by functional analysis of two VAMP72 homologs in M. truncatula nodules (84). Therefore, a unique identity of the SM could allow the membrane to intercept specific secretory traffic to the plasma membrane and specific endocytic/biosynthetic traffic toward the vacuole (83). The presence of Rab8 and Rab2 small GTPases, that are thought to be involved in trafficking of vesicles from the Golgi and the ER, respectively, to the plasma membrane (85)(86)(87), further supports the idea of the SM as a chimeric membrane.
SNARE proteins such as syntaxins are also involved in vesicle fusion and we have identified a protein related to syntaxin 131 in the soybean SM (SYP131; Glyma13g38370, 13% peptide coverage). SYP131 is part of the clade that includes the Medicago SM syntaxin SYP132 (88, see below). These syntaxins are considered plasma membrane SNAREs in nonsymbiotic tissues (89) but MtSYP132 is localized to regions of the plasma membrane close to the infection thread and infection droplet membranes as well as on the SM (88,90). Whether GmSYP131 is localized on membranes other than the SM is not known, but the gene encoding it is expressed in other plant tissues and its expression is not enhanced significantly in nodules (66), suggesting that it may have a role on the plasma membrane in nonsymbiotic cells.
In the arbuscular mycorhhizal symbiosis, secretory vesicles normally targeted to the plasma membrane can be redirected to the periarbuscular membrane (derived from and contiguous with the plasma membrane) at a specific time in the symbiosis, to form the specialized symbiotic membrane (91). This is analogous to the SM and the presence of the Rab small GTPases and syntaxins suggests that perhaps a similar reorientation of the secretory system is used to create the specialized membrane that is the SM. This might also explain how proteins with roles on both the plasma membrane and SM are targeted to the SM when required although a particular targeting sequence is not obvious.
Transport-Nodulin 26 -Peptides corresponding to nodulin 26 (Glyma08g12650) were detected in all SM samples analyzed in this study, with up to 20% coverage of the protein (Table II). Spectra corresponding to this protein were the most abundant in the proteomic analysis, as expected for a dominant SM protein. Nodulin 26 was detected in the previous pro-teomic analysis of the soybean SM (31), but was not in SM proteomes from L. japonicus, M. truncatula, or pea (P. sativum). Nodulin 26 is exclusively localized to the SM and because of its prevalence is widely used as a marker for the membrane. Nodulin 26 was first identified as an integral membrane transporter of soybean SM (19) and is a member of the major intrinsic protein/aquaporin (MIP/AQP) channel family. It is estimated to constitute 10% of the protein content of the SM (21,58). Nodulin 26 acts as a multifunctional aquaglyceroporin, with Xenopus oocyte studies showing it can facilitate the movement of glycerol and formamide (18,21). Other studies have shown that it can also facilitate ammonia transport across the SM (16) and can act as a docking station for cytosolic glutamine synthetase (20). Glutamine synthetase (Glyma10g06810) was detected in both the SM and SM peripheral proteomes, and interestingly in the PBS proteome. Its detection in the PBS is unexpected as the C terminus of nodulin 26, to which glutamine synthetase binds, is cytosolic (92). The detection of both glutamine synthetase and nodulin 26 across all our samples, however, provides further support for their suggested roles in ammonia release from the symbiosome (16,20).
Sulfate Transporters-Two putative sulfate transporter proteins were identified in the SM proteome (Glyma09g32110 and Glyma07g09710) with 8 and 6% coverage, respectively, from identified peptides (Table II). These proteins, classified as sulfate/bicarbonate/oxalate exchangers, are homologous to the L. japonicus SST1 protein. Sulfur is a component of the metallo-clusters of nitrogenase, essential for the reduction of nitrogen, and must be actively transported across membranes (23). LjSST1 was identified from a fix Ϫ mutant in L. japonicus and complemented a yeast strain deficient in sulfate transport (23). LjSST1 is also one of the few transporters that has been previously identified on the SM through proteomic analysis (34). Krusell et al. (23) reported that LjSST1 expression is essential for symbiotic nitrogen fixation; knockout mutants grow normally in nonsymbiotic conditions but are unable to produce functioning nodules when inoculated with Mesorhizobium loti.
Transcriptome data shows expression of Glyma09g32110 and Glyma07g09710 in soybean is specific to nodule tissue, where they are highly expressed (65,66). Detection of peptides corresponding to the soybean homologs here provides evidence for a role in the symbiosis in soybean as well as L. japonicus. Studies using 35 SO 4 Ϫ and isolated soybean symbiosomes failed to detect sulfate uptake (Day, unpublished data) and in this context, it should be noted that some members of the SST family, though not phylogenetically close to these soybean candidates, can transport other metabolites in addition to sulfate, including molybdate (93). Molybdenum is an essential component of the nitrogenase enzyme and an SM molybdate transporter is yet to be identified.
Energization of the SM-Three related P-type H ϩ -ATPases were identified in the soybean SM proteome (Glyma04g34370, Glyma06g20200, and Glyma19g02270) with 16%, 16%, and 15% peptide coverage, respectively. A number of other H ϩ -ATPases share peptides with these proteins so in fact there may be many different proteins that play this role on the SM. The soybean transcriptome suggests that at least 13 H ϩ -ATPases genes are expressed in nodules. Of those with unique peptides Glyma19g02270 and Glyma04g34370 show highest nodule expression (66). However, expression levels in other tissues are similar to that of nodules suggesting that the same proteins have this activity in symbiotic and nonsymbiotic tissues. This agrees with data of Blumwald et al. (29) that suggests that the H ϩ -ATPase on symbiosomes and the plasma membrane of uninfected soybean root cells were not immunologically distinct, although they saw some differences in activity. Because presumably the activity of H ϩ -ATPase on the SM reflects the activity of a number of different proteins, the differences in activity might reflect the different combination of H ϩ -ATPase proteins on the SM and root plasma membrane. A P-type H ϩ -ATPase was detected on the SM of soybean using specific antibody labeling (29) and found in the SM proteomes in L. japonicus and M. truncatula (34,36). P-type H ϩ -ATPases are considered to have an important role in the development of the symbiotic association both to acidify the symbiosome space to promote protonation of NH 3 , as well as to energize the SM by establishing an electrochemical gradient across the membrane that is necessary for the secondary transport of other solutes (reviewed in 14). Interestingly, the related V-type ATPases are also in the SM proteome of pea and L. japonicus (33,34), but could not be detected by immunolocalization on the soybean SM (29). The absence of V-type ATPases in this study, together with Fedorova et al. 's (29) results, suggest that soybeans may differ from other legumes in their SM ATPase requirements.
Calcium Transport-Three Ca 2ϩ -ATPases were identified in the SM proteome: Glyma09g06890, Glyma03g33240, and Glyma19g35960. It has been suggested that symbiosomes may behave as calcium stores in infected cells (61). Calcium uptake is an active (ATP-driven) process and an ATP-driven Ca 2ϩ -pump has been biochemically characterized on the SM of broad bean (61). As for the P-type H ϩ -ATPases, the Ca 2ϩ -ATPases identified here are expressed broadly across soybean tissues (65,66), suggesting recruitment to a new role and location as part of the symbiosis.
Members of the NPF transport a range of nitrogen-based compounds (98) . AtNPF6.3 (AtNTR1.1, CHL1), one of 53 proteins in the NPF of Arabidopsis, can transport nitrate (99) and auxin (100) as can the M. truncatula homolog MtNRT1.3 (101,102). In this context, indole acetic acid uptake by isolated soybean symbiosomes as reported (103) may be relevant. NPF proteins with dual transport functions are implicated in nutrient sensing within the plant, in addition to high-and low-affinity nitrate uptake (100). Other members of the NPF in Arabidopsis transport glucosinolate defense compounds in seeds (104). In the nonlegume Alnus glutinosa, AgDCAT1 was localized to the symbiotic interface and shown to transport dicarboxylates when expressed in E. coli (105), though its closest homologs are characterized as nitrate transporters (e.g. AtNPF6.3). This suggests homology alone cannot be used to predict solute specificity in this family. Because the main transfer of carbon from plant host to bacteroid in the symbiosis is through the dicarboxylate malate (106), members of any transporter family capable of malate transport on the SM are of particular interest.
Transport of nitrogen containing compounds is of interest in legumes, especially as nodule development is suppressed in the presence of nitrate (107). In all plants, nitrogen plays an important regulatory role, particularly in lateral root formation and nodulation. An NPF family member in M. truncatula (MtNPF1.7 previously called LATD/NIP), classified in the same subfamily as GmNPF1.2, is essential for the development and maintenance of lateral roots and release of rhizobia into the symbiosome (108 -110). Heterologous expression experiments have suggested that MtNIP/LATD encodes a nitrate transporter, but its function in nodules could not be directly replaced by its Arabidopsis homolog NTR1.1 (111).
There is also recent evidence to suggest that bacteroids in the pea : Rhizobium leguminasarum symbiosis may be auxotrophs for branched-chain amino acids, relying on the plant host to provide these solutes (112). Transported peptides may serve as a source of these amino acids, rescuing the bacteroids from their branched-chain amino acid deficiency. Also identified on the SM was Glyma09g21070, a member of the cationic amino acid transporter (CATs) subfamily of the amino acid-polyamine-choline family of amino acid transporters. Expression of Glyma09g21070 appears nodule specific.
ATP-binding Cassette Family Transporters-Five proteins with homology to the ATP-binding cassette (ABC) superfamily were identified in the SM proteome: Glyma04g34140 (GmABCA2), Glyma04g34130 (GmABCA7), Glyma02g10530 (GmABCB20), Glyma08g07580 (GmABCG11), and Glyma-10g34700 (GmABCG39). GmABCA2, GmABCA7, and GmABCG11 have expression that is high and relatively specific to nodule tissue, whereas GmABCB20 and GmABCG39 have a more diverse expression pattern across soybean tissues (65,66). ABC transporters can act as importers or exporters and are driven by ATP hydrolysis. There are 133 members of this family in Arabidopsis, distributed over eight subclasses, but only 22 members have been characterized functionally (reviewed in 113). Plant ABC transporters have been localized to a range of subcellular membranes such as those of the vacuoles, chloroplasts, mitochondria, ER, and peroxisomes, as well as to the plasma membrane. They fulfil a range of functions within the plant and roles have been established in the transport of hormones, lipids, metals, secondary metabolites, and xenobiotics (reviewed in 114). The first member of the ABCA subfamily characterized, AtABCA9, has recently been demonstrated to mediate the transport of fatty acids for lipid synthesis in the endoplasmic reticulum (115). A number of members of the ABCB subfamily are auxin efflux carriers (116), whereas AtABCB14 is a malate importer (117; as opposed to the protein expected to export malate out of the cytosol and into the symbiosome). Members of the ABCG subfamily in Arabidopsis have a number of different roles including transport of strigolactones in development of the plant-mycorrhizal symbiosis (118), transport of lipids and waxes involved in production of the cuticle and in vascular development (119 -121), and cadmium and lead export to aid in cell detoxification (122,123). GmABCG11 is a half-sized transporter that would function as a dimer. Its closest Arabidopsis homolog is WBC11 (AtABCG11), which forms both hetero-and homodimers in its role in transport of cuticular lipids and sterols (119 -121). GmABCG39 is a full-size ABCG transporter with 82% similarity to AtABCG39 and AtABC34. AtABC39 is localized on the plasma membrane and mediates resistance to paraquat although there is no direct evidence that it transports this compound (124).
Lipid Raft Proteins-Several band 7/flotillin-like type proteins were identified in this study (Glyma05g01360, Glyma06g06930, and Glyma19g02370). There is 62% coverage of peptides for Glyma06g06930 and 65% for Glyma05g-01360 (Table II). The genes encoding both are expressed at high levels in nodule tissue with limited expression in the other tissues (Table II, 63,64). The proteins share a common motif, the SPFH (stomatin, prohibitin, flotillin, and HflK/C) domain. Flotillin-like proteins have previously been identified on the SM in pea as well as soybean (31,33) and play an important role in the infection process in legume-rhizobia symbioses (125). Glyma06g06930, the soybean homolog of M. truncatula FLOT4, contains a conserved flotillin domain, a subgroup of the band-7 like proteins. Flotillin domain proteins are lipid raft-associated. Lipid raft-microdomains on plant membranes are dynamic, sterol and lipid rich protein assemblies that serve as centers for membrane trafficking and signaling events as they interact with a range of different proteins (126,127). FLOT4 is up-regulated in a strongly nod-factor dependent manner during early symbiotic events and has been localized to the infection thread membrane and the plasma membrane in root nodules (125). FLOT4 silenced plants form fewer nodules that do not fix nitrogen efficiently (125). Although a role for flotillin has been established in the infection thread process, this study suggests it has a continuing presence on the symbiosome membrane in soybean.
Two remorin proteins, Glyma08g01590 and Glyma05g37990, were identified in the SM proteome with 19 and 15% peptide coverage respectively (Table II). The genes encoding both these proteins show nodule-specific expression (Table II) Remorin proteins are plant specific and are localized to lipid rafts on membranes (128). Remorins have been implicated in regulatory functions in the symbiosis and their localization to lipid rafts on the SM confirmed (60). They were identified on the SM in the L. japonicus, and pea proteomes (33,34). Identification here presents further evidence of a regulatory role for remorin proteins in the mature SM.
Other Proteins of Interest-Glyma11g31870 (GmYSL7), a member of the Yellow stripe-like (YSL) family that is part of the oligopeptide transporter family, was identified on the SM, with 4.3% peptide coverage of the protein. GmYSL7 was identified first in soybean nodules through a PCR based approach (22) and the soybean transcriptome suggests nodule-specific ex-pression (63,64). YSL proteins in dicots typically transport metals such as iron, copper, and manganese complexed with nicotianamine (NA) (reviewed in 129,130). However, the closest Arabidopsis homolog, YSL7, has recently been shown to transport the Pseudomonas virulence factor, Syringolin A, which is a peptide derivative, with transport of Syringolin A inhibited by tri-to octapeptides (131). Syringolin A has similar chemical properties, size and net charge to metal-NA complexes (131) that are the usual substrate for YSL transporters, but whether AtYSL7 can also transport metal-NA was not established.
Four proteins with a PLAC8 superfamily motif (Glyma09g-31910, Glyma08g04830, Glyma05g37590, and Glyma08g-01990) are found on the SM. Expression of Glyma09g31910 and Glyma08g04830 is extremely high and virtually specific to nodule tissue, whereas Glyma05g37590 and Glyma08g01990 are expressed over a range of tissue types, with Glyma-05g37590 enhanced five times in nodules compared with roots (65,66). Glyma09g31910 and Glyma08g04830 are in a clade of the PLAC8 family, known as plant cadmium resistance (PCR) proteins and fruit weight 2.2-like (FWL). Of particular interest, given the requirement for metal transport into the symbiosome (132,133), is the reported role of two members of this clade from Arabidopsis, AtPCR1 and AtPCR2, that appear to be involved in the export of heavy metals from root cells (134,135). This would translate to an import of metal into the symbiosome and the presence of homologous proteins on the SM suggests a possible role in maintaining adequate nutrition for the isolated bacteroids through import of a variety of metal cations. Zinc transport across the SM is also mediated through the ZIP1 transporter in soybean (136), so the PLAC8 transporters may present an additional transport mechanism to aid in maintaining zinc homeostasis. Ferrous iron transport into isolated symbiosomes was inhibited by cadmium and copper, perhaps indicating that a system for the transport of all three metals exists on the SM (27). There is also evidence that PCR proteins, such as BjPCR1, can mediate calcium ion transport (137,138). Another role postulated for PLAC8 proteins is in regulating cell number and so fruit size (139,140). Whether this role is governed by metal transport as observed for AtPCR1 and 2 or Ca 2ϩ transport as recorded for BjPCR1 is not known (141). In soybean, Glyma09g31910, named FWL1, was recently investigated (142). Silencing of the gene resulted in decreased nodule numbers with structural aberrations and heterochromatin condensation in infected cells. Promoter-GUS analysis suggested expression was highest in the nodule epidermis and cortex. Our results suggest expression is almost exclusively in infected cell in nodules (Fig. 2, see below). Clearly there is more work needed to understand the role of this family in nodules.
Expression Analysis for Selected Genes-Analysis of the RNAseq data for soybean (64) shows 11% of proteins localized to the symbiosome membrane in this study are encoded by genes that are specifically expressed in nodules. A further 10% show expression 10-fold higher in nodules than any other tissues. Many of these specifically expressed genes fall into the transport and protein degradation categories, suggesting specific roles for these classes of proteins within the symbiosis. We investigated where the genes encoding six of the SM proteins were expressed using promoter GUS fusions. All genes showed infected cell expression in nodules as expected if the protein product is localized to the SM (Fig. 2). Glyma11g34613.1 (GmNPF5.24), Glyma09g31910.1, Glyma01g31910.2, and Glyma07g39320.1 had expression specifically in these cells, whereas Glyma09g21070.2, and probably Glyma11g34600.1 (GmNPF5.25), showed expression in both infected and uninfected (Fig. 2). This correlated well with the transcriptome data for soybean (64) that suggests nodule specific expression for all genes except Glyma07g39320.1. Because symbiosomes are only present in infected cells the specific infected cell expression for most of these genes supports the role of the protein product on the SM.
As many of the genes with specific expression have clear duplicated copies expressed in other tissues, it seems that there has been subfunctionalization and, at least, regulatory neofunctionalization for these genes because the two genome duplication events in soybean (143). Polyploidy in soybean has possibly allowed the specialization of particular genes to their role in the symbiosis, producing signals for infected cell specific expression as seen for five of the six genes investigated above. This may have led to neofunctionalization in a functional sense to make the symbiosis more efficient and to produce specific targeting signals that allow these SM proteins to reach their final location in the cell. The data for cell specific expression and subcellular localization will provide a basis for further study in this area.
Confirmation of Localization to the SM for GmNPF5.25 and 5.29 -To confirm localization of putative SM proteins we analyzed their subcellular localization in infected cells of soybean nodules. GmNPF5.25 and GmNPF5.29 were fused to the N terminus of green fluorescent protein (GFP). We generated transgenic roots that expressed the GFP fusion constructs. Confocal microscopy showed that GFP-tagged proteins are located on symbiosomes (Fig. 3). The pattern of labeling closely resembles previous labeling of the SM in soybean nodules (29). Similar results were obtained for the products of Glyma11g31870.1 and Glyma08g04160.4 (results not shown). Nodules were costained with the lipophilic dye FM4 -64. FM4 -64 staining allows visualization of membranes of infected cells. Analysis of fluorescence intensity in the region of interest clearly showed colocalization of GFPtagged proteins with the SM (supplemental Fig. S1).
For GmNPF5.29 the localization to the SM using GFP fusion was strong validation of our proteomic results because this was one of the lowest confidence proteins among those identified in the SM proteome. We had identified only two peptides for this protein in the proteome, one that was shared with other NPF family members.
Concluding Remarks-This is the most comprehensive proteomic study to date of the symbiosome membrane and the contents of the soluble space enclosed within that membrane. It confirms some previous studies and extends them substantially to identify new proteins that are likely to be involved in the transport of solutes across the symbiosome membrane and, through this transport, the regulation of communication between the symbiotic partners. We have shown that a subset of the genes encoding members of the SM proteome are expressed in infected cells of nodules, often specifically, and shown that some of these localize to the SM, using GFPfusion analysis. Our results pave the way for functional analysis of these proteins and the further elucidation of mechanisms underpinning the function of the symbiotic organelle.
Acknowledgments-We thank Dr. Ben Crossett (University of Sydney, Australia) for technical advice on proteomic aspects of this work, and to Prof. Harvey Millar and Dr. Nicolas Taylor (University of Western Australia, Australia) for their technical assistance with our initial proteomic experiments. We acknowledge the facilities, and the scientific and technical assistance, of the Australian Microscopy and Microanalysis Research Facility at the Sydney Microscopy and Microanalysis facility, The University of Sydney. The mass spectrometry proteomic data have been deposited to the ProteomeXchange Consortium (144) via the PRIDE partner repository with the data set identifier PXD001132 and DOI 10.6019/PXD001132.