The Salivary Protein Repertoire of the Polyphagous Spider Mite Tetranychus urticae: A Quest for Effectors*

The two-spotted spider mite Tetranychus urticae is an extremely polyphagous crop pest. Alongside an unparalleled detoxification potential for plant secondary metabolites, it has recently been shown that spider mites can attenuate or even suppress plant defenses. Salivary constituents, notably effectors, have been proposed to play an important role in manipulating plant defenses and might determine the outcome of plant-mite interactions. Here, the proteomic composition of saliva from T. urticae lines adapted to various host plants—bean, maize, soy, and tomato—was analyzed using a custom-developed feeding assay coupled with nano-LC tandem mass spectrometry. About 90 putative T. urticae salivary proteins were identified. Many are of unknown function, and in numerous cases belonging to multimembered gene families. RNAseq expression analysis revealed that many genes coding for these salivary proteins were highly expressed in the proterosoma, the mite body region that includes the salivary glands. A subset of genes encoding putative salivary proteins was selected for whole-mount in situ hybridization, and were found to be expressed in the anterior and dorsal podocephalic glands. Strikingly, host plant dependent expression was evident for putative salivary proteins, and was further studied in detail by micro-array based genome-wide expression profiling. This meta-analysis revealed for the first time the salivary protein repertoire of a phytophagous chelicerate. The availability of this salivary proteome will assist in unraveling the molecular interface between phytophagous mites and their host plants, and may ultimately facilitate the development of mite-resistant crops. Furthermore, the technique used in this study is a time- and resource-efficient method to examine the salivary protein composition of other small arthropods for which saliva or salivary glands cannot be isolated easily.

two-spotted spider mite, Tetranychus urticae, is the most polyphagous, having been reported on more than 1000 host plant species in more than 140 different families (1). However, not all these host plants are equally suitable to T. urticae, and host plant acceptance can even differ across mite populations (5)(6)(7).
Important factors determining host plant acceptance by the herbivore are plant defenses, including physical and molecular-chemical barriers that hamper herbivore feeding (8). Different herbivores can induce a different repertoire of defenses and these differential plant responses are set in motion via herbivore-specific signals, predominantly emanating from their saliva (9). Plant defenses are regulated by a set of phytohormones, primarily jasmonates (such as jasmonic acid (JA) 1 ) (10), salicylic acid (SA) (11,12), and ethylene (13). Hormonal interactions are believed to enable the plant to regulate and customize responses under variable biotic and abiotic stress conditions (14). Most spider mites induce a mixture of JA-and SA-defenses (15)(16)(17)(18)(19) while a role for ethylene remains elusive (20).
It is conceivable that some spider mites have evolved traits that enable them to resist (5,21,22), attenuate (23), or suppress JA- (5) and SA-related defenses (24) to maintain a high fitness (19). Although it is largely unknown which terminal plant defenses determine resistance or susceptibility to mites, negative correlations were found between mite fitness and several plant secondary metabolites (18,(25)(26)(27). How plants detect spider mite feeding is poorly understood, but analyses of transcriptional networks have suggested the involvement of receptor-like kinases reminiscent of other plant-herbivore interactions (28). These receptors may be involved in the recognition of molecules (elicitors) released during the onset of the plant-pathogen or plant-herbivore interaction (29).
Many herbivore elicitors emanate from saliva or regurgitation fluids released on or in the plant during feeding (30). Reminiscent of phytopathogens (9,31,32), herbivores evolved additional salivary molecules to counter the induction of defenses (8,33,34). Such molecules, enhancing herbivore performance, were originally referred to as "effectors." Some plant varieties have however evolved the means to recognize these effectors, effectively turning them into elicitors which activate plant defense responses (32,35,36). Because of this context-dependence (37), a broader inclusive definition of the term effector was suggested (35). Effectors are defined as pathogen-or herbivore-secreted proteins and small molecules that alter host-cell structure and function. Effectors are of high interest to the plant breeding industry because they can lead to the identification of resistance genes (R genes) (38) and susceptibility genes (S genes) (39). R genes code for immune receptors, which confer recognition of pathogen-or herbivore-derived effectors or their modification inflicted on a host protein, eventually resulting in the activation of host defenses (39). S genes, on the other hand, can considered to be all plant genes that facilitate infection and support compatibility (39).
The vast majority of herbivore effectors emanate from saliva. Silencing salivary effectors in non-arthropod herbivores like nematodes has been shown to reduce their performance (40). Likewise, silencing salivary effectors in insects like aphids reduced their reproduction (41). These studies indicate that salivary components are key players in the plant-herbivore molecular battlefield, and hence their identification is a high priority. Salivary proteins can be inferred from genomic, transcriptomic and/or proteomic data using a combination of criteria. For example, combining temporal and spatial gene expression data with the predicted presence of an N-terminal signal peptide (SP) in the corresponding proteins results in lists of putative salivary proteins (e.g. (42,43)). For T. urticae, an annotated genome is available (44), but no salivary glandspecific transcriptome and/or proteome has been obtained yet. It is known that spider mites inject salivary substances into host plant leaves (45)(46)(47). However, the proteomic composition of these substances has yet to be elucidated. The generation of gland specific transcriptomes and proteomes is hampered by the extremely small size of spider mites and the complex morphology of the glands (48) (T. urticae adults have a body length of 400 -500 m with an approximate salivary gland length of 50 m). Salivation of several eriophyid mite species has been achieved by soaking adult mites into immersion oil (49), and of Varroa destructor mites by topical application of cholinomimetic agents (50). Protein sequences were not obtained in these studies, however. A successful approach for obtaining sufficient amounts of salivary secretions suitable for protein analysis from nonmite arthropods has been to collect secretions from artificial diets encapsulated by a membrane on which feeding has taken place. For example, using this approach, multiple proteins, in a range from 10 to 100, have been identified in the secreted saliva of aphids (51,52) and true bugs (53).
We developed a set-up for collecting salivary secretions of T. urticae from artificial diet and analyzed the proteomic composition of these secretions. Our approach involved T. urticae lines that were reared on distinct economically important host plants for more than five generations, a period during which adaptation usually occurs (54). By including lines adapted to different hosts, we aimed to discover a broader spectrum of salivary proteins. Mite salivary secretions were harvested using a custom-developed mite feeding assay and subsequently investigated by nano-LC-MS/MS analysis. Additionally, a transcriptome of the proterosoma-harboring the salivary glands-was constructed to validate proteomic data. Evidence for the salivary origin of a selection of identified proteins was obtained by whole-mount in situ hybridizations (ISHs). Furthermore, to assess host-specificity of salivary gland productions, we investigated the host-dependence of expression of genes coding for the identified putative salivary proteins. The results from this study lay the groundwork for an improved understanding of the molecular machinery behind induction or suppression of resistance during plant-mite interactions, and may open new opportunities for mite-resistance plant breeding.

Establishment of T. urticae Lines on Different Host Plants-The T.
urticae London strain has been maintained under laboratory conditions on bean plants (Phaseolus vulgaris cv. "Prelude," Fabaceae) for many years. The genome of this London strain has been sequenced (44). Lines on alternative host plants were established by transferring ϳ250 adult female mites from the London strain on bean to new hosts. These new host plants were cotton (Gossypium hirsutum, Malvaceae), maize (Zea mays, cv. "Ronaldinio," Poaceae), soy (Glycine max, cv. "Merlin," Fabaceae), and tomato (Solanum lycopersicum, cv "Moneymaker," Solanaceae). Three independent lines were generated for cotton and tomato, whereas four independent lines were obtained for maize and soy. The mite lines were maintained in a climatically controlled environment at 26°C with 60% RH, and a light/dark (L:D) photoperiod of 16:8 h. Mites were offered fresh plants as needed, and were used in experiments after 5 generations for all hosts, except tomato, where replicate lines derived from London were adapted and maintained on tomato for over 30 generations (23).
Collection of Artificial Diet Enriched with T. urticae Saliva-To collect saliva, spider mites were allowed to feed on an artificial diet. Briefly, a pocket-like invagination was made in stretched Parafilm® M (Sigma, Bornem, Belgium) using a custom built vacuum device (see supplemental Fig. S1), consisting of a 96-well plate (plate thickness 4.2 mm, hole diameter 4.5 mm) fitting on a vacuum manifold plate (Analytical Research Systems, Micanopy, FL) connected to a vacuum pump (model N 035.1.2 A_.18, KNF Neuberger, Freiburg, Germany). Next, 70 l sterile holidic artificial diet (1/30 diluted aphid diet, (55)) supplemented with the antibiotic rifampicin (0.05 mg/ml) was added, after which the pocket was sealed with packaging tape (Scotch Packaging Tape, Extra Strong, 3M, Diegem, Belgium). The parafilm, with the diet-filled hemisphere side directed upwards, was cut to size (ϳ4 ϫ 4 cm), mounted on the back of a small Petri dish (90 mm diameter), and placed in a large Petri dish (135 mm diameter) filled with water. Using cotton wool and paper tissue, a water barrier was created, confining the spider mites in close proximity of the diet hemisphere ( Fig. 1). Thirty to 40 adult female mites were transferred to each feeding arena. These mites originated from replicate lines adapted to bean, maize, soy or tomato (mites adapted to cotton were not used for the collection of saliva). For each host plant-specific sample, mites originating from each replicated host plant line were pooled. Addition of a blue colorant (0.05 mg/ml erioglaucine, Sigma) to control diet hemispheres was used to verify spider mite feeding as assessed by staining of gut contents (Fig. 1). Petri dishes with feeding hemispheres were placed in an incubator (MLR-352H, Sanyo/Panasonic, Osaka, Japan) at 26°C with 60% RH and 16:8 L:D. After 24 h, the remaining content of the feeding hemispheres was collected using a Hamilton microsyringe under sterile conditions (Laminar Flow cabinet, ESCO, Singapore). Samples were stored at Ϫ80°C until enough sample (about 0.3 ml, 10 -15 diet hemispheres/sample) was collected for nano-LC-MS/MS analysis. Feeding hemispheres that did not receive spider mites were treated identically and served as reference sample during nano-LC-MS/MS analysis.
Nano-LC-MS/MS Analysis-Twenty micrograms of total protein (Pierce BCA Protein Assay Kit, Thermo Scientific, Aalst, Belgium) of the sample was reduced using 1.25 l of 500 mM TRIS (2-carboxyethyl) phosphine in a volume of 100 l 100 mM TEAB, and incubated for 1 h at 55°C. Next, the samples were processed using the filteraided sample preparation (FASP) procedure (FASP Protein digestion kit, Protein discovery, Knoxville, TN) according to manufacturer's instructions. In short, the samples were diluted in a urea buffer and processed on a FASP filter, alkylated with iodoacetamide and digested with trypsine (enzyme/protein ratio ϭ 1:50) overnight. Liquid chromatography mass spectrometric analysis was performed on a Waters nanoAquity LC-Ultra system connected to a Thermo Scientific LTQ Velos Orbitrap mass spectrometer. The equivalent of 2 g of total protein of the digested sample was dissolved in 20 l of 2% acetonitrile in HPLC-grade water. 10 l of the sample (1 g) was loaded on the trapping column (Pepmap C18 300 m ϫ 20 mm) (Dionex, Sunnyvale, California) with an isocratic flow of 2% acetonitrile in water with 0.1% formic acid at a flow rate of 5 l/min. After 2 min, the column-switching valve was switched, placing the pre-column online with the analytical capillary column, a Pepmap C18, 3 m 75 m ϫ 150 mm nano column (Dionex). Separation was conducted using a linear gradient from 2% acetonitrile in water, 0.1% formic acid to 40% acetonitrile in water, 0.1% formic acid in 100 min. The flow rate was set at 400 nl/min. The LTQ Orbitrap Velos (Thermo Scientific) was set up in a data dependent MS/MS mode where a full scan spectrum (350 -5000 m/z, resolution 60,000) was followed by a maximum of ten CID tandem mass spectra (100 to 2000 m/z). Peptide ions were selected as the 20 most intense peaks of the MS1 scan. CID scans were acquired in the LTQ iontrap part of the mass spectrometer. The normalized collision energy used was 35% in CID. We applied a dynamic exclusion list of 45 s.
Protein identification-Prior to protein identification, calibration of the data was performed using the methods described by (56), correcting the systematic bias in mass measurement in the second replicate. Peak lists obtained from MS/MS spectra were identified using OMSSA version 2.1.9 (57), X!Tandem version X! Tandem Vengeance (2015.12.15.2) (58), and MS-GFϩ version Beta (v10282) (59). The search was conducted using SearchGUI version 2.8.5 (60). Protein identification was conducted against a concatenated target/ decoy version of the T. urticae protein database holding 17,907 target sequences (version of December 16th 2014, see supplemental Data S1) and the common Repository of Adventitious Proteins (cRAP) database (available at http://www.thegpm.org/crap/). Reversing the target sequences in SearchGUI created the decoy sequences. The identification settings were as follows: Trypsin with a maximum of 2 missed cleavages; 10.0 ppm as MS1 and 0.5 Da as MS2 tolerances; variable modifications: carbamidomethyl c (ϩ57.021464 Da), oxidation of m (ϩ15.994915 Da), pyro-glu from n-term q (-17.026549 Da), acetylation of protein n-term (ϩ42.010565 Da), pyro-cmc (-17.026549 Da) and pyro-glu from n-term e (-18.010565 Da). Peptides and proteins were inferred from the spectrum identification results using PeptideShaker version 1.10.2 (61) (see supplemental Table S1 and S2 for a detailed list of all features of the LC-MS/MS identifications at the protein and peptide level in the treatment samples (artificial diet with feeding mites), respectively, and supplemental Table S3 for a list of all features of the LC-MS/MS identifications at the protein level in the reference samples (artificial diet without feeding mites)). Peptide Spectrum Matches (PSMs), peptides and proteins were validated at a 1.0% False Discovery Rate (FDR) estimated using the decoy/hit distribution.
The mass spectrometry data along with the identification results have been deposited to the ProteomeXchange Consortium (62) via the PRIDE partner repository (63) (www.ebi.ac.uk/pride) with the data set identifiers PXD003022 and 10.6019/PXD003022.
Experimental Design and Statistical Rationale-Two technical replicates of artificial diets enriched with saliva of mites maintained on four different host plant species were analyzed in this nano-LC-MS/MS study. The different host plants to which the mite lines have been adapted were bean, maize, soy and tomato, as described in a previous section on "Establishment of T. urticae Lines on Different Host Plants." Eight samples were analyzed in total, yielding a list of putative T. urticae salivary proteins. Care should be taken when comparing the proteomics data of the different host plant lines, as each host-specific data set is based on one pooled biological replicate (each analyzed in two technical replicates). However, we believe that our approach was sufficiently rigorous as host plant specific production of T. urticae salivary proteins was complemented by a gene expression analysis of mites maintained on the four different host plants (see "The Expression Profiles of Salivary Protein Genes Show Signs of Host Adaptation").
Proteomics Data Filtering and Protein Quantification-Proteins identified in the artificial diet were pooled into one data set, while retaining host plant specific information. In order to retain confident proteins identifications, only proteins identified in at least one of the samples with a mean PSM value of at least two, calculated over both technical replicates per host plant, were retained in the final putative salivary protein list. A MS1-intensity based Top3 analysis was performed to derive protein abundance (64). First, the moFF (modest Feature Finder, https://github.com/compomics/moFF) tool was used to extract the MS1 intensities from the Thermo raw files for the validated PSMs matching to non-decoy (true) sequences (See supplemental Table S4). Afterward, the intensities of the three (or fewer) peptides with the highest intensities were averaged for every protein detected. Next, a normalized abundance factor (rTop3) was calculated by dividing the Top3 value by the sum of all Top3 values of the quantified proteins in each experiment (65), excluding contaminants. Additionally, the proteins were ranked according to their maximal rTop3 value, as calculated across each host-plant specific sample.
X-ray Submicron Computed Tomography-Adult female T. urticae specimens were fixed as described previously (66). Briefly, live mites were collected and then incubated in a 6:3:1 mixture of 80% ethanol, 38% formaldehyde and 100% acetic acid for 72 h. Specimens were washed with 70% ethanol and dehydrated through graded concentrations of 70 to 100% ethanol (5% steps) for 10 min per step. The 100% ethanol step was done overnight at room temperature (RT). The dehydrated specimens were critical point dried with CO 2 (CPD 020, Bal-Tec, Balzers, Liechtenstein) and glued with their opisthosoma to a 0.5 mm carbon pencil lead (Staedtler, Bornem, Belgium). Spider mite specimens were scanned with Nanowood (67), an x-ray submicron computed tomography (CT) system developed at the Ghent University Centre for x-ray Tomography (UGCT). The sample was scanned with an open-type nanofocus x-ray tube, reaching a focal spot size Ͻ 1 m. Samples were scanned for 1 h, resulting in scans with an isotropic voxel pitch of ϳ0.5 m. Reconstructions were performed using Octopus Reconstruction, a tomography reconstruction package for parallel, cone-beam and helical geometry (68), licensed by InsideMatters (www.insidematters.eu). Filtering was performed using the single step phase-retrieval Paganin algorithm (69). After reconstruction, a noise removal anisotropic diffusion filter was applied using Octopus Analysis, formerly known as Morphoϩ (70), also licensed by InsideMatters. All visualization was performed with Fiji (71).
RNAseq Expression Analysis of the Proterosoma of T. urticae Adult Females-RNA was extracted from intact adult T. urticae females and from dissected proterosomas of adult females using the Qiagen RNeasy RNA extraction kit (Qiagen, Venlo, The Netherlands). Three replicates of 100 intact adult females from the London strain were collected from bean plants. For dissection of the proterosoma, a dissection chamber was prepared by affixing a square piece of 12.7 mm width double sided Scotch tape (3M, Maplewood, MN) on the bottom of a 35 mm Petri plate and allowing tape to cure for 24 h. Adult female mites were carefully attached to tape with their ventral side downwards and covered with 1x phosphate buffered saline solution (PBS). The dorsal side of the hysterosoma was opened using microdissecting needles and contents (gut, ovaries, eggs, dorsal parts of exoskeleton) were removed with dissecting needles and micropipette. Remaining tissue (proterosoma, first and second pair of legs, nervous mass, salivary glands) was gently lifted from tape and transferred to Buffer RLT (RNeasy RNA extraction kit, Qiagen) using a micropipette. In total, 250 mites were dissected and dissected tissue was pooled into one sample. RNA from this sample, as well as three samples collected from whole adult females, was used for Illumina library construction and subsequent sequencing on a HiSeq instrument (Fasteris, Plan-Les-Ouates, Switzerland). Briefly, 100 bp strand-specific paired-end reads were generated for all samples. The paired-end strand-specific Illumina RNA-seq reads were aligned to the T. urticae reference genome (44) using the two-pass alignment mode of STAR 2.5.0b (72) with a maximum intron size of 20 kb (the results were splice-aware alignments made independently of the reference genome annotation). The resulting BAM files were subsequently sorted by read name using Samtools 1.2 (73). Read counts per gene, based on the reference annotation (version October 29, 2015), were then obtained using the default settings of HTSeq 0.6.0 (74) with the "STRANDED" flag set to "yes" and the "FEATURE" flag set to "exon." Differentially expressed genes between the proterosoma of T. urticae females and intact T. urticae females were determined using the DESeq2 (version 1.6.3, (75)) and Bioconductor (http://bioconductor. org/) R-packages. The "unfiltered DESeq2 results" settings (dds Ͻ-DESeq(dds, minReplicatesForReplace ϭ Inf) and res Ͻresults(dds, cooksCutoff ϭ FALSE, independent Filtering ϭ FALSE)) were used for differential expression analysis. Genes with a fold change (FC) higher than or equal to eight and a Benjamini-Hochberg adjusted p value less than or equal to 0.05 were considered differentially expressed (DE). All gene expression data have been uploaded to the Gene Expression Omnibus with accession number GSE81128.  (76,77). Briefly, RNA was extracted from T. urticae mites (London strain) feeding on bean or tomato plants (Total RNA Isolation Mini Kit, Agilent), treated with TURBO DNA-free TM Kit (ThermoScientific) to remove contaminating genomic DNA and used for cDNA synthesis using Maxima First Strand cDNA Synthesis Kit (ThermoScientific). Primers were designed using Primer3 (http:// bioinfo.ut.ee/primer3/). A fragment with preferred length of about 300 bp was amplified (primers used are in supplemental Table S5). PCR products were cloned into pGEM-T plasmids (Promega, Madison, WI) and transformed into E. coli. Plasmids from liquid cultures were purified after which insert orientation and nucleotide sequence were determined by sequencing (LGC Genomics, Germany). A PCR was performed on the plasmids using pUC/M13 primers (supplemental Table S5). PCR product, containing insert flanked by T7 and SP6 promoter sites from the plasmid backbone, was checked by agarose gel electrophoresis and purified using E.Z.N.A. Cycle Pure Kit (Omega Biotek, Norcross, GA). Depending on orientation, sense or antisense digoxigenin-labeled (DIG-labeled) probes were generated using T7 or SP6 RNA polymerase (Roche, Anderlecht, Belgium), using the pUC/ M13 PCR product and DIG-UTPs (Roche) in the in vitro labeling reaction. Probes were then purified using SigmaSpin TM Sequencing Reaction Clean-Up Columns (Sigma), supplemented with hybridization buffer (50% formamide (Sigma), 4ϫ SSC (Sigma), 1ϫ Denhardt's solution (Sigma), 250 g/ml tRNA (wheat germ type V, Sigma), 250 g/ml ssDNA (boiled salmon sperm DNA, Sigma), 50 g/ml heparin (sodium salt, Sigma), 0.1% Tween-20 (Sigma), 5% dextran sulfate (sodium salt, Sigma), and stored at Ϫ20°C until used.
T. urticae nymphs and adults (London strain) of both sexes were collected from bean and tomato plants and fixed overnight in a 1:1 mixture of heptane and PTw (PBS with 0.1% Tween-20) containing 4% formaldehyde. The mites were then washed in methanol and gradually rehydrated in PTw, followed by sonication in a sonic cleansing bath and treatment with 5 mg/ml Proteinase K during 10 min. The mites were then refixed with 4% formaldehyde in PTw. After washing in PTw, mites were prehybridized in hybridization buffer for 1 h at 52°C. Hybridization buffer was refreshed and probe was added. The mites were then incubated overnight at 52°C. Washing occurred at 53°C with wash buffer composed of 50% formamide, 2ϫ SSC and 0.1% tween-20. After washing at room temperature with PBTw (PTw with 0.1% BSA, Sigma), the mites were incubated at RT for 2 h with a 1:1000 dilution of anti-digoxigenin-AP (Fab fragments, Roche) in PBTw. The mites were then washed with PTw and alkaline phosphatase (AP) buffer (100 mM Tris pH 9.5, 100 mM NaCl, 1 M MgCl 2 , 0.1% Tween-20). AP buffer containing the FastRed substrate (SIGMA-FAST TM Fast Red TR/Naphthol AS-MX tablets, Sigma) was added and mites were incubated at RT in the dark, until red staining was visible. Methanol was used to reduce background staining and the mites were eventually cleared in 70% glycerol in PTw (pH 8.5) after washing with pure PTw. The mites were then mounted on a microscopy glass for further microscopic investigation (Nikon A1R fluorescence confocal microscope; emission at 500 -530 nm and acquisition at 488 nm for spider mite auto-fluorescence and emission at 570 -620 nm and acquisition at 561.7 nm for FastRed signal). Z-stacks were created using 15 slices with 2-3 m distance between slices. All images were processed with Fiji (71).
Transcriptome Analysis of T. urticae Maintained on Different Host Plants-For each host plant adapted mite line, three (tomato and cotton) or four (bean, maize and soy) biologically replicated RNA samples were obtained. Each RNA sample was extracted from 100 -150 pooled female adult mites using the RNeasy extraction kit (Qiagen) and was subsequently treated with DNase (Turbo DNA-free kit, ThermoScientific). RNA quantity and integrity was measured using an Agilent TapeStation system. RNA samples were labeled with cyanine dyes following the Low Input Quick Amp Labeling Kit (Agilent Technologies, Diegem, Belgium), with 100 ng of total RNA as starting material. RNA samples from mites feeding on the reference bean host plant were labeled with cy3, whereas cy5-labeling was performed on all other samples. Samples were pooled per host plant transfer and hybridized to a custom-made Sureprint G3 8 ϫ 60K array (Agilent Technologies, with a GEO platform number of GPL16890) following the standard procedure of the Gene Expression Hybridization Kit (Agilent Technologies). After washing procedures (Gene Expression Wash Buffer kit (Agilent Technologies)), raw data was extracted from the 8 ϫ 60 k slides using the GE2_107_Sep09 protocol of the Agilent Feature Extraction Software. The intraspot correlation coefficient per array and the metrics from the arrayQualityMetrics package per host plant line were assessed for optimal background correction and normalization procedures (78). Data was background corrected using the "normexp"-method and normalized by loess and Aquantile (79). Cyanine intensities were extracted from the processed RG-object and averaged per host plant. Using the normalized MA-object, differential expression was assessed for mites on cotton, maize, soy and tomato against the corresponding ancestral mite population living on bean by an empirical Bayes approach. T. urticae gene expression data have been uploaded to the Gene Expression Omnibus with accession number GSE80337.
T. urticae Protein Family Analyses-T. urticae proteins were assigned to OrthoMCL groups using the online OrthoMCL software tool (http://www.orthomcl.org/orthomcl/proteomeUpload.do) and the T. urticae proteome (version July 29, 2015, available via the Online Resource for Community Annotation of Eukaryotes (ORCAE) data-base) as query (80). Signal peptides were predicted with SignalP 4.1 (81) and protein subcellular localization was predicted using WoLF PSORT (organism type: "Animal") (82) at http://www.genscript.com/ wolf-psort.html. Proteins with a sequence length less than 30 AA or not containing a start methionine (pseudogenes) were excluded from WoLF PSORT analysis. Transmembrane domains were predicted using TMHMM server 2.0 (http://www.cbs.dtu.dk/services/TMHMM/). A phylogenetic analysis was performed for proteins belonging to OrthoMCL group cluster Tu_MCL_35 and Tu_MCL_36. Except for tetur55g00110 (Tu_MCL_36) which is encoded by a pseudogene, proteins from each cluster were aligned using MUSCLE (83). Model selection was done with ProtTest 2.4 (84) and according to the Akaike information criterion WAGϩG and WAGϩGϩF were optimal for the phylogenetic reconstruction of Tu_MCL_35 and Tu_MCL_36 proteins, respectively. Finally, for each alignment a maximum likelihood analysis was performed using Treefinder (version 2011) (85) bootstrapping with 1000 pseudoreplicates (LR-ELW). The resulting trees were midpoint rooted and edited with MEGA 6.0 software (86).
Evaluation of T. urticae Salivary Proteins Using Available Databases-T. urticae salivary proteins identified by nano-LC-MS/MS were used as query in a BLASTp search (E-value threshold 1.0 E-5) against the proteome of the non-phytophagous American house dust mite, Dermatophagoides farinae (87). Furthermore, expression levels of genes coding for putative T. urticae salivary proteins were compared between feeding (mobile) T. urticae stages (larva, nymph and adult) and the embryo stage (44), while a nano-LC-nano-ESI-QTOF MS/MS proteomic analysis of mite faeces (88) was screened for the presence of putative salivary proteins of T. urticae. A set of differentially expressed genes in diapausing T. urticae females (89) was also investigated for genes coding for putative T. urticae salivary proteins. Finally, an expression data set of T. urticae genes across different time points of host plant transfer to tomato (23) was mined for putative T. urticae salivary genes.

T. urticae Secretes Proteins in Artificial Diet Which Can
Reliably Be Identified Through Mass Spectrometry Analysis-Spider mite lines propagated for at least 5 generations on different host plants (i.e. bean, maize, soy and tomato) were allowed to feed on an artificial diet encapsulated in parafilm ( Fig. 1). Mites fed on the artificial diet within hours, as assessed by blue staining of control mites being fed an erioglaucine-supplemented diet. After 24 h, the artificial diet was collected for each plant-adapted mite line and for reference samples (artificial diet without feeding mites) and the protein content was analyzed using nano-LC-MS/MS. The detected salivary proteins from all host plant adapted mites were pooled into one data set. Ninety-five proteins (some belonging to 12 "protein inference groups" containing proteins identified by shared peptides) had a mean PSM of at least 2 and were retained as putative T. urticae salivary proteins (Table I, supplemental Table S6). Subsequently we used the normalized (relative) abundance factor rTop3, based on MS1 intensity, for abundance ranking of these putative salivary proteins. This rTop3 value has been shown to correlate with the mole fraction of the protein of interest (65,90). Proteins with a high rTop3 factor are therefore assumed to be more prominent in T. urticae saliva. The majority (81%) of the putative T. urticae salivary proteins had a SP for secre-tion as predicted by SignalP. Only four (tetur03g08030, tetur10g00090, tetur10g00100, tetur22g00260) out of the 95 putative salivary proteins were predicted to have a transmembrane domain (Table I), whereas 76 (80%) were predicted to have an extracellular localization (Table I). The OrthoMCL analysis grouped 13,558 T. urticae proteins into 6397 ortholog groups. The majority of these groups already existed in the OrthoMCL database (http://www.orthomcl.org/orthomcl/), whereas 401 ortholog groups (group names starting with "Tu_MCL") were specific for T. urticae (supplemental Table  S7). From the 95 putative T. urticae salivary proteins detected by nano-LC-MS/MS, 63 proteins could be assigned to an existing OrthoMCL group, 22 proteins were grouped into a T. urticae specific OrthoMCL group and 10 proteins could not be included into any group (Table I, supplemental Table S6).
RNAseq Analysis of T. urticae Proterosomas Supports the Salivary Origin of Many Identified Proteins-RNA was extracted from 100 intact adult female T. urticae mites with 3-fold biological replication and from 250 dissected T. urticae proterosomas (see Experimental Procedures). For all T. urticae genes the number of mapped RNAseq reads can be found in supplemental Table S8. Using the RNAseq data and the DESeq2 software, we performed a differential expression analysis (Benjamini-Hochberg adjusted p value Յ 0.05 and ͉FC͉ Ն 8) (supplemental Fig. S2, supplemental Table S9) between dissected T. urticae proterosomas and intact females: 1800 T. urticae genes showed a significantly higher expression in the proterosoma compared with intact females, whereas 1268 had a significantly lower expression. About 20% (591 genes) of the genes with a significantly different expression in the proterosoma coded for extracellular proteins, with 455 out of 591 genes having a significantly higher expression and 136 having a significantly lower expression in the proterosoma (Fig. 2, supplemental Table S9). Fifty-three percent of the top 100 (ranked based on decreasing log 2 FC values) of the genes coding for extracellular proteins with significantly higher expression in the proterosoma coded for proteins with unknown function, whereas genes coding for serine proteases (8 genes), neuropeptides (6 genes) or homologues of "salivary gland peptides" of ticks (5 genes) were also observed (supplemental Table S9).
Evaluation of the Salivary Protein Repertoire With Previously Published Data Sets-Fifty of the 95 proteins (53%) identified by nano-LC-MS/MS shared homology (BLASTp with E-value threshold 1.0 E-5) with proteins of the most closely related nonphytophagous acariform mite for which proteomic resources were available, the American house dust mite D. farinae (Table I, supplemental Table S6). These were mainly proteins with a predicted function in digestion (26 out of 50),     Table S7 for OrthoMCL details).

The Salivary Proteome of the Spider Mite Tetranychus urticae
like glycoside hydrolases, beta-galactosidase/mannosidase, propylcarboxypeptidase, proteinase inhibitors, and serine proteases. T. urticae proteins with an unknown function did not share homology with D. farinae proteins. A previously published data set of the T. urticae transcriptome across different developmental stages (44) was mined for the gene expression-levels of T. urticae putative salivary genes. The overall average expression for all T. urticae genes (calculated as "reads per kb of transcript per million mapped reads": RPKM) across all the feeding stages (larva, nymph, adult) relative to the overall average expression in the embryo stage was 2.21-fold higher (44). Seventy-five out of 95 T. urticae putative salivary genes had a ratio higher than this average, whereas 28 (30%) had an expression ratio more than 8 (Table  I). Moreover, although only three putative T. urticae salivary protein genes were expressed at lower levels in nondiapausing as compared with diapausing adults, we found that 26 genes were more strongly expressed in nondiapausing adults that are actively feeding ((89), Table I, supplemental Table S6). Finally, 15 putative T. urticae salivary proteins were also detected in a proteome analysis of T. urticae faeces, (Table I, ). These included cathepsins (tetur09g04400, tetur25g 00650, and tetur12g01860), actins (tetur03g09480, tetur09g 05350), a lipocalin (tetur04g05980), beta-galactosidase (tetur07g 07380), fructose-biphosphate aldolase (tetur07g03440), glutathione S-transferases (tetur01g02510, tetur03g07920), vitel-logenins (tetur39g00810, tetur43g00010), alpha-2-macroglobulin (tetur18g03030), and two proteins with unknown function (tetur07g00160 and tetur29g01360).

ISH of Salivary Protein Genes Combined with Morphological Analysis Distinguishes Anterior and Dorsal Podocephalic
Glands-The spatial expression pattern of 15 putative T. urticae salivary protein genes was evaluated using whole-mount in situ hybridization (ISH). These genes were selected out of the nano-LC-MS/MS data set based on the possession of a SP for secretion and a relatively high expression in the proterosoma (Table I). Furthermore, we endeavored to include genes from as many different gene families as possible. To ensure that the observed ISH signal is linked to the genuine morphological structure, T. urticae internal anatomy was investigated by means of x-ray submicron computed tomography. This imaging technique resulted in highresolution three-dimensional pictures, providing additional insights into spider mite internal anatomy from those reported previously (48,91,92), and allowed more accurate interpretation of the ISH signal observed by bright field and fluorescence confocal microscopy. Of the 15 genes, 14 were expressed specifically in either the anterior or dorsal podocephalic glands (Fig. 3). None of the genes we tested showed expression in both the anterior and dorsal glands, and staining was not observed when using sense control probes. We did not detect gene expression in a third pair of glands annotated previously as being salivary, and known as the coxal organs (48). Tetur13g00600 was also tested, yet no clear specific staining was visible in any gland or tissue.
The fied as well, 7 were found only when feeding on maize, 5 only when feeding on soy and 11 only when feeding on tomato. No proteins were unique for the reference line on bean (Fig. 4A, supplemental Table S6). Host plant specificity of putative T. urticae salivary proteins was further illustrated by means of a heat map depicting the mean rTop3 factor of a selection of candidate salivary proteins (maximum rTop3 value higher than the 30 th percentile of all maximum rTop3 values) for the different investigated host plants (Fig. 4B). The most apparent case of host-specific salivary proteins was tetur55g00040/ tetur28g01330 (WTSP1, belonging to OrthoMCL group TuMCL_36), proteins with an unknown function, which were uniquely and abundantly identified from saliva of the tomatoadapted mites. Based on our clustering analysis the salivary proteomic repertoire of T. urticae feeding on bean plants seemed most similar to the repertoire of soy-adapted mites (both plant species belong to the Fabaceae).
To validate these host-specific findings based on individual biological replicates, as well as to compare in further detail the expression level of genes coding for T. urticae putative salivary proteins across host plant species, we performed a genome-wide expression analysis of the mite lines that were used in proteomic experiments (i.e. T. urticae adapted to maize, soy or tomato relative to the reference line on bean) using an Agilent gene expression microarray. Additionally, a line adapted to cotton during 5 generations was also investigated. We determined the absolute expression levels using normalized cyanine 3 (cy3) intensity values but also calculated the expression levels relative to the mites from bean as a benchmark (see Experimental Procedures for details, Fig. 5A  and 5B). We confirmed what we detected previously when comparing the rTop3 values of the proteomics data across different host plant lines (Fig. 4): the salivary composition of T. urticae is host plant dependent. For example, in the tomatoadapted mite lines several salivary protein genes were highly expressed relative to bean-adapted mites: tetur32g00050 (Tu_MCL_35), tetur28g01330 and tetur55g00040 (Tu_MCL_ 36) and tetur31g00830 (Tu_MCL_25  4. Overview of nano-LC-MS/MS identified putative T. urticae salivary proteins. A, Venn diagram depicting overlap between putative T. urticae salivary proteins secreted by mites adapted to different host plants (bean, maize, soy, tomato). Only those salivary proteins with a mean PSM of at least two in at least one of the T. urticae host plant adapted lines were used for comparison (see Table  I and supplemental Table S6). B, Heat map of mean rTop3 values of putative T. urticae salivary proteins secreted by mites adapted to different host plants (bean, maize, soy, tomato). Only those salivary proteins (and "protein inference groups") with a mean PSM of at least two in at least one of the T. urticae host plant adapted lines and with a maximum rTop3 value higher than the 30 th percentile of maximum rTop3 values were used for comparison (see Table I and supplemental Table S6). The Euclidean distance metric and Ward's method were used for clustering of both rows and columns. All putative salivary proteins for which the corresponding genes were shown to be expressed in the salivary glands by ISH (Fig. 3) are shaded gray. 03670, tetur03g10093, tetur03g03700, tetur03g03680, tetur03g 03690, and tetur03g03730 (OrthoMCL group OG5_144177) were expressed at much lower levels in mites feeding from all host plants except for soy, compared with bean. The absolute expression level of the latter six genes was very high after feeding on bean or soy (cy3 intensity levels were within the top 1% of highly expressed genes in mites on either bean or soy) and relatively low after feeding on the nonleguminous host plants under study (fold changes between mites on bean and mites on tomato, maize or cotton varied between 7 and 289).

Identification of Salivary Gland Proteins Secreted by the Spider Mite T. urticae-
The spider mite T. urticae is well known for its ability to feed on an extraordinary wide range of different plant species even though their feeding activities induce plant defenses that can negatively affect mite fitness (5,16,93). However, arthropod populations have adapted to suppress these defenses, including T. urticae populations (19,23,24,94), suggesting that effectors in their saliva are crucial determinants for success in colonizing host plants. To gain insight into the salivary proteome of T. urticae, spider mites were allowed to feed on small hemispheres filled with artificial diet. Using this setup, mites can secrete saliva into the diet, which in turn can be used for nano-LC-MS/MS analysis. By combining this approach with extensive follow-up analyses, we aimed to obtain an as accurate as possible set of proteins that are actually injected by T. urticae into host plants. In the past, this kind of approach was also successful to identify aphid salivary proteins (95)(96)(97). T. urticae gene expression profiles are known to be highly host plant dependent (23,98). By using spider mites reared on four different host plant species for a minimum of five generations, we maximized chances of capturing a higher variety of salivary proteins. In addition, this allowed to study the influence of the host plant on spider mite salivary protein composition.
By design, our method is suitable to collect saliva from small numbers of potentially tiny arthropods. In recent studies on aphid saliva (96,99,100), tens of thousands of aphids were needed to collect saliva in sufficient quantities for protein identification. Despite the smaller size of T. urticae (0.4 -0.5  tetur01g00940  tetur19g01900  tetur04g09479  tetur03g04470  tetur16g03470  tetur32g01880  tetur40g00392  tetur01g01850  tetur09g00900  tetur13g03820  tetur13g00600  tetur01g05230  tetur12g03940  tetur03g08030  tetur06g06630  tetur16g03190  tetur01g11910  tetur06g02570  tetur01g16473  tetur14g03160  tetur09g03880  tetur95g00080  tetur06g03320  tetur07g08034  tetur31g00830  tetur32g00050  tetur02g11340  tetur06g00510  tetur07g03440  tetur09g03650  tetur22g00260  tetur109g00010  tetur10g00100  tetur23g01640  tetur26g02320  tetur28g01720  tetur06g02580  tetur02g11390  tetur30g01440  tetur05g04560  tetur03g08710  tetur11g01500  tetur20g01290  tetur28g00360  tetur28g01330  tetur55g00040  tetur01g00950  tetur09g05350  tetur09g05360  tetur01g02510  tetur06g01060  tetur09g04400  tetur12g01860  tetur43g00010  tetur01g01010  tetur06g00230  tetur31g00630  tetur04g05980  tetur14g02080  tetur18g03030  tetur01g02670  tetur09g03670  tetur14g02070  tetur12g03950  tetur20g00560  tetur02g04310  tetur147g00020  tetur33g01260  tetur02g12930  tetur07g07380  tetur03g09480  tetur05g04580  tetur03g07920  tetur14g02090  tetur06g01640  tetur07g00150  tetur31g01040  tetur31g01060  tetur03g03670  tetur03g10093  tetur03g03700  tetur03g03680  tetur03g03690  tetur03g03730  tetur07g00160  tetur10g00090  tetur25g00650  tetur09g03620  tetur07g01660  tetur29g01360  tetur39g00810  tetur516g00020   OG5_144177  Tu_MCL_25  Tu_MCL_36  Tu_MCL_35 FIG. 5. Heatmap of expression levels of putative T. urticae salivary protein encoding genes and their up-or down-regulation in mites adapted to different host plants. A, Heatmap of cyanine intensities of putative T. urticae salivary protein encoding genes. The Euclidean distance metric and Ward's method were used for clustering of both rows and columns. For 92 out of 95 putative T. urticae salivary protein genes expression data was available. B, Heatmap of log 2 FCs of putative salivary protein genes in mites adapted to soy, maize, cotton or tomato as compared with mites adapted to bean. Genes are sorted based on their order in panel A. Genes that were shown to be expressed in the salivary glands by ISH (Fig. 3) are shaded gray. A circle, triangle, filled square or empty square indicates whether a T. urticae gene belongs to either OrthoMCL group OG5_144177, Tu_MCL_25, Tu_MCL_36 or Tu_MCL_35, respectively. mm) compared with aphids (Ͼ 1 mm), we managed to collect sufficient amounts of saliva for nano-LC-MS/MS analysis using only about 500 mites per sample. Limited volumes of diet were sufficient for the analysis via state-of-the-art mass spectrometry, making the analysis of salivary proteomes of small herbivorous arthropods time-and resource-efficient where feeding on an artificial diet is feasible.
Salivary proteins need to be secreted by the salivary glands into the salivary duct before they can be delivered into a host plant via the saliva. The majority of the 95 putative T. urticae salivary proteins identified by nano-LC-MS/MS possess a SP and were predicted to be localized extracellularly, consistent with secretion. However, the lack of a SP for 18 out of the 95 proteins does not exclude that these proteins are secreted, as secretory pathways independent of the canonical endoplasmic-reticulum-Golgi network may exist (101). Additionally, SPs can be missed by gene prediction programs during genome annotation. In fact, both reasons have been proposed to explain why many putative arthropod salivary proteins lack a predicted SP (e.g. (96)).
Although detection by LC-MS is one of the most robust methods to identify secreted salivary proteins, the list in Table  I should not be considered exhaustive. A number of factors determine detection by nano-LC-MS/MS, including the adhesion to the material of the collection device during sample preparation, the presence of trypsin cleavage sites (102), the quality of MS/MS spectra, the type of chemical or posttranslational modifications that were accounted for in the database search, and the protein hydrophobicity influencing the fractionation (103). We validated proteomic findings with additional expression data sets. Previously, transcriptome analyses of salivary glands of insects have been reported (e.g. (104 -106)). Current technology, however, does not allow straightforward generation of specific spider mite salivary transcriptomes as spider mite glands are merely 50 m in diameter. We therefore isolated spider mite proterosomas, the anterior body region that includes the salivary glands next to other tissues such as the nervous mass and the silk glands (Fig. 3). In this study, the salivary glands were confirmed to be localized in the proterosoma using x-ray tomography. In a comparison between transcriptomes of T. urticae intact females and T. urticae proterosomas, candidate genes for salivary gland expression are expected to be more highly expressed in the latter. Our analysis revealed more than 400 genes with a significantly higher expression in the proterosoma. Reassuringly, (Table I, supplemental Table S6). The majority of the identified putative T. urticae salivary proteins are predicted to be secreted, are encoded by genes of which the expression is higher in the body region containing the salivary glands, and are expressed specifically in feeding spider mites.
Genes Encoding Putative Salivary Proteins Are Expressed in the Salivary Glands-Despite multiple lines of evidence for salivary origins, the proteins we identified could still originate from another source. Of particular note, the catalogue of putative salivary proteins did show some overlap (15 out of 95), with proteins identified from spider mite fecal extracts (88). However, with ISH we showed that for genes encoding two of these proteins (tetur07g00160 and tetur29g01360) expression is specific to the salivary glands (Table I, Fig. 3). Hence, the proteins present in the faeces probably reflect a salivary origin. This indicates that secreted salivary proteins may end up in the digestive system during feeding, either directly or indirectly after ingesting cytoplasm from cells in which saliva had been injected. Interestingly, many putative T. urticae salivary proteins shared homology with D. farinae proteins that were predicted to have a function in digestion. ISH in T. urticae showed that some of the genes that code for presumptive gut proteins, for example tetur07g00150 (a serine protease) and tetur28g00360 (a beta-mannosidase), are actually expressed in the salivary glands (Table I). Hence, they may have a digestive function in the saliva instead of, or in addition to, a potential role in the gut. Nevertheless, some proteins we found in the artificial diet could be fecal contaminants. This is also suggested by the proterosomal transcriptome, as some of the proteins detected in the diet are not differentially expressed, whereas the digestive enzymes confirmed by ISH are highly expressed in proterosomas (Fig. 2, supplemental Table S9). Adult female spider mites were found to deposit eggs and silk on the diet membrane and together with the mite's cuticle these could be other potential sources of contamination, e.g. for vitellogenin (tetur39g00810, tetur43g00010 and tetur516g00020). Among the 15 genes selected for ISH, we included some coding for proteins with either high or low rTop3 values, but all were more highly expressed in the proterosoma and belonged to the most prominent proteins or protein families in the data set (Table I).
Of the 15 selected genes, 14 were found to be expressed in either the paired anterior or paired dorsal podocephalic glands. Both of these podocephalic salivary glands are predicted to be rich in proteins as assessed by ribosomes presence (107) and staining for proteins with methylene blue (48). Intriguingly, our data show that the anterior and dorsal podocephalic glands are responsible for the production of a discrete subset of the salivary proteome. Such "division of labor" has been reported for the different secretory cell types in the salivary glands of aphids (108 -110) and has been suggested for thrips as well (111). The regulatory mechanism by which these glands or cell types are "assigned" to produce particular proteins in spider mites, aphids, or thrips is not known. A functional distinction between anterior and dorsal podocephalic glands in spider mites was already predicted by Mothes and Seitz (48). The production of a serous secretion was attributed to dorsal glands, whereas the anterior glands were predicted to produce a mucous secretion (48). However, supporting evidence for these specific predictions is not provided by our study.
For one of the 15 genes selected for ISH, expression in the salivary glands could not be confirmed, nor did we observe staining in other tissues. This does not exclude that this gene encodes a salivary protein, as ISH "failure" may have several causes including poor probe design or expression below the detection limit.
The Saliva of T. urticae Comprises a Mixture of Different Protein Families-Among the different putative T. urticae salivary proteins we identified, several were carbohydrate and protein degrading enzymes including cathepsins, serine proteases, glycoside hydrolases, beta-galactosidases, and betamannosidases. Genes coding for serine proteases and betamannosidases also showed a higher expression in the proterosoma (Table I, supplemental Table S9). These catabolic proteins could have a digestive function during the feeding process before ingestion. For example, polysaccharidedigesting enzymes present in the saliva of the hemipteran herbivore Homalodisca vitripennis have been proposed to play a role in the degradation of cell wall material, hereby facilitating stylet penetration (112). Predigestion of plant material has been suggested to be a property of spider mite saliva (47)(48)(49) but there is no empirical evidence to support this hypothesis. Recently, the spider mite consumption rate was estimated at a single mesophyll cell per 10 min (113), which is much lower than reported previously (3). Hence, such a low consumption rate might allow predigestion of plant cells with spider mite salivary enzymes. However, secreted digestive enzymes could have a function in the mite itself, e.g. in the esophagus, after ingestion. Like caterpillar feeding, also spider mite feeding induces the plant to produce defensive proteins like serine protease inhibitors (PI-I and PI-II) (5,15,29,93). The production of salivary serine proteases may be a means to compensate for inactivation of gut serine proteases. The mite itself also produces salivary proteinase inhibitors as we detected several cystatins (cysteine protease inhibitors: tetur06g01060, tetur06g06630, tetur09g03620, tetur09g 03650, and tetur09g03670) secreted in the diet. Cystatins can play important roles in plant-pathogen interactions. The maize pathogen Ustilago maydis manipulates expression of a plant cystatin via its effector protein pep1, causing inactivation of defensive Cys proteases thereby making the plant susceptible to infection (114). In addition, Phytophthora infestans produces secreted serine protease inhibitors and cystatin-like effectors (115) to target key extracellular defensive proteases of its hosts and a host plant shift of this pathogen was attributed to a single amino acid change in one of these cystatin effectors (116). This illustrates how decisive effectors can be for determining host plant compatibility of a pest and it may suggest that unrelated organisms might have evolved effector proteins with similar functions (i.e. similar plant targets).
We identified a chitinase (tetur01g11910) among the T. urticae salivary proteins as well. Chitinase was also found in saliva of an aphid species (Diuraphis noxia) feeding on wheat (Triticum aestivum) and it was suggested that this enzyme might inhibit secondary fungal infections at the feeding site by hydrolyzing fungal chitin-rich cell walls (51). We found the chitinase only in the saliva of mites-lines adapted to both bean and maize (Table I, supplemental Table S6, Fig. 2) suggesting there might be a host plant specific role for these proteins. However, salivary chitinases are not uncommon and were found, for example, in saliva of humans (117), remipedes (118), octopuses (119), nematodes (120,121), and insects (51) and were suggested to play a defensive role against chitinous pathogens. Finally, among characterized proteins, a superoxide dismutase (SOD, tetur26g02320), which targets reactive oxygen species that are often induced in plants upon damage, was also detected. Noteworthy, SODs have also been identified in the saliva of a potato leafhopper (Empoasca fabae) and two aphids (D. noxia and Ropalosiphum padi) (122).
Intriguingly, several of the most abundant proteins in the saliva are coded by multi-membered gene families of unknown function. Most of these proteins were encoded by genes with proterosoma-enriched expression, and lacked homologs in D. farinae, a nonplant feeding mite. One possibility, although speculative, is that the proliferation of these families facilitated host-range expansion in T. urticae. Polyphagous species are proposed to harbor a larger collection of salivary proteins because they are exposed to a greater diversity of selection pressures, i.e. host plants that vary in morphology and (defensive) physiology (52). Analogous gene family proliferations have been described for other proteins relevant for plant-spider mite interactions, such as detoxification enzymes (cytochrome P450s, carboxyl/choline esterases, and glutathione S-transferases), transporters (21,123,124), and digestive cysteine peptidases (125).
The Expression Level of Some Salivary Protein Genes is Host Plant Dependent-Secreting the full salivary protein repertoire, independent of the host, might maximize the possibility that some salivary proteins are perceived as elicitors by the plant. In addition, it is unlikely that all proteins will function optimally in all hosts as targets may differ across plants or may be absent. Therefore, it would be beneficial for polyphagous herbivores to alter the composition of their saliva according to the host plant identity. Differences between the salivary gland transcriptomes of two populations of Nilaparvata lugens, a rice specialist, maintained on either a resistant or susceptible rice variety, were proposed to be related to different virulence traits of these brown planthoppers (126,127). Furthermore, drastically different salivary protein profiles were found when the western tarnished plant bug Lygus hesperus was fed artificial diet, cotton or pinto bean (128). Hence, we investigated host plant specific secretion and expression of the mite's salivary genes. As suggested by the clustering of proteomics data (Fig. 4), transcriptome analysis revealed that the expression of T. urticae salivary protein genes is strongly influenced by the host plant species on which the mites had been feeding (Fig. 5B). For example, proteins tetur28g01330 and tetur55g00040 were uniquely and abundantly identified from diet fed upon by tomatoadapted mites, whereas expression of the corresponding gene was also extremely high when feeding on tomato, relative to mites feeding on the other tested plants (Fig. 5). Next to individual genes, we also studied OrthoMCL gene family groups. Tu_MCL_25, Tu_MCL_35, and Tu_MCL_36 (Table I, supplemental Table S7) all have members that are highly expressed when feeding on tomato, whereas the expression of other members of these groups was not influenced by the host plant (Fig. 5, Fig. 6). Future experiments should point out whether this is because of the fact that mites from the tomato-adapted lines had been maintained on tomato for many more generations than mites on the other hosts or whether this is because of the specific allelochemicals of tomato posing digestive or defensive challenges. Members of the OrthoMCL group OG5_144177 (Table I, supplemental Table S7) were, relative to mites on bean, expressed at lower levels in mites feeding from maize, cotton, and tomato ( Fig 5B). When feeding on soy however, these genes were expressed at slightly higher levels. A similar pattern can also be deduced from the proteomics data of mites adapted to bean and soy versus mites adapted to maize and tomato. Bean and soy both are legume species (Fabaceae) and mites feeding on these plants probably encounter analogous plant secondary compounds that select for or induce a similar repertoire of salivary proteins. The observation that different host plant species can differentially affect expression levels of genes coding FIG. 6. Expression profiles of genes coding for members of the expanded Tu_MCL_36 and Tu_MCL_35 protein families in T. urticae. Phylogenetic analyses of Tu_MCL_36 and Tu_MCL_ 35 proteins are shown next to heatmaps depicting relative gene expression of Tu_MCL_35 and Tu_MCL_36 genes in T. urticae subjected to different host plant regimes (log 2 FCs, relative to the expression level on bean). The heatmap on the left represents log 2 FCs of Tu_MCL_35 and Tu_MCL_ 36 genes in mites adapted to soy, maize, cotton or tomato compared with bean, whereas the heatmap on the right represents log 2 FCs of Tu_MCL_35 and Tu_MCL_36 genes in T. urticae after host shifts between bean and tomato (B_24hT, mites from the London reference strain on bean transferred to tomato for 24h; B_30GT, mites from the bean strain grown on tomato for 30 generations; B_30GT_2GB, mites from the bean strain grown on tomato for 30 generations and transferred back to bean for 2 generations (23)). Gray boxes indicate that for a specific gene no probes were included in the T. urticae microarray design, and hence expression could not be assayed.
for salivary proteins-such as effectors-has been reported for aphids as well (108,110,129). Elzinga et al. (129) suggested that the differential expression of salivary effector genes represents a strategy to avoid activation of defenses and to facilitate feeding.
In the scope of this discussion it is important to realize that only one T.urticae strain (London) was used. Although mites from this strain had been maintained on bean for many years, previous studies have shown that this population is both not fully inbred (130) and capable of extensive transcriptional plasticity upon transfer to new hosts (18,21,22,44), the latter of which has been further confirmed in this study. We believe therefore that this study has captured much of the repertoire of T. urticae's biologically relevant salivary proteins. However, as marked variation of genotypes exists between T. urticae populations and across spider mite species (5,19), the use of additional T. urticae strains or different spider mite species may deliver additional salivary proteins. Identification via shotgun proteomics will however be less straightforward without reference genomes for these strains or species.
Prediction of Effector Proteins-Confidently identifying salivary proteins as effectors is not straightforward. Arthropod effector proteins are characterized by the presence of a Nterminal SP, directing them to the secretory pathway, and their encoding genes show signatures of rapid evolution (poor sequence similarity with other genera, multiple gene copies, and high rates of non-synonymous nucleotide mutations) (131)(132)(133)(134). Because of the highly specific function of effectors, variable expression levels depending on the host plant could also be indicative of a context-dependent function (108). The presence of R genes, which turn effectors effectively into elicitors, can differ among plant species or varieties (135)(136)(137), as does the presence of S genes (39), the target of the effector. As such, the R and S gene composition of a host may determine the specific transcriptional response of a herbivore's secretome and thus its effector repertoire. Proteins of the OrthoMCL groups OG5_144177, Tu_MCL_25, Tu_MCL_ 35, and Tu_MCL_36 have characteristics attributed to (arthropod) effectors: (a) they are targeted to the saliva and are secreted during feeding, (b) they share no homology with proteins of non-phytophagous mite species, (c) they belong to multigene families (Ͼ 10 members) ( Table I, supplemental  Table S6, supplemental Table S7), and (d) the expression level of some of the gene family members has been shown to differ across host plants (Fig. 6). Intriguingly, salivary proteins from the OrthoMCL groups Tu_MCL_25 and Tu_MCL_211 (tetur31g01040 and tetur01g01000, respectively, see Table  S6), were recently shown to act as effector proteins by Villarroel et al. (138). These proteins were identified using an in silico approach and were shown to enhance T. urticae reproductive performance when transiently expressed in N. benthamiana leaves (138). Future research is needed to unravel the function of these and other T. urticae salivary constituents documented here. CONCLUSIONS Despite the minute size of T. urticae, we managed to collect saliva for proteome analysis through an artificial diet system. Using nano-LC-MS/MS, 95 putative T. urticae salivary proteins were identified, indicating that these mites employ a complex protein mixture in their interaction with host plants. A proterosoma specific transcriptome is presented as a second discovery pipeline, and a considerable overlap with nano-LC-MS/MS data was observed. A selection of genes coding for putative salivary proteins was confirmed to be expressed in the salivary glands by whole-mount in situ hybridizations. The expression levels of a subset of genes were strongly dependent on feeding on different host plant species. Several proteins belong to protein families with as yet unknown functions, with some having structural and gene family features suggestive of roles as effector proteins. The differential expression levels of some of these putative salivary effector genes across host plants may be an indication of a host plant specific mode of action. The search for effectors and their possible targets is essential to our understanding of polyphagy and the evolution of the plant-mite interactions, but it is also of practical importance, as it would enable plant breeders to discover new R and S genes in order to develop mite-resistant crops.