Nanospray FAIMS Fractionation Provides Significant Increases in Proteome Coverage of Unfractionated Complex Protein Digests*

High-field asymmetric waveform ion mobility spectrometry (FAIMS) is an atmospheric pressure ion mobility technique that can be used to reduce sample complexity and increase dynamic range in tandem mass spectrometry experiments. FAIMS fractionates ions in the gas-phase according to characteristic differences in mobilities in electric fields of different strengths. Undesired ion species such as solvated clusters and singly charged chemical background ions can be prevented from reaching the mass analyzer, thus decreasing chemical noise. To date, there has been limited success using the commercially available Thermo Fisher FAIMS device with both standard ESI and nanoLC-MS. We have modified a Thermo Fisher electrospray source to accommodate a fused silica pulled tip capillary column for nanospray ionization, which will enable standard laboratories access to FAIMS technology. Our modified source allows easily obtainable stable spray at flow rates of 300 nL/min when coupled with FAIMS. The modified electrospray source allows the use of sheath gas, which provides a fivefold increase in signal obtained when nanoLC is coupled to FAIMS. In this work, nanoLC-FAIMS-MS and nanoLC-MS were compared by analyzing a tryptic digest of a 1:1 mixture of SILAC-labeled haploid and diploid yeast to demonstrate the performance of nanoLC-FAIMS-MS, at different compensation voltages, for post-column fractionation of complex protein digests. The effective dynamic range more than doubled when FAIMS was used. In total, 10,377 unique stripped peptides and 1649 unique proteins with SILAC ratios were identified from the combined nanoLC-FAIMS-MS experiments, compared with 6908 unique stripped peptides and 1003 unique proteins with SILAC ratios identified from the combined nanoLC-MS experiments. This work demonstrates how a commercially available FAIMS device can be combined with nanoLC to improve proteome coverage in shotgun and targeted type proteomics experiments.

Intelligent selection of peptide precursors for fragmentation is a performance-limiting factor in tandem mass spectrometric analyses of complex protein digests. When there is a large number of co-eluting species, as is routinely the case in whole proteome studies, the duty cycle (how many mass spectra can be obtained per unit time), mass resolving power (the ability to distinguish among precursor ions of similar mass), and dynamic range of the mass analyzer can limit the number of peptide identifications that can be obtained from a protein digest sample. A typical shotgun proteomics experiment employs data-dependent precursor selection wherein precursors are sequentially selected for fragmentation in order of decreasing intensity (1). The speed at which the mass analyzer can analyze all of the presented precursors determines whether low-intensity precursors will ever be selected for fragmentation. Dynamic exclusion of previously analyzed precursors extends the dynamic range of data-dependent analyses by forcing the ions into an exclusion list, but a sufficiently complex sample can still overwhelm the duty cycle (2). Precursor selection resolution determines the extent to which precursors with similar m/z can be independently selected for fragmentation. The lower the resolution capability, and the more complex the sample, the more likely it is that multiple precursors of similar mass may be simultaneously fragmented, resulting in chimeric spectra that can confound peptide spectrum matching search engines (3). Finally, for a lowintensity precursor to be selected for analysis, it must be distinguishable from the baseline chemical background noise.
Although improvements in the mass accuracy and duty cycle of mass analyzers continue to increase the number of peptides that can be identified in a given sample, the issue of sample complexity is still routinely addressed by fractionating the sample (typically by strong cation exchange (1), isoelectric focusing (4 -6), or gel electrophoresis (7,8)) prior to chromatographic separation (typically reverse-phase liquid chromatography (9)). Although routine and effective, sample fractionation generally increases sample requirements and introduces the potential for sample loss and experimental error concomitant with additional sample handling. An approach that avoids extra manual sample handling is gas-phase fractionation (10), a technique wherein only precursors in a preselected range of m/z are selected for fragmentation. The m/z bins can be rationally designed according to empirical or theoretical knowledge of the typical m/z ratios of the sample (11), and can be as narrow and numerous as time and sample quantity allow. Although this technique increases the likelihood that a low-intensity precursor will be selected for fragmentation, it does not address the problem of resolving co-eluting analytes with similar m/z.
High-field asymmetric waveform ion mobility spectrometry (FAIMS) 1 is an atmospheric pressure ion mobility separation technique that has been demonstrated to be compatible with electrospray ionization mass spectrometry (12). FAIMS separates gas-phase ions according to differences in their characteristic mobilities in high and low electric fields. FAIMS has been shown to improve analysis of peptides by mass spectrometry. Guevremont et al. described coupling electrospray ionization (ESI) to a FAIMS device in which gas-phase ions travel in the interstitial space between two concentric cylindrical electrodes in a direction parallel to the cylindrical axis (13)(14)(15). In this style of FAIMS, ions exit the device where the two electrodes terminate in concentric domes that provide a second point of focusing. Using only the separation properties of FAIMS, infusions of hemoglobin tryptic digest were analyzed by ESI-MS without previous separation by LC. Venne et al., using the FAIMS device developed by Guevremont and commercialized by Ionalytics, coupled nanoLC (600 nL/min) and nanospray ionization to FAIMS for analysis of protein digests (16). Coupled to a Waters Q-TOF, the FAIMS device gave a 20 -40% increase in the signal and a roughly eightfold increase in signal-to-noise of infused [Glu-1] fibrinopeptide B. Ionalytics was acquired by Thermo Fisher in 2006 and the Ionalytics FAIMS device was modified and commercialized for use with Thermo Fisher LTQ and triple quadrupole instruments. The cylindrical electrode design was kept but the path of the ions was changed to traverse the interstitial space between the electrodes in a direction orthogonal to the cylindrical axis. Canterbury et al. described coupling nanoLC and nanospray to a prototype version of the Thermo Fisher FAIMS device using a linear ion trap as a mass analyzer and demonstrated that collecting data while rotating through several compensation voltages (CV) over the course of a single experiment effectively increases the peak capacity of a single nanoLC-MS run (17). Because the Thermo Fisher FAIMS device integrates their Ion Max source design, which is designed for use only with their high-flow ESI-style probes, an after-market probe adaptor (New Objective) was required to couple nanospray to FAIMS. A loss of signal in excess of an order of magnitude was observed when coupling nanospray to FAIMS, but this drawback was offset by a fivefold increase in signal-to-noise.
In this work we describe a novel adaptation of a Thermo Fisher ESI probe that allows the user to easily couple nanospray to the Thermo Fisher FAIMS device, and we demonstrate that the addition of sheath gas, a feature currently unavailable in any other nanospray probe available for the Thermo Fisher system, significantly improves signal when using nanospray and FAIMS. To demonstrate the utility of nanoLC-FAIMS-MS for analysis of complex samples, we analyze an unfractionated, stable isotope labeling with amino acids in cell culture (SILAC)-labeled haploid/diploid Saccharomyces cerevisiae tryptic digest and show that the use of FAIMS significantly increases peptide discovery compared with nanoLC-MS without FAIMS.

EXPERIMENTAL PROCEDURES
Preparation of Haploid and Diploid Yeast Digest-Single colonies of BY4742 (haploid mating type ␣) and BY4743 (diploid) S. cerevisiae were grown overnight in a rich medium and used to inoculate two 200-ml cultures of yeast minimal media (0.17% yeast nitrogen base without ammonium sulfate or amino acids, 0.5% ammonium sulfate, 2% glucose) containing a full complement of amino acids minus arginine (R) or lysine (K). R and K amino acids were supplemented in one of two isotopic forms; normal (R 0 K 0 ) in the BY4743 strain or heavy (R 10 K 8 ) in the BY4742 strain. These cells were gown for eight generations to an OD 600 of ϳ1.5. The cultures were then were harvested by brief centrifugation and flash frozen in liquid nitrogen. Frozen pellets were disrupted by cryolysis using a Retsch ball mill grinder. The resulting grindates were brought up, by brief sonication with a probe tip sonicator, in 3 ml of a buffer containing phosphate buffered saline (pH 7.4), 50% trifluoroethanol, HALT phosphatase inhibitors (Thermo Fisher) and protease inhibitors (SigmaFAST Protease Inhibitor Tablets, Sigma). The lysate was cleared by centrifugation at 5000 ϫ g for 5 min, and the supernatant reduced with 10 mM dithiothreitol (DTT) at 55°C for 45 min, alkylated at room temperature for 60 min with 20 mM iodoacetamide in the dark, then the excess iodoacetamide quenched by the addition of 10 mM DTT for 10 min at room temperature. Samples were diluted 1:10 with 100 mM ammonium bicarbonate and digested with trypsin (Promega, Madison, WI) for 16 h at 37°C. The trypsin digested samples were adjusted to a pH of 3 and desalted on C18 columns (Waters Sep-Pak 500 mg) (loaded in 0.1% trifluoroacetic acid, eluted in 60% acetonitrile/0.1% trifluoroacetic acid) and dried down. Light-and heavy-labeled digests were mixed one-to-one prior to analysis.
Mass Spectrometer-Analyses were performed on a LTQ Velos Orbitrap (Thermo Fisher). MS1 data were collected over the range of 300 to 2000 Th in the Orbitrap (Profile mode resolution ϭ 60,000 for infusion experiments and 30,000 for nanoLC-MS experiments) with automatic gain control with a target ion volume of 1 ϫ 10 6 and a max fill time of 500 ms. MS2 data were collected in the LTQ-Velos (centroid mode with a target automatic gain control ion volume of 1 ϫ 10 4 and a max fill time of 100 ms). The 20 most intense peaks from a preview scan of each full Orbitrap scan were selected (with a selection window of 2.0 Th) for collision-induce dissociation (CID) with wide-band activation. The minimum signal required for activation was set at 10 to ensure that the full dynamic range capability of each technique could be observed. Dynamic exclusion was enabled to exclude an observed precursor for 30 s after a single observation. The dynamic exclusion list size was set at the maximum 500 and the exclusion width was set at Ϯ0.05 Th. Monoisotopic precursor selection was enabled. Charge state rejection was not enabled. The ion transfer capillary temperature was 275°C. The S-lens RF value was optimized at 45% with FAIMS and 65% without FAIMS.

FAIMS-
The FAIMS device (Thermo Fisher) has been described previously (17). Briefly, the FAIMS electrodes consisted of two concentric cylinders (outer electrode I.D. ϭ 18 mm, inner electrode O.D. ϭ 13 mm) with a 2.5 mm gap and an effective length of 25 mm. Ions entered the FAIMS device orthogonal to the longitudinal axis of the cylinders and exited on the opposite side. Carrier gas was supplied in such a way that any gas in excess of the amount aspirated by the mass analyzer exited the FAIMS device through the ion entrance orifice. For the experiments described here, the dispersion voltage was -5000 V, the outer and inner electrode temperatures were 90°C and 70°C, respectively, the faceplate voltage was 1 kV and the spray voltage was 3.5 kV for a net spray voltage of 2.5 kV, and the bias voltage was 9 V. The FAIMS gas was 100% N 2 supplied at 3.5 L/min.
Modified Nanospray Source-The FAIMS device as provided by Thermo Fisher incorporated their Ion Max source housing. In order to couple nanospray to FAIMS, a HESI-II heated ESI source was modified to accommodate a 360 m O.D. fused silica capillary with a pulled tip for nanospray ionization (supplemental Fig. S1). The union where the metal ESI needle was coupled to the incoming eluent was bored out to a diameter of 1 mm and the existing Kel-F ferrule in the union was replaced with a similar Kel-F ferrule with an I.D. optimized for 360 m O.D. tubing (Upchurch Scientific WA, USA part F-151). This modification allowed the metal ESI needle to be replaced with a pulled-tip capillary that could be adjusted to extend any desired length from the HESI probe body. Voltage was applied end of the capillary by a platinum electrode coupled to a T junction. The heating capability of the HESI-II probe was not utilized. For each LC run, 4 l of sample containing 1 g total protein was injected on the trap and rinsed with loading buffer (2% v/v acetonitrile and 0.2% v/v trifluoroacetic acid) at 0.005 ml/min as provided by an Agilent 1100 binary pump. Sample was separated by a linear gradient changing from 95% A (0.1% v/v formic acid in water) and 5% B (0.1% v/v formic acid in acetonitrile) to 65% A and 35% B in 60 min at 0.3 l/min as provided by an Agilent 1100 nanopump enabled with an electronically controlled flow splitter. Infusion experiments were performed using an unpacked Picofrit column. A solution containing 500 fmol/l each of human angiotensin-I and [Glu-1] fibrinopeptide in 30% v/v acetonitrile and 0.1% v/v formic acid was infused at 0.300 l/min using a 100-l syringe and the syringe pump on the mass spectrometer. The values for peak intensity and noise were determined from the average of the instrumental values of 10 scans as measured in the Orbitrap.
Data Analysis-MS/MS data were analyzed using the Trans Proteomic Pipeline version 4.4 (18). Thermo Fisher .RAW files were converted to mzXML format using MSConvert (ProteoWizard) and searched with X!Tandem(19) version 2009.10.01.1 with k-score plugin (20). Data were searched against a nonredundant S. cerevisiae reference protein database comprised of entries from the SGD, Ensembl, NCI, and GenBank databases. The database contained 13,618 entries: 6714 yeast; 94 common contaminants including human keratin as well as bovine and porcine trypsin; protein A; and decoys. Decoys were generated by randomly shuffling the residues between the N-and C-terminal residues of every potential tryptic peptide. Parent ions were searched with a 0.1 Da mass tolerance and fragment ions were searched with a 0.4 Da mass tolerance. Searching high-resolution precursors with a wide precursor mass tolerance improved probability models by employing the high mass accuracy binning model available in Peptide Prophet (18). Peptides were as-sumed to be tryptic (cleavage after K or R except when followed by P). Semi-tryptic peptides with up to two missed cleavages were allowed. The search parameters included a static modification of ϩ57.021464 Da at C for carbamidomethylation by iodoacetamide and potential modifications of ϩ15.994915 at M for oxidation as well as ϩ8.014199 Da at K and ϩ10.008269 Da at R for SILAC labeling. X!Tandem was set to search automatically for Ϫ17.026549 Da for deamidation at N-terminal Q and Ϫ18.010565 Da for loss of water at N-terminal E from formation of pyro-Glu as well as Ϫ17.026549 Da at N-terminal carbamidomethylated C for deamidation from formation of S-carbamoylmethylcysteine. Each LC-MS experiment was first analyzed separately in Peptide Prophet to assign probabilities to the peptide spectrum matches (PSM). Accurate mass binning was employed. Decoys and the nonparametric model option were used to improve PSM scoring. The Peptide Prophet results of all the nanoLC-FAIMS-MS or nanoLC-MS experiments were combined with iProphet (21) to determine false discovery rates at the peptide level. Only peptides identified with iProphet probabilities corresponding to a false discovery rate of 1% were used for protein quantification. Ratios of light and heavy SILAC peptides were quantified using ASAPRatio (22) with a wavelet function (23). ASAPRatio parameters were set so that the background was assumed to be zero, the heavy and light peaks were quantified over the same scan range, and a mass tolerance of Ϯ 0.01 Da was used to select precursors for quantification. Protein identifications were determined with Protein Prophet (24). Proteins identified at Protein Prophet probabilities corresponding to a decoy-determined false discovery rate of 1% were taken for further analysis.
Analysis of Persistent Peaks-Experimental dynamic range and optimum CV were determined from PSM's that mapped to a persistent chromatographic peak. Peak mapping was performed with Hardklor (25). The peaks of each MS1 spectrum were represented by their centroid value. An observed peak was accepted as valid if the m/z value could be matched within 10 ppm in three or more consecutive spectra. A gap of up to two spectra was allowed to accommodate for portions of the profiles near the noise threshold. For each persistent peak, a profile was stored that included the retention time boundaries and the maximum signal intensity. Unique peptide ions (peptides with unique sequence, charge, and modifications) identified from PSM's were matched to persistent chromatographic peaks if the m/z values of the PSM and the persistent peak were within 5 ppm and their retention time profiles overlapped with a 1 min tolerance. For the PSM's identified by nanoLC-FAIMS-MS, the maximum and minimum CV for that PSM observation was recorded, and a CV profile was generated by storing the maximum intensity of the peak profile at each CV where a match occurred.
Protein Network Data Analysis-A protein and gene interaction network for S. cerevisiae was generated as previously described (26). Briefly, genetic and physical interactions between proteins were obtained from the Saccharomyces Genome Database (SGD; http:// downloads.yeastgenome.org/literature_curation/). Additional transcription factor information was obtained from YEASTRACT (http:// yeastract.com). Networks were generated in Cytoscape (http:// cytoscape.org) where nodes represented proteins labeled by the standard name and edges between nodes represented genetic, physical, or transcription factor interactions. Edges were weighted by the number of observations of each classification of interaction. Each node was assigned quantitative information computed using ASAP-Ratio as described above or from existing published SILAC (27) and microarray-based gene expression (28,29) data sets.
Experimental Design-Multiple analyses of 1 g of SILAC-labeled yeast tryptic digest were performed with nanoLC-MS and nanoLC-FAIMS-MS. Forty replicate nanoLC-MS experiments were performed using Thermo Fisher's nanospray source, and 40 nanoLC-FAIMS-MS experiments were performed using the above-described modified HESI-II source. Each nanoLC-FAIMS-MS experiment was performed at a different CV from -1 V to -40 V (in random order).

RESULTS
Modified Electrospray Source-The Thermo Fisher FAIMS device incorporates their Ion Max housing, which is only compatible with their ESI-type probes. With a low-flow needle installed (Thermo Fisher part OPTON-53011) the lowest recommended flow rate is 1 l/min and the dead volume of the spray needle is 0.67 l (0.003 inch I.D., 148 mm length). In order to couple a pulled-tip capillary nanoLC column to the FAIMS device, we modified a Thermo Fisher HESI-II probe to accommodate a 360 m O.D. capillary in place of the metal ESI needle (supplemental Fig. S1). Rather than seating a capillary of fixed length in place of the ESI needle, the union was modified so that the capillary passed through it, allowing us to adjust the length that the capillary extends from the probe body. This modification allowed us to place a capillary column in the probe, thus eliminating any postseparation dead volume, and to employ sheath gas, a feature unavailable on other nanospray source adaptors for the Ion Max source housing. We observed optimal ion signal with the probe placed at the depth demarcated "C" on the probe body and the capillary column extending 10 mm from the tip of the probe body such that the spray tip was several millimeters above and away from the entrance orifice of the FAIMS device. A low flow of sheath gas (5 arbitrary units) gave optimum signal. When infusing human angiotensin-I (500 fmol/l) at a flow rate of 0.3 l/min, the most abundant signal obtained with un-aided nanospray and FAIMS was nearly 30-fold less than the signal obtained with nanospray and no FAIMS. The addition of sheath gas increased the signal nearly fivefold so that the signal with FAIMS was about one sixth the magnitude of that obtained without FAIMS (Fig. 1). FAIMS could be seen to confer a significant benefit despite signal loss; namely, reduced background signal and a fivefold improvement in The FAIMS data were collected at a compensation voltage of -27 V, experimentally determined to optimize transmission of the triply charged 432.90 Th angiotensin ion. Spectra were acquired in the Orbitrap at a resolution of 60,000. The intensity of the 432.90 Th peak was 3.9 ϫ 10 7 with the standard nanospray source, 1.4 ϫ 10 6 with the unaided modified nanospray source and FAIMS, and 6.7 ϫ 10 6 with the modified nanospray source with sheath gas and FAIMS. The signal-to-noise of the 432.9 Th peak was 2800 with the standard nanospray source and 14,000 with the modified nanospray source with or without sheath gas and FAIMS. The time to fill the trap to the preset ion volume was 2.5 ms with the standard nanospray source, 350 ms with the unaided nanospray source and FAIMS, and 69 ms with the modified nanospray source with sheath gas and FAIMS.
signal-to-noise. Without FAIMS, any amount of sheath gas resulted in signal loss (data not shown).
NanoLC-FAIMS-MS of a Complex Protein Digest-We performed an exhaustive nanoLC-FAIMS-MS analysis of a tryptic digest of a 1:1 mixture of SILAC-labeled haploid and diploid yeast. (The data associated with this manuscript may be downloaded from the ProteomeCommons.org Tranche network using the hash shown in supplementary information. Injections of 1 g total protein were separated by reversephase chromatography over a 60 min linear gradient on a 75 m I.D. capillary column pulled to a 15 m I.D. tip that acted as a nanospray emitter. Analyses were performed either without FAIMS using a Thermo Fisher nanospray probe, or with FAIMS using our modified ESI probe with sheath gas. Forty replicate analyses were performed in both modes. In FAIMS mode, each replicate was performed at a different CV from -1 V to -40 V. The large number of non-FAIMS replicates was necessary to verify that any increase in peptide spectrum matches (PSM) afforded by FAIMS was not due to the number of replicates alone.
Nearly identical mass analyzer instrumental parameters were employed for both the FAIMS and non-FAIMS experiments: ion optics were optimized separately with and without FAIMS, but experimental parameters were kept the same. The threshold for CID activation was set extremely low in order to test the effective dynamic range of each type of experiment. Precursor charge rejection was not enabled in order that the charge states for optimal generation of PSM's from CID could be confirmed and so that the behavior of peptides of various charge states in FAIMS could be analyzed. Monoisotopic precursor selection was enabled, which requires the instrument to recognize isotopic distributions of the same precursor when performing dynamic exclusion; as a consequence, precursors for which charge state could not be determined by the mass analyzer were not selected for CID.
The Trans Proteomic Pipeline was used to assign confidence levels to the peptide and protein identifications obtained from an X!Tandem search of acquired mass spectra. Peptide Prophet analysis was performed on the combined search results of either all 40 nanoLC-MS or all 40 nanoLC-FAIMS-MS experiments, using X!Tandem's scoring algorithm as well as an accurate mass model and a decoy-generated model to assign false discovery rates to each PSM. iProphet was then used to combine the PSM's and assign false discovery rates to unique peptides based on the number of times a peptide was observed in multiple experiments and with multiple charge states and modifications. A modified version of ASAPRatio that employed wavelets to improve identification and integration of chromatographic peaks was used to determine ratios of light and heavy SILAC-labeled peptides. At a 1% false discovery rate, the combined nanoLC-MS experiments resulted in 336,984 PSM's corresponding to 16,221 unique peptide ions (peptides with unique sequence, modifications, and charge state), 11,792 unique peptides (peptides with unique sequence and modifications, disregarding charge state), and 6908 unique stripped peptides (peptides with unique sequence, disregarding modifications and charge state). The combined nanoLC-FAIMS-MS experiments resulted in 137,154 PSM's corresponding to 25,027 unique peptide ions (54% more than the nanoLC-MS experiments), 18,090 unique peptides (53% more than the nanoLC-MS experiments), and 10,377 unique stripped peptides (50% more than the nanoLC-MS experiments). Protein Prophet analysis of the combined nanoLC-MS experiments resulted in identification of 1003 unique yeast proteins with SILAC ratios (207 with identity inferred from a single peptide). The combined nanoLC-FAIMS-MS experiments produced 1649 unique protein identifications with SILAC ratios (452 with identity inferred from a single peptide), a 64% increase over the nanoLC-MS experiments (Fig. 2, supplemental Tables S1-S3).
For analysis of dynamic range and optimum CV, PSM's were matched to chromatographically persistent precursor peaks in order to reduce the likelihood of spurious matches to low-intensity peaks (17). 97.1% of unique peptide ions identified by nanoLC-FAIMS-MS and 98.5% of unique peptide ions identified from the nanoLC-MS experiment mapped to persistent peaks. The effective experimental dynamic range of the experiments was measured as the ratio of the apexes of the most intense and the least intense persistent peaks observed over the course of all the combined replicates for each type of experiment. The most and least intense persistent chromatographic peaks that resulted in a PSM in the nanoLC-MS experiments had magnitudes of 2.31 ϫ 10 8 and 2.91 ϫ 10 3 , respectively, a dynamic range of 7.93 ϫ 10 4 . The most and least intense persistent precursor peaks observed in the nanoLC-FAIMS-MS experiment had magnitudes of 2.65 ϫ 10 7 and 1.51 ϫ 10 2 , respectively, a dynamic range of 1.76 ϫ 10 5 , 2.2-fold greater than for the nanoLC-MS experiment. Representing the distribution of precursor ion intensi-ties visually (Fig. 3), it can be seen that the mean magnitude of precursor ions in the nanoLC-FAIMS-MS experiments was about 10-fold lower than without FAIMS. The shape of the distributions suggests that much of the increase in PSM's afforded by the nanoLC-FAIMS-MS experiments came from relatively low-intensity precursors, a result that is consistent with the hypothesis that FAIMS increases PSM's by decreasing sample complexity and improving signal-to-noise.
Fractionation by FAIMS-Fractionation of complex samples is often desirable for LC-MS experiments. FAIMS can be used to fractionate peptide ions post-column by exploiting differences in ion mobility to allow only a desired subset of analyte ions to enter the mass analyzer. Fig. 4 shows the number of unique peptide ions (peptides with distinct m/z, including the same peptide identified at different charge states and with different modifications) identified at each CV. These chargestate specific distributions show that FAIMS can be used to preselect populations of peptides with charge states optimal for a given fragmentation method. In this case, the CV's that result in the greatest number of PSM's from CID can be seen to consist primarily of ϩ2 and ϩ3 precursors while largely excluding ϩ1 precursors. Of the 10,377 unique stripped peptides identified by the combined nanoLC-FAIMS-MS experiments, only 1.8% were uniquely identified at a charge state of ϩ1 (that is, only identified from precursors with a charge state of ϩ1 and no other charge state). 31% were uniquely identified at a charge state of ϩ2, 29% were uniquely identified at a charge state of ϩ3, and 1.1% were uniquely identified at a charge state of ϩ4. The ϩ2 and ϩ3 precursors alone could account for 97% of the observed unique stripped peptides. For all charge states, peptides observed at more negative CV's had higher charge densities (supplemental Fig. S2), a feature that could be useful for alternative fragmentation techniques (i.e. electron transfer dissociation) that are bettersuited to high charge density precursors.  Table S4). Fractionation by FAIMS could be seen to confer an advantage over nanoLC-MS after as few as three replicates, and additional nanoLC-FAIMS-MS replicates added novel PSM's at a higher rate than the same number of nanoLC-MS replicates. The large number of replicates here was intended to thoroughly investigate the behavior of peptides in FAIMS, and these results can be used to design future experiments that require many fewer replicate injections. The single CV that resulted in the most unique PSMs, -15 V, isolated primarily ϩ2 ions and was near the apex of the distribution of ϩ2 ions as seen in Fig. 4. The optimal gap in CVs (⌬CV) for multiple nanoLC-FAIMS-MS experiments decreased as the number of replicates increased: the most unique PSMs were observed with a ⌬CV ϭ 6 V for any two replicates, ⌬CV ϭ 5 V for 3 to 8 replicates, ⌬CV ϭ 4 V for 9 or 10 replicates, ⌬CV ϭ 3 V for 11 to 14 replicates, ⌬CV ϭ 2 V for 15 to 19 replicates, and ⌬CV ϭ 1 V for 20 to 40 replicates. The average overlap in unique peptide ions identified (the percentage of unique peptide ions observed in an adjacent CV) was 72 Ϯ 7% for ⌬CV ϭ 1 V, 63 Ϯ 5% for ⌬CV ϭ 2 V, 54 Ϯ 6% for ⌬CV ϭ 3 V, 45 Ϯ 7% for ⌬CV ϭ 4 V, 38 Ϯ 8% for ⌬CV ϭ 5 V, and 32 Ϯ 8% for ⌬CV ϭ 6 V (Supplemental Table S5).
Improved Coverage of Protein Interaction Networks by FAIMS-To illustrate the advantages of increased protein identifications afforded by FAIMS, we examined KEGG (30) pathway coverage (supplemental Table S6) and physical interaction network coverage (supplemental Figs. S3-S7). Of 98 KEGG pathways comprised of 2967 proteins involved in metabolism, genetic information processing, environmental information processing, and cellular processes (all the pathways available for S. cerevisiae as of June 11, 2011 from the free KEGG web service, http://www.genome.jp/kegg), 1209 proteins contained in 94 pathways were quantified by the nanoLC-FAIMS-MS experiments leaving only four pathways (alpha-linolenic acid metabolism, other glycan degradation, biotin metabolism, and lipoic acid metabolism) without any identifying proteins. Only 863 proteins covering 85 pathways were identified by the nanoLC-MS experiments. As an example, KEGG pathways involving metabolism were identified by 1932 proteins using nanoLC-FAIMS-MS whereas they were only identified by 1392 proteins using nanoLC-MS, a drop of 30% in total identification (see supplemental Fig. S8). For networks for which any number of proteins were identified, the FAIMS experiments provided 48% coverage on average compared with 36% coverage by the non-FAIMS experiments.
A physical interaction network comprised of all the proteins identified in all the experiments described here is presented in supplemental Fig. 3. Fig. 6 shows a sample protein interaction extracted from the larger network comparing proteome coverage and quantitative results from the combined nanoLC-  supplemental Table S4). For example, the two compensation voltages that combined to yield the most unique PSM's were -15 V and -21 V, a ⌬CV of 6 V, whereas the 10 CV's that combined to yield the most PSM's were -4 V, -8 V, -12 V, . . . -40V, a ⌬CV of 4. As the number of combined replicates was increased, the ⌬CV was decreased.
FAIMS-MS and nanoLC-MS experiments. Comparison with the de Godoy et al. study (27) as well as transcriptomics data are presented in the supplemental data (supplemental Fig. S4). This network centers around RHO1, a GTPase that is the master regulator of cell wall integrity (a system wellcharacterized in a review by David Levin (31), which work is referenced here unless otherwise noted). RHO1, which we observed to be relatively more abundant in haploid yeast, activates ␤1,3-glucan synthase (GS) by a direct interaction with the integral membrane-spanning protein FKS1. FKS1 and its functionally redundant and structurally similar partner GSC2 (a.k.a. FKS2) makes up the catalytic subunit of GS and are essential for cell wall biosynthesis. There is a large degree of sequence homology between FKS1 and GSC2 that makes unambiguous identification and quantification difficult. Though the two proteins are functionally similar, they are differently regulated: FKS1 is the predominantly expressed gene under normal growth conditions, whereas GSC2 expression is induced by, among other things, mating pheromone and cell wall stress. In the nanoLC-FAIMS-MS experiments, FKS1 was identified by 11 peptides, six of which are shared with GSC2. Only one peptide unique to GSC2 was identified from the nanoLC-FAIMS-MS experiments. In the nanoLC-MS experiments, FKS1 was identified by six peptides, four of which are shared with GSC2. No peptides unique to GSC2 were identified from the nanoLC-MS experiments.
RHO1 also plays an important role in membrane fusion by serving as an anchor for SEC3 (32,33), a component of the hetero-octameric exocyst protein complex. Three of the eight exocyst subunits were quantified from the nanoLC-FAIMS-MS experiments whereas none were identified from the nanoLC-MS experiments. The exocyst complex also interacts with the t-SNARE proteins SSO1, SSO2, and SEC9, all of which are necessary for fusion of secretory vesicles. SEC9 was observed by the nanoLC-FAIMS-MS experiments to be relatively up-regulated in diploid yeast, but was not observed in the nanoLC-MS experiments. SSO1 and SSO2 are functionally redundant and exhibit a large degree of sequence homology. In the nanoLC-FAIMS-MS experiments, SSO2 was identified by four unique peptides, one of which is shared with SSO1. Since no corroborating peptides were observed for SSO1, it was not considered to have been observed in the nanoLC-FAIMS-MS experiments according to the parsimony rules of Protein Prophet. In the nanoLC-MS experiments, SSO1 was identified by a single peptide unique to SSO1. The identification of SSO1 was assigned a high Protein Prophet probability because the peptide was identified at high confidence in 14 different nanoLC-MS replicates. One peptide FIG. 6. RHO1 interaction network. Node color and size represent the direction and magnitude of regulation: smaller and red shows the protein to be more abundant in diploid yeast, larger and green shows the protein to be more abundant in haploid yeast, yellow shows the protein to be near equal in both, and gray indicates that the protein was not observed. Better network coverage was obtained from the nanoLC-FAIMS-MS results. Edge thickness indicates the strength of evidence for the interaction, with thicker lines representing more independent experiments demonstrating the interaction. Red edges represent genetic interactions, black edges represent physical interactions, and blue edges represent transcription factor interactions. unique to SSO2 was identified in two separate nanoLC-MS experiments, but the protein inference did not pass Protein Prophet stringency requirements because of weak Peptide Prophet probability. DISCUSSION We have demonstrated how FAIMS can be used as a postcolumn fractionation technique prior to MS analysis to improve peptide discovery from complex protein digests. To take advantage of the reduced sample requirements of nanoLC (34), it was necessary to construct a modified ESI source compatible with the Thermo Fisher FAIMS device. We modified a commercially-available Thermo Fisher ESI source to be compatible with a capillary column (75 m I.D.) pulled to a tip (15 m) for nanospray. Not only does this simple and inexpensive modification allow users to easily couple nanospray to FAIMS, it also avails the user of sheath gas, a feature unavailable with other nanospray sources compatible with the Thermo Fisher FAIMS device. The maintenance of sheath gas to the nanospray process provided a ϳfivefold improvement in signal compared with nanospray FAIMS without sheath gas. The addition of sheath gas helps to offset the significant signal loss observed when nanospray is coupled to FAIMS. The exact mechanism by which sheath gas improves ion transmission when nanospray is coupled to FAIMS cannot be stated with certainty and will require further experimentation to elucidate. It is thought that the sheath gas provides improved desolvation, which is especially important for FAIMS. In the absence of the FAIMS device, the nanospray tip would be placed millimeters away from the heated ion transfer capillary at the entrance of the mass analyzer, and this capillary would be heated to temperatures in excess of 275°C in order to ensure complete desolvation of ions. With the FAIMS device installed just prior to the heated ion transfer capillary, it is important for efficient desolvation to occur as solvated "quasi-molecular" ions have markedly different properties and thus will travel through the FAIMS device differently than their desolvated counterparts resulting in significant signal loss (35). Thermo Fisher offers a heated electrospray probe (HESI-II) to aid in desolvation of ions prior to FAIMS, but to our knowledge no such solution previously existed for coupling a nanospray probe to the FAIMS device. Given the signal improvement obtained when coupling nanospray to FAIMS and the ease with which it is implemented, the work presented here provides for significant advances in highly sensitive inline sample fractionation in an automated approach applicable to multitudes of sample types.
For FAIMS to confer any benefit as a fractionation technique, it was necessary to perform multiple injections at various CV's. In the experimental design described here, the CV was held constant for each nanoLC-FAIMS-MS replicate. No single nanoLC-FAIMS-MS replicate yielded as many unique peptide identifications as a single nanoLC-MS experiment, but combining the results of multiple nanoLC-FAIMS-MS ex-periments at different CV's conferred an advantage over replicate nanoLC-MS experiments with as few as three experiments. The FAIMS device and software provided by Thermo Fisher enables the user to program methods in which spectra can be collected at multiple CVs over the course of a single chromatographic separation, effectively increasing the theoretical peak capacity of the experiment (17). However, the practical utility of this approach is limited by duty cycle. In its current configuration, the Thermo Fisher FAIMS device requires ϳ100 ms of lag time when switching between CVs. Because of this lag time combined with the fundamental limits in duty cycle of the mass analyzer, we were unable to design a CV scanning experiment that resulted in more PSM's from a single nanoLC-FAIMS-MS experiment than from a single nanoLC-MS experiment. Future improvements in this FAIMS device (36) or others under development(37, 38) may decrease this lag time and make CV scanning experiments more viable on a chromatographic timescale (including peak widths afforded by nano UPLC applications of ϳ15 s). In this current construction, however, we have demonstrated that scanning CVs is not an absolute requirement in order for existing FAIMS technology to be used to improve peptide discovery in a nanoLC-FAIMS-MS experiment.
To demonstrate the advantages of increased proteome coverage afforded by FAIMS, we used both nanoLC-MS and nanoLC-FAIMS-MS to analyze a 1:1 mixture of haploid (mating type ␣) and diploid yeast that had been differentially labeled using SILAC technology. Comparing all 40 combined nanoLC-FAIMS-MS with CV step experiments against all 40 combined nanoLC-MS experiments, fractionation by FAIMS yielded 50% more unique stripped peptide identifications and 64% more unique quantified protein identifications. The large number of nanoLC-FAIMS-MS replicates was performed in order to cover the range of CV's at which tryptic peptides could be observed, and an equal number of nanoLC-MS replicates was necessary to demonstrate that any increase in PSMs afforded by FAIMS was not simply the result of replicate injections. It is not necessary to perform this large range of replicate injections in order to gain a practical benefit; with the appropriate selection of CVs, fractionation by FAIMS conferred an advantage over un-aided nanoLC-MS with as few as three replicate injections (Fig. 5). The experimental approach described here provides a starting point for designing future nanoLC-FAIMS-MS experiments.
The model system studied here, SILAC-labeled haploid and diploid yeast, was also used by de Godoy et al. (27) in their study in which they reported near-complete coverage of the yeast proteome by LC-MS employing numerous and various sample fractionation techniques. While the results described here do not approach the 4,033 quantified proteins reported in that study, it is worth noting that the level of proteome coverage reported in the Godoy study was obtained with 492 two-hour gradients performed on fractions obtained from either isoelectric focusing or gel elec-trophoresis and only defined 66% of the complete yeast proteome. By comparison, the 1677 quantified proteins reported here were identified from 40 one-hour gradients (requiring a total of 71 h of instrument time) and no prior sample fractionation, a reduction in at least 25-fold with respect to time required. With the increasingly faster scanning speeds of high-resolution mass spectrometers, the notion of whole proteome analysis in a discrete set of analyses is achievable. The recent work of Thakur et al. (39) described the ability to further increase proteome coverage by using very long gradients of up to 8 h coupled to the LTQ-Velos-Orbitrap, allowing the identification of a total 2990 proteins from yeast. In comparison, the identification of 1677 SILAC quantified proteins using short one-hour gradients provides a favorable mass spectrometry platform for whole proteome identification. By further adopting long gradients and discrete FAIMS CV settings for whole cell lysate analysis, the ability to provide information on almost all proteins present in a sample is achievable.
In conclusion, we have demonstrated how nanoLC can easily be coupled to existing commercially-available FAIMS technology by making simple modifications to a Thermo Fisher ESI probe. Our modified probe enables augmentation of nanospray with sheath gas, which we have demonstrated significantly ameliorates the signal loss initially observed when coupling nanospray to FAIMS. We have demonstrated that FAIMS can be used to address concerns of sample complexity in proteomics experiments by improving dynamic range, allowing charge-based selection of precursors, and selectively decreasing the number of unique ions populating the ion trap. This work presents nano-FAIMS as a viable technology for augmentation of existing mass spectrometry instrumentation in the proteomics laboratory with great potential to fractionate automatically very complex samples without manual intervention with time consuming wet chemistry techniques that invariably lead to sample loss.