Interactive Peptide Spectral Annotator: A Versatile Web-based Tool for Proteomic Applications*

We have developed an innovative web-based spectrum annotator which visualizes and characterizes peptide tandem mass spectra. The generated annotated spectra are fully interactive and customizable, and any spectrum can be exported as a scalable vector graphic for publication-ready figures. All uploaded data can optionally be batch processed to rapidly extract ion statistics for method development. Our platform additionally supports the annotation of spectra collected in the negative mode, providing a much-needed resource for the proteomics community. Graphical Abstract Highlights Generation of interactive annotated spectra using scalable vector graphics. Upload data using generic or standard file formats mzTab, mzIdentML, or mzML. Annotation support for peptides collected using negative-mode ionization. Batch processing of uploaded identifications to report detailed ion statistics. Here we present IPSA, an innovative web-based spectrum annotator that visualizes and characterizes peptide tandem mass spectra. A tool for the scientific community, IPSA can visualize peptides collected using a wide variety of experimental and instrumental configurations. Annotated spectra are customizable via a selection of interactive features and can be exported as editable scalable vector graphics to aid in the production of publication-quality figures. Single spectra can be analyzed through provided web forms, whereas data for multiple peptide spectral matches can be uploaded using the Proteomics Standards Initiative file formats mzTab, mzIdentML, and mzML. Alternatively, peptide identifications and spectral data can be provided using generic file formats. IPSA provides supports for annotating spectra collecting using negative-mode ionization and facilitates the characterization of experimental MS/MS performance through the optional export of fragment ion statistics from one to many peptide spectral matches. This resource is made freely accessible at http://interactivepeptidespectralannotator.com, whereas the source code and user guides are available at https://github.com/coongroup/IPSA for private hosting or custom implementations.


In Brief
We have developed an innovative web-based spectrum annotator which visualizes and characterizes peptide tandem mass spectra. The generated annotated spectra are fully interactive and customizable, and any spectrum can be exported as a scalable vector graphic for publication-ready figures. All uploaded data can optionally be batch processed to rapidly extract ion statistics for method development. Our platform additionally supports the annotation of spectra collected in the negative mode, providing a muchneeded resource for the proteomics community.

Highlights
• Generation of interactive annotated spectra using scalable vector graphics.
• Upload data using generic or standard file formats mzTab, mzIdentML, or mzML.
• Annotation support for peptides collected using negative-mode ionization.
• Batch processing of uploaded identifications to report detailed ion statistics.
Here we present IPSA, an innovative web-based spectrum annotator that visualizes and characterizes peptide tandem mass spectra. A tool for the scientific community, IPSA can visualize peptides collected using a wide variety of experimental and instrumental configurations. Annotated spectra are customizable via a selection of interactive features and can be exported as editable scalable vector graphics to aid in the production of publicationquality figures. Single spectra can be analyzed through provided web forms, whereas data for multiple peptide spectral matches can be uploaded using the Proteomics Standards Initiative file formats mzTab, mzIdentML, and mzML. Alternatively, peptide identifications and spectral data can be provided using generic file formats. IPSA provides supports for annotating spectra collecting using negative-mode ionization and facilitates the characterization of experimental MS/MS performance through the optional export of fragment ion statistics from one to many peptide spectral matches. This resource is made freely accessible at http://interactivepeptidespectralannotator. com, whereas the source code and user guides are available at https://github.com/coongroup/IPSA for private hosting or custom implementations. Molecular & Cellular Proteomics 18: S193-S201, 2019. DOI: 10.1074/mcp. TIR118.001209.
Tandem mass spectrometry (MS/MS) 1 is the centerpiece of modern proteome analysis. Advances in instrument design and acquisition software have enabled collection of well over 100,000 MS/MS scans in less than an hour of analysis (1)(2)(3)(4)(5)(6)(7)(8)(9). Researchers have developed a wide variety of search algorithms and related computational tools to rapidly translate this large volume of experimental data to peptide spectral matches (PSMs), where peptide sequences are assigned to spectra to identify the proteins present in a sample (10 -16). An important component to this process is matching expected product ions to those observed in the experimental spectra. Annotation of spectra in this sense usually involves labeling observed m/z features with matched fragment ion designations (e.g. a/x-, b/y-, or c/z-type product ions) derived from the reported peptide sequence. Expert manual annotation is a valuable but greatly time-consuming process-unfeasible for the large volume of spectra generated in modern proteomic experiments.
Proteomic field guidelines have increasingly emphasized the importance of providing access to annotated MS/MS spectra for publication, which allows others to inspect reported PSMs and validate their assignment to a given sequence (17)(18)(19)(20). Many software tools have been created to aid researchers annotating individual PSMs contained in bulk datasets. Most such tools are downloadable and often integrated directly into data-analysis suites, although a handful have been developed as web browser-based platforms (21)(22)(23). Lorikeet (https://uwpr.github.io/Lorikeet/) is a well-established web-based spectral annotator which has been integrated into several online mass spectrometry resources to visualize routine shotgun and cross-linked proteomics data (24 -27). However, Lorikeet does not render generated annotated spectra in scalable vector graphics (SVG) format, limiting the flexibility of exported visualizations with regards to figure creation.
Although powerful for the platforms for which they were designed, many of these tools are inseparable from their respective analytical pipelines; data visualization in Max-Quant is only available following processing with the integrated Andromeda search engine, for example. Their purview is therefore limited, and facile spectral annotation is restricted to only those search algorithms packaged in a pipeline with a developed annotator. This restriction poses a problem for numerous applications, especially for alternative peptide fragmentation methods such as ultraviolet photodissociation (UVPD), collisionally supplemented electron-transfer dissociation (EThcD), or activated-ion electron-transfer dissociation (AI-ETD) (28 -30). Often these methods can be integrated into established analytical pipelines adopted by the field over the course of several years. But flexible annotation tools are largely unavailable in the beginning stages of method development-arguably when they are needed most. For example, Lorikeet bundles annotation calculations directly with its spectrum viewer. This requires in-depth knowledge of Lorikeet's architecture to add functionality for new technologies. However, separating the annotation process from the spectrum renderer is amenable toward a more stable platform for spectral annotation as the components can be maintained and implemented independently.
Here we present the Interactive Peptide Spectra Annotator (IPSA) to provide a standalone web platform for annotation and interpretation of peptide tandem mass spectra independent of instrumental platform, identification pipeline, and peptide fragmentation technique. IPSA provides flexibility to annotate spectra containing any of the six common peptide fragment ion types. Importantly, it can export annotated data in a tabular format, which enables the rapid culmination of fragment ion statistics for individual or multiple peptide tandem mass spectra, a useful tool in a wide range of proteomic experiments. We have also built in compatibility with spectra collected in the negative mode, providing a much-needed resource for the continued development of negative-mode proteomic approaches. Further, IPSA offers a platform for the generation and exportation of figure-ready annotated spectra in an editable format. In all, IPSA expands spectral annotation capabilities to all types of shotgun proteomic data regardless of how data was collected or processed.

EXPERIMENTAL PROCEDURES
Software Development-IPSA is composed of two major components: a client-facing interactive web visualizer and a server-side data processor which handles the data processing required for spectral annotation. Client-side visualization software was developed using AngularJS. The D3.js library is leveraged to generate interactive annotated spectra using SVG from annotated data returned from the server after analysis (31).
Server-side software consists of a set of modular PHP scripts, which perform form validation, data processing and annotation, file upload handling, and data export. A MySQL database is incorporated to securely cache parsed peptide identifications and spectral information extracted from uploaded data. MySQL integration facilitates data storage and retrieval when annotation requests are submitted to the server.
Example Data Sets-Cell pellets of Saccharomyces cerevisiae (strain BY4742) containing ϳ1 ϫ 10 8 cells were harvested from liquid culture by centrifugation (3000 ϫ g, 3 min, 4°C). The supernatant was removed, and the cell pellet was resuspended in 8 M urea, 100 mM tris (pH 8.0). Methanol was added to 90% by volume and vortexed to lyse the cells and induce protein precipitation. The resulting solution was centrifuged (14,000 ϫ g, 3 min) to form a protein pellet. The supernatant was removed, and the pellet was resuspended in 8 M urea, 100 mM tris (pH 8.0), 10 mM tris(2-carboxyethyl)phosphine, and 40 mM chloroacetamide. The solution was then diluted to 1.5 M urea with 50 mM tris. Trypsin (Promega, Madison, WI) was added (1:50 enzyme/ protein) and was allowed to digest overnight (22°C). The resultant peptides were acidified (pH Ͻ 2.0) using 0.1% trifluoroacetic acid (TFA) and were desalted using polymeric reverse phase Strata-X columns. Columns were equilibrated using one bed volume of 100% acetonitrile (ACN), then one bed volume of 0.1% TFA. Peptides were loaded onto the column and washed with two bed volumes of 0.1% TFA. Peptides were eluted by an addition of 500 l 40% ACN, 0.1% TFA followed by an addition of 650 l 70% ACN, 0.1% TFA and were then dried and resuspended in 0.2% formic acid. Peptide concentration was determined using a Pierce quantitative colorimetric peptide assay (Thermo Fisher Scientific, Rockford, IL).
Low pH reverse-phase liquid chromatography was conducted using a Dionex UltiMate 3000 UPLC as described previously (1, 2). Eluting peptides were analyzed using a Q Exactive HF hybrid quadrupole Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany) and were fragmented at HCD at 25% normalized collisional energy. Survey scans were taken at a resolution of 60,000 at 200 m/z, whereas tandem mass spectra were collected using a resolution of 15,000 at 200 m/z. The resulting tandem mass spectra were searched using the Coon OMSSA Proteomic Analysis Software Suite (v1.4.1) (32,33). A precursor mass tolerance of Ϯ150 ppm was used, whereas fragment ions were searched using a mass tolerance of Ϯ0.01 Da. A maximum of 3 missed tryptic cleavages were permitted. Carbamidomethylation of cysteine was set as a fixed modification, whereas oxidation of methionine was set as a variable modification. Data was searched against a canonical and isoform Saccharomyces cerevisiae database (UniProt, June 10, 2016) concatenated with the reverse protein sequence for decoy generation. A 1% FDR threshold was used at the peptide level, using both e-value and precursor mass accuracy to filter results.
Additional peptide identifications and spectral data were acquired from the previous work of Riley et al. to demonstrate IPSA's ability to process PSMs fragmented using alternative dissociation techniques. These include ETD; collisionally supplemented ETD (ETcaD and EThcD); AI-ETD; AI-NETD; and AI-ETD with supplemental infrared photon irradiation post-reaction (AI-ETDϩ) (34,35).

RESULTS
Design of IPSA-IPSA was developed as a versatile webbased spectral analysis tool capable of individual or en masse annotation of PSMs generated from experiments that produce any of the six common peptide fragment ion types (Fig. 1). Single spectra can be annotated by entering peptide and spectral data into an intuitive web form, whereas multiple spectra can be uploaded directly to the website to be individually queried or batch processed. Single annotations are conducted using the metrics provided by the user through the web form and are returned client-side to generate an exportable, annotated spectrum. Exported spectra can easily be shared or integrated into figures. Because the individual interrogation of large numbers of PSMs can quickly become 1 The abbreviations used are: MS/MS, tandem mass spectrometry; CSV, comma-separated value; MGF, mascot generic format; IPSA, interactive peptide spectral annotator; PSM, peptide spectral match; m/z, mass to charge ratio; ETD, electron transfer dissociation; EThcD, electron transfer and higher-energy collision dissociation; ETciD, electron transfer and collision-induced dissociation; AI-ETD, activated ion electron transfer dissociation; AI-NETD, activated ion negative electron transfer dissociation; UVPD, ultraviolet photodissociation; SVG, scalable vector graphic; ACN, acetonitrile; TFA, trifluoroacetic acid; PPM, parts per million; PTM, post-translational modification; FDR, false discovery rate; JSON, JavaScript Object Notation.

Interactive Peptide Spectral Annotator
Molecular & Cellular Proteomics 18.14 S195 tedious, we added functionality to batch process all uploaded PSMs and export the annotations in a tabular format. This feature permits the rapid characterization of tens of thousands of tandem mass spectra.
Single Spectrum Annotation-A single peptide spectrum can be annotated by providing the peptide's sequence, precursor charge, maximum allowed fragment charge, and spectral data to the user interface shown in Fig. 2A. Expected fragmentation patterns and neutral losses can be selected to specify which theoretical peptide fragment ions are generated during data processing (36,37). The mass tolerance for matching experimental features to theoretical fragment ions can be set in either ppm or Daltons. A relative intensity, raw intensity, or S/N (if supplied with spectral data) cutoff can be defined to ignore low-abundance or insignificant features during matching. Visualization colors can additionally be customized.
A predefined list of common protein post-translational modifications (PTMs) can be queried and selected using a searchable dropdown below the fragmentation options. Available PTMs for a peptide are intelligently filtered to only show PTMs relevant to the entered peptide sequence. If a desired PTM is not included in the predefined modification list, new PTMs can be defined and are stored locally in the user's web browser. The user can provide a new modification name, target site, and mass shift to create a custom PTM option.
When the server receives an annotation request, data entered into the user interface is validated and sent for processing. The peptide sequence is parsed and assembled into an intact peptide in-silico. Theoretical peptide fragment ions are created from the intact peptide using the fragmentation schema selected by the user. Each fragment is matched to m/z peak within the specified mass tolerance. To address the case that multiple theoretical fragments are mapped to the same experimental feature, only the theoretical fragment that matches with the smallest mass error is reported. Once annotation mapping has been finalized, annotated spectral data is formatted into JSON and is returned to the client for visualization.
Immediately upon this return, IPSA generates the interactive annotated spectrum (Fig. 2B). This visualization consists of three portions: a peptide sequence marked with detected fragment ion locations and summary statistics, an interactive annotated spectrum, and an interactive scatterplot of the Fig. 1. IPSA Software Flowchart. Single spectra can be annotated by entering peptide and spectral data into provided web forms. Files containing multiple peptide identifications and spectra can be uploaded either in PSI-supported or generic text-based formats to be individually annotated or to be bulk processed for ion statistic extraction. Theoretical peptides are assembled in-silico, fragmented, and matched to the experimental spectrum. The annotated experimental spectrum is then returned and visualized client-side. This visualization can be exported as an SVG image for figure generation or as a CSV file containing ion statistics for the single spectrum. Alternatively, ion statistics for all uploaded peptide spectral matches can be calculated and exported through bulk processing, returning two files containing summary and detailed metrics for each uploaded PSM.

A B
FIG. 2. Spectrum Annotation. The peptide KAEMESDLNNAADLFAGLGVAEEHPR and its spectral information provide an example PSM. A, Peptide information, fragment ion characteristics, tolerances, and chart colors can be easily set using the provided form. B, The generated interactive visualization after server processing containing the peptide sequence marked with the locations of matched fragment ions, an annotated mass spectrum, and visualization of mass error in either parts-per-million or daltons for all matched fragment ions.
Interactive Peptide Spectral Annotator S196 matched fragment-ion mass errors. The visualization supports many interactive features to help facilitate data interpretation. Both axes allow contextual zooming for deeper investigation of congested sections of annotated spectra, whereas tooltips provide exact values for any highlighted plotted experimental features. Highlighted fragments are mirrored in each section of the visualization to emphasize all aspects of the feature of interest. Additionally, annotation labels can be dragged to clearer locations to declutter busy regions. The generated visualization can be exported as an SVG file for figure creation as it appears on screen or in a tabular format at any time.
Bulk Data Upload-If many spectra need to be rapidly interrogated, IPSA provides functionality to serially process multiple PSMs by directly uploading files containing peptide identifications and spectral data to the server. Identifications can be provided either in the Proteomics Standards Initiative file formats mzTab or mzIdentML, or in a generic CSV format (18,19,38). Each row in the generic CSV lists a scan number, peptide sequence, precursor charge, and all PTM names and locations for each peptide identification. We chose this architecture for its simplicity; peptide identifications produced from a wide variety of search algorithms can easily be converted into this format. Spectral data can be uploaded as a Mascot Generic Format (MGF) or mzML file (12,39). Finally, a modifications file can be uploaded to link peptide modification names to their respective masses. We provide a set of example files on IPSA's file upload page to demonstrate how each of these files should be structured. MGF and mzML files can easily be generated from vendor or open file formats using conversion tools such as MSConvert (40).
Data parsed from bulk identification and spectral data uploads are stored securely server-side in a MySQL database. On data upload, a unique identifier is assigned to the user's browser which is used to exclusively access the uploader's data. After data extraction, uploaded files are deleted to reduce server footprint. Only one data set can be stored at a time.
Negative Mode Annotation-Proteomic analyses are typically conducted using low-pH separations and positive-mode electrospray ionization to create peptide cations. This tendency leads to a systematic underrepresentation of acidic peptide species, which preferentially ionize as anions (41)(42)(43). High-pH separations using negative-mode ionization can be used to better study these acidic species, but the complexity of tandem mass spectra generated using traditional collisionbased activational methods has precluded the widespread adoption of this mode. This spectral complexity arises in part from a multitude of neutral losses originating from precursor and fragment ions (44). Alternative fragmentation techniques such as UVPD or AI-NETD, producing a/x-, b/y-, c/z-type and a • /x-type product ions respectively, have recently demonstrated their utility in producing informative tandem mass spectra from peptide anions (35,43). However, many spectral annotators do not support these data types. IPSA is capable of annotating PSMs collected using negative-mode electrospray ionization. Fig. 3 demonstrates an IPSA-annotated spectrum of the triply deprotonated peptide LIPSDFILAAQSH-NPIENK dissociated using AI-NETD (35).
Ion Statistics-Obtaining fragment ion statistics in an automated fashion for an entire mass spectrometry experiment is no trivial task. Fragment ion statistics can be greatly informative during method optimization and can be used to monitor MS/MS performance by providing information on what ion types (and in what amounts) are being generated. Additional informative metrics include the sequence coverage of all detected peptide fragments, fragment ion mass errors, and the percent of the total ion current (TIC) that can be explained by annotated fragment ions.
IPSA provides a unique utility among web-based spectral annotators to compute and export all detected fragment ions for an uploaded experiment in a tabular format. The server extracts the fragment ion series, mass tolerances, and any intensity threshold from the provided user interface and serially processes every uploaded peptide identification. The annotation results are continuously written to a set of two downloadable CSVs. The first file contains summary statistics for the matched fragment ions for each uploaded PSM. This file reports the number of matched fragment ions, unique peptide bonds broken, and the percent of the total ion current explained by matched fragment ions. The second file contains detailed information concerning every detected fragment ion for all uploaded identifications; more specifically, the raw intensity, theoretical m/z, experimental m/z, mass error, percent of base peak, and percent of the total ion current explained is reported.
A series of experiments were previously described by Riley et al. to examine the efficacy of ETD, ETcaD, EThcD, AI-ETD, and AI-ETDϩ fragmentation on a liquid chromatography timescale (34). The authors found AI-ETDϩ to be the optimal supplemental ETD fragmentation technique. Using the authors' reported peptide identifications and spectral data, we created a set of detailed comparisons similar to those made in the referenced manuscript using the ion statistics files directly exported from IPSA (Fig. 4). No further programming was required to extract these data or make this figure, and all data manipulation postexport was performed in a spreadsheet using basic arithmetical functions.
In summary, IPSA is capable of both cleanly annotating peptide spectra collected using a wide variety of dissociation techniques in both positive and negative mode and of exporting the generated annotated spectra in the editable SVG format. Additionally, IPSA allows the bulk analysis of detected fragment ions for any number of uploaded spectra, permitting in turn the deep interrogation of data without requiring programming experience.

DISCUSSION
Modern MS-based proteomics techniques are widely used to identify and characterize tens of thousands of peptides and proteins originating from a variety of biological samples. The annotation of the tandem mass spectra used to identify these species is an arduous task requiring extensive expertise. Our web-based and open-source peptide spectral annotator, IPSA, provides a resource for generating and investigating annotated spectra for peptide identifications to a wide research community. IPSA can generate customizable annotated peptide spectra using a clean and intuitive user interface, allowing researchers to export customizable, publication-ready annotated spectra as vector graphics to aid in figure creation. It can process MS/MS spectra from both anionic and cationic precursors, and it has built-in support to annotate fragment ions generated from a diverse assortment of dissociative techniques. Additionally, IPSA can extract fragment ion statistics from any number of peptide spectra and return results in a tabular format, giving researchers a deeper and more comprehensive view of their peptide analyses.
We chose to develop IPSA as an online platform to reach a wide audience of proteomics researchers: those with an Internet connection on a computer with a web browser. Webbased software also allowed us to use the flexibility of the well-established JavaScript visualization library D3.js while avoiding software compatibility issues and version control. The annotated AI-NETD spectrum of the triply deprotonated peptide LIPSDFILAAQSHNPIENK generated using IPSA. The charge-reduced precursor was downscaled by a factor of 3. Unreacted precursor was cleaned from this spectrum.

S198
Through IPSA, we aim to increase the approachability of spectral annotation to proteomics novices and experts alike.
The IPSA source code is freely available for inspection and download at https://github.com/coongroup/IPSA alongside additional guides regarding software usage. We recommend using an updated web browser to access IPSA at http:// interactivepeptidespectralannotator.com as outdated browsers may not provide support for critical functions. IPSA can be easily installed on a private desktop or server using a prebuilt Docker image and instructions at https://hub.docker.com/r/ dbrademan/ipsa, or IPSA's project files can be manually configured to operate on private web servers with full functionality. Additionally, the JavaScript file used to render the interactive visualization, IPSA.js, is configured to be used as an AngularJS directive. This directive can be attached to custom annotation scripts in many website environments, allowing the use of our software beyond that of the platform we described here.
Acknowledgments-We thank Kevin Schauer for providing feedback during IPSA's design.

DATA AVAILABILITY
Raw spectral data, peptide identifications, and protein databases have been deposited to the ProteomeXchange Consortium via the PRIDE (45) partner repository with the dataset identifier PXD011695.