Dichlorvos-induced formation of isopeptide crosslinks between proteins in SH-SY5Y cells

Chlorpyrifos oxon catalyzes the crosslinking of proteins via an isopeptide bond between lysine and glutamic acid or aspartic acid in studies with purified proteins. Our goal was to determine the crosslinking activity of the organophosphorus pesticide, dichlorvos. We developed a protocol for examining crosslinks in a complex protein mixture consisting of human SH-SY5Y cells exposed to 10 μM dichlorvos. The steps in our protocol included immunopurification of crosslinked peptides by binding to anti-isopeptide antibody 81D1C2, stringent washing of the immobilized complex, release of bound peptides from Protein G agarose with 50% acetonitrile 1% formic acid, liquid chromatography tandem mass spectrometry on an Orbitrap Fusion Lumos mass spectrometer, Protein Prospector searches of mass spectrometry data, and manual evaluation of candidate crosslinked dipeptides. We report a low quantity of dichlorvos-induced KD and KE crosslinked proteins in human SH-SY5Y cells exposed to dichlorvos. Cells not treated with dichlorvos had no detectable KD and KE crosslinked proteins. Proteins in the crosslink were low abundance proteins. In conclusion, we provide a protocol for testing complex protein mixtures for the presence of crosslinked proteins. Our protocol could be useful for testing the association between neurodegenerative disease and exposure to organophosphorus pesticides.


Introduction
Formation of γ-glutamyl-ε-lysine isopeptide bonds between the ε-amine of lysine and γ-glutamyl of glutamine is mediated by the transamidase activity of transglutaminase, a family of enzymes that includes fibrin stabilizing factor XIII [1].
The earliest method for detecting isopeptide crosslinks used extensive proteolysis followed by amino acid composition analysis [1,2] or high-pressure liquid chromatography [3] to identify the γ-glutamyl-ε-lysine product. More recently, isopeptide crosslinks were detected by anti-isopeptide antibodies in immunohistochemically stained brain sections and in Western blots. None of these methods was capable of identifying the peptides that gave rise to the reactive lysine and glutamine or the specific residues that were crosslinked. Using amino acid sequencing after reaction with a radiolabeled glutamine substrate and limited proteolysis provided a means for identifying the labeled protein/peptide. However, a variety of artifacts limits this technique to proteins that have already been shown to be substrates for transglutaminase [3]. Mass spectrometry offers a more robust means for studying isopeptide crosslinked peptides because it can identify the specific amino acids, peptides, and proteins involved. Nemes and coworkers demonstrated the power of mass spectrometry by identifying 2 proteins from brain cortex that were crosslinked to ubiquitin via lysine-glutamine isopeptide bonds and an intramolecular crosslink between Gln99 and Lys58 of α-synuclein [4]. The amount of isopeptide crosslinked peptides was so low that to detect them the preparations had to be highly enriched, by immunopurification with anti-isopeptide antibodies [4].
Three cell lysates were prepared for the plus dichlorvos cells and two for the minus dichlorvos cells. Each lysate was digested with trypsin, immunopurified with anti-isopeptide antibody 81D1C2, and subjected to liquid chromatography tandem mass spectrometry.

Trypsin digestion
Cell lysate supernatant containing 200 μg protein (15 μL) was diluted with 185 μL of 20 mM ammonium bicarbonate pH 8. Proteins were denatured in a boiling water bath for 3 min. The denatured proteins were digested with 4 μg trypsin (8 μL) at 37 °C for 16 h without reduction and alkylation. Trypsin was inactivated by heating the digest in a boiling water bath for 3 min.

Immunopurification of tryptic peptides linked through an isopeptide bond
The heat-treated digest was incubated with 8 μg (8 μL) of anti-isopeptide monoclonal 81D1C2 at room temperature for 8 h to capture isopeptide crosslinked peptides. Antibodypeptide complexes were immobilized by adding 0.1 mL of a 1:1 suspension of Protein G agarose beads, in PBS. The sample was rotated overnight at room temperature.
The beads and liquid were transferred to a 0.45 μm Durapore spin filter (Millipore UFC30HV00). Use of the spin filter maximized recovery because beads were not lost in the wash steps. Beads were washed with 0.4 mL of RIPA buffer (25 mM Tris-HCl pH 7.6, 1% NP-40, 1% sodium deoxycholate, 0.1% SDS, 140 mM NaCl) 5 times followed by 5 washes with water. Salts and detergents were washed off with water. The flow through in each wash step was discarded.
The basket of washed beads was transferred to a new microfuge tube. Bound peptides were released from the washed beads by incubating the basket of beads with 0.1 mL of 50% acetonitrile, 1% formic acid for 0.5-1 h at room temperature. The released peptides were collected in the flow through by brief centrifugation. The extraction step was repeated twice. The combined flow through was dried by vacuum centrifugation.

Sample preparation for mass spectrometry
The dry sample was dissolved in 20 μL water. The sample was centrifuged for 30 min at 14,000×g and 4 °C. The top 10 μL were transferred to an autosampler vial.

Mass spectral data acquisition
Peptide separation was performed with a Thermo RSLC Ultimate 3000 ultra-high pressure liquid chromatography system (Thermo Scientific) at 36 °C. Solvent A was 0.1% formic acid in water, and solvent B was 0.1% formic acid in 80% acetonitrile. Peptides were loaded onto an Acclaim PepMap 100C18 trap column (75 μm × 2 cm; Thermo Scientific cat# 165535) at a flow rate of 4 μL/min and washed with 98% solvent A/2% solvent B for 10 min. Then, they were transferred to a Thermo Easy-Spray PepMap RSLC C18 column (75 μm × 50 cm with 2 μm particles, Thermo Scientific cat# ES803) and separated at a flow rate of 300 nL/min using a gradient of 9-50% solvent B in 30 min, 50-99% solvent B in 40 min, hold at 99% solvent B for 10 min, 99 to 9% solvent B in 4 min, hold at 9% solvent B for 16 min.
Eluted peptides were sprayed directly into a Thermo Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Scientific). Data were collected using data dependent acquisition. A survey full scan MS (from m/z 350-1800) was acquired in the Orbitrap with a resolution of 120,000. The AGC target (Automatic Gain Control for setting the ion population in the Orbitrap before collecting the MS) was set at 4 × 10 5 and the ion filling time was set at 50 msec. The 25 most intense ions with charge state of 2-6 were isolated in a 3 s cycle and fragmented using high-energy collision induced dissociation with 35% normalized collision energy. Fragment ions were detected in the Orbitrap with a mass resolution of 30,000 at m/z 200. The AGC target for MS/MS was set at 5 × 10 4 , and dynamic exclusion was set at 30 s with a 10 ppm mass window. Data were reported in *.raw format.
The *.raw data files were converted to *.mgf files using MSConvert (ProteoWizard Tools from SourceForge).

Batch Tag Web search for crosslinked peptide candidates
The *.mgf files were subjected to a database search using the Batch Tag Web algorithm in Protein Prospector version 6.2.1. Searches were performed on the Protein Prospector website https://prospector.ucsf.edu [prospector.ucsf.edu] (14May2022). Database search parameters included: database-SwissProt 2021.0618; Species-Homo sapiens; enzyme-trypsin, missed cleavages-3; expect calc method-none; protein N-term-unchecked; protein C-term-unchecked; uncleaved-checked; parent mass tolerance-20 ppm; fragment mass tolerance-30 ppm; precursor charge state-2, 3, 4, 5; parent ion conversionmonoisotopic; modification defect-0.0048 Da; instrument-ESI Q high res; link search type-user defined link; link aa-E, D, protein C-term→ K, protein N-term; mod comp ion-K, D, and E; mod range-−18 to 3883 Da; bridge comp-H-2O-1; mod uncleaved -checked; msms mass peaks-80; msms max modifications-2; variable modificationoxidation methionine; fixed modification-none. This database search created a list of peptides that Protein Prospector considered to be crosslinked. The list of potentially crosslinked peptides, along with parameters indicating the level of confidence in the assignment, were displayed in Protein Prospector/Search Compare.
Note that using smaller mass tolerances for parent mass and fragment mass did not improve the detection of isopeptide crosslinked peptides.

Search Compare screening of crosslinked peptide candidates
To reduce the number of crosslink peptide candidates and aid in the identification of crosslinked peptides, the Search Compare list was screened using the Protein Prospector output parameters. Parameters indicating a crosslinked peptide were taken to be: charge state 3, 4, 5; Score >25; score difference >1; % matched intensity >45%; and at least 5 amino acids in each peptide. Choice of these parameters is empirical and was based on experience.

Manual evaluation of crosslinked peptide candidates
Ultimately, crosslinked peptides were confirmed by manual evaluation. Discussion of the manual evaluation that we employed requires an understanding of the following terms:

1.
A crosslink candidate is a pair of crosslinked peptides selected by Protein Prospector from a database search.

2.
A crosslink specific mass is a mass in the MS/MS spectrum from a crosslink candidate that includes residues from both peptides.

3.
A crosslink specific amino acid is an interval in the MS/MS spectrum from a crosslink candidate that is defined by two crosslink specific masses and corresponds to an amino acid that is part of the crosslink candidate sequence.

4.
A ladder sequence is a term used by Protein Prospector to describe neutral loss of amino acids from the N-terminus of the parent ion. Ladder sequencing losses can occur from both peptides in the same MS/MS spectrum.

5.
A peeling sequence is a term used by Protein Prospector to describe the neutral loss of an amino acid from the C-terminus of the parent ion. This is otherwise referred to as a [b n-1 + 18] fragment [11]. Any C-terminal residue can be lost, provided that a basic residue such as arginine, lysine or histidine is present in the sequence [12].

6.
Mixed fragmentation refers to a series of crosslink specific masses that correspond to sequential removal of amino acids from both peptides in the crosslink candidate.

7.
Peptide rearrangement consists of the transfer of amino acid(s) from one terminus of a peptide to the other. Rearrangement sometimes was helpful in mixed fragmentation scenarios. Two mechanisms have been described to explain peptide rearrangements. One is protease induced cyclization and ring re-opening during proteolysis. The rearrangement is believed to require a missed endoproteinase cleavage site (i.e., an arginine or lysine for trypsinolysis) within 2residues of either terminus [13]. The other mechanism for peptide rearrangement occurs in the mass spectrometer. Rearrangement occurs from linear b-ions by cyclization and subsequent ring opening. The cyclic form can open at various amide bonds effectively shifting residues from one terminus to the other [14].

Manual evaluation criteria that support the presence of crosslinked peptides
For a crosslink candidate to be accepted as a crosslinked peptide there must be amino acid sequence support for both peptides and there must be at least one crosslink specific amino acid, defined by two crosslink specific ions. Sequence support consists of the following features.

1.
A series of non-crosslink specific masses in the MS/MS spectrum must correspond to an amino acid sequence from one or the other peptide in a crosslink candidate. Suitable sequences include an N-terminal sequence, a Cterminal sequence, or an internal fragment. Sequences must be at least 3 amino acids long (for example green AVNKV and blue KGV in Fig. 2 panel A and green REDLLIN in Fig. 3 upper panel).

2.
At least one crosslink specific amino acid is essential. A series of crosslink specific amino acids is frequently encountered.

2a)
Sometimes the entire series consists of crosslink specific amino acids from one peptide. This is taken as strong evidence for a crosslinked peptide (for example blue KSEA in Fig. 3 lower panel).

2b)
Sometimes a series of crosslink specific amino acids will be appended by a single amino acid that is not part of the crosslink candidate. Such a sequence is still accepted as support for a crosslinked peptide (for example an unassigned V residue is appended to blue KV in Fig. 7 upper panel).

2c)
Sometimes a series of crosslink specific amino acids will be appended by more than one amino acid that is not part of the crosslink candidate. Such a sequence is not accepted (for example unassigned KV is appended to blue V in Fig. 6 upper panel).

2d)
Sometimes the entire series consists of a mixture of crosslink specific amino acids from both peptides, or of a mixture of amino acids from both the N-and C-terminals of one peptide. We refer to this as mixed fragmentation. This is taken as support for a crosslinked peptide (see Fig. 2, panels B and C, Fig. 6 lower panel, Fig. 7 lower panel for examples). Occasionally, mixed fragmentation can require rearrangement of a peptide sequence by shifting residues from the N-terminus to the C-terminus, or vice-versa (see Fig. 2, panel B for an example).

2e)
Sometimes a sequence of amino acids can be identified that is unrelated to the crosslinked candidate even though it may contain a few crosslinked specific amino acids. Such a sequence is rejected as support for the crosslink (for example QFLELY in Fig. 5 lower panel).

3.
Neutral loss of amino acids from the parent ion. Neutral losses can be N-terminal amino acids (ladder sequence) from one peptide (for example blue VGK in Fig.  2 panel A, blue EALH in Fig. 3 lower panel, green VGK in Fig. 7 upper panel), C-terminal amino acids (peeling sequence) from one peptide, a combination of N-terminal and C-terminal amino acids from one peptide, or a mixture of N-terminal and C-terminal amino acids from both peptides. By definition, the amino acids that are neutral losses from the parent ion contain residues from both peptides and are therefore crosslink specific amino acids.

Isopeptide crosslinked peptides
Three separate preparations of dichlorvos-treated SH-SY5Y cells were analyzed for isopeptide crosslinks between lysine and glutamate or lysine and aspartate. Forty isopeptide crosslinks were confirmed by manual evaluation. The results are given in Table 1. Two separate preparations of SH-SY5Y cells were made without exposure to dichlorvos. No isopeptide crosslinks between lysine and glutamate or lysine and aspartate were detected in cells not treated with dichlorvos.
All data were obtained from the soluble fraction of the cell lysate, therefore proteins in the aggregated pellet were not evaluated. Though the pellet could contain aggregated proteins that might be of interest, the complexity of the pellet made working with it unappealing. In our experience, aggregated proteins are soluble and remain in the cell lysate supernatant.

Peptide abundance
Only two crosslinked peptides appeared more than once. The crosslinked peptide AVNKVKDTPGLGK 450 VK/KGVDIE 4391 ISHR from protein kinesin-like protein KIF26B and hemicentin-1 appeared 5-times: twice in the first repetition (charge states 3 and 4), once in the second repetition, and twice in the third repetition (charge states 4 and 5). See Table 1.
The crosslinked peptide VTKKE 97 TLKAQK/TVGAAQLK 2180 PTLNQ LKQTQK from oxysterol-binding protein-related protein 5 and serine/threonine-protein kinase WNK2 appeared twice, once in the first repetition and once in the second repetition.
We attribute this low degree of reproducibility to crosslinking between low abundance proteins. To test this assumption, we compared the crosslinked proteins to two databases where the cellular concentration of the proteins was determined, see Supplementary Material Table S2.
Beck and coworkers determined the protein abundance for 7311 proteins from the human osteosarcoma cell line U20S [15]. Protein abundance ranged from <5 × 10 2 to 6.53 × 10 6 copies per cell. Abundance determination was based on 144 heavy isotope labeled reference peptides. The abundance of individual proteins in our crosslinked pairs was similar, ranging from <5 × 10 2 to 4.42 × 10 6 (see Supplementary Material Table S2). However, one member of each crosslinked pair was always from a low abundance protein (abundance less than 8 × 10 3 or 0.1% of the maximum).
A total of 11,732 proteins were detected. Proteins were reported as iBAQ values (intensity based absolute quantitation). iBAQ values are obtained by summing the peak intensities for all peptides matching to a specific protein and dividing by the number of theoretically observable peptides in the sample. iBAQ values are reported as log10 [17]. Protein abundance in the database ranged from 2.41 to 9.13. The abundance of individual proteins for our crosslinked pairs spanned a similar range, 2.42 to 8.78 (see Supplementary Material  Table S2). However, one member of each crosslinked pair was always from a low abundance protein (abundance of less than 5 or 0.01% of the maximum).
The fact that most of the crosslinked peptides are from low abundance proteins illustrates one difficulty in obtaining isopeptide crosslinking data, low abundance proteins usually give low signals in the mass spectrometer. This also makes the frequency at which the crosslink occurs difficult to determine. Normally, we would use the number of times a given spectrum appears in the data (spectral count) as a measure of frequency. Most of the crosslinks in this data appear only once.   The first scenario is illustrated in Fig. 2  Either of these mixed fragmentation scenarios supports the same isopeptide crosslinked peptide.

Examples of MS/MS spectra for isopeptide crosslinked peptides
Assignment of the crosslink to K450-E4391 is based on the following logic. The crosslinks we are investigating are limited to KE and KD. The crosslinked peptides AVNKVKDTPGLGK450VK (green peptide)/KGVDIE4391ISHR (blue peptide) together contain one E, two D, and five K residues. K4386 in the smaller, blue peptide is a b1 ion and is not a crosslink specific ion. Because there are no other K residues in the blue peptide, K in the isopeptide crosslink cannot come from the blue peptide. Four K residues in the green peptide are potential partners. K441 and K443 are released from the green peptide in panel B to make crosslinked fragments at 1138.64 and 1025.06. Since the peptides remain crosslinked when K441 and K443 are gone, K441 and K443 cannot be crosslink partners. K452 is at the C terminus, which was the cleavage site for trypsin. As a general rule, trypsin does not cleave at modified lysine residues therefore K452 cannot be a crosslink partner. This leaves K450 as the only possible lysine crosslink partner in the green peptide.   respectively. There is only one crosslinked feature in the spectrum, the green PDY fragment. The 6-amino acid, +2 sequence (QFLELY) is not consistent with the crosslink candidate. Unassigned amino acids appended to the blue E in peptide QFLELY disqualify E as support for the blue peptide. This removes support for the blue peptide. Without support for both peptides, the candidate crosslink in Fig. 5 is rejected. Details of the fragmentation pattern are described in the figure legend.

Isopeptide crosslinks induced by other organophosphates
In previous studies we have used the organophosphate pesticide chlorpyrifos oxon to induce isopeptide crosslinks in the pure proteins butyrylcholinesterase, casein, serum albumin and tubulin [5,6]. Schmidt et al. found that the organophosphorus nerve agent VX induced crosslinks in ubiquitin [7]. The results from the current work demonstrate that the organophosphate pesticide dichlorvos can also induce isopeptide crosslinks. From these observations it is tempting to suggest that any organophosphylate may be capable of inducing isopeptide crosslinks in a variety of proteins.

Mechanism for the reaction of organophosphates with lysine
In a previous study [6] we showed that the organophosphate chlorpyrifos oxon reacted with lysine residues in bovine casein, human serum albumin, mouse serum albumin, human butyrylcholinesterase, and porcine tubulin to form diethylphospho-adducts. Only selected lysine residues were labeled, suggesting that reactive lysines were activated. It was proposed that activation involved through-space, charge-charge interactions with nearby negatively charged residues. Consistent with this proposal, half of the reactive lysine residues were within two residues of an acidic residue in the linear sequence. Support for subsequent OP-induced isopeptide bond formation was the observation that 77% of the crosslinks detected involved a lysine that had been labeled with diethylphosphate.

Organophosphate induced isopeptide crosslinks and disease
Epidemiological evidence suggests that there is an increased incidence of Alzheimer's disease [18,19] and Parkinson's [20] disease in agricultural workers who are exposed to organophosphorus pesticides. A hallmark for both diseases is accumulation of aggregated protein. Previously, we demonstrated that chlorpyrifos oxon, the activated form of the pesticide chlorpyrifos, can crosslink proteins [6]. Of the proteins we have studied in vitro, tubulin is the most sensitive to reaction with chlorpyrifos oxon. Reaction leads to formation of aggregates [9]. Treatment of mice with nonlethal levels of chlorpyrifos resulted in disruption of microtubule structures in the brain [21]. In light of the critical role that microtubules play, disruption of their structure would be expected to presage neurological problems. More recently, we have shown that chlorpyrifos oxon can induce isopeptide dimerization of amyloid beta (1-42) [22]. The covalent amyloid beta dimer is considered to be the toxic form of amyloid beta that leads to protein aggregation and disease. It follows that exposure to organophosphorus pesticides could contribute to the progression of diseases associated with aggregation. At this point, we have no data on how common protein crosslinking is upon exposure to organophosphate levels encountered in the environment, however, because development of Alzheimer's and Parkinson's disease appears to require years, slow accumulation of organophosphorus-induced aggregates could be a causative factor.

Compare isopeptide crosslinks created by transglutaminase to those induced by organophosphates
Transglutaminase is well known for creating γ-glutamyl-ε-lysine isopeptide crosslinks between an α-carboxy group from glutamine on one protein and an ε-amino group from lysine on another. In 2007 Kang et al. reported that isopeptide bonds could form spontaneously between the side chains of lysine and asparagine on Spy0128 in the polymeric shaft of pili expressed by S. pyogenes [8,23]. More recently, it has been reported that organophosphylates can promote spontaneous isopeptide formation between the side chains of lysine and glutamate or lysine and aspartate [6,7]. The consequence of isopeptide bond formation by any of these methods is the creation of protein dimers and higher oligomers. Dimers and molecular aggregates created by transglutaminase are associated with neurodegenerative diseases [24,25].

The process of identifying isopeptide crosslinks
Identifying isopeptide crosslinks in mass spectral data is technically challenging and somewhat subjective. In the Methods section we outlined a procedure for selecting isopeptide crosslinked peptides from raw mass spectral data. The process begins with a database search of the mass spectral data against a SwissProt database using the Protein Prospector/Batch Tag

Conclusion
A protocol is presented for identifying zero-length isopeptide crosslinks in a complex protein mixture. Enrichment of tryptic peptides by binding to anti-isopeptide antibody is a key first step. Stringent washing of the immunopurified complex with a detergent-containing buffer minimizes the number of false positives. Searches of mass spectrometry data with Protein Prospector software provides a list of candidate dipeptide crosslinks. Manual evaluation of candidate crosslinked peptides is laborious but critical.
Our protocol identified isopeptide crosslinked proteins in cultured cells that had been exposed to 10 μM dichlorvos. Isopeptide crosslinks induced by organophosphate pesticides are distinct from crosslinks induced by transglutaminase. The chemically induced crosslinks are between lysine and glutamic acid or lysine and aspartic acid with release of a molecule of water. The transglutaminase induced crosslinks are between lysine and glutamine with release of a molecule of ammonia.
Protein aggregates in neurodegenerative diseases are thought to be produced in part by the action of transglutaminase. Our results suggest that exposure to organophosphorus pesticides may also be implicated.  Mechanism for organophosphate-induced isopeptide bond formation. The structure on the left shows organophosphate covalently attached to the epsilon amino group of lysine. Attack of the lysine amine on the Glu/Asp carbonyl carbon is catalyzed by a vicinal acidic group. The middle structure shows an intermediate between the organophosphate-modified lysine and the side chain of glutamic or aspartic acid. In the last step, the organophosphate is released and a covalent isopeptide bond forms between lysine and glutamic or aspartic acid. A nearby acidic residue stabilizes the crosslink. This mechanism is analogous to that for spontaneous formation of isopeptide bonds proposed by Kang et al. [8]. Mass spectrometry distinguishes chemically induced crosslinks from transglutaminase catalyzed crosslinks by identifying the crosslinked residues and the peptides in the crosslink. Reproduced by permission [9].  This emphasizes a 4-amino acid, +3, ladder sequence (HLAE) from the blue peptide, and a 4-amino acid, +2 y-ion crosslink specific sequence (AESK) from the blue peptide. Unlabeled, red masses mostly represent loss of water, amine, or CO.      b % match is the matched intensity of the assigned peaks compared to the observed peaks. Confidence in the assignment is associated with high % matched intensity. c Crosslinked residues are suffixed by a subscripted number identifying the crosslinked residue.