Dataset on the comparative proteomic profiling of mouse saliva and serum from wild type versus the dystrophic mdx-4cv mouse model of dystrophinopathy

The comparative proteomic data presented in this article provide supporting information to the related research article "Proteomic identification of elevated saliva kallikrein levels in the mdx-4cv mouse model of Duchenne muscular dystrophy " (Murphy et al., 2018). Here we provide additional datasets on the comparative proteomic analysis of saliva and serum proteins and the mass spectrometric identification of kallikrein isoform Klk-1 in wild type versus mdx-4cv saliva specimens. The data article presents the systematic identification of the assessable saliva proteome and the differential presence of proteins in saliva versus serum samples. Representative mass spectrometric scans of unique peptides that were employed to identify the kallikrein isoform Klk-1 in wild type versus mdx-4cv saliva specimens are provided. The dataset contains typical saliva-associated marker proteins, including alpha-amylase and albumin, as well as distinct isoforms of cystatin, serpin, kallikrein, cathepsin, glutathione transferase, carbonic anhydrase, mucin, pyruvate kinase, and aldolase.


Specifications table
Value of the data Proteomic data presented here provide an overview of biofluid changes in the mdx-4cv mouse model of X-linked muscular dystrophy.
This data provide comparative listings of proteins in saliva versus serum specimens, as well as their mass spectrometric identification.
The mass spectrometric data are valuable to serve as a pathobiochemical biofluid signature of the dystrophin-deficient mdx-4cv mouse.

Data
The data presented relate to the systematic survey of whole saliva using mass spectrometry-based proteomics of the mdx-4cv mouse model of Duchenne muscular dystrophy [1]. This accompanying article lists the proteomic identification of the total saliva protein population and the differential presence of protein species in saliva versus serum samples, as well as representative MS/MS scans of unique peptides that were used to identify the kallikrein isoform Klk-1 in wild type versus mdx-4cv saliva specimens. Table 1 lists the mass spectrometric profiling of the mouse saliva proteome. Listed are the protein name, gene name, the number of unique peptides, the number of total peptides, the relative molecular mass, and the estimated isoelectric point of the identified protein species. A set of typical marker proteins of whole saliva were identified, including alpha-amylase and albumin, as well as distinct isoforms of cystatin, serpin, kallikrein, cathepsin, glutathione transferase, carbonic anhydrase, mucin, pyruvate kinase, and aldolase [2][3][4][5]. The identified protein species in saliva were compared with the previously established serum proteome [6]. Fig. 1 shows a Venn diagram of the distribution of proteins that are shared between saliva and serum, and protein species that are uniquely associated with saliva versus serum samples. Tables 2 and 3 list the mass spectrometric identification of proteins identified in saliva only or are shared between serum and saliva. In Table 2 are listed 59 proteins found in wild-type saliva, but not serum, including carbonic anhydrase 6, BPI fold-containing family A members 1 and 2, cystatin 10, cardiomyopathy-associated protein 5, mucin-19, and desmoplakin. Table 3 lists 78 proteins found in both serum and saliva, including alphaamylase, cathepsin D, serum albumin, and fructose-bisphosphate aldolase A, as well as kallikrein-1 and Klk1-related peptidases b1, b3, b4, b5, b8, b9, b11, b16, b21, b22, b24, b26, and b27. In addition to the MS/MS scans of the unique peptide NNFLEDEPSAQHR shown in the accompanying research

Experimental design, materials, and methods
Details of the methodological approach used in this study are available in [1,6]. Protein S100-A1 S100a1 1 1 10.5 4.50   Table 3 Mass spectrometry-based proteomic identification of proteins that are present in both saliva and serum from wild type mouse. Protein S100-A9 S100a9 Q60854 Serpin B6 Serpinb6 Q8VEN2-2 Isoform 2 of Placenta-expressed transcript 1 protein Plet1 O70404 Vesicle-associated membrane protein 8 Vamp8

Sample collection and processing
For the proteomic profiling of easily assessable biofluids, saliva and serum specimens were obtained from 6-month-old dystrophic mdx-4cv and age-matched wild type C57BL/6 mice through the Bioresource Unit of the University of Bonn [6], where mice were kept under standard conditions according to German legislation on the use of animals in experimental research. Sample collection and preparation of protein extracts were carried out as previously described in detail [1,6]. The collected saliva and serum specimens were transported to Maynooth University on dry ice in accordance with the Department of Agriculture (animal by-product register number 2016/16 to the Department of Biology, National University of Ireland, Maynooth).

Mass spectrometric analysis of saliva and serum proteins
Serum samples were processed as previously described [6]. For the proteomic analysis of saliva samples, 30 mg of protein was processed by the filter-aided sample preparation (FASP) method, as described in detail by Wiśniewski et al. [7], using a trypsin to protein ratio of 1:25 (protease:protein). Following overnight digestion and elution of peptides from the spin filter, 2% trifluoroacetic acid (TFA) in 20% acetonitrile (ACN) was added to the filtrates (3:1 (v/v) dilution). Peptides were analyzed by label-free liquid chromatography mass spectrometry (LC-MS/MS) by a standardized method using an Ultimate 3000 NanoLC system (Dionex Corporation, Sunnyvale, CA, USA) coupled to a Q-Exactive mass spectrometer (Thermo Fisher Scientific) as previously described in detail [1,6,8,9].

Protein identification and quantification
Proteins present in the wild type and the mdx-4cv salivary and serum proteomes were initially identified using Proteome Discoverer 1.4 against Sequest HT (SEQUEST HT algorithm, licence Thermo Scientific, registered trademark University of Washington, USA) using the UniProtKB/Swiss-Prot database, with 25,041 sequences for Mus musculus [1,6]. Identified saliva peptides were then filtered using a minimum XCorr score of 1.5 for 1, 2.0 for 2, 2.25 for 3, and 2.5 for 4 charge states, with peptide probability set to high confidence. For quantitative analysis, samples were evaluated with MaxQuant software (version 1.6.1.0) and the Andromeda search engine used to explore the detected features against the UniProtKB/SwissProt database for Mus musculus. The following search parameters were used: (i) first search peptide tolerance of 20 ppm, (ii) main search peptide tolerance of 4.5 ppm, (iii) cysteine carbamidomethylation set as a fixed modification, (iv) methionine oxidation set as a variable modification, (v) a maximum of two missed cleavage sites, and (vi) a minimum peptide length of seven amino acids. The false discovery rate (FDR) was set to 1% for both peptides and proteins using a target-decoy approach. Relative quantification was performed using the MaxLFQ algorithm [10]. The "proteinGroups.txt" file produced by MaxQuant was further analysed in Perseus (version 1.5.1.6).
Proteins that matched to the reverse database or a contaminants database or that were only identified by site were removed. The LFQ intensities were log2 transformed, and only proteins found in all eight replicates in at least one group were used for further analysis. Data imputation was performed to replace missing values with values that simulate signals from peptides with low abundance chosen from a normal distribution specified by a downshift of 1.8 times the mean standard deviation of all measured values and a width of 0.3 times this standard deviation [11]. A two-sample t-test was performed using p o0.05 on the post imputated data to identify statistically significant differentially abundant proteins.

VLNFNTWIR
LGSTCLASGWGSITPVK Fig. 2. Proteomic identification of kallikrein isoform Klk1 in saliva from the wild type versus the mdx-4cv mouse model of Duchenne muscular dystrophy. Shown are representative MS/MS scans of the unique Klk-1 peptides LGSTCLASGWGSITPVK and VLNFNTWIR, which were identified and compared in wild type versus mdx-4cv saliva, respectively.