Simultaneous Quantification of Protein Expression and Modifications by Top-down Targeted Proteomics: A Case of the Sarcomeric Subproteome*

We have developed a straightforward and robust LC/MS-based top-down quantitative proteomics strategy for simultaneous quantification of protein modification and expression that can be directly compared with the antibody-based quantitative strategies (i.e. Western blot). As demonstrated, this top-down targeted proteomics platform offers an excellent “antibody-independent” alternative for the accurate quantification of sarcomeric protein expression and PTMs concurrently in complex mixtures with high throughout and high reproducibility, which is generally applicable to different species and various tissue types. Graphical Abstract Highlights A new strategy for simultaneous quantification of protein expression and modification. This top-down LC/MS-based method shows high reproducibility and high throughput. Quantification at the intact protein level with results comparable to Western blot. This top-down proteomics method is applicable to different species and tissues. Determining changes in protein expression and post-translational modifications (PTMs) is crucial for elucidating cellular signal transduction and disease mechanisms. Conventional antibody-based approaches have inherent problems such as the limited availability of high-quality antibodies and batch-to-batch variation. Top-down mass spectrometry (MS)-based proteomics has emerged as the most powerful method for characterization and quantification of protein modifications. Nevertheless, robust methods to simultaneously determine changes in protein expression and PTMs remain lacking. Herein, we have developed a straightforward and robust top-down liquid chromatography (LC)/MS-based targeted proteomics platform for simultaneous quantification of protein expression and PTMs with high throughput and high reproducibility. We employed this method to analyze the sarcomeric subproteome from various muscle types of different species, which successfully revealed skeletal muscle heterogeneity and cardiac developmental changes in sarcomeric protein isoform expression and PTMs. As demonstrated, this targeted top-down proteomics platform offers an excellent 'antibody-independent' alternative for the accurate quantification of sarcomeric protein expression and PTMs concurrently in complex mixtures, which is generally applicable to different species and various tissue types.

In the post-genomic era when the complexity of proteome is increasingly recognized, determining changes in protein expression and post-translational modifications (PTMs) 1 is crucial for the elucidation of cellular signal transduction and disease mechanisms (1)(2)(3)(4). Antibody-based assays such as Western blotting and enzyme-linked immunosorbent assay (ELISA) are commonly used in research and clinical laboratories to provide high sensitivity quantification of protein expression and selected PTMs. However, there are many inherent problems in the antibody-based approaches including the limited availability of high-quality antibodies and the batchto-batch variation of the commercially available antibodies, the difficulty to develop antibodies specific to one protein isoform or PTM of interest, and the species dependence of antibodies (5)(6)(7). It is also nearly impractical to use antibody-based approaches for large-scale quantification of expression and PTM changes of a large number of proteins (8,9).
Liquid chromatography (LC)/mass spectrometry (MS)based proteomics technology has become the method of choice for the quantitative analysis of thousands of proteins in many biological systems (1-4, 10 -14). Most of the the quantitative proteomics studies were carried out using the conventional "bottom-up" approach, in which proteins are digested into small peptides prior to LC/MS analysis. Despite its high throughput and high sensitivity, the bottom-up approach introduces a 'peptide-to-protein' inference problem which complicates the identification and quantification steps (15)(16)(17). For example, bottom-up proteomics typically does not distinguish protein isoforms with high sequence homology and may yield inconsistent results in the determination of protein expression especially in the label-free quantification using multiple peptides from the same protein (17). Therefore, antibody-based assays are often necessary for the validation of the protein expression changes obtained from the bottom-up proteomics data set (12,18,19).
In contrast, the emerging "top-down" approach (20 -22) directly analyzes intact proteins, and thus, allows for the differentiation among myriad proteoforms (23) arising from the same gene because of alternative mRNA splicing and PTMs, or homologous protein isoforms produced by different genes (20), independent of the antibody used. The top-down MS approach is especially attractive for relative quantification of protein modifications because small modifying groups have much less effect on the physicochemical properties of intact protein compared to those of peptides (24 -30). Nevertheless, compared with the well-established quantitative strategies in the bottom-up approach (31,32), reproducible and accurate quantification of protein expression levels across multiple samples by top-down proteomics remains challenging.
Herein, we have developed a highly robust and reliable LC/MS platform for quantitative top-down targeted proteomic analyses of sarcomeric protein expression together with their modifications using reversed-phase chromatography (RPC) separation coupled with a high-resolution quadrupole timeof-flight (Q-TOF) mass spectrometer. First, we demonstrated high reproducibility and good linearity of this method for the quantification of protein expression levels. Subsequently, we employed this platform for the simultaneous quantification of protein expression and PTM changes in the sarcomeric subproteome from various muscle types of different species to demonstrate the general applicability and robustness of the method which is independent of species and tissue type. A single LC/MS run per sample enables the quantification of expression and PTMs of major sarcomeric protein isoforms (Ͻ50 kDa), without the need of antibodies used in Western blotting and with greatly simplified analytical procedures. Overall, this study opens great opportunities for top-down LC/MS-based proteomics in terms of simultaneous quantification of protein expression and PTMs in a subproteome given optimized chromatographic method and stable mass spectrometer performance, which circumvented the availability, specificity, species dependence issues in the antibodybased approaches.

EXPERIMENTAL PROCEDURES
Chemicals-All reagents were purchased from Sigma-Aldrich Inc. (St. Louis, MO) unless noted otherwise. HPLC grade water, acetonitrile, and ethanol were purchased from Fisher Scientific (Fair Lawn, NJ).

Skeletal and Cardiac Muscle Samples-Male
Fisher 344 x Brown Norway F1 hybrid rats (F344BN) aged 24-months (n ϭ 6) were obtained from the National Institute on Aging colony maintained by Harlan Sprague-Dawley (Indianapolis, IN). The vastus lateralis (VL), vastus intermedius (VI) and soleus (SOL) muscles taken from one leg of each rat were flash frozen in liquid N 2 and stored at Ϫ80°C. Cardiac tissues from the ventricles of the adult (n ϭ 8) and fetal sheep (n ϭ 6) were obtained from the Center for the Study of Fetal Programming at University of Wyoming and stored at Ϫ80°C for subsequent analysis.
Sarcomeric Proteome Extraction-The extraction of sarcomeric proteins from skeletal and cardiac tissues was described previously (24 -26). Briefly, 5-20 mg of muscle tissue was homogenized in 100 l HEPES extraction buffer (25 mM HEPES pH 7.5, 50 mM NaF, 2.5 mM EDTA, 1 mM PMSF, 1 mM Na 3 VO 4 ), followed by the centrifugation at 16,000 rcf for 15 min at 4°C, and the remaining pellet was further homogenized in 10 vol (l/mg tissue) of TFA solution (1% TFA, 2 mM TCEP). The homogenate was centrifuged at 16,000 rcf for 30 min at 4°C, and the supernatant was collected. Bradford protein assay was performed using bovine serum albumin for the linear curve to determine the total protein concentration of the extracts for protein normalization.
Tandem MS Analysis-Data-dependent automatic MS/MS was performed on the rat skeletal and sheep cardiac sarcomeric protein extracts. The top three most intense ions in each MS spectrum were selected and fragmented by collision-induced dissociation (CID) with a scan rate of 2 Hz for 10 spectra in 200 -2000 m/z, with an active exclusion time set to 90 s. Offline CID and electron capture dissociation (ECD) (33) were also employed to assist the protein identification and characterization via LC/MSϩ (24,34).
Western Blot Analysis-Sarcomeric protein extracts (2.5 g of total proteins after optimization) from the rat skeletal muscles extracts were resolved using 12.5% SDS-polyacrylamide gel cast in-house. The proteins on the gel were transferred to methanol-activated PVDF membrane overnight at 30 mA constant current at 4°C, and the membrane was blocked using protein-free blocking solution at room temperature for 1 h. The PVDF membrane was cut as guided by the pre-stained molecular weight markers for probing individual proteins with various molecular weights. Primary antibodies for skeletal ␣-actin, ␣-tropomyosin, and troponin I were added by 1:1000, 1:500, 1:5000 dilutions in the protein-free blocking solution, and incubated overnight at 4°C. The PVDF membranes were washed using TBS containing 0.1% Tween-20 and incubated with secondary antibodies conjugated with HRP (1:5000 dilution) in protein-free blocking solution for 2 h. Enhanced chemiluminescence reagents (GE Health) were mixed by a 1:1 ratio, and added to individual membrane, and the membranes were imaged using Odyssey Fc imager (LI-COR, Lincoln, NE).
Experimental Design and Statistical Rationale-Quantitative topdown proteomics analysis was based on (1) 6 rats (biological replicates) of 24 months old to study the muscle-fiber specific heterogeneity, and (2) 6 fetal and 8 adult sheep (biological replicates) to investigate the expression and PTM changes in cardiac development. Based on the highly reproducible injection and extraction technical replicates shown in the results section, single LC/MS runs were carried out per biological replicate.
All LC/MS data were processed and analyzed using DataAnalysis software (Bruker Daltonics). All chromatograms shown were smoothened by Gauss algorithm with a smoothing width of 2.04 s. Mass spectra were deconvoluted using the Maximum Entropy (35) algorithm incorporated in the DataAnalysis software. The resolving power for Maximum Entropy deconvolution was set to 50,000 for proteins that were isotopically resolved. MS/MS (either online or offline) were performed to identify proteins and subsequently associated PTMs and sequence variations. Offline MS/MS results of rat skeletal muscles have been previously reported (36). Online MS/MS data of sheep cardiac sarcomeric proteins was output as an .msalign file from the DataAnalysis software for protein identification using MS-Alignϩ (37). The rat protein database (35,838 entries) and sheep protein database (27,368 entries) from Swiss-Prot (download date 07/18/2016) were used for the skeletal and cardiac samples, respectively. The searching parameters were set as following: shift number 3; mass tolerance 15 ppm; e-value cutoff 0.01. All protein identifications were validated manually using the MASH Suite Pro software (38).
To quantify protein expression across samples, top 5 most abundant charge states ions (average Ϯ 0.2 m/z) of all major proteoforms from the same protein were retrieved collectively as one extracted peak in the extracted ion chromatogram. The list of ions selected can be found in supplemental Table S1, suplemental Information. The area under curve was manually determined for each protein isoform using DataAnalysis. To quantify protein modifications, the relative abundances of specific modifications were calculated as their corresponding percentages among all the detected protein forms in the deconvoluted averaged mass spectra as described previously (24,26,28,29,36). All masses reported are monoisotopic values for both intact and fragment ions.
All statistical data were presented as the mean Ϯ standard error of the mean (S.E.M.). Student's t test was performed between group comparisons to evaluate the statistical significance of variance for the validation of the simultaneous quantification of protein expression and modification changes. Differences among means were considered significant at p Ͻ 0.05. All error bars shown in the figures were based on S.E.M.

A Robust Top-down LC/MS Platform for Simultaneous
Quantification of Sarcomeric Protein Expression and Modifications-To achieve reliable and accurate quantification of protein expression level across multiple samples by top-down MS, several criteria should be confirmed: reproducibility of the sample preparation protocol, robustness of the protein separation strategy, and linearity of the instrument response from the mass spectrometer. The extraction protocol employed in this study has been proven effective for enriching sarcomeric proteins from striated and cardiac muscle tissue ( Fig. 1A) (24,39). Meanwhile, total protein normalization has yet to be utilized in top-down sarcomeric protein analysis. Here, we aim to establish reliable quantification of protein expression by normalizing samples to the same protein concentration ( Next, we assessed the reproducibility of this LC/MS method for the separation and quantification using rat skeletal sarcomeric subproteome. We chose the skeletal muscle system because of its highly heterogeneous nature and the co-existence of isoforms generated from multi-gene families together with the PTMs. A complete list of the sarcomeric protein isoforms and modifications analyzed can be found in supplemental Table  S1. First, we performed three injection replicates for the same rat VL extract. The chromatograms from the different runs were nearly identical with constant retention times and MS signal intensities for individual proteins (supplemental Fig.  S1). Subsequently, we analyzed six extraction replicates from the same VL tissue sample and normalized the final concentration of all the samples prior to LC/MS analysis. With the same amount of total proteins injected, the base peak chromatograms (BPCs) were highly reproducible among different protein extracts as shown in Fig. 2A. The EICs of the representative proteins, fast skeletal troponin I (fsTnI), skeletal ␣ tropomyosin (s␣-Tpm), fast skeletal myosin light chain 2 (fsMLC2, also known as fast skeletal regulatory light chain) and skeletal ␣-actin (s␣-Actin) showed consistent abundances as represented by their AUCs of the EICs (Fig. 2B) (coefficient of variation Ͻ 5%, supplemental Fig. S2A). This demonstrated the robustness of the sample preparation, protein separation and detection method, as well as the feasibility of using total protein normalization as an internal control.
Following the assessment of reproducibility, we evaluated the linearity of the instrument response for multiple proteins extracted from rat VL, including fsTnI, fsMLC2, s␣-Actin, s␣-Tpm, and fast skeletal troponin C (fsTnC). With 0.19, 0.22, 0.28, 0.45, 0.56, 0.75 and 1.1 g of total proteins injected for LC/MS analysis, the retention times of the representative proteins were highly consistent (Fig. 2C). The abundances of the two and three major proteoforms (i.e. M1, acM1; M2, pM2, ppM2), respectively. The relative abundances of modifications of M1 and M2 were quantified within the deconvoluted mass spectrum. "ac" represents acetylation; "p" and "pp" represent mono-and bis-phosphorylation, respectively. (AUCs) of the individual proteins exhibit mutual linear correlation with the 0.28 -0.75 g of total proteins loaded into the LC/MS system. As shown in Fig. 2D, the standard curves for individual proteins do not always have intercept at the origin, indicating that fold-change of a protein obtained from LC/MS dataset does not equally correlate to fold-change in the actual amount of the protein in the sample. Thus, it is critical to establish a standard curve for any protein of interest to evaluate the instrument response for the correct evaluation of the fold-change in protein expression level between samples.
Quantitative Analysis of Skeletal Muscle Protein Isoform Expression and Modifications-Skeletal muscles provide an excellent system to test the quantitative capabilities of our top-down LC/MS strategy because of high muscle-to-muscle heterogeneity in the expression of sarcomeric protein isoforms and various degrees of modifications. Given the importance of isoform switching in muscle physiology and pathophysiology (45)(46)(47)(48), we evaluated if this top-down LC/MS method can be applied to quantify the expression of sarcomeric protein isoforms in skeletal muscles. We used three different muscle types, VL, VI, and SOl, which contains predominantly fast-twitch, a mix of fast-and slow-twitch, and predominantly slow-twitch muscle fibers, respectively (49,50).
Equal amounts of total protein (0.50 g) for skeletal muscle protein extract from VL, VI, and SOL were loaded and separated by RPC (Fig. 3A and supplemental Fig. S3). As expected, the fast-skeletal isoforms of the contractile proteins, EICs of fsTnI and ssTnI show differential expression of the proteins consistent with the muscle types. C, EICs of fsMLC2 and ssMLC2show differential expression of the proteins consistent with the muscle types. D, EICs of s␣-Tpm and s␤-Tpm show differential expression of the proteins consistent with the muscle types. Deconvoluted mass spectra show each protein in fast-and slow-twitch fibers with modifications. p and pp denotes mono-phosphorylation and bis-phosphorylation, respectively. VL and SOL contains primarily fast-and slow-twitch muscle fibers, respectively, and VI contains both fast-and slow-twitch muscle fibers. The variations of protein expression and PTMs are highly dependent on the ratio of fast-and slow-twitch fibers in the VI tissue. E, Summary of the expression of different contractile protein isoforms in six biological replicates in the rat VL and SOL muscles. F, Summary of the modification changes within multiple sarcomeric protein isoforms in the rat VL and SOL muscles. *p Ͻ 0.05, **p Ͻ 0.01, ***p Ͻ 0.001. such as fsTnI and fsMLC2, are highly expressed in VL, and the slow-skeletal isoforms, such as ssTnI and ssMLC2, are the major isoforms in SOL, whereas VI contains both the fast-and slow-skeletal isoforms with various abundances as shown in Fig. 3B and 3C. Similarly, s␣-Tpm is mainly expressed in fast-twitch fibers whereas skeletal ␤ tropomyosin (s␤-Tpm) is the predominant isoform in slow-twitch muscles as shown in Fig. 3D (51). It is worth noting that both fsMLC2 and ssMLC2 are phosphorylated: In VL, mono-phosphorylated and bisphosphorylated fsMLC2 are abundant modifications that constitute ϳ50% among the fsMLC2 isoform, whereas the monophosphorylated ssMLC2 is close to 10% of the ssMLC2 species in SOL. The un-phosphorylated and phosphorylated forms are extracted and integrated together for the fast-and slow-skeletal MLC2 isoforms (left panels in Fig. 3C). Despite that various isoforms of the same contractile protein family have a high degree of sequence homology (supplemental Fig.  S4), we were able to achieve high-resolution separation of these isoforms as shown in Fig. 3A-3D and supplemental Fig.  S3. High-resolution protein separation is the basis for accurate quantification of the expression level of different isoforms, as co-eluting proteins from different genes with similar m/z typically result in higher baseline for the EICs of the target protein family, and therefore, may lead to inaccurate estimation of protein expression level. Because the rat VI is a mixture of both fast-and slow-twitch fibers, variations of protein expression and PTMs are highly dependent on the ratio of fast-and slow-twitch fibers in the VI tissue. Thus we employed this robust LC/MS method to quantify the expression of the major contractile protein isoforms (Ͻ50 kDa) in the rat VL and SOL muscles only (Fig. 3E). The expression of these isoforms is highly consistent with the type of muscles being analyzed (fast-twitch muscle versus slow-twitch muscle).
Besides the different level of expression, the PTMs for each individual isoform are also altered in VL and SOL as shown in Fig. 3F. The phosphorylation levels of s␣-Tpm and s␤-Tpm increased in the slow-twitch fibers (SOLϾVL); for ssTnI and ssMLC2, the percentage of phosphorylation remained similar across the two muscle types; and a decreasing trend was observed in fsMLC2 phosphorylation in SOL compared with that in VL.
Sarcomeric Protein Isoform and Modification Changes During Cardiac Development-We next sought to demonstrate that our established LC/MS method can be generally applicable to different species (i.e. rat versus sheep) and various types of muscles (i.e. skeletal versus cardiac). Sheep is increasingly recognized as phenotypically relevant models for human genetic diseases (52) but typically antibodies developed for rodents and humans do not recognize proteins from these different species because of the species-dependent nature of antibodies. Hence, there is an urgent need to develop an antibody-independent quantitative strategy. After establishing the reliable quantification of sarcomeric isoform expression in different skeletal muscles from rat, we em-ployed this method to investigate the sarcomeric protein isoform and PTM changes in cardiac development using cardiac tissue from adult (n ϭ 8) and fetal (n ϭ 6) sheep. supplemental Fig. S2B shows the extraction reproducibility for adult sheep cardiac tissue. The representative RPC chromatograms of the sarcomeric protein extracts are shown in supplemental Fig.  S5 with the linearity assessment. Automatic MS/MS was performed to determine the isoforms of the sarcomeric proteins in the sheep cardiac tissue. The protein identification results are summarized in supplemental Table S1. We were able to identify the major sarcomeric proteins in the sheep cardiac tissue despite the sheep specific antibodies often being unavailable or unreliable.
We then quantified the expression of sarcomeric protein isoforms and observed increased expression of cTnI accompanied by decreased expression of ssTnI (Fig. 4A, 4B). This is consistent with previous reports that indicated decreased ssTnI expression and increased cTnI expression in human hearts during cardiac development (53,54), which demonstrates the accuracy of our LC/MS-based top-down proteomics platform for the quantification of protein isoforms. Although TnI isoform switching during development is well documented (53,54), the extent of cTnI phosphorylation in the developmental hearts is less clear. Interestingly, despite an increase in cTnI expression, cTnI phosphorylation was not significantly altered in the adult sheep cardiac tissue compared with fetal tissue shown in the bar graphs at the bottom of Fig. 4A. Because either isoform switching or PTM changes of TnI affects the pH-dependent myofibrillar Ca 2ϩ activation of contractile force production (55,56), our LC/MSbased method permits the simultaneous quantification of TnI isoforms and PTMs to determine the contributions of isoform switching or PTM changes on the change of contractile properties in cardiac development and diseases. In contrast to most cardiac sarcomeric proteins, which exhibit increased protein expression in adult cardiac tissue compared with fetal cardiac tissue (Fig. 4C, 4D and supplemental Fig. S6), the change of muscle LIM protein (MLP) expression was not significant when the un-modified and mono-phosphorylated MLP were quantified collectively (Fig. 4E). However, the relative abundance of mono-phosphorylated MLP among total MLP is higher in adulthood compared with the fetal stage, indicating changes in certain signaling pathways in the developmental stage (Fig. 4E), which was not previously investigated.

Comparison Between Top-down LC/MS-based Quantification and Western Blot Analysis for Protein Expression-We
further compared protein expression quantification using topdown LC/MS-based method with the well-established Western blotting approach. For the identical samples that were analyzed by LC/MS (0.50 g), we loaded 2.5 g total protein for SDS-PAGE and Western blotting. Biological replicates 1-5, each containing three muscle types, were employed to compare the performances of Western blotting and LC/MS, because of the limited number of lanes in the homemade gel. Skeletal muscle-specific antibodies were used to probe for s␣-Actin, ␣-Tpm, ssTnI and fsTnI in different muscle types. Quantification of protein expression level by top-down LC/MS and Western blotting were highly consistent, as shown in Fig.  5A, for the four proteins investigated in a case-by-case manner for three representative rats. s␣-Actin remained relatively constant in different muscle types without significant changes. ␣-Tpm was highly expressed in fast-twitch skeletal muscle (VL) and the expression was low in SOL as a slowtwitch muscle. In the mixed-fiber cases of VI, ␣-Tpm expression was higher in the VI#1 and VI#2 skeletal muscle protein extract compared with VI#3, indicating that VI#1 and VI#2 contain more fast-twitch muscle fibers whereas VI#3 contains , respectively. ssTnI expression (no modification) was evident in fetal hearts but absent in adult hearts; c␣-Tpm, c␤-Tpm, and cMLC2 expressions increased in adult sheep heart without modification changes. MLP expression was not altered, whereas the relative abundance of mono-phosphorylated MLP increased in adult heart compared with fetal heart. p and pp denote mono-phosphorylation and bis-phosphorylation, respectively. **p Ͻ 0.01, ***p Ͻ 0.001.

FIG. 5. Comparison of sarcomeric protein expression quantification by top-down LC/MS and Western blotting (WB) in rat skeletal muscle fibers.
A, Case-by-case comparison of s␣-Actin (green), s␣-Tpm (red), ssTnI (dark blue) and fsTnI (light blue) by EIC and WB in VL, VI, and SOL fibers from three representative rats. B, Bar graphs showing high correlation between LC/MS (n ϭ 5) and WB quantification (n ϭ 5). Relative abundances were normalized to the highest mean of the corresponding protein from the three muscle types. more slow-twitch fibers. Consistent with ␣-Tpm expression, ssTnI expression was higher in VI#3 protein extract than in VI#1 and VI#2 whereas fsTnI expression was lower in VI#3. Based on five biological replicates in Fig. 5B, the trends of expression level changes were very similar between LC/MS and Western blotting quantification, although some discrepancies also existed in the relative abundances because of the different response factors (i.e. sensitivity, dynamic range, background level, and total protein-loading amount) for the two methods.
Sarcomeric Protein Identification and Characterization by MS/MS-One limitation for antibody-based approaches is that they can only be utilized to identify/confirm and quantify known proteins or PTMs (8,9). In contrast, MS/MS strategies in top-down proteomics provide unparalleled opportunity for identification, characterization, and quantification of both known and unknown proteins and PTMs (20 -22). In this study, both online and offline MS/MS (36) were employed to identify protein isoforms and characterize their modifications. Generally, offline MS/MS is powerful in characterizing various isoforms and PTMs to achieve high coverage of amino acid residue cleavages (24,36,57) with the trade-off of time and labor, whereas online automatic data-dependent MS/MS is convenient in identifying the proteins as well as providing often partial characterization of the sequence variations and PTMs. For example, a polymorphism in cTnI has been observed in several sheep cardiac samples (second fetal sample shown in Fig. 4A), which provides a good opportunity to evaluate the capability of online automatic MS/MS in characterizing sarcomeric protein sequence variants and modifications. To localize the PTMs and sequence variation of cTnI after its identification by MSAlignϩ (37), we carefully examined the automatic MS/MS data using MASH Suite Pro (38). Fig. 6A shows the two variants of the sheep bis-phosphorylated cTnI with removal of the N-terminal Met1, and Ala2 acetylation (42.01 Da). Notably, the two phosphates had minimum cleavage during CID for the b ions as demonstrated in Fig. 6B, although phosphorylation is generally considered as a labile modification (58). The two phosphorylated sites were then localized to Ser24, Ser25, or Thr33 based on the preservation of the phosphates on the fragment ions. Previous work identified endogenous phosphorylation sites of human and swine cTnI as Ser24 and Ser25 by top-down MS (24,59,60), and therefore, sheep cTnI phosphorylation sites are likely localized to Ser24/25 based on sequence homology (supplemental Fig. S7). Regarding the sequence variation on cTnI, the accurate ⌬mass of Ϫ30.01 Da in mass spectrum suggests that the polymorphism is likely either Thr3 Ala or Ser3 Gly variation, although variations with 2 or more amino acid changes that collectively result in a 30.01 Da difference are also possible. The fragment ion pairs with a ⌬mass of 30.01 Da and similar intensities were investigated to localize the sequence variation. Such pairs were constantly observed for y ions larger than y42, but did not exist for y20 and smaller ions ( Fig. 6C and supplemental Fig. S8). Because there is no Ser residue within the C-terminal 20 -42 amino acids, and only one Thr residue is present in this range, T183A (Fig. 6A) is most likely to be the polymorphism that results in 30.01 Da difference in the sheep cTnI species. The online CID shows adequate performance in the identification and partial characterization of major sarcomeric proteins and provides supplementary information to protein expression and modification quantification. Offline protein characterization may still be essential if bond cleavages of most residues are required.

DISCUSSION
Despite the high sensitivity for quantification of protein expression, the antibody-based assays, such as Western blotting and ELISA, face significant challenges (5). A significant percentage of commercially available antibodies have serious issues such as poor quality, cross-reactivity, and batch-tobatch variation which makes it difficult to conduct reproducible Western blotting and ELISA (7). It is even harder to produce high-quality antibodies that can target specific isoform to distinguish it from the others with high sequence similarity as well as a specific PTM. Furthermore, the antibodies generated based on one species (e.g. human or mouse) may not recognize proteins in other species (e.g. pig or sheep). Given the increasing use of large-animal models to recapitulate human diseases, currently there is an urgent need to develop quantification strategies that are species-independent (8,9).
Quantitative proteomics strategies offer a solution to solve this problem (1)(2)(3)(4). However, the mainstream bottom-up quantification of peptides produced from protein digests rather than direct quantification of proteins and may lead to complication and inconsistency (there are cases for which different peptides from the same protein yielded different quantification results) (17). Top-down proteomics strategies can quantify intact proteins directly, providing an ultimate solution for determining changes in protein expression and PTMs. In the past, top-down proteomics has often been considered semi-quantitative, with un-modified and modified forms with small sequence variations quantified in the same mass spectrum, given the assumption that small modifications or sequence variations have negligible effects on the ionization and detection of closely related forms (20, 21, 24 -26).
Compared with the well-developed quantitative strategies in the bottom-up approach (31,32), reproducible and accurate quantification of protein expression across multiple sam-ples remain challenging (9). Substantial efforts have been made in top-down quantitative analysis across samples using labeling strategies, such as stable isotopic labeling by amino acids (61,62), isotope-coded protein labels (63), isobaric tags for relative and absolute quantitation (64), and tandem mass tags (65). However, incomplete incorporation of the labels can lead to spreading of the intact protein signals, resulting in detrimental reduction in the signal-to-noise ratio (S/N) of the analytes (62). Hence, label-free quantification methods for measuring protein expression, both targeted and untargeted approaches, have been recently investigated, including the pioneer work of proof-of-principle targeted lipoprotein quantification in human serum (40), the large-scale label-free quantitative pipeline developed for the quantification of protein forms below 30 kDa in a complex system (66), and the spectral counting method for exosomal proteins from myeloidderived suppressor cells (67). Although proven useful, the previous expression level quantification methods by topdown proteomics did not consider all the criteria including reproducible extraction methods, robust high-resolution LC separations, and linear MS responses, to yield reliable and accurate results. More urgently, a direct comparison of top-down LC/MS approach for protein quantification with Western blotting in a case-by-case manner has not been established.
Meanwhile, most quantitative top-down proteomics studies focus on the quantification of "proteoform expression" (66, 67) instead of "protein expression" between the samples. Even though the quantification of individual proteoforms with different masses is conceptually less challenging and easier for automation, this approach does not account for the families of related protein modifications that can have similar biological functions. For instance, cTnI has been implicated as a biomarker in acute myocardial infarction, also known as heart attack (68). cTnI are rapidly phosphorylated by protein kinase A in response to adrenergic stimulation, which produces positive inotropic effects and increases the rate of cardiac relaxation without altering protein expression (55,69). However, if cTnI quantification does not include all the modifications, the PTM changes could be mistaken into protein expression alteration. Therefore, closely related modifications in a same isoform family should be quantified as a whole for assessing protein expressions. This idea also accords with the antibody-based assays where proteins containing the same specific epitope are quantified collectively. On the other hand, as protein modifications varied by a single amino acid or PTM may have different activity, interacting partners and/or kinetics (20,21,58,70,71), multiple related forms within a family should then be distinguished and quantified separately.
Thus, the realization of the full potential of top-down proteomics requires a robust and reliable strategy for the quantification of protein expression and the relative abundance of modifications across multiple samples. In this study, we have demonstrated a robust top-down LC/MS-based targeted pro-teomics platform for simultaneous quantification of protein expression and PTMs. The advantages of this method include high reproducibility, good linearity for multiple proteins, and the capabilities for both expression and modification quantifications comparable to Western blotting, without the use of antibodies.
We have applied this method to quantify sarcomeric protein expression and modification abundances in multiple tissue extraction samples. Previous studies have shown that sarcomeric isoform switching plays an important role in skeletal muscle development, adaptation and diseases (48,72). Among the three rat skeletal muscle types, differential sarcomeric protein isoform expressions were observed by LC/MS and proven by Western blotting, which agrees with the fact that slow-twitch fibers (SOL) contain primarily slow contractile isoforms and fast-twitch fibers (VL) have mainly fast isoforms (49,50). In the heterogeneous muscle VI with mixed-twitch fibers, the expression of fast and slow isoforms depends on the type of fibers collected as well as individual variances. We have established a robust and well-characterized system for the quantification of sarcomeric protein isoform expression across multiple samples using LC/MS-based top-down proteomics strategy. Regarding the quantification of PTMs, as shown in one of our previous study, the decrease of phosphorylation in fsMLC2 was discovered to be associated with sarcopenia using rat gastrocnemius muscle model (26). Our results prove the differential degree of expression and modification for the same sarcomeric protein isoform in multiple muscle types, and therefore indicate the importance of muscle sample selection in disease model studies in order to assign causation between disease phenotypes and protein changes (both expression and PTMs).
In the case of sheep cardiac development, we observed the expression increases of cTnI, cardiac troponin C, c␣-Tpm, c␤-Tpm, cMLC1, cMLC2, and s␣-Actin during the cardiac development, whereas ssTnI was only expressed in fetal samples but absent in the adulthood. These findings are in agreement with the previous studies in human hearts during cardiac development by Western and Northern blot analysis (53,54). Our previous quantitative top-down analysis of sarcomeric proteins in cardiac muscles where the quantification was limited to the relative abundances of modified versus un-modified forms within samples (24). In contrast, this top-down LC/MS approach enables sarcomeric protein expression quantification across samples with the consideration of all major protein forms in one isoform family. Because of the lack of antibodies specific for sheep proteins, Western blotting for these sarcomeric proteins was not possible. Therefore, our top-down LC/MS-based protein quantification platform offers an excellent alternative for the accurate quantification of protein expression level independent of target protein specific antibodies.

CONCLUSIONS
In summary, we have developed a reliable and robust topdown LC/MS strategy that allows for the quantification of protein expression simultaneously with corresponding modifications across multiple samples. On the premise of reproducible sample preparation, high-resolution protein separation, and linear instrument response, our method enables the quantification of major sarcomeric proteins from muscle extracts. This method accounts for the inter-relationship among un-modified and modified protein forms of the same isoform family, facilitating accurate quantification of protein expression. Moreover, given the impacts of protein PTMs and sequence variations on protein function, the quantification and characterization of modification is also permitted. The differentiation between the regulation of protein expression and PTMs will significantly benefit the elucidation of cellular signal transduction and disease mechanisms. The major advantages of this method include high reproducibility, good linearity for multiple proteins, and the capabilities for both expression and modification quantifications comparable to Western blotting, without the use of antibodies. Essentially, with well-controlled sample preparation procedures, optimized and reproducible chromatographic strategies, robust MS methods and automated data processing algorithm, many proteins can be quantified using this top-down LC/MS-based targeted proteomics method. This platform opens unique opportunities for top-down LC/MS approaches for simultaneous quantification and characterization of protein expression levels and PTMs in complex protein mixtures. * Financial support was provided by NIH R01 HL109810 and HL096971 (to Y.G.). Y.G. would like to acknowledge NIH R01 GM117058, R01 GM125085, and high-end instrument grant S10OD018475. T.T. would like to acknowledge support from the NIH Chemistry-Biology Interface Training Program NIH T32GM008505. W. G. And S. F. would like to acknowledge NIFA-USDA Hatch grant (#1009266 to W.G.), and NIH HD070096-01A1 (to S. F).