Analytical validation of protein biomarkers for risk of spontaneous preterm birth

Highlights • Results for an assay validation detecting two spontaneous preterm birth biomarkers.• Sample prep of immuno-depletion, tryptic digest, and mass spectrometric detection.• Precision, accuracy, linearity, LOQ and analytical specificity determined.• Method provides robust means of determining relative abundances of biomarkers.• Allows for prediction of individual risk of spontaneous preterm birth.


Introduction
Preterm birth (PTB), defined as delivery at fewer than 37 weeks of gestation, is the leading cause of mortality and morbidity in neonates [1].Worldwide, PTB impacts 15 million deliveries annually and results in over one million infant deaths [2,3].Spontaneous onset of preterm birth (sPTB) represents a high percentage of all PTB cases [4].Until recently, sPTB lacked an adequate prognostic test.The development of such a test was complicated by the variety of etiologic associations described for sPTB, including infection, inflammation, placental complications, and uterine distension [4].The complex etiology requires that an effective prognostic test must have the ability to interrogate multiple biological pathways.
Increasingly, proteomics is being used in clinical diagnostic testing as a predictors for a variety of complex diseases and conditions (e.g., spontaneous preterm birth [5], lung cancer [6], therapeutic targeting of breast cancer [7]).These conditions are often characterized by numerous and diverse interconnecting biological pathways, requiring systematic approaches for the development of comprehensive clinical diagnostic tests.Tandem mass spectrometry is not only capable of multiplexing assays, but it can also be rigorously validated on a wide variety of analytes, including proteins [8][9][10][11][12][13].
We applied a targeted proteomics workflow, coupled with highly multiplexed tandem mass spectrometry detection, to simultaneously monitor peptides from candidate signature proteins and quality control proteins in subsets of clinical study serum samples.Insulin-like growth factor-binding protein 4 (IBP4) and sex hormone-binding globulin (SHBG) were previously shown by our group to perform well as biomarkers for discriminating sPTB from term birth [5].Additionally, we utilized this technology to develop and clinically validate a bivariate protein biomarker assay for the qualitative prediction of individual risk of spontaneous preterm birth [5].
We previously validated the first-generation assay using an Agilent 6490 and a 30-min liquid chromatography gradient.This method established acceptable measures of analytical validation, including precision, carryover, limit of detection, and analytical specificity, and was used to assay blinded samples during a previously published clinical validation [5] and commercial samples later.As with any new assay, performance requirements change over time, either to improve analytical performance or reduce cost.To increase the laboratory's throughput and to enable integration of a current-generation mass spectrometer, a plan was developed to validate the method after migration to an Agilent 6495 triple quadrupole mass spectrometer with a 15-min liquid chromatography gradient.The validation plan required that this novel second-generation assay demonstrate analytical performance equivalent to that of the previously validated [5] firstgeneration method, prior to being placed into clinical practice.Validation of the second-generation clinical diagnostic assay for the IBP4 and SHBG signature proteins, which measured precision, alternative method comparison, linearity, limits of quantitation, carryover, analytical specificity, interference, and stability [14], is described herein.

Validation samples and quality control material
This work used samples derived from the Proteomic Assessment of Preterm Risk (PAPR) study, which enrolled 5501 pregnant women from 11 clinical sites and is broadly representative of the US population [15].Maternal serum was collected from patients between 17 and 28 weeks of gestation for the purposes of developing a second trimester serum-based test predictive of sPTB risk.Women with singleton pregnancies, aged 18 to 60 years, receiving prenatal care and capable of providing consent, were eligible for the study.Subjects pregnant with more than one fetus or those with a known or suspected fetal anomaly were excluded.Of the 5235 women who continued the study, pre-specified exclusions resulted in the removal of 326 subjects with medically-indicated preterm birth, 109 subjects who had been treated with progesterone, and 28 samples with pre-analytical issues (e.g., hemolysis).Of the remaining samples, 413 were used for analytical validation.The PAPR study followed the Good Clinical Practice guidelines issued by the International Conference on Harmonisation [16].
In addition to the PAPR samples, two pools of serum from female donors were purchased from Golden West Biologicals (Temecula, CA).One serum pool (QC1) was created from equal volumes of serum from 10 non-pregnant female donors.A second serum pool (QC2) was created from equal volumes of serum from 21 pregnant donors.The QC pools were a critical part of quality control for each batch, allowing for long-term trend analysis of assay performance.Non-pooled serum from individual donors was also purchased from Golden West Biologicals.Both the pooled materials and the single donor samples consisted of a large number of identical single-use aliquots that were stored in the same conditions as clinical samples (À80 °C).The pooled materials and singledonor samples were collected using the same protocol used in the PAPR study.
Phosphate buffered saline served as a negative control sample and was purchased from Life Technologies (Carlsbad, CA).

Reagents and consumables
Protein Depletion Buffer A and B were supplied by Agilent Technologies (Santa Clara, CA).Acetonitrile and water were LC-MS grade and were purchased from ThermoFisher (Hampton, NH).Methanol was HPLC grade and was purchased from JT Baker.Dithiothreitol (DTT), iodoacetamide (IAA) and trifluoroacetic acid (TFA) were purchased from Sigma (St. Louis, MO).A custom order of solubilized Trypsin Gold was purchased from Promega (Madi-son, WI).Formic acid was purchased from ThermoFisher (Hampton, NH).
High purity stable isotope standards (SIS) were purchased from New England Peptides (Gardner, MA).The SIS were checked for a minimum purity of 95% by HPLC.A mass spectral analysis was also performed and yielded an acceptable measured mass within 0.1% of the calculated average molecular weight.An amino acid analysis was performed to verify the amino acid composition to within 20% of the theoretical concentration for each amino acid and to determine the molar yield of each peptide.Carboxy-terminal lysine and arginine residues of SIS peptides were uniformly labeled with 13 C or 15 N resulting in either a +8 or +10 amu mass shift, respectively.Individually synthesized peptides were used to create a pool of high purity SIS containing heavy-labeled analogues of signature and quality control peptides monitored in the PreTRM Ò assay.SIS peptides were used at a final concentration that approximated the abundance of the endogenous peptides, excepting the linearity and limits of quantitation studies.

Initial workflow and assay development
Early assay development involved the creation of an MRM assay to perform relative determination of abundances of 242 proteins chosen because of their association in the literature with preterm birth and other pregnancy complications.Public and proprietary databases were used to identify up to five proteotypic peptides per protein based on previous detection in blood.The assay was further refined by supplementation with novel discoveries, recurrent literature curation, and trimming of the assay size using peptide correlation and analytical performance.Mass spectrometer settings were optimized for each peptide to provide the highest signal-to-noise, the highest precision, or a signal free of chromatographic anomalies.Work flow optimization considered both serum and plasma blood fractions, protein depletion strategies and materials, tryptic digestion conditions, solid phase extraction-based desalting methods, LC-MS/MS gradient separation, and HPLC column performance.Initial assay development utilized separate injections of synthetic unlabeled peptides to confirm analyte identity.Later phases of assay development utilized sample fortification with high purity heavy-labeled peptide standards to confirm analytical specificity.

Batch design
Except for the batch runs to determine linearity and limits of quantitation, a standardized plate batch design was adopted and used for this work.The batch design utilized replicates of the two serum quality control pools (i.e., QC1 and QC2).Two terminal replicates of phosphate buffered saline, named Process Blanks, served to monitor for routine carryover and cross contamination.A maximum of 24 clinical samples were assayed in a single batch, as shown in Table 1, representing a standardized batch design; run order was top to bottom, column 1 to column 4.

Protein depletion chromatography
Clinical serum samples and pooled serum quality control materials were retrieved from À80 °C storage and allowed to thaw while resting on crushed wet ice.Once thawed and mixed by inverting three times, 50 mL of a serum sample was added to a well of a 0.2 mm polypropylene filter plate (Captiva Filter Plate, Agilent Technologies, Santa Clara, CA) containing 150 mL of Protein Depletion Buffer A. The plate was mixed on a plate vortexer for approximately 30 s and then vacuum filtered into a polypropylene deep 96-well plate.
The sample filtrates were depleted of 14 high abundance proteins using a protein depletion system based on Agilent 1260 liquid chromatography instrumentation running ChemStation software (B.04.03 SP2).Sample filtrates in the deep well plate were maintained at 4 °C in an autosampler.A 100 mL injection of the sample filtrate was fractionated on a protein depletion column (Agilent MARS14, 4.6 Â 100 mm) at 23 °C and a flow rate of 0.125 mL/min of Protein Depletion Buffer A. The flow-through fraction containing signature proteins was detected with a peak at 11.0 min by UV absorption at 280 nm and was collected in a polypropylene deep 96-well plate controlled at 4 °C (fraction collection time was 8.00-14.00min, approximate volume of fraction was 0.8 mL).After the flow-through fraction was collected, the bound fraction was eluted to waste by pumping Protein Depletion Buffer B through the depletion column at 1 mL/min.Once the bound fraction was eluted from the column, as confirmed by detection of UV absorption at 20.5 min, the column was re-equilibrated with Protein Depletion Buffer A at 1 mL/min for 11 min.The flow rate was reduced to 0.125 mL/min of Protein Depletion Buffer A to condition the depletion column for the next cycle.The cycle time was 36 min.
Protein depletion was monitored by calculating the ratio of the chromatographic peak area of the proteins of interest (Peak A) to the chromatographic peak area of the depleted proteins (Peak B).A representative chromatogram is shown in Fig. 1.

Enzymatic digest
Serum samples that were depleted of high abundance proteins underwent a trypsin digestion to generate proteotypic peptides that served as surrogate analytes for the proteins.The depleted samples were fortified with acetonitrile to a final concentration of 5%.Dithiothreitol (DTT), from a single-use aliquot stored at À20 °C, was added to each sample to a concentration of 5 mM and mixed by pipetting; the samples were then incubated at 60 °C for 20 min in a water bath.After the DTT incubation, the sample plate was equilibrated to room temperature (the lab temperature is set to 23 °C year-round).Iodoacetamide (IAA), from a single-use aliquot stored at À20 °C, was added to a concentration of 5 mM, mixed by pipetting, and incubated at room temperature for 30 min in the dark.After the IAA incubation, trypsin was added to a concentration of 6.25 mg/mL and mixed by pipetting (approximate enzyme:substrate of 1:40).The samples were then incubated at 34.5 °C for 17 h.Digested samples were fortified with a pool of high purity stable isotope standards (SIS) [17,18].The samples were mixed and then split into two equal 0.345 mL fractions.One fraction continued through the remainder of the work flow and the second was held at À80 °C in reserve for repeat analysis, as needed.

Solid phase extraction-based desalting
Samples were acidified by adding TFA to a final concentration of 1.5% (v/v), followed by solid phase extraction to remove buffer salts.A 96-well solid phase extraction plate (Empore, 3 M, St. Paul, MN) fitted to a vacuum manifold was conditioned with 0.1 mL of methanol, then 0.2 mL of water, then 0.3 mL of 5% acetonitrile in water (v/v).The entire volume of the sample was added to the conditioned plate.The sample was washed with 0.3 mL of 5% acetonitrile in water (v/v).Peptides were eluted with 0.25 mL of 95% acetonitrile in water (v/v).The eluate was frozen at -80 °C and then lyophilized to dryness.

LC-MS/MS analysis
Lyophilized desalted samples were reconstituted in 25 mL of a 2% acetonitrile, 0.1% formic acid in water solution (v/v), which matched the initial LC gradient conditions.An injection volume of 10 mL of sample was separated on an Agilent 1290 UPLC system fitted with a Poroshell EC-C18 (2.1 Â 50 mm, 2.7 mm) reversedphase LC column maintained at 50 °C.The mobile phases used in the separation were: (A) 0.1% formic acid in water (v/v), and (B) 0.1% formic acid in acetonitrile (v/v).The flow rate was maintained at 0.4 mL/min.The LC gradient used the following linear steps (time, %B): 0.00 min, 2%; 0.25 min, 2%; 4.00 min, 9%; 12.50 min, 35%; 13.00 min, 95%; 14.00 min, 95%; 14.50 min, 2%; 15.00 min, end.There was a 2.50 min equilibration at initial conditions prior to the next injection.The LC eluent was diverted to waste for the initial 1.8 min and after 13.4 min post-injection to minimize MS source fouling.The cycle time between sample injections was 17.50 min.The reconstituted samples were maintained at 4 °C in the autosampler while awaiting analysis.
Peptides were monitored on an Agilent 6495 triple quadrupole mass spectrometer operating with a Jet Stream electrospray ionization source in positive ion mode (MassHunter B.08.00).Unit resolution was used for both the first and second mass filtering quadrupoles.Source parameters were: gas temperature at 150 °C, gas flow at 11 L/min, nebulizer pressure at 20 psi, sheath gas temperature at 350 °C, sheath gas flow at 11 L/min, capillary at 4000 V, high pressure funnel RF at 200 V, and low pressure funnel RF at 110 V.
Data was acquired with a dynamic MRM method using a 1 min retention time window for the signature analytes and 0.5 min window for all other peptides.Each peptide was monitored with two correlating transitions.In addition to the two signature peptides (Table 2), the method monitored a panel of quality control peptides, as well as a large number of peptides that are of research and development interest and not required for calculating sPTB risk.The total number of transitions monitored was 442 with the method yielding average dwell times for the IBP4 and SHBG signature peptides of 42 and 50 ms, respectively, and a maximum number of concurrent transitions of 60 (Appendix A).
Chromatographic peak areas of the quantitative transition were determined using MassHunter Quantitative Analysis software (Agilent, B.07.01).Representative chromatograms are shown in Fig. 2.

Risk of spontaneous preterm birth calculation
The laboratory LIMS system (Nautilus LIMS (v 9.2), Thermo-Fisher) calculated the response ratio for the peptides according to Eqs. (1) and 2 by dividing the chromatographic area counts (a 1 and a 2 ) of the signature peptide by the chromatographic area counts of the corresponding SIS (b 1 and b 2 ) for IBP4 and SHBG, respectively.
The LIMS subsequently determined the ratio of these peptide response ratios, which corresponded to the relative change in serum levels of the two marker proteins.The proteomic score, S, is the natural logarithm of the ratio of relative response ratios of IBP4 (numerator) to SHBG (denominator): The risk of delivery before 37 weeks of gestation is reported as a Bayesian posterior probability based on the patient's individual proteomic score, S [5].The algorithm utilizes the relative abundances of the signature proteins and the patient's body mass index (BMI) to generate a qualitative risk score [5].

Precision
The precision of the assay was assessed using individual runs of the same sample sets on multiple instrumentation clusters, multiple lots of reagents, and multiple operators.Intra-batch precision (i.e., repeatability) and inter-batch precision (i.e., reproducibility) were determined by running a sample set (n = 6) repeatedly in twenty-one randomly ordered batches over a span of 34 days.The sample set was comprised of the samples shown in Table 3 with four replicates of each sample being run in each batch.
Precision acceptability was determined using the analyte response ratio %CV.Establishing appropriate acceptance criteria was accomplished by modeling changes in the response ratio of signature analytes and determining if those changes had a clinically relevant impact on the calculated risk of sPTB.

Alternative method comparison
Results of accuracy relative to clinical outcome have previously been published [5] using data that was generated by the firstgeneration validated method (data not shown).The analytical accuracy of the second-generation PreTRM assay (with Agilent 6495 detection and reduced LC run time) was assessed using an alternative method comparison to the first-generation method (with Agilent 6490 detection and a longer LC run time).The alternative method comparison involved assaying 413 individual PAPR samples in 21 batches with each batch split into two equivalent fractions after trypsin digestion (Section 2.2.4).One set of fractions was analyzed on two Agilent 6490 s running the longer LC method.The other set was analyzed on three Agilent 6495 s running the shorter LC method.Agilent 6495 derived signature analyte response ratios and proteomic scores versus the same data derived from the Agilent 6490 s were compared using linear regression.

Linearity and limits of quantitation
To complete the linearity and limits of quantitation experiments, a sufficient number of replicates of the healthy pregnant donor serum pool were processed through the trypsin digestion step and then pooled.The pool was fortified with high concentrations of SIS peptides and then serially diluted with additional volumes of the pooled, partially-processed serum sample.The replicates, n = 10 for each concentration level, were then continued through the remainder of the work flow and submitted for LC-MS/ MS analysis.The resulting samples maintained a constant concentration of endogenous analytes and a wide range of concentrations of SIS.A reversed response ratio was calculated by dividing the chromatographic peak areas of the SIS response by the chromato- graphic peak areas of the endogenous peptide.Linearity and limits of quantitation were determined on three Agilent 6495 LC-MS/MS systems.

Analytical specificity
The ability of the assay to consistently measure the two signature analytes in maternal serum was assessed using transition ratios.The mass spectrometric method had two mass transitions for each analyte.If a serum sample contained an endogenous or exogenous substance that interfered with the detection of a transition, the ratio of the responses from the two transitions was abnormal.The transition ratio was calculated using Eq. ( 4).
Transition Ratio ¼ ðChromatographic Peak Area of Qualitative Transition=Chromatographic Peak Area of 2.2.12.Endogenous interferent testing The potential for common endogenous interferents to impact the ability of the assay to measure either of the signature peptides was evaluated by fortifying an intended use sample (QC2) with high concentrations of triglyceride-rich lipoprotein, hemolysate, protein (i.e., albumin and immunoglobulins), conjugated bilirubin, or unconjugated bilirubin.A commercially sourced endogenous interferent kit (Assurance Interference Test Kit, Sun Diagnostics, New Gloucester, ME) was used to fortify QC2 to create the test pool.The test pool was compared to a control pool that was created using QC2 diluted with the solvent system used to prepare the interferent stock.Equal numbers of test and control pool replicates (n = 7) were assayed using the complete work flow, and the proteomic score was calculated for all replicates.A two-tailed homoscedastic t-test was used to determine if the p-value of the proteomic score differences between the test and control pools exceeded a predefined !0.05 acceptance criteria.

Analyte stability
Individual serum samples (n = 12, Golden West Biologicals) were subjected to multiple freeze/thaw cycles to determine the impact on signature analytes as interpreted by the proteomic score.Serum samples from twelve healthy pregnant donors, received and stored at -80 °C, were subjected to three freeze/thaw cycles.A thaw cycle involved the retrieval of the sample from À80 °C storage and incubation at ambient temperature (i.e.,

Intra-Batch and Inter-Batch precision
The assay was designed to utilize relative abundances of IBP4 and SHBG to assign a risk for sPTB.The lowest reported risk of sPTB is <7.3% which represents the prevalence rate of sPTB in the United States based on the data and methodology at the time of clinical validation [19].The highest reported risk is >60% [5].A confidence interval around risk of sPTB was generated during clinical validation [5].To determine acceptable precision performance, two of the precision samples were used to model analyte response variance up to ±10%.One of the precision samples (15-6036, Table 3) was from a subject who had a risk of sPTB at the US prevalence rate (i.e., <7.3%) as previously determined multiple times (n = 9).The other precision sample (15-6138, Table 3) was from a subject who had a risk of sPTB that was over three times the US sPTB prevalence rate (i.e., 24%), also previously measured multiple times (n = 3).Surface plots for these two samples were generated with xand y-axes representing modeled IBP4 and SHBG response ratios, respectively (Fig. 3).The sample mean analyte response ratio was set at zero on each axis (center of surface plots).The risk of sPTB, plotted on the z-axis, was calculated using combinations of modeled analyte response ratios and assessed as to whether the modeled values were outside of the risk confidence interval determined in clinical validation (dashed lines along the z-axis plane, [5]).For the sample at the US sPTB prevalence rate, all combinations of modeled analyte response ratios resulted in risks of sPTB that were within the established risk confidence interval (e.g., no risk of a false positive result for a lower risk sample).For the elevated risk sample, there was a change in the calculated risk that exceeded the risk confidence interval when the IBP4 response ratio was high in conjunction with a low SHBG response ratio, as indicated in the red portion of the plot.While such a situation cannot absolutely be ruled out, the most common analytical errors (e.g., digestion, sample or SIS handling) would change both signature analyte responses in the same direction and would be corrected by the relative protein response used to calculate the proteomic score.Nevertheless, the change in determined risk of sPTB did not affect the predictive value of the result; the elevated risk status was main-tained (e.g., no risk of a false negative result for an elevated risk sample).
Based on the precision modeling results, a 20% CV for each signature analyte response ratio was deemed to not significantly affect the risk of sPTB determination.Therefore, 20% CV for analyte response ratio was used as the acceptance criteria for intra-batch precision and inter-batch precision.
After repeated analysis of six precision samples in 21 batches, four replicates of each sample per batch, the response ratio for each signature analyte was assessed for agreeable results.For intrabatch precision, the established acceptance criteria were that for each of the six precision samples tested, the signature analytes had response ratio %CVs within a batch that were 20% in 20 of the 21 batches (!95%).Each of the precision samples had at least 20 of 21 batches with IBP4 and SHBG response ratio %CVs that were 20% (Table 4).Two data points were dropped from the calculations because of a sample handling error that resulted in the samples being combined into a single well.A third data point was dropped because of a trypsin digestion anomaly.
The same 21 batches were used to assess inter-batch precision.For inter-batch precision, the established acceptance criteria were that for each of the six precision samples, the signature analytes had response ratio %CVs across all 21 batches that were 20%.Each of the precision samples had IBP4 and SHBG response ratio %CVs that were 20% (Table 5).

Alternative method comparison
PAPR samples (n = 413) were used to verify that the secondgeneration alternative analytical method provided results that were accurate, relative to the first-generation reference method.Linear regression of each of the analyte response ratios, as well as the proteomic scores derived from the two models of LC-MS/ MS systems, showed excellent correlation between the two methods (Figs. 4 and 5) providing evidence that the alternative method maintains accuracy relative to the reference method and, consequently, yields equivalent test results.The assaying of the 413 samples occurred over a 43-day period.

Linearity
Linearity samples were generated by maintaining the endogenous analyte response and fortifying the samples with increasing amounts of SIS.The SIS served as a surrogate for the detection of the endogenous analyte.The endogenous analyte response was kept constant and functioned by normalizing the SIS response.The response ratios used for linearity determination were generated by dividing the SIS response by the endogenous analyte Fig. 3. Modeling impact of analyte response variability on calculated risk of sPTB.
response, the reverse of normal usage.This method, referred to as a reverse calibration curve [20], was used as a tool to determine if the detector generated a proportional response for the two signature analytes across a range of analyte abundances that are clinically relevant.The results from one of three detectors tested are shown in Fig. 6 (the results from the other two detectors are shown in the supplementary data section).Both signature analytes had a linear detector response across a broad range that encompassed the range of responses obtained from a large number of PAPR sam-ples (n = 413), which represents the intended use population (dashed lines).The three LC-MS/MS instruments tested had R 2 values that were >0.99 for IBP4 and SHBG.

Limits of quantitation
The lower and upper limits of quantitation were determined utilizing the same sample sets as used for the linearity study, and also the same reverse response ratio methodology.The limits   of quantitation were established at the lower and upper extremes of the linear range in which the reversed response ratios maintained an acceptable level of precision of 20% [21].The lower and upper limits of quantitation for the IBP4 peptide, as measured by a reverse response ratio, were 0.040 and 210, respectively (Table 6).The lower and upper limits of quantitation for the SHBG peptide, as measured by response ratio, were 0.011 and 291, respectively (Table 6).Carryover was below detection limits for IBP4 and was not significant for SHBG.

Analytical specificity
The analytical specificity of the assay was attained from the high specificity intrinsic in LC-MS/MS-based detection.The retention time of each peptide was initially determined empirically on the analytical column.To confirm the measured signal was from the expected endogenous analyte, a heavy-labeled analogue with identical chemical properties to the endogenous peptide, but with a discernable signal because of the heavy labeling, was run simultaneously.At each peptide's determined retention time, the mass spectrometer was programmed to monitor two parentproduct ion m/z transitions for each peptide and the supporting heavy-labeled analogue.The signal ratio of the two transitions (i.e., transition ratio) was calculated.Retention time, m/z of parent and product ions, and matching transition ratios between the endogenous peptide analyte and the exogenous heavy-labeled analogue were used to confirm each signal was, indeed, from the expected endogenous analyte.
The signature peptide transition ratios measured in the 413 samples assayed over a 43-day period as part of the alternative method comparison did not have chromatographic anomalies (e.g., change in peak shape, significant increases in noise, split peaks), and only one sample had a transition ratio that was >±30% of the mean transition ratio (Fig. 7, [22]).

Endogenous interferent testing
At the concentrations tested, none of the endogenous interferents tested had a significant impact on the proteomic score calculated (Table 7).The concentrations of endogenous interferents  tested were intentionally above the concentrations that would be expected in clinical samples to confirm robustness.

Analyte stability
The p-value of a paired t-test comparing the means of samples subjected to multiple freeze/thaw cycles with the means of the same samples that were not stressed was 0.23.The difference between a sample stressed with three freeze/thaw cycles was not significantly different than a non-stressed sample and, therefore, was considered stable for up to three freeze/thaw cycles.The stability of the signature analytes in stored samples was established through the consistent performance of the quality control materials stored for >2 years at À80 °C.

Quality control and long-term performance
The assay described in this study contains many steps towards the preparation of a sample that is assayed by LC-MS/MS.Each step could have been affected by environmental, procedural, reagent quality and stability, and equipment performance issues.Accordingly, several QC metrics were developed and adopted to monitor the performance of the assay for individual batches of samples, and across multiple batches, in order to assess long-term performance.Acceptability of any variances detected was judged according to the commonly applied Westgard rules [23].
The quality of a batch was determined using multiple replicates of two pooled serum quality control samples (i.e., QC1 and QC2) which bookended the clinical samples.The performance of QC1 and QC2 during depletion was monitored through the ratio of the chromatographic peak areas of the flow-through and depleted fractions in the 21 batches that were used for alternative method comparison, as shown in Fig. 8.
The protein depletion column depleted 98.0-99.9% of 14 proteins with high concentrations in human blood.Since the assay measured the amount of protein that was not depleted in a serum sample, a small change in depletion efficiency would have resulted in large changes in response ratios.For example, a sample that had 99.0% of apolipoprotein A1 (APOA1) depleted would have had a nearly ten-fold increase in response than a sample that had 99.9% depleted.A peptide from each of the 14 depleted proteins was assessed to support the efficiency of protein depletion.If a depleted protein exceeded a protein-specific threshold (i.e., <96-98% depletion efficiency) the sample and/or batch was subjected to a supervisory quality review.The responses for the depleted   proteins measured in the 413 clinical samples used in the alternative method comparison are shown in Fig. 9.An occasional sample had reduced depletion efficiency of APOA1 and transthyretin (TTHY).During endogenous interferent testing, samples burdened with high concentrations of triglyceride-rich lipoprotein had APOA1 concentrations that were higher than the less efficiently depleted samples, without a significant impact on assay performance.Two peptides for beta 2-microglobulin with different trypsin kinetic profiles were monitored to support the quality of trypsin digestion.If the kinetics of trypsin digestion were changed for a sample or a batch, then the ratio of the chromatographic peak areas for these two peptides would fall outside of the acceptance range and the individual sample, or the batch, would be failed.The digestion quality control results for the 413 samples assayed as part of alternative method comparison are shown in Fig. 10.In this figure, the x-axis represents the alternative method compar-ison samples arranged by analysis date, and the y-axis represents the ratio obtained from the chromatographic peak areas of the two beta 2-microglobulin peptides (amino acid sequence of VNHVTLSQPK and VEHSDLSFSK).The lower and upper acceptance thresholds are marked with a dashed line.
The quality of the assay was also determined through interrogation of the quality control samples.The proteomic score for each individual quality control sample and the batch mean for a quality control sample type were acceptable if they were 2.5 standard deviations from the historical mean.If a quality control was !2.5 standard deviations from the historical mean, then a cascade of Westgard rules [23] was applied: an additional quality control sample that was !2.5 standard deviations from the historical mean resulted in batch failure; a maximum range of proteomic scores for a quality control sample type that was !5 standard deviations from the historical mean resulted in batch failure; ten consecutive replicates of a quality control sample type on one side of the histor-  The performance of the QC1 and QC2 during the analysis of 413 clinical samples that comprised alternative method comparison are shown in Fig. 12.The proteomic scores for the two quality control samples were within 2.5 SD of the historical mean, except for one QC1 replicate, providing support for the quality of the clinical samples run within those batches.

Conclusions
A targeted proteomic workflow coupled with LC-MS/MS-based detection of peptides for two protein biomarkers of sPTB was validated.The validated method has inter-and intra-batch precision  that generates agreeable results over time and across multiple sets of instrumentation.The validated method is a novel secondgeneration method that produces accurate results relative to the reference first-generation method.The detector response for each of the signature peptides is linear across several orders of magnitude and encompasses the range of responses expected for the intended population.Limits of quantitation were determined, defining a range of analyte responses with acceptable performance.The level of analyte carryover in the method is insignificant.The method is resilient to the differences in sample matrix, high levels of endogenous interferents, and stress incurred during repeated freeze/thaw cycles and long-term storage.Lastly, quality metrics developed to monitor the assay enabled the assessment of individual batch process quality and long-term trend analyses.Together, these results demonstrate the acceptable performance and robustness of an analytical method that provides relative abundances of maternal serum proteins needed to generate a qualitative individualized determination of a woman's risk of sPTB.As such, the two analytical methods yield equivalent test results.
Additional opportunities exist for improvement of assay efficiency.These include a further reduction of sample complexity through affinity enrichment techniques, at either the protein or peptide level.Such simplification would potentially allow for even shorter LC-MS/MS run times with a commensurate increase in throughput.
The authors would like to acknowledge the contributions made during the initial phase of this work by Pascal Croteau, Jeff Flick, Laura McIntosh, Warren Porter, and Michael Schrim.Carryover: Materials from a sample that remain in a system and are introduced into subsequent samples IBP4: Insulin-like growth factor-binding protein 4 LC-MS/MS: An analytical chemistry technique that combines the physical separation capabilities of liquid chromatography (LC) with the mass analysis capabilities of tandem mass spectrometry (MS/MS) Limit of Quantitation: A lower or upper limit of analyte abundance where expected assay performance is maintained Linearity: The ability of an assay to generate results that are proportional to a given range of relative abundances of the analyte(s) being measured Precision: The extent an assay can generate agreeable results on a sample set over time using multiple instrumentation clusters, multiple lots of reagents, and multiple operators Protein Depletion: The specific depletion of high abundance proteins in serum using an immuno-affinity material Proteotypic Peptide: A peptide that is unique to a protein.
PTB: Preterm birth, any delivery that is <37 weeks gestation Qualitative Transition: The transition used to support the trueness of measurement of the quantitative peak, typically the transition that provides the lower response Quantitative Transition: The primary transition used for measuring an analyte, typically the transition that provides the higher response Response Ratio: The ratio generated by dividing the area counts for the quantitative analyte transition by the quantitative SIS transition SHBG: Sex hormone-binding globulin SIS: Stable isotope standard, used for normalizing the response of the light endogenous analyte sPTB: Spontaneous preterm birth, a preterm birth that is not iatrogenic (or ''indicated") Transition Ratio: The ratio of the qualitative transition area count divided by the quantitative transition area count, multiplied by 100

Table 1
Standard batch design.

Table 3
Precision samples.for approximately one hour.The subsequent freeze cycle was for at least 21 h at À80 °C.A single analytical batch was used to assay the stressed sample and a non-stressed aliquot of the same sample.The proteomic score was calculated and a paired ttest was used to determine if the mean of the 12 stressed samples was significantly different (p value 0.05) than the mean of the 12 unstressed samples.

Table 5
Inter-batch precision results.

Table 6
Limits of quantitation.