Construction of à la carte QconCAT protein standards for multiplexed quantification of user-specified target proteins

QconCATs are quantitative concatamers for proteomic applications that yield stoichiometric quantities of sets of stable isotope-labelled internal standards. However, changing a QconCAT design, for example, to replace poorly performing peptide standards has been a protracted process. We report a new approach to the assembly and construction of QconCATs, based on synthetic biology precepts of biobricks, making use of loop assembly to construct larger entities from individual biobricks. The basic building block (a Qbrick) is a segment of DNA that encodes two or more quantification peptides for a single protein, readily held in a repository as a library resource. These Qbricks are then assembled in a one tube ligation reaction that enforces the order of assembly, to yield short QconCATs that are useable for small quantification products. However, the DNA context of the short construct also allows a second cycle of loop assembly such that five different short QconCATs can be assembled into a longer QconCAT in a second, single tube ligation. From a library of Qbricks, a bespoke QconCAT can be assembled quickly and efficiently in a form suitable for expression and labelling in vivo or in vitro. We refer to this approach as the ALACAT strategy as it permits à la carte design of quantification standards. ALACAT methodology is a major gain in flexibility of QconCAT implementation as it supports rapid editing and improvement of QconCATs and permits, for example, substitution of one peptide by another.


Background
Absolute quantification of proteins by mass spectrometry is typically based on the use of accurately quantified stable isotope labelled internal standards, usually peptides, as surrogates for the protein quantification. There are many ways to generate these labelled peptides, including direct chemical synthesis (AQUA peptides [1,2]), from full length labelled proteins (PSAQ [3][4][5]) or shorter epitopic fragments (PrEST [6,7]). An additional approach is the use of QconCAT technology [4,8,9]. QconCATs are multiplexed protein standards for proteomics, products of artificial genes designed to encode concatamers of peptides, wherein each peptide or more commonly, a pair of peptides, is chosen to act as mass spectrometric standard(s) for absolute quantification of multiple peptides. The initial publications on QconCATs [10][11][12] have received over 1000 citations and the methodology is well known and embedded in the community.
At a typical size of about 60-70 kDa, a QconCAT encodes approximately 50 tryptic peptides, permitting the quantification of around 25 proteins at a ratio of two peptides per protein. The genes are then expressed as recombinant proteins in bacteria grown in the presence of SIL amino acids, usually lysine and arginine, ensuring a single labelling position for every standard tryptic peptide. Because the genes are designed de novo, it is feasible to introduce additional features, such as purification tags, sacrificial peptides to protect the QconCAT from exoproteolysis and peptide sequences, common to each QconCAT as a quantification standard, permitting absolute quantification of each standard within the proteomics workflow-in effect, an 'internally standardised standard'. We have demonstrated that the QconCAT approach can be used successfully in large scale proteome quantification studies and have reported the absolute quantification of approx. 1800 proteins in the Saccharomyces cerevisiae proteome [13], by far the largest absolute quantification study conducted to date. Because each peptide derived by complete excision from a Qcon-CAT are present in equal quantities, QconCATs also have utility in the determination of subunit stoichiometry of multiprotein complexes, such as the proteinaceous bacterial metabolosomes for propanediol degradation in Salmonella [14].
Although QconCATs have been widely adopted, their broader deployment can be challenging. First, QconCAT expression requires skills in molecular biology and facilities for bacterial expression of heterologous proteins. We have addressed this in part through the introduction of cell-free synthesis of QconCATs, which brings added advantages of concurrent, single tube synthesis that we have extended to over 100 QconCATs simultaneously, a strategy we have dubbed MEERCAT [15,16]. Secondly, QconCATs cover a set of target proteins based on the needs of one research group, which may not always match the requirements of subsequent research groups. Thirdly, the choice of peptides is often obliged to be made without knowledge of the performance of these peptides in absolute quantification. Lastly, editing of any QconCAT, for example, the removal or addition of a single protein, has required complete resynthesis and expression of the gene.
To overcome these complications, we now introduce the concept of 'ALACATS' -'à la carte' QconCATs, the term reflecting the ability to design a QconCAT of any length that encodes peptides for a user-specified set of target proteins. ALACATS are assembled from 'Qbricks', oligonucleotides that encode (typically) two quantotypic peptides for a single target protein, together with short flanking peptides to recapitulate the correct primary sequence context, and thus normalise digestion rates. Each Qbrick (one for each target protein in the proteome) is a discrete entity, a double-stranded DNA construct that can be readily synthesised, stored, catalogued and accessed to enable the synthesis of an ALACAT to order. These are the fundamental building blocks in the ALACAT workflow.

Design, synthesis and assembly of Qbricks
A Qbrick ('quantification brick', a type of biobrick [17]) is defined as a short, double-stranded oligonucleotide that encodes two or more Qpeptides that are quantotypic for a single protein and is thus the smallest building block (Fig. 1). The Qbrick also encodes interspersed peptide sequences that recapitulate the primary sequence context of the two peptides, thus equalising digestion rates of standard and analyte. Each Qbrick has asymmetric overhangs at each end, creating sticky ended DNA molecules that permit assembly by a strategy called 'Loop Assembly', driven by sequential use of two Type IIS restriction endonucleases [18]. Different unique overhang sequences (A, B, C, D, E and F) flank the Qbricks. Five Qbricks are assembled in a single reaction-the 'odd' cycle [18]. Annealing during assembly maintains the reading frame through the QconCAT, adding two amino acids to the interspersed linker with no effect on the peptides generated from the QconCAT (Fig. 1a). These short QconCATs, assemblies of five Qbricks that encode 10 peptides, are perfectly useable when expressed as a five-target protein standard suitable for small, focused studies. A short QconCAT, containing 10 Qpeptides (from five Qbricks), interspersed linker peptides, quantification and purification peptides as well as suitable sacrificial sequences at either end, totals approximately 170-220 amino acids, of a size suitable for expression and deployment.
For more wide-reaching quantification studies, individual short ALACATs are subsequently concatenated in a second reaction. The initial 5-Qbrick constructs are cloned into plasmids that introduce a second set of six overhanging linker sequences, distinct from those used in the 'odd' cycle. These linkers ('even cycle', α, β, γ, δ, ε, λ; Fig. 1b) allow assembly of the five short ALACATs into a complete, 'long ALACAT', capable of encoding Qpeptides for quantification of approximately 25 proteins. These QconCATs would be 75-90 kDa (relatively shorter because of the single instances of N-terminal and C-terminal features), typical for cell-free or bacterial expression. Of course, any variant, from two to five short ALACATs, encoding quantification standards for any number of proteins between 5 and approximately 30 is possible using this approach. This greatly expands the flexibility of the QconCAT approach. The sequences of the constructs used in this paper, and the cloning syntaxes, are provided in Additional file 1. (2021) 19:195 As proof of concept, we built an ALACAT from a series of Qbricks encoding standards for 25 human plasma proteins (Additional file 2: Table S1). We first assembled five short ALACATs and then, in turn, assembled these into a long ALACAT. Each short ALACAT was expressed independently in a wheat germ cell-free system (CFS), as well as the long ALACAT and yields of all were high (Fig. 2a). Typically, yields were of the order of 500 pmol, which is a substantial quantity for LC-MS/ MS-based quantification (typically, a single LC-MSMS run would require 50 fmol on column). The expressed short and long QconCATs were then digested with trypsin and analysed by LC-MS/MS (Fig. 2b). All peptides were readily detected (Fig. 2c), and whether derived from a short or long ALACAT, the relative peptide intensities remained the same (Fig. 2d). Further information on the analysis of the short and long ALACATs is provided in Additional file 2: Figures S1-S6.
Even after purification, QconCAT preparations also contained several proteins derived from the expression system. To assess these, we searched digests of a different set (C) of purified QconCATs against a wheat database (Triticum aestivem, UniProt UP000019116). The proteins were very consistent across six constructs (five short ALA-CATs and one long ALACAT), and the most abundant proteins were ribosomal proteins and seed storage proteins; other proteins were one to two orders of magnitude lower (Additional file 2: Figure S7). However, the Qcon-CATs were the most abundant proteins on SDS-PAGE, and because they are deployed using specific MS assays, contaminant peptides would not be an issue.
Each QconCAT contained two peptides common to every construct-the glu fibrinopeptide (EGVNDNEEGF FSAR) that we have used previously for quantification of the QconCAT [8,13] and a second peptide derived from the common c-myc peptide (LISEEDLGGR) to give a tag for monitoring expression by western blotting if necessary. We were able to use these two peptides to assess the consistency of the intensity of the quantification peptide, whether in long or short ALACATs (Fig. 2c). The correlation was extremely high, confirming that the peptides were cleaved and released similarly, irrespective of the nature of the ALACAT.
We also demonstrated the synthesis of ALACAT in a E. coli cell-free system. The E. coli system couples transcription and translation in a single tube, which allows us to skip the in vitro transcription reaction required in the wheat cell-free system. In this study, we set up a small-scale reaction system using a microdialysis device (Fig. 3). All prepared ALACAT genes were successfully synthesized in this system), and the efficiency of 13 C/ 15 N incorporation into their lysine and arginine residues was more than 99%.

Editability of ALACATs
One of the advantages of the ALACAT approach is the introduction of straightforward editability of the construct. Previously, there was no simple route to exchange one peptide for another without extensive resynthesis of the gene. However, with ALACATs, the editability simplifies the introduction of changes in the sequence and embedded peptides. This editing process can take place at two levels. First, individual peptides can be replaced in Qbricks, and a new short ALACAT could be constructed. The only new DNA required would be the sequence of the Qbrick. Alternatively, an entire short ALACAT could be exchanged, replacing multiple peptides in a single process. This might be of value, for example, in a multi-species construct, if some short ALACATs contained species-specific peptide sequences, and others contained sequences that were identical in both species. A simple switch from species A to species B would only require an exchange of the relevant short ALACAT in the one-step, even cycle ligation reaction. Further, about 10% of all traditionally synthesised Qcon-CATs failed to express in bacteria [16] and the ability to quickly create a large set of rearrangements of different Qbricks or short ALACATs would be able to deliver a library of ALACATs, with equivalent function, that could be quickly screened for expression potential. Alternatively, this type of combinatorial synthesis could be used to explore adjacency and proximity effects. To test this possibility in extremis, we therefore initiated a 'one pot' combinatorial ligation of two families of Qbricks, or, in a separate experiment, two families of short ALACATs.
We tested both levels of editability using the two ALA-CAT series (B, plasma and H for analysis of the stoichiometry of a metabolic compartment; Additional file 1) described above. First, we demonstrated the ease of exchange of short ALACATs by building a combinatorial (See figure on previous page.) Fig. 1 Overall strategy for building block assembly of short and long ALACATs. The smallest unit of an ALACAT is a single double-stranded oligonucleotide encoding peptides (one, two, more) for quantification of a single target protein, flanked on either side by tripeptides that preserve the natural primary sequence context. These oligonucleotides include linker regions compatible with a Type IIS restriction enzyme (BsaI) that allows all five Qbricks to be assembled in the correct order in a single ligation reaction (a, odd cycle assembly) to form short ALACATs. In turn, these short ALACATs contain DNA sequences that are compatible with a second Type IIS restriction enzyme (SapI) and can be similarly assembled in a one tube reaction to create a long ALACAT (b, even cycle assembly). The vectors for the odd and even cycles both include inframe fusions to glu-fibrinopeptide and c-myc encoding regions (quantification) and a hexahistidine tag (purification) series of long ALACATs created from random introduction of appropriate short ALACATs-the 'even' cycle. Each position in the long ALACAT could contain a short ALACAT from either the B or H series. Rather than create one editing reaction to prove the swap of one short ALA-CAT for another, we took a different approach and set up a single reaction, in which we mixed ten short ALACATs derived from the two different families, prefixed B and H, such that B1 and H1 would share common SapI overhang sequences and similarly, the other four pairs (H2/B2 to H5/B5). Thus, short ALACATs H1 and B1 would represent a binary choice at position one. In this assembly, a random ligation process would generate 2 5  To assess the equivalent combinatorial substitution of Qbricks, we performed essentially the same experiment in an 'even' cycle but with two sets of Qbricks from the two families (B and H), again picking multiple clones from a single tube ligation reaction. To increase complexity, we created further potential by providing H4 and B5 with two assembly contexts (Fig. 5, Additional file 4). After assembly, multiple ALACAT clones were picked and sequenced. From this experiment, 26 unique short ALACATs were constructed, spread across 65 sequences that were sampled. Of these 65, seven were long variants of five Qbricks (made possible by our construction strategy) but the majority comprised assemblies of four Qbricks, a total of 19 combinations from a set of 24 possibilities were recovered. Further, 16 were unique, two were replicated once, three occurred thrice, two were four-fold, up to one assembly that was sequenced in 16 (approx. 25%) of the clones. It is possible that this bias reflected differences in the relative concentrations of the input DNA sequences, which would allow tuning of the system to preferred assemblies.

Discussion
Although the QconCAT approach is well recognised, there are undoubtedly barriers to widespread adoption. The selection of peptides is an early commitment, followed by the synthesis of a gene, embedded in a suitable expression vector, and finally, expression and labelling by biosynthesis in vivo. This requires routine skills in molecular biology that may not be present in a typical proteomics team. Moreover, QconCAT expression in vivo is no different to expression of other heterologous proteins; sometimes, the expression fails. However, we have demonstrated that synthesis in vitro overcomes this issue-so far, every QconCAT we have made by cell-free synthesis has not only expressed well, but also incorporates a higher degree of labelling [16].
The ALACAT concept has multiple advantages over traditional QconCAT gene synthesis and expression. The synthesis and storage of individual Qbrick oligonucleotides is straightforward and in time, these Qbricks could be drawn from an ever-expanding library, stored at the point of synthesis. Once a set of Qbricks are available, assembly through the odd cycle is a single tube reaction, creating the possibility of expression in vitro of a short ALACAT; useful for a quick check of the suitability of the encoded peptides. When the short ALACATs have been evaluated, a second, single tube reaction leads to the even cycle assembly of the full length QconCAT. A primary advantage of the ALACAT approach is therefore that the clustering of Qpeptides into QconCATs becomes a late decision, driven by the interests of specific users and/or research programmes. If peptides are suboptimal for a particular mass spectrometric approach, often an unknown factor before the construct is made, it would be trivial to replace one Qbrick, build a new short ALACAT, and if required, subsequently assemble the new short ALACAT into the full length ALACAT, both steps being single-tube reactions.
Further, ALACATs can be designed and delivered at any size (although we recommend an upper limit of 50 to 60 target proteins), according to the focus and depth of individual quantitative proteomics studies. For example, a single short ALACAT would be a rapidly generated resource for quantification of a few key proteins. To increase the target numbers, two or three short ALA-CATs could be combined to form highly efficient (See figure on previous page.) Fig. 2 Construction of a human plasma protein QconCAT using the ALACAT strategy. A series of Qbricks were designed, each encoding two peptides for each of 25 plasma proteins. Groups of five were assembled in an odd cycle reaction to short ALACATs (301-305) that were expressed in vitro using wheat germ lysate and purified by virtue of their hexahistidine tag. In addition, the short ALACATs were assembled in an even cycle reaction into a single long ALACAT that was also expressed and purified (a). Each of the ALACATs was then digested and analysed by LC-MSMS; all peptides were detected (c, infilled peptides are those visible by LC-MS/MS (b); different colours define the short ALACAT origins of the peptides in the long ALACAT). Using common peptides (glu fibrinopeptide and c-myc epitope) as normalisation controls, the peak areas for peptides in short ALACATs were compared to the areas of the same peptides in the long ALACATs (d) Fig. 3 ALACAT synthesis by E. coli cell-free system. Experimental design for cell-free synthesis (a). Ten small ALACATs (belonging to two series; 'B' and 'H') and one long ALACAT were synthesized in a small-scale cell-free synthesis system using a microdialysis device (b). Lysine and arginine residues of the synthesized ALACATs were labelled with 13 C/ 15 N. b Representative SDS-PAGE separation images of the synthesized ALACATs. Histag purified ALACATs were separated by NuPAGE 4-12% gradient gels and visualised by CBB staining. Asterisk: ALACAT band. c Selective reaction monitoring (SRM) analysis of ALACAT peptides. The efficiency of stable isotope incorporation was estimated by SRM for two tryptic digested peptides derived from one ALACAT QconCATs of intermediate size. Of course, multiple ALACATs can be co-expressed in vivo, simplifying resource expansion and providing an efficiency and flexibility which, coupled with cell-free synthesis and MEERCAT, means that large scale absolute proteome quantifications are now eminently feasible, sustainable and modifiable. Many stages of the ALACAT workflow are suitable for delivery through laboratory automation, reducing the need for human intervention. The Qbrick approach means that it would be possible to create an everexpanding resource of Qbrick DNA (in the form of double-stranded oligonucleotides) that could be assembled 'to order' in response to requests by any research group. There would be no reliance on prior clustering of peptides, and the assembly would be a simple additional step. Moreover, the ability to 'swap out' specific Qbricks without having to redesign and build the QconCAT from scratch means that problematic peptides will be rapidly expunged from the resource (Fig. 6). The advantages of having 'editable QconCATs' cannot be overstated. This added flexibility in standard design and optimisation, coupled with ever-increasing selectivity and sensitivity of LC-MS/MS platforms, makes absolute quantification of part, or even all, of a proteome increasingly feasible.
Finally, QconCATs, assemblies of peptides generated by proteolytic digestion, are a simple route to the generation of stoichiometric quantities of sets of peptides that can be used for purposes other than absolute quantification, such as instrument quality control or calibration of retention time index [19][20][21][22][23]. The combinatorial experiments described in this paper, for example, create the ability to build a large number of different combinations of peptides from a common library, and could be used in the understanding of local influences on ionisation, or even to test the emergent methods for prediction of precursor or product ion intensity [24][25][26][27][28][29]. The more straightforward the production methodology, the more likely tests of such predictive methods can be created.

Conclusions
QconCATs can now be assembled from libraries of Qbricks, where each Qbrick is a short oligonucleotide that encodes quantotypic peptides for a target protein.
The assembly of Qbricks in the ALACAT process can create QconCATs, at two or more peptides per target protein, in two successive one tube reactions, giving substantial time savings. The ALACAT approach allows for rapid replacement of one Qbrick for another, whether to introduce superior peptides or to replace one target protein with another. The ALACAT approach sets the stage for large libraries of Qbricks that can be assembled in a bespoke, à la carte fashion according to specific project goals.

Materials and reagents
All enzymes, competent cells and manual DNA purification kits were purchased from New England Biolabs (Hitchin, UK), all oligonucleotides were purchased desalted and lyophilised from Integrated DNA Technologies BVBA (Leuven, Belgium) or Eurofins Genomics (Ebersberg, Germany). All bacterial media and antibiotics were purchased from Formedium Ltd (Hunstanton, UK).

Production of pOdd and pEven acceptor vectors
Plasmid pEU01-MCS (CellFree Sciences, Ehime, Japan) was domesticated via site-directed mutagenesis to remove unwanted BsaI and SapI restriction sites. pOdd vectors were produced by inserting a lacZ cassette with SapI and BsaI sites as indicated with appropriate syntaxes flanked N-terminally by GluFib and Myc tag linkers and C-terminally with 6x His-Tag and stop codons. pEven vectors were produced similarly but the Amp R gene was exchanged for Spec R gene amplified from pGM134_1 and cloned via an NEBuilder (NEB, UK) reaction. All lacZ cassettes were synthesised by Twist Bioscience (San Francisco, USA) and cloned as single fragments into modified pEU01-MCS via NEBuilder, producing pOdd vectors pGM247_2 -6 and pEven vectors pGM247_8 -12.

Design of oligonucleotides and production of QBrick DNA Blocks
QBrick peptide sequences were reverse translated using Geneious software (Biomatters Ltd), set up to use the Escherichia coli K12 codon usage table [30] and to avoid internal restriction sites of BsaI, SapI, BbsI and BsmBI (See figure on previous page.) Fig. 4 Combinatorial assembly of short ALACATs into long ALACATs. To test the ease of editability and swapping of short ALACATs into long ALACATs, we created a ligation reaction in which there were two choices of short ALACAT at each of four or five positions (a). The reaction products were cloned and a total of 81 clones were selected for DNA sequencing, to establish the composition of the long ALACATs (from either the B or H series) were incorporated (b). There was an even representation of the short ALACATs across the entire structure (c) and the evenness of the creation of the products is evidenced by the split of the products through the first three positions (d)  19:195 and > 5 nt homopolymers. These were converted to overlapping oligonucleotides (T ann approx. 60°C). Required 5′ overhangs for BsaI or SapI recognition sequences and molecular syntaxes were then added. Pairs of overlapping oligonucleotides were mixed at 2.5 μM ea. (final conc.) in Q5 2x mastermix in 20 μl total reaction. These were annealed and extended using the following thermocycler parameters: 98°C for 60 s followed by five cycles of 98°C for 10 s, 60°C for 30 s, 72°C for 15 s and a final incubation at 72°C for 60 s. These reactions were diluted 1:100 in water before added to cloning reactions below (approx. 25 fmol/μl).

Cell-free expression of short and long ALACATs
For each ALACAT, 2 μg DNA in pEU-E01 vector (Cell-Free Sciences Co., Ltd, Japan) was used for a single expression reaction. Synthesis was completed in 240 μL scale using WEPR8240H full Expression kit (2BScientific Ltd, UK). A positive control (pEU-E01-DHFR coding dihydrofolate reductase gene derived from E. coli) and negative control (pEU-E01-MCS empty vector) were used, both supplied with the kit. Full kit instructions were followed, including preparation of WEPRO8240H aliquots and 2 x SUB AMIX reagent. The Transcription Mix for each expression was prepared with 20 U RNase inhibitor, 20 U SP6 RNA Polymerase, 50 nmol NTP mix and a 0.2 x dilution Transcription Buffer. DNA for the ALACAT or controls, and nuclease-free water, were added to a final volume of 20 μL. The transcription reaction occurred over 6 h at 37°C and the resulting mRNA was stored briefly at room temperature before transcription.
A 1 x SUB AMIX was prepared with a 0.5 x dilution of 2 x SUB AMIX into nuclease-free water and 60 nmol of each of the standard 20 amino acids (R, K, A, N, D, C, E, Q, G, H, I, L, M, F, P, S, T, W, Y, V), with substituted stable isotope labelled [ 13 C 6 ], [ 15 N 4 ] arginine and [ 13 C 6 ],[ 15 N 2 ] lysine (CK Isotopes Ltd, UK). The Translation Mix for each expression was prepared with 12 nmol of each of the same standard 20 amino acids, including 13 C 6 15 N Arg and Lys, combined with 0.8 μg creatinine kinase, 10 μL WEPRO8240H wheat germ lysate, 0.5 x dilution of 2 x SUB AMIX, and 10 μL of mRNA for each ALACAT or standard. A 96-well plate was prepared with 200 μL of 1 x SUB AMIX in each well. The Translation Mixture was carefully pipetted beneath the SUB AMIX in each well to form a bilayer. The plate was sealed and incubated at 16°C for 16 h.
E. coli cell-free synthesis was performed using a Musaibou-Kun protein synthesis kit (Catalog #A183-0242, Taiyo Nippon Sanso Corporation, Tokyo, Japan). For ALACAT synthesis, an amino acid cocktail with lysine and arginine universally labelled with 13 C and 15 N (Catalog # A91-0128, Taiyo Nippon Sanso Corporation) was used. All synthetic reactions were performed using an Xpress micro-dialyzer MD100 with molecular weight (See figure on previous page.) Fig. 5 Combinatorial assembly of Qbricks into short ALACATs. To test the ease of editability and swapping of Qbricks into short ALACATs, we created a ligation reaction in which there were two choices of Qbrick at each of four or five positions (a), encoding pairs of quantotypic peptides P 1 and P 2 . The reaction products were cloned and a total of 65 clones were selected for DNA sequencing, to establish the composition of the long ALACATs (from either the B or H series) were incorporated (b). There was a reasonably even representation of the Qbricks across the entire structure (c)  19:195 cut-off of 12-14 kDa (Scienova, Spitzweidenweg, Germany) inserted into a 2 mL microtube. Before synthesis, 825 μL of the outer solution was mixed with 75 μL amino acid cocktail and 100 μL distilled water, incubated at 30°C, and added to the outside of the dialysis unit at the start of synthesis. Then, 77.5 μL of the internal solution for synthesis was mixed with 10 μL template DNA (50 ng/μL), 7.5 μL amino acid cocktail, and 5 μL distilled water, and added to the dialysis unit. The synthesis reaction was carried out at 30°C for 18 h. After the synthesis was completed, all the solution in the dialysis unit was collected into a new tube.

ALACAT purification
Note that the positive control used for expression does not have a hexa-histidine tag and therefore both controls were used as negative controls in this next stage of the protocol. The 240 μL contents of each individual well of the 96-well plate was transferred to a low binding tube (Biotix Inc., USA). This was then combined with 400 μL Bind Buffer pH 7.4 (20 mM sodium phosphate, 0.5 M sodium chloride, 20 mM imidazole, 6 M guanidine hydrochloride), and incubated at room temperature for 1 h using a rotor mixer, before the addition of 10 μL Ni Sepharose suspension (GE Healthcare Ltd, UK) and a further 1 h incubation. Centrifuge filters (Corning Costar Spin-X 0.45 um pore size cellulose acetate membrane, Merck, UK) were washed once with 750 μL Bind Buffer and centrifuged, before the addition of the sample and Ni Sepharose, and further centrifugation; all centrifuge steps used 6000×g 2 min 4°C. This was followed by three further washes by centrifugation with Bind Buffer; two 400 μL washes and one 200 μL wash. Sample was eluted by centrifugation from the resin with two additions of 15 μL Elution Buffer pH 7.4 (20 mM sodium phosphate, 0.5 M sodium chloride, 1 M imidazole, 6 M guanidine hydrochloride), after each addition the resin and buffer were agitated to mix before centrifugation. The final 30 μL elution was transferred to a low binding tube for protein precipitation. To each tube, 600 μL HPLC grade methanol (Fisher Scientific Ltd, UK) was added and mixed well before the addition of 150 μL chloroform and 400 μL HPLC grade water (VWR International, UK) to precipitate proteins. Following centrifugation at 13,000×g for 3 min a bilayer was formed, the uppermost layer of which was carefully removed. A further 600 μL methanol was added and gently mixed by inversion. After a second centrifugation step the majority of the liquid was removed and discarded, with the remaining liquid allowed to evaporate. The precipitate was resuspended in 30 μL 25 mM ammonium bicarbonate, with 0.1% (w/v) RapiGest TM SF surfactant (Waters, UK) and protease inhibitors (Roche cOmplete TM , Mini, EDTA-free Protease Inhibitor Cocktail, Merck, UK). Before tryptic digestion, the protein concentration of each sample was estimated using a NanoDrop Spectrophotometer (ThermoFisher Scientific, UK). All uncropped images of gels used in this publication are presented in Additional file 2: Figure S8.

Tryptic digestion
For digestion, 0.5 μg protein for each was treated with 0.05% (w/v) RapiGest TM SF surfactant at 80°C for 10 min, reduced with 4 mM dithiothreitol (Melford Laboratories Ltd., UK) at 60°C for 10 min and subsequently alkylated with 14 mM iodoacetamide at room temperature for 30 min. Proteins were digested with 0.01 μg Trypsin Gold, Mass Spectrometry Grade (Promega, USA) at 37°C overnight. Digests were acidified by the addition of trifluoroacetic acid (Greyhound Chromatography and Allied Chemicals, UK) to a final concentration of 0.5% (v/v) and incubated at 37°C for 45 min before centrifugation at 13,000×g 4°C to remove insoluble non-peptidic material.
(See figure on previous page.) Fig. 6 A model for the creation of an ALACAT resource. From a library of Qbricks (that may include redundant sequences to increase the choice of quantotypic peptides for specific proteins) assembly into short ALACATs would provide a test system to assess the suitability of peptides for quantification. Once the short ALACATs are optimised, sets of them could subsequently be assembled into long ALACATs. Long ALACAT DNA can be readily used to drive protein synthesis in the end user laboratory, labelled as appropriate The nano-electrospray ionisation source was operated in positive polarity under the control of QExactive HF Tune (version 2.5.0.2042), with a spray voltage of 1.8 kV and a capillary temperature of 250°C. The mass spectrometer was operated in data-dependent acquisition mode. Full MS survey scans between m/z 350-2000 were acquired at a mass resolution of 60,000 (full width at half maximum at m/z 200). For MS, the automatic gain control target was set to 3e 6 , and the maximum injection time was 100 ms. The 16 most intense precursor ions with charge states of 2-5 were selected for MS/MS with an isolation window of 2 m/z units. Product ion spectra were recorded between m/z 200-2000 at a mass resolution of 30,000 (full width at half maximum at m/z 200). For MS/MS, the automatic gain control target was set to 1e 5 , and the maximum injection time was 45 ms. Higher-energy collisional dissociation was performed to fragment the selected precursor ions using a normalised collision energy of 30%. Dynamic exclusion was set to 30 s.
The efficiency of stable-isotope incorporation of ALA-CAT synthesized in E. coli cell-free system was determined using LC-SRM analysis. Prior to SRM analysis, ALACAT protein B1 separated by SDS-PAGE (shown in Fig. 3b) was digested with trypsin in a gel and purified with a STAGE tip containing an Empore SDB-XC disc as described previously [16]. The purified peptide sample was dissolved in 20 μL of 0.1% (v/v) TFA solution, of which 1 μL was injected into an Eksigent nanoLC system coupled to a SCIEX QTRAP 5500 mass spectrometer. For LC separation, solvent A was 0.1% (v/v) formic acid, and solvent B was 0.1% (v/v) formic acid/80% (v/v) acetonitrile. The sample was desalted using a 200 μm i.d. × 0.5 mm cHiPLC trap column (SCIEX) at a flow rate of 5 μL/minute for 10 min using 0.1% (v/v) TFA and then transported to a fused-silica capillary column packed with C18 resin (12.5 cm × 75 μm i.d.; Nikkyo Technos) at a flow rate of 300 nL/minute according to the following gradient schedule: 0-15 min, 2-50% B; 15-18 min, 50-90% B; hold at 90% B for 6 min; and re-equilibrate at 2% B for 20 min. SRM analysis was conducted in positive ion mode with the following parameters: ion spray voltage = 2300 V; curtain gas = 20; ion source gas1 = 20; collision gas = 12; interface heater temperature = 150; entrance potential = 10; collision cell exit potential = 9; and Q1/Q3 = low resolution. Details of SRM transitions and quantitative data are accessible via the Panorama-Web server [31] (https://panoramaweb.org/alacat.url).
The raw MS data files were loaded into Thermo Proteome Discoverer v.1.4 (ThermoFisher Scientific, UK) and searched against a custom ALACATs database using Mascot v.2.7 (Matrix Science London, UK) with trypsin as the specified enzyme, one missed cleavage allowed, carbamidomethylation of cysteine, label