Design, synthesis, and functional testing of recombinant cell penetrating peptides

Cell penetrating peptides (CPP) are one of the most attractive DNA delivery systems currently in development. In this research, in silico CPP development was performed based on a literature study to look for peptides that induce endosome escape, have the ability to bind DNA, and pass through cell membranes and/or nuclear membranes with a final goal of creating a new CPP to be used as a DNA delivery system. We report herein the successful isolation of three candidate CPP molecules, which have all been successfully expressed and purified by NiNTA. One of the determinants of CPP success as a DNA carrier is the ability of the CPP to bind and protect DNA from the effects of nucleases. The DNA binding test results show that all three CPPs can bind to DNA and protect it from the effects of serum nucleases. These three CPP candidates designed in silico and synthesized in the prokaryote system are eligible candidates for further testing of their ability to deliver DNA in vitro and in vivo.


Introduction
Cell-penetrating peptides (CPPs) are peptides capable of delivering large molecules to cells, and studies into their capabilities began about 20 years ago with the known TAT HIV-1 peptide. A CPP is also known as a protein translocation domain, a membrane translocation sequence, or a Trojan peptide [1]. Peptides belonging to the CPP category are typically composed of between 30-35 amino acids dominated arginine and lysine, are positively charged or amphipathic in makeup, are easy-to-produce, and are non-toxic [2]. The main attraction of CPP development for delivery systems is the ability of CPPs to deliver molecules much larger than themselves to cells with high efficiency and low toxicity [1,3]. A CPP also has the ability to pass through nuclear membranes and mitochondrial membranes without damaging them [4].
CPPs can be used to deliver DNA into the cell nucleus such that the delivered DNA can be expressed for both vaccination and gene therapy purposes. They can be used to deliver nucleic acids both in vitro and in vivo and can also be data engineered in such a way as to bind DNA, condensing it and helping it to avoid entrapment in the endosome. CPPs are one of the DNA delivery systems that can overcome barriers at both the extracellular and intracellular levels [5]. This type of delivery system is interesting because it is relatively easy to construct, modify, and produce in large quantities using a prokaryotic expression system. In addition, the production process does not require laboratory facilities with a high level of security.
In this research, a combination of several different peptides with different abilities has been analyzed. Key peptides that were investigated include those that have been prepared as DNA-binding peptides, those with the ability to be recognized by the internalization system of cell membranes, and those containing the nuclear localization signal (NLS). Therefore, it was expected that the resulting CPPs would have the ability to bind to and change the conformation of DNA such that it would become more compact and not so easily degraded by cellular nucleases. These CPPs were expected to have the ability to internalize DNA within the cell and release it from the endosome. In addition, the CPPs were expected to penetrate into the nucleus of the cell where the DNA will initiate the desired protein expression process [6].

Preparation of CPPs in silico and bioinformatics analysis
The design of the amino acid sequences of the CPPs was based on a literature study of the function and composition of peptides with the ability to carry molecules through cell membranes or act on endosomal membranes, DNA-binding proteins, NLS proteins, and CPP proteins. The selection of fusion peptides (FP) acting on the cell membrane was based on the mechanism of virus internalization into cells. In this study, FP were selected from viruses that infect through fusion of their viral sheath with cell membranes, causing the genetic material of the virus to enter the cytoplasm. Peptides were selected to have the minimal amino acid sequence of the viral shell that acts as a FP and that have never been reported to function as a CPP in previous studies. The selection of DNA-carrying peptides was based on the analysis of certain HIV proteins that are thought to have this ability to bind DNA. The selected DNA-binding proteins had to meet several criteria. First, these proteins had to have a demonstrated binding to any nucleic acid sequence, not just a specific one, and second, these proteins should induce DNA conformational changes that were hypothesized to protect the DNA from the effects of DNase I and nuclease. The selection of NLS peptides was based on the amino acid sequencing of several HIV proteins that can serve as NLS in non-dividing cells. This was necessary because most of the cells in the body are not actively dividing, and the peptides should be able to target all of these cells. Having obtained peptide candidates based on their expected abilities to act as CPPs in silico, bioinformatics analysis was performed using software downloaded from the search site.

Cloning
Preparation of the DNA sequences and optimization of the codons for the prokaryotic system was done according to the DNA 2.0 software. ALMR, SIMR, and VPMR synthetic DNA fragments were made through IDT Singapore partner companies in Indonesia.

Protein expression
Three colonies of DH5α bacteria containing recombinant plasmids were plated from a stock culture and grown on LB media containing ampicillin and on replica plates. After overnight incubation in 37°C, the bacteria were transferred to an enriched terrific broth medium (HiMedia) containing 100 μg/ml ampicillin with bacteria to media ratio of 1:10. The bacterial culture was incubated at 37 °C for two hours. Induction of the recombinant protein expression was performed by adding IPTG such that the final concentration in the solution was 1 mM. Incubation was continued for four hours, and every hour 1 ml was taken for analysis via SDS PAGE and stained with Page Blue (Thermo Scientific).

Purification of proteins
Purification of recombinant proteins was performed using a nickel-bound agarose technique (NiNTA) in a denatured state. A denaturation buffer (100 mM NaH2PO4, 10 mM Tris-HCl, 500 mM NaCl, and 6 M guanidine hydrochloride) was added to the bacterial lysis supernatant, and NiNTA was added to the mixture at 1/25 the volume of the diluted lysis supernatant. The binding of the protein was carried out at 4 °C for two hours in a rotary shaker (BioRad) at 60 rpm. Subsequently, centrifugation was performed (3500xg for 10 minutes, 4 °C). The NiNTA matrix was washed four times using buffer C 3

1234567890
The 3) with an equivalent volume as that of the bacterial lysis volume. After the washing buffer was added, the NiNTA resin was incubated for five minutes on ice and centrifuged (3500xg for 10 minutes, 4 °C). The protein elution was performed by adding an amount of buffer D (100 mM NaH2PO4, 10 mM Tris-HCl, 8 M urea, 250 mM imidazole, and 10% glycerol, pH 5.9) equal to the initial volume of NiNTA. After incubation for five minutes on ice and centrifugation (3500xg for 10 minutes, 4 °C), the supernatant was transferred to a 1.5 ml tube and labeled as elution 1, and the second elution was carried out in the same fashion. The 3rd and 4th elutions were carried out using buffer E (100 mM NaH2PO4, 10 mM Tris-HCl, 8 M urea, 250 mM imidazole, and 10% glycerol, pH 4.5). The volume of buffer E was adjusted to match the initial volume of NiNTA, and after incubation for five minutes on ice, was centrifuged (3500xg for 10 minutes, 4 °C). This supernatant was transferred to a 1.5 ml tube and labeled as elution 3, and the 4th elution was carried out in this same fashion. The protein was then analyzed by SDS-PAGE.

DNA binding test
The DNA was dissolved in 100 mM HEPES buffer (pH 5.2). The CPP was added to the DNA solution based on the molar ratio of the two components, and the final reaction volume was 15 μl. After incubation at room temperature for 30 minutes, electrophoretic gel mobility assays were conducted by passing the sample through a 0.7-0.8% agarose gel (0.7-0.8 grams agarose [Vivantis] in 1x Tris-Acetate-EDTA buffer). The DNA was colored using Red Gel reagents, and the visualization was done using a GelDoc UV lighting system (BioRad). The DNA intensity that can be visualized in the agarose was measured by using the software contained in the GelDoc machine (BioRad). The determination of the ability of the CPP to bind DNA was based on a decrease in the intensity of the DNA band that matched the control DNA containing no CPP and a slowing of the migration rate of DNA in agarose. This is because the CPP binds with the DNA based on positive and negative ion bonds, which causes a reduction in the negative charge of DNA that is seen as a slowing of the DNA migration in agarose [7]. The percentage of DNA bound to the polypeptide is measured by fluorescent intensity (F) according to the following formula: The resultant IC50 values indicate the ratio of polypeptide DNA formation when 50% of the DNA in the sample is bound, measured using a curve formed from the percentage of DNA bound to peptide in a ratio of CPP:DNA [8].

Measurement of CPP's ability to protect DNA from nuclease effects
The stability of the DNA within the CPP complex was tested against serum nuclease. The formation of the CPP:DNA complex was performed as previously in a final volume of 30 μl. To ensure the formation of the CPP complex, 6 μl of DNA was separated by agarose. Into the remaining 24 μl of the CPP:DNA complex, 22 μl of complete DMEM (DMEM containing 1% Penstrep antibiotic, 10% fetal bovine serum (FBS), and 2 mM L-glutamine) was added, and the solution was incubated at 37 °C for three hours. A total of 12 μl of the CPP: DNA complex containing complete DMEM was mixed with 4 μl of 4x DNA-loading buffer (0.24 M Tris-HCl, 0.24 M SDS, 40% glycerol, and 20% 2mercaptoetanol, pH 6.8) and was again separated by 0.8% agarose using a voltage of 100 volts for 30 min to observe DNA degradation patterns. The CPP: DNA interactions were modeled to predict the binding method. The ALMR: DNA and SIMR: DNA interactions were predicted using CCSB System Biology, the VPMR: DNA interaction was predicted using NPDock, and all data analysis was conducted with PyMOL.

Results
Peptide design ALMR weighs 7.1 kDa with pI of 12.01, and is part of a transmembrane domain. Its secondary structure is composed of 73.8% amino acids forming a helical structure and 26.2% forming loops, meaning that the FP ALSV will form a helical structure much like the DNA-binding domain, Rev, will form a helical structure. SIMR weighs 6 kDa with a pI of 12.09, and 41.2% of the amino acids will form a helical structure, 17.6% will form a strand, and 41.2% will form a loop. The amino acids chosen for the FP SIV form the strand and the loop secondary structures. It does not contain a transmembrane domain. SIMR is a protein capable of binding DNA with a score of 1.58, and it has a nuclear localization score of 0.48 (Table 1). VPMR weighs 8.3 kDa and has a pI value of 12.14. This peptide has the ability to bind DNA and is located within the nucleus of eukaryotic cells with a predictive score of 1.799 and 0.76, respectively. The 2D analysis shows a 25.4% helical peptide structure on the C-terminus and a loop on the N-terminus.

Cloning and recombinant CPP synthesis in the prokaryotic system
The presence of the nucleotide sequence encoding the CPP fragment was verified with colony PCR using the pQEF and pQER primer pairs. The DNA amplification results of the insertion indicate the presence of an appropriately sized, 4700 bp, DNA amplicon band as expected in the analysis of electrophoresis gel ( Figure 1). DNA amplicons containing the recombinant plasmid DNA are larger than the WT plasmids, and therefore will be higher up on the gel image. The electrophoresis gel, shown in Figure 1D, indicates that all recombinant plasmids had a size of 4700 bp of DNA. A restriction enzyme analysis was conducted using the enzymes BamHI and HindIII, and the results were analyzed using two types of gel. Agarose gel is used to determine the size of the vector while the acrylamide gel was used to determine the size of the inset DNA. Restriction analysis results show the recombinant plasmids contain the expected insertion DNA. ALMR, SIMR, and VPMR peptides were successfully expressed in a prokaryotic expression system. The SDS-PAGE analysis showed an overexpressed protein band with the size expected for each peptide (Figure 2). CPP ALMR, SIMR, and VPMR weighed 7.1, 6.0, and 8.3 kDa, respectively, and all migrate under the 10 kDa marker (Figure 2). The three types of peptides were successfully purified in their denatured state, and the purity of the resulting peptides is shown in Figure 2C and 2D. The recombinant proteins were verified via western blotting using polyclonal antibodies against the Rev-Matrix (MV), and the results indicated the presence of MV polyclonal serum reactivity with ALMR, SIMR, and VPMR ( Figure  2E).

DNA binding tests and CPP capabilities protect DNA from the effects of nucleases
The ability of the CPPs to bind DNA was assessed by the migration pattern of the DNA towards the positive poles of the electrophoresis device and by the DNA band intensity when compared to the control DNA in a gel shifting analysis. The gel shifting analysis showed ALMR, SIMR, and VPMR had the ability to bind to plasmid DNA pcDNA3.1 eGFP. In this analysis, the higher the ratio of the electrical charge between the CPP and DNA, the higher the percentage of DNA that bound to the CPP. It can be observed from the results in Figures 3A, B, and C that more DNA is retained on the negative pole of the agarose gel when the relative amount of CPP to DNA is increased. This shifting phenomenon is not seen in the BSA sample in the last lane of each gel that was used as a protein control, nor with the eGFP did recombinant protein whose production and purification processes were conduct in the exact same fashion as the CPP production and purification processes (Figure 4). The CPP to DNA ratio IC50 values for ALMR, SIMR, and VPMR are 0.68, 0.67, and 0.64, respectively. ALMR can bind to 100% of the 200 mg of DNA in a CPP to DNA ratio equal to about 1.5:1, while SIMR and VPMR have ratios of about 1:1 ( Figure 3). The results of the CPP docking analysis indicate that all of the CPPs can interact with DNA. Each analysis shows a roughly twisted double-stranded DNA, and these double-stranded DNA can bind to more than one peptide at a time. The CPP binds to the DNA through the N-or C-terminus, sometimes both, and covers the DNA, protecting each strand from exposure to the external environment (Figure 4). The interaction of the CPP with DNA causes a double strand of DNA to be protected from the external environment. The DNA interacts with the CPP through the phosphate backbone, and the CPP interactions occur through the histidine tag, fusion peptides, and the HIV and REV HIV matrices. As previously reported, the DMEM exposure was performed in 10% FBS for three hours, and the results indicate that the three types of CPP that were tested can protect the DNA from the effects of the serum nuclease contained in FBS ( Figure 5). In the analysis of the gel electrophoresis, it can be observed that DNA not contained within the CPP complex (CPP: DNA = 0:1) and the DNA in complex with eGFP (non-DNA binding), is degraded by the serum nuclease into smaller fragments of DNA. The smaller DNA fragments migrate faster than the intact DNA and DMEM ( Figure 5). The presence of DNA in the CPP complex was proven by adding SDS-PAGE buffers to the reaction mixtures. The electrophoresis gel analysis showed that plasmid DNA within the complex with ALMR, SIMR, and VPMR can be observed, whereas DNA in complex with eGFP or not contained in the ALMR, SIMR, and VPMR complexes was degraded ( Figure 5).

Discussion
Design and bioinformatics analysis of CPP sequences In this study, CPP production was based on a literature study of each known CPP. Some peptides, such as VP22, have been shown to be a CPP in earlier research. The selection of some peptides such as FP ASLV, FP SIV, FP Hepatitis B, HIV-1 Matrix, and Rev HIV-1 are based on a literature study of peptides involved in the process of viral infection with a natural ability to be cell penetrant, and VP22 is part of the viral protein of the herpes simplex virus. It is located between the envelope and matrix proteins and constitutes 50% of virion-composing proteins [9]. The VP22 protein is 38 kDa, and is phosphorylated. It has the ability to bind to microtubules, causing cytoskeleton reorganization, and it can cause intercellular trafficking by entering the cell nucleus through disintegrated core membranes in the early phase of mitosis, binding to chromatin [10,11]. The SIV-fusion peptide belongs to the class I fusion peptide group. The SIV envelop is prepared by 160 kDa glycoprotein precursors. This large protein is cut into two parts, gp120 (120 kDa) and gp32 (32 kDa) at the time of the infection process [12]. This process causes the SIV-fusion peptide to be exposed. It is hydrophobic and when exposed to the cell membrane will penetrate into the hydrophobic lipid bilayer [12]. ASLV belongs to the retrovirus group and is a class I fusion peptide. The ASLV envelop consists of two subunits (SU), one of which functions in receptor recognition and one is a trans membrane subunit (TM) that plays a role in the fusion with cell membranes [13]. The FP ASLV is not located at the end of the N-terminal transmembrane domain, as is generally the case for class I fusion peptides, but lies in the central part of TM as a so-called internal fusion peptide (IFP) [14]. The IFP composite peptide is hydrophobic and may form a α-helical structure [15]. Rev is one of the HIV-1 regulatory proteins, and it has the ability to internalize into the nucleus of a cell [16]. The NLS present in Rev can bind directly to proteins that play a role in actively trans-locating objects into the nucleus, such as importin β, importin 5, importin 7, and transportin. Cai et al. (2010) successfully proved that the amino acid sequence ranging from 25-33 of the HIV 1 matrix plays a role in DNA binding, wherein the binding is independent of the nucleotide sequence [17,18].

CPP production in the prokaryote system
In this study, a prokaryotic expression system was used to produce the CPPs for testing. The success of protein expression in the prokaryotic system is influenced by many factors, including codon bias and GC content, the ability of RNA polymerase host cells to recognize transgene promoters, and the ability of the host cell ribosome to recognize the translation initiation region (TIR). The codon bias indicates the difference between the triplet sequences of codons present in the synthetic genes and the dominant codon sequences present within the host cell due to the fact that many amino acids can be coded by different triplet codes. Differences in codons are found in the last base of the triplet codon or in the wobble position. The sequence difference in the wobble position between the tRNA and the mRNA causes the tRNA to be imperfectly attached to the mRNA, meaning that the protein translation process can prematurely stop. The parameter used to measure the bias value of the codons is the codon adaptation index (CAI). The transcription process for transgenic mRNA is determined by the ability of the RNA polymerase complex to recognize the sequence of promoters present in the transgenic DNA. The promoter contained in pQE80L is a T5 promoter, which is derived from bacteriophage and is shown to be good for expressing recombinant proteins in the prokaryotic system [19]. The T5 promoter is recognized by RNA polymerase of E. coli and contains two Lac operators for increasing the binding of the lac repressor so that recombinant proteins are not expressed in the absence of an inducer, such as IPTG in this case. The utilization of the T5 promoter has the advantage that protein expression does not need to occur in a special strain of E. coli, as with the T7 promoter. The T7 9

1234567890
The promoter should be used in E. coli containing the specific encoding gene RNA polymerase from bacteriophage, such as BL21 (DE3) and its derivatives [20]. The ability of the RNA polymerase of E.coli to recognize the T5 promoter means that the CPP proteins, in this case ALMR, SIMR, and VPMR, can be expressed using E. coli DH5α cells, and an expression of the recombinant VP22 heat shock protein 70 in E. coli DH5α has been reported by Nishikawa [21]. Purification of proteins in a native state is advantageous because the purified protein remains in a naturally functioning state. CPP proteins such as were expressed in this paper can be found in the cell lysate supernatants postsonication, indicating they can be isolated in this native state, but, unfortunately, these three proteins cannot be purified in a native state. This is probably due to the inaccessibility of the 6x histidine tag by NiNTA, as required for purification, when the protein is in its native, folded state [19]. Purification of proteins in a denatured state is done if the protein in question cannot be purified in a native state. CPP ALMR, SIMR, and VPMR were all successfully purified in a denatured state using a guanidine thiocyanate buffer. When exposed to a mild environment such as guanidine thiocyanate and urea, the proteins lose their 3D structure, and the 6x histidine tag present in the recombinant proteins is more readily accessible by NiNTA. The denatured protein at the time of transfection into mammalian cells may undergo structural changes and may function normally [22]. Additionally, the denatured protein may return to its native form if its buffer is slowly replaced by one more closely resembling physiological buffers, such as PBS, via a dialysis process.

CPP binding analysis of DNA and CPP protection of DNA against the effects of serum nucleases
The ability of peptides to bind to and protect DNA from the effects of nuclease degradation is one of the requirements that a CPP must satisfy [23]. DNA and protein interactions occur through electrostatic bonds between the negative charges present in the DNA phosphate backbone and the cation molecules present in the CPP. The bond causes the negative charge of DNA to be neutralized, and this causes the DNA to become more compact [23]. This compactness induces a decrease in the migration rate of DNA toward the negative pole in an electrophoresis gel when compared to the migration rate of the control DNA (DNA not combined with the CPP). The DNA can become so shielded that even in the analysis of the gel electrophoresis, there are often no signs of any DNA bands. The CPP:DNA interaction can occur through electrostatic bonding, hydrogen bonding, and hydrophobic bonding. Electrostatic bonds are salt bridges independent of both protein and DNA structure [24] and are formed between negatively charged DNA phosphates and positively charged amino acids, such as arginine (R) and lysine (K), through their guanidium and lactin ammonium groups [25]. The strength of electrostatic bonding is influenced by water and salinity, with lower water or salt contents causing a stronger the bond between DNA and protein due to the ability of both water and salt molecules to intercept both the protein and the DNA-loaded clusters [26]. The hydrogen bond occurs between two adjacent molecules, and can form between the R groups, the amine backbone, the amino acid chain carbonyl groups with the DNA base, and the oxygen group present in the DNA phosphate [27]. Peptides having a α-helical structure can interact with the major groove of DNA through hydrogen bonds, which can be seen in the pattern of the VPMR interactions via HIV matrix and the ALMR interactions through fusion peptides with the DNA bases. Hydrophobic bonds occur between two hydrophobic molecules when they try to maintain their shape in water. As the number of water molecules that lie between the DNA and the proteins decreases, the DNA and proteins will interact through complementary structures of both molecules [26]. Plasmid is a double-stranded DNA with a helical structure, which is formed as a result of the two DNA strands twisting along the long axis of the DNA chain. The backbone phosphate is on the outside of the twist, while the sugars and bases are on the inside. Based on the CPP:DNA bond prediction analysis, more than one CPP can bind on the same or different DNA strand. CPP bonding on the backbone phosphate as well as with the base contained in the groove will help to shorten the distance of the twist of the DNA (turn). This causes the double strand of DNA to become more compact or condensed. In the CPP-protective test against the effect of serum nucleases, it was shown that the CPP protects the DNA from nuclease degradation. In the body, nucleases are found in both the serum and extracellular fluid, so the CPP 10 1234567890 The would need to be able to protect the DNA. The CPP:DNA binding prediction results show that the CPP:DNA interaction is occurring almost entirely along the DNA surface. This bonding pattern will cause DNA to be in a peptide "cage" that will be difficult to access by the existing nucleases in the environment [28]. In addition, the steric hindrance of the larger protein molecules prevent the nuclease from even approaching the DNA.

Conclusion
In this study we have designed and synthesized novel CPPs with the ability to bind DNA and protect it from serum nucleases. We expect these CPPs to contribute to the design and synthesis of new, recombinant peptide CPPs for DNA shuttling through cell membranes.