FoldX as Protein Engineering Tool: Better Than Random Based Approaches?

Improving protein stability is an important goal for basic research as well as for clinical and industrial applications but no commonly accepted and widely used strategy for efficient engineering is known. Beside random approaches like error prone PCR or physical techniques to stabilize proteins, e.g. by immobilization, in silico approaches are gaining more attention to apply target-oriented mutagenesis. In this review different algorithms for the prediction of beneficial mutation sites to enhance protein stability are summarized and the advantages and disadvantages of FoldX are highlighted. The question whether the prediction of mutation sites by the algorithm FoldX is more accurate than random based approaches is addressed.


Introduction
Increasing protein stability is a desirable goal for different life science purposes, this includes design of therapeutic proteins like antibodies, human cell biology and biotechnology. It is expected that such improvements result in lower process costs and in enhanced long-term stability of the applied proteins. Enhanced protein stability in general can be achieved due to various factors, e.g. by increasing thermostability, salt tolerance or tolerance towards organic solvents, and consequently, involves different bioinformatics approaches. The emphasis for application of proteins for medical and chemical purposes is focused on the fields of biosensors (e.g. blood sugar test strips [1]), biomedical drugs (e.g. antibodies against cancer cells [2]) or on the synthesis of complex as well as chiral substances for food (e.g. high fructose corn syrup [3]) and pharmaceutical industry (e.g. sitagliptin [4]) [5]. Obviously biosensors for medical use assisting to diagnose several diseases like breast cancer [6], diabetes [7] or infectious diseases [8] have to be functional and reliable for a defined period of time. It seems, for example, to be beneficial to gain more thermostable antibodies for treatment of cancer diseases [9]. Furthermore, for the synthesis processes of drugs and pharmaceutically relevant intermediates, applied enzymes have to be active and functional for long batch times to prevent drastic increases in costs per unit of product [10][11][12] [13]. For industrial enzymes improved stability against heat, solvents and other relevant process parameters, e.g. acidic or basic pH, often becomes crucial [14]. In addition, improved thermostability of enzymes might prevent thermal inactivation and conformational changes at higher reaction temperature, which could in turn be beneficial to raise turnover rates and substrate concentrations [15][16][17][18][19]. According to the Q 10 -rule of thumb, biological systems and enzymes tend to have a Q-factor of 2, i.e. a temperature increase of about 10 K results in doubling the reaction rate [20,21]. Contrary, stabilization also can lead to more rigid enzymes, which are less active at the same temperature, but show the same activity at elevated temperatures. This can be observed when enzymes from hyperthermophilic and mesophilic sources are compared with respect to their reaction rates [22]. A thermostabilized enzyme might be less active at a certain temperature, but longer active at higher temperatures, which allows applying the Q 10 -rule on the condition that the activity can be maintained for longer time periods at elevated temperatures [23,24]. However, it is also possible that a thermostabilized enzyme is not impaired in activity at moderate temperatures and is even more active at higher temperatures [25][26][27][28][29]. Arnold et al. demonstrated that an enzyme can be simultaneously developed towards higher stability and activity [24,30,31,45].
Protein denaturation and degradation due to both heat and solvents are based on the same protein unfolding processes. The most important forces for protein stability, which are relevant targets for the improvement of protein stability, are intramolecular interactions, i.e. disulfide bridges, ion interactions, hydrogen bonds, hydrophobic interactions and core packings [22]. The rigidity and flexibility of proteins seem to be the key parameters [33] and both can be influenced by using Computational and Structural Biotechnology Journal 16 (2018) [25][26][27][28][29][30][31][32][33] immobilization techniques or enzymatic engineering in order to expand the durability of protein applications.
The well-known technique to immobilize proteins to gain stabilized proteins is applied for antibody to increase the thermostability [34,35]. Beside the improvement of thermostability using immobilization techniques, directed evolution is an alternative approach, but the existence of a robust high-throughput screening assay for the selected protein is an important prerequisite [11,[36][37][38][39][40][41]. For enzymes, activity can be used as parameter for functionality at elevated temperatures, but for non-catalyzing proteins a more sophisticated assay or even protein purification is necessary. Furthermore, the number of necessary protein variants, created by using e.g. error prone PCR or other techniques, is mostly about 10 3 to 10 5 and even higher. However, in case of enzymes, selection might easily be performed by heating up unpurified crude extracts from cells [42]. Using this technique, protein melting temperature T m can be improved in the best screenings by far N10°C [42][43][44][45][46][47]. The artificial evolution approach can result in a 140-fold increase in long-term enzymatic activity like demonstrated for the alkaline pectate lyase [48]. Also for antibodies or antibody fragments evolutionary approaches can be used [49]. For example, the protein melting temperature of a human antibody domain was improved by N10°C [50].
Directed evolution can be a successful strategy but might not be applicable at any time, especially when missing a high-throughput screening or protein purification for stability measurements is necessary. Therefore, this mini-review focusses on protein/enzyme engineering for thermostabilization using structure guided site-directed mutagenesis. This strategy helps to reduce screening effort and also costs, which is an issue in large screenings. Furthermore, we selected the popular FoldX algorithm and would like to answer the question: how powerful is FoldX for common protein stability improvements? FoldX is a frequently used algorithm and many studies about protein stabilization experiments are described in literature. A second reason is the user-friendliness of FoldX, because it can easily be used as plugin in the protein structure visualizer YASARA [51]. In contrast to other, command-line based in silico approaches, which are without graphical interface, scientists not familiar with programming languages like python, Java, R-script and so on and hide a larger workload for these kind of approaches.

Computational Approaches for Stability Engineering
Besides FoldX, several other algorithms used for site-directed mutagenesis are also known aiming at different inter-and intra-protein interactions. One target is the introduction of artificial disulfide bridges into proteins. As a covalent bond, a disulfide bridge is a strong physical force which helps to stabilize the 3D-structure within a protein chain or between monomers raising the protein melting temperature (T m ) up to 30°C, and can achieve an increase of thermal stability by N40% at distinct temperature levels [52][53][54]. However, introduction of disulfide bridges can also lower the T m up to −2.4°C [55]. Starting with the protein structure as basis for molecular dynamic simulations and energy calculations, amino acid positions can be selected which are potentially suitable for engineering of disulfide bridges. However, these approaches need profound understanding of different prediction and calculation software, often without graphical interfaces [56]. Two examples are the algorithms for fast recognition of disulfide mutation sites FRESCO and the open access webtool "Disulfide by Design 2" (DD2), but only DD2 can be easily used with graphical interface [57][58][59]. Using FRESCO, a temperature improvement of 35°C was achieved due to the combination of single disulfide bridges. Jo et al. increased the T m of the α-type carbonic anhydrase by 7.8°C due to an introduction of a disulfide bond efficiently predicted by DD2 [60]. Albeit the promising examples, it has to be mentioned, that the extensive FRESCO strategy cannot be understood as an end-user script, but more or less as a blue script for improving thermostability. Wijma et al. further improved FRESCO by integrating FoldX and Rosetta as additional energy improvement tools and combined these results with Dynamic Disulfide Discovery algorithm based on molecular dynamic simulations [57,145]. After in silico elimination of less stable variants, they expressed, tested and combined beneficial point mutation sites and disulfide bonds to gain two variants with drastically increased T m of 34.6 and 35.5°C, respectively. However, this strategy is very extensive and many point mutations have to be tested and combined.
Beside the possible de novo design of disulfide bridges, further computational methods like helix dipole stabilization or core repacking exist. Core repacking aims only at the core region of proteins to increase hydrophobic interactions. Vlassi et al. showed that a reduction of hydrophobic interaction decreases the protein stability [61] and computational tools like RosettaDesign and Monte Carlo simulations are used for the optimization process [62][63][64]. Adapted and automated RosettaDesign framework for repacking are available, but profound programming capabilities are needed for applying [64]. In contrast, helix dipole stabilization methods lead to improvements of molecular interactions at the end of helices, which can also result in drastically increased T m by N30°C [65,66]. However, for this strategy elaborate electrostatic calculations and molecular simulations are needed to select mutation sites. Beside these strategies, consensus sequences can also help to improve protein stability using multiple sequence alignments. In so-called consensus guided mutagenesis, sequences are compared according to their amino acid frequencies to elucidate consensus sequences. Replacing amino acid residues at certain positions with the most prevalent ones often result in highly beneficial energy improvements stabilizing proteins [67][68][69][70]. Huang et al. demonstrated that by using consensus approach it was possible to improve the stability of the reductase CgKR1 T 50 15 (temperature at which the enzyme activity is halved within 15 min) by N10°C [71].

Un/folding Energy Algorithms
At least 22 standalone calculation tools are described for the prediction of beneficial single and multiple point mutation sites to reduce the Gibbs free energy of proteins. The broad diversity of these standalone software was reviewed by Modarres et al. and beside the mentioned FoldX algorithm, other tools like PoPMuSiC, CUPSAT, ZEMu, iRDP web server or SDM were mentioned [72][73][74][75]. These calculation tools are structure or sequence dependent and use energy calculation functions or machine learning algorithms. Also databases collecting changes in protein stability (e.g. for Gibbs free energy changes and melting temperatures) are available like ProTherm (others are e.g. MODEL, DSBASE), but it should be mentioned that 70% of the logged mutations are destabilizing which leads to unintended biases [73,76]. Beside the more popular algorithms others are published like mCSM, BeAtMuSiC and ENCoM using different calculation approaches [77][78][79]. Moreover, it is also possible to use crystallographic data gained by X-ray analysis of protein structure. The B-factor is an indicator for the flexibility of positions within the protein. Reetz et al. used this factor for increasing protein stability [80].

FoldX
Considering the diversity of available algorithms, it seems to be very difficult to choose an efficient tool for protein stabilization. In this review we concentrated on the force field algorithm FoldX, which we have used by ourselves to create a more stable ω-transaminase [81]. The force field algorithm, which was originally created by Guerois et al. became popular as webtool in 2005 by Schymkowitz et al. and was refined to the currently last version FoldX 4.0 [82][83][84].
The software package FoldX includes different subroutines e.g. RepairPDB, BuildModel, PrintNetworks, AnalyseComplex, stability and so on. For example the repair function of FoldX reduces the energy content of a protein-structure model to a minimum by rearranging side chains and the function BuildModel introduces mutations and optimizes the structure of the new protein variant. The energy function of FoldX is only able to calculate the energy difference in accurate manner between the wildtype and a variant of the protein [83].
FoldX is also able to calculate total energies of objects, but this function is only valid to predict, whether a problem with the structure is given or not. The total energy results are not able to predict experimental results [51,83]. The core function of FoldX, the empirical force field algorithm, is based on free energy (ΔG) terms aiming to calculate the change of ΔG in kcal mol −1 (Eq. (1)). This equation includes terms for polar and hydrophobic desolvation or hydrogen bond energy ΔG wb of a protein interacting with solvent and within the protein chain. Increased protein rigidity works against entropy and consequently, results in entropy costs.
Furthermore the energy algorithm also addresses the free energy change at protein interfaces of oligomeric proteins. This term is mainly ΔG kon which calculates the electrostatic contribution of interactions at interfaces [83]. The parameters which are important for the energy calculation were determined in laboratory experiments, e.g. for amino acid residues and explored on protein chains. Beside this distinct parameters the letters of the total energy equation, a to l, represent the weights of separate terms [83]. The algorithm works with optimal accuracy when the hypothetical unfolding energy difference of the hypothetical energy from a wild-type variant is determined in comparison to a mutated protein. For this purpose, FoldX uses the 3D structure to calculate the hypothetical unfolding energy. The algorithm was first implemented as free available web server tool and is now a commercially available software, which can be used free of charge for academic purposes. As a prerequisite, a highly resolved crystal structure is necessary to calculate the energy changes for site-directed mutagenesis experiments. Users can also automate the calculations e.g. by using the programming code Python to calculate whole protein amino acid exchanges at every distinct position [85,86]. Furthermore, FoldX shows very good performance with respect to calculation time even on single core computers. Compared to e.g. ZEMu, FoldX needs only half the time for calculating single site mutations (calculated on one single processor) and is faster than RosettaDDG [75,87]. As mentioned earlier, it can be used with a graphical user interface as plugin tool in YASARA, which opens FoldX towards a broad community of researchers.

FoldX-applications
FoldX was applied for different stability tests, especially when protein design was performed to predict whether distinct mutations are destabilizing. Therefore FoldX shows to be beneficial for different approaches and is not strictly limited to a distinct function. Moreover the peptides, individual domains and multi-domain proteins can be addressed for experiments [88,89]. The algorithm has been used to explain and predict stability improvements when designing solvent stable enzymes. The group of U. Schwaneberg designed a laccase with improved resistance in ionic liquids for using hardly soluble lignin lysates and increased tolerance towards high molarity of salts [90]. Beside its suitability for protein energy calculations, it is also possible to calculate the energy changes of DNA-protein interactions [91]. Furthermore, FoldX is implemented in a lot of approaches like Fireprot, FRESCO, TANGO or in combination with Voronoia 1.0. Voronoia helps to engineer protein core packing and is based on energy calculations using FoldX as force field algorithm [92,93]. The program FRESCO (Framework for Rapid Enzyme Stabilization by Computational libraries) joins Rosetta with FoldX energy calculations and combines single point mutations with disulfide predictions for drastic energy improvements of enzymes [57]. The direct alternative to FoldX is the Rosetta energy algorithm. It was shown, that Rosetta predicts other possible mutation sites than FoldX for energy improvements, but only 25% of all mutations were predicted by both algorithms for the same protein [57]. Additionally, the authors of this work excluded 52% of the selected mutations manually, e.g. excluding hydrophobic mutations on surface exposed sites and mutations to a proline residue or a proline residue to a non-proline residue. At the end around 65% of the predicted mutation sites were calculated by FoldX and thereby 35% of all predicted sites were discarded. Voronoia in combination with FoldX helps to predict and to explain why hydrophobic interactions in the core region can have a huge impact on protein stability, as it was demonstrated for the thermophilic lipase T1 [93]. Another approach is TANGO, which helps to predict the aggregation of proteins and, in combination with FoldX, is a powerful tool for the investigation of predicted mutations regarding solubility, e.g. protective site-directed mutations for the Alzheimer's αβ peptide [83,94,95]. Furthermore, FoldX can also support protein design. For engineering the zinc-finger nuclease, FoldX was used as prediction algorithm to detect if the binding energy of a distinct DNA-sequence was increased or decreased [96]. Also, FoldX can help to estimate protein-protein binding energy and resulting stabilities of protein complexes. Szczepek et al. redesigned the interface between dimeric zinc finger nucleases using FoldX as prediction tool [97]. After deeper in silico calculations, only 9.3% of predicted variants were expressed and proved to be beneficial for stability [97]. Considering these and other experiments the performance for FoldX should be critically evaluated.
Therefore, we gathered FoldX experiments and analyzed available publications if FoldX was helpful for increasing protein stability (Table 1). In general, the amount of standalone FoldX calculations for protein stability improvement in literature is relatively low compared to approaches, which are using FoldX as an additional tool for stability calculation. Furthermore, FoldX is often only used as algorithm for explanations of the impact of mutagenesis in proteins with respect to stability or towards predictions of protein-protein or protein-DNA binding. Therefore, in Table 1 only mutations with effects based on FoldX predictions are pointed out, even when authors used additional calculation tools. When no pre-selection of distinct protein sites are indicated, a complete calculation of every position in the protein was performed. In this case, every amino acid was exchanged with the 19 standard amino acid residues. This calculation setup results very fast in high numbers of predicted variants. One criterion for excluding many variants is to set an energy barrier for ΔΔG between −0.75 and −5 kcal mol −1 for stabilizing mutations and for destabilizing mutations of N+1 kcal mol −1 in accordance to the Gaussian distribution of FoldX predictions (SD for FoldX 1.78 kcal mol −1 [95]) [98]. After this preselection a large number of variants can be excluded. Furthermore, mutations nearby active sites, proline residue mutations or variants which seem to be critical for protein structure can be also excluded. In addition to manual exclusion of variants, also MD-simulations can be performed to exclude more variants. Aiming to indicate the grade of improvement, protein melting temperature T m or half-life activity is frequently used. The largest positive changes in stability were reached for the T1 lipase, phosphotriesterase, Flavin-mononucleotide-based-fluorescent-protein and for the haloalcohol dehalogenase ranging from 8 up to 13°C using single site mutations [99]. However, FoldX also allows prediction of destabilizing mutations, which were performed very accurately for the thermoalkalophilic lipase with a negative ΔT m of 10°C. Noticeably, stabilizing predictions are useful for biotechnology and are therefore mentioned in studies with biotechnological background, whereas destabilizing predictions seem to be more applicable for human disease studies [95]. Beside mere stability studies, also protein design was performed towards specific enzyme-DNA binding or antibody-antigen binding, which can reduce the size of antibody libraries for distinct antigen targets. Moreover, FoldX can also be used to adapt or to select Table 1 Summary of different FoldX applications for single point mutations regarding stability and ligand binding. The changes achieved i.e. T m is listed for changes in protein melting temperature. ΔΔG displays the change in free energy by mutation/design of proteins. "Criteria" describes the settings for experiments. "Cut-off" means, that the authors excluded those indicated FoldX predictions (with a higher or lower ΔΔG) from further experiments. ΔΔG is defined as: ΔΔG = ΔG fold (mutation) − ΔG fold (wild type).

Accuracy of FoldX
From the FoldX studies summarized in Table 1 it can be deduced that the crystal-structure quality is crucial for accurate calculations. From a benchmark test on myoglobin mutants Kepp concluded that some protein stability predicting algorithms are extremely sensitive towards crystal structure quality and some are very robust [121]. It seems plausible that interactions are in the order of atomic resolutions and therefore the crystal structure quality has an important influence on energy calculations [107,[121][122][123]. However, for the prediction algorithms PoPmuSic, I-Mutant 3.0 and other tools the influence of the crystal structure quality was only in the order of 0.2 kcal mol −1 (standard deviation using different structure data of superoxide dismutase 1) [123]. According to Christensen et al. FoldX belongs to the more structure sensitive methods and Kepp suggested to use only structures solved in scales of near-atomic-resolutions [107,121]. With reference to Table 1, all cited studies were based on crystal structures with an resolution better than 3.3 Å and an average resolution of 1.87 Å which is nearby atomic resolution (1 Å is approximately the diameter of an atom plus the surrounding cloud of electrons). Furthermore, also protein-protein interactions might have an influence on the prediction power, which are not addressed in some performance studies like from Tokuriki et al., because only monomeric proteins were selected [124], but e.g. Pey et al. and Dourado and Flores showed that also oligomers can be utilized for calculations (using extra terms: ΔG kon electrostatic interaction, ΔS tr translational and rotational entropy) [125,75]. The rootmean-square deviation (RMSD) in a dataset of protein complexes, with known energy impacts, was determined to be 1.55 kcal mol −1 (for single mutants) [75]. In contrast the algorithm ZEMu addresses such mutations on interfaces better than FoldX [75].
Based on experimental results, it can be concluded that the prediction of destabilizing mutations is more accurate than prediction of stabilizing mutations. After pre-selection of experiments with the aim to increase stability, it can be concluded that the approximate success rate for mutations predicted as stabilizing (according to their negative ΔΔG-values) is only 29.4% (focusing on 13 single mutation experiments). For experiments with focus on detection of destabilizing mutations or for simple proof of destabilizing events, sample size is only five but the average success rate is 69%. However, with regard to the small sample size a valid statement about success rates cannot be made. It is likely that many unsuccessful experiments were not published and therefore, the real success rate might be much lower. Khan et al. evaluated the performance of 11 protein stability predictors by using a dataset containing N1700 mutations in 80 proteins which were taken from ProTherm database. It was shown that FoldX was among the three most reliable algorithms, predicting 86 true positives and 133 false positives for stabilization from 776 variants, which is a success rate of 64%. Only Dmutant and MultiMutate were comparably successful in predicting stabilization events [102].
Compared to other results, this success rate might be higher than expected. As an example for an investigation of the performance of an adapted FoldX algorithm, laccase isoenzymes were used. The large calculation setup included 9424 FoldX predictions per isoenzyme using an adapted algorithm. These calculations were evaluated by using molecular dynamic simulations and additional different settings within FoldX were tested. Like mentioned before, the authors remarked that FoldX needs high-resolution crystal structures of proteins and that FoldX performs well in predicting stability trends, but not in a quantitative accuracy [75]. Using the deciphering protein (DPP) as an example, Kumar et al. showed on the basis of 54 DPP mutants how accurate the prediction power of FoldX is compared to other tools. The study focused on destabilizing mutation events, which were described in medical data sets of DPP and concluded that the R-value (correlation coefficient) was only 0.45 to 0.53. The quality of the crystal structures in this study ranged between 1.07 and 1.93 Å [77]. Potapov et al. utilized for performance investigation a protein database set regarding 2156 variants in 59 proteins. The crystal structure qualities were not reported. However, they concluded that 81.4% of T m changes were qualitatively predicted correctly [127]. Furthermore Potapov et al. headlined their work for analyzing different protein stability tools "Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details", and proved that FoldX has potential to predict if a certain mutation is stabilizing or destabilizing, but its prediction power decreases, when ΔΔG is correlated with ΔΔG experimental or with stability parameters like T m [127]. The correlation coefficient R, plotting ΔΔG theoretical against ΔΔG experimental values from databases was only 0.5 (for negative and positive ΔΔG), but it also depends on the crystal structure and on the nature of the protein [127].
For better comparison, we summarized statistical parameters given for the different algorithms derived from Kumar et al., and other studies (as indicated) in Table 2, but not for every algorithm we were able to find a full set of data. For example, Kumar   human superoxide dismutase1 [73] which is involved in the motor neuron disease [131]. In this benchmark test FoldX and PoPMuSiC performed best by far. FoldX showed in this test a correlation coefficient R of 0.53 and a standard error of 1.1 kcal mol −1 , which was only slightly surpassed by PoPMuSiC [77]. In conclusion, the authors described FoldX as more sensitive and accurate towards difficult mutation sites but PoPMuSiC as more accurate to all kinds of mutations. Also, they demonstrated that FoldX can interpret patient data for dismutase diseases quite well with an R of 0. 45 [130]. In contrast, by investigating 582 mutants of seven proteins, R was 0.73 with a standard deviation of 1.02 kcal mol −1 [134]. The best result was a correlation coefficient of 0.73 for a lysozyme structure [127] and was increased to 0.74, when only hotspot areas were chosen for prediction. The standard deviation (1.37 kcal mol −1 ) was in the same range of Broom et al. (1.78 kcal mol −1 ) [95]. However, Tokuriki et al. calculated that the average ΔΔG for any protein is +0.9 kcal mol −1 ΔΔG, which clearly shows, that the probability of destabilization events is much higher, which concludes that the number of stabilizing theoretical mutants is much lower [135]. Not only the number of theoretical stabilizing mutations seems to be lower, also the correlation for predicted stabilizing mutations towards real stabilization is weaker than for destabilizing mutations [57,111]. In contrast, Khan et al. showed for human proteins that FoldX predicts more stability increasing variants than destabilizing variants, which might be a hint that human proteins are relatively nonrigid and less thermostable compared to other protein sources or that distribution of ΔΔG calculated against the frequency of stabilizing and destabilizing mutations is only protein depending [102]. Furthermore, the calculated ΔΔG Foldx energies deviate from real ΔΔG measurements. The values can be recalculated using an experimental factor ΔΔG experimental = (ΔΔG calculated + 0.078) 1.14 −1 [135,136]. Depending on the method used to evaluate FoldX, the accuracy will be in the range from 0.38 to 0.80 [102,129]. Obviously, FoldX can predict positions which are important for stability, but the discrimination between different amino acid residues at one site is not really powerful, e.g. an exchange of lysine to glutamate did not result in any change of ΔG, but experimentally a stabilization was observed [120,128]. The summarized results in Table 2 demonstrate that actually all algorithms are not able to design or predict single mutation events towards trustworthy one mutation protein designs. However, FoldX shows a good performance in most of the studies compared to other algorithms, but it is necessary to increase the number of experimental mutations above 3 to achieve probable true positive results for protein engineering experiments. A general disadvantage of FoldX and other algorithms seems to be that FoldX often predicts hydrophobic interactions but at the expense of protein solubility [95].

The Next Generation of FoldX Based Predictions
Due to the low accuracy of all algorithms for stabilization mutations, algorithms often are combined to find coincident predictions or to prove predictions with a second algorithm. A popular combination is FoldX and Rosetta-ddG to gain more stabilizing mutation predictions. It was shown that FoldX and Rosetta-ddG predictions overlapped only in 12%, 15% or 25%, respectively, which means that a good coverage of beneficial mutations can only be achieved when more than one tool is used [57,87,105]. As a consequence of low prediction accuracy, popular algorithms are continuously improved. Recently a refinement of the Rosetta energy algorithm was reported with increased accuracy and faster calculation times. This demonstrates also the continuing importance of stability prediction in the field of protein engineering, but the authors stated that it is still far away from a final gold standard in the field of energy content prediction [137].
A sophisticated approach is the freeware webtool FireProt [138]. The FireProt algorithm uses FoldX as a pre-filter to select beneficial mutations which are subsequently proved in a second round using Rosetta-ddG. Only if Rosetta-ddG also predicts these mutations as putatively stabilizing they will be used for the experimental realization of these amino acid exchanges. Furthermore, the algorithm uses a consensus analysis of the protein-sequences to predict evolutionary beneficial mutation sites towards stability. These selected sites are then evaluated for their suitability using FoldX. The algorithm is divided into three stages using different methods for crosschecking the accuracy of the calculations and combines putative beneficial mutations to gain further improvement of stability. The free webtool of FireProt allows even inexperienced users to perform protein energy calculations. Bednar et al. demonstrated at two examples the utility of this algorithm using the example of two enzymes and combined many mutation sites with overall improvements of ΔT m of 21°C and 24°C for the combination of all sites [87]. However, to verify if FireProt is useful or not, more studies are necessary. Furthermore, the core function of FoldX algorithm does not simulate backbone movements of the protein, which might be a potential factor to improve FoldX [75]. The stability prediction tool of Goldenzweig et al. might be an alternative to the mentioned Fireprot -algorithm. Similar to Fireprot, it combines information gained in sequence homology alignments and of energy calculations using crystal structure data and Rosetta-ddG. Using the human acetylcholinesterase, an improvement in stability (ΔT m = 20°C) was demonstrated and simultaneously, the expression level in E. coli BL21 was increased. They hypothesized, that putatively destabilizing mutations can be excluded from mutation libraries using homologous Table 2 Summary of different algorithms evaluated in performance tests considering prediction accuracy in comparison to experimentally investigated mutations and calculated statistical parameters. This table displays reported standard deviations of predicted true positives and true negatives. Accuracy is defined as ratio of true positives/true negatives to the total number of predictions. R-values (correlation coefficients) describe how precisely the predicted energies fit to database values.

Conclusion
The performance of FoldX depends drastically on the quality of the crystal structure and it is unclear if the protein source might have an influence on the accuracy of such algorithms. Nevertheless, FoldX seems to be more accurate for the prediction of destabilizing mutations and less accurate for the prediction of stabilizing mutations, but in both cases it was shown that FoldX is clearly better than random approaches: e.g. Christensen et al. described FoldX as one of the most accurate single site stability predictors and Potapov et al. even described the accuracy of FoldX as impressive compared to other algorithms [122,127]. The natural success rate for random mutagenesis is only~2%, which was surpassed by most experiments [95,140]. Therefore, FoldX seems to be a promising tool for protein design, but as mentioned by Thiltgen et al. we agree that FoldX cannot serve as a gold-standard for generally improving stability of proteins. Moreover, using FoldX together with other algorithms for reciprocal control of calculation results, Rosetta-ddG or PoPmuSiC as filter for true positive results will most probably increase the accuracy and the success rate of thermostability engineering [141,87,95]. In general the accuracy can be improved additionally, when mutation outliers are eliminated or additional MD-simulations are performed [83]. FoldX was used successfully in different approaches (Table 1) aiming from enzyme stabilization towards predictions of protein-protein interactions (especially for drug design) or for the prediction of disease-associated mutant proteins, making FoldX a versatile tool for life science [81,[142][143][144]. The progress in protein stability prediction is striking, however up to now no in silico calculation can fully spare experimental procedures, although the existing tools can reduce the amount of lab experiments significantly.