PreDisorder: ab initio sequence-based prediction of protein disordered regions

Background Disordered regions are segments of the protein chain which do not adopt stable structures. Such segments are often of interest because they have a close relationship with protein expression and functionality. As such, protein disorder prediction is important for protein structure prediction, structure determination and function annotation. Results This paper presents our protein disorder prediction server, PreDisorder. It is based on our ab initio prediction method (MULTICOM-CMFR) which, along with our meta (or consensus) prediction method (MULTICOM), was recently ranked among the top disorder predictors in the eighth edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP8). We systematically benchmarked PreDisorder along with 26 other protein disorder predictors on the CASP8 data set and assessed its accuracy using a number of measures. The results show that it compared favourably with other ab initio methods and its performance is comparable to that of the best meta and clustering methods. Conclusion PreDisorder is a fast and reliable server which can be used to predict protein disordered regions on genomic scale. It is available at http://casp.rnet.missouri.edu/predisorder.html.


Background
While most regions of a protein adopt localized, stable structures, there are some segments of the protein chain which do not. These are regions whose coordinates are hard to determine by experimental techniques or that simply do not fold into stable structures [1,2]. Such regions are known as disordered regions. Proteins with disordered regions are capable of binding to multiple partners and participating in various reactions and pathways [3][4][5]. Disordered regions can also give rise to the poor expression of a protein, making it difficult to produce for crystallization or other purposes [6]. Conse-quently, the prediction of disordered regions in proteins has implications for protein production, structure prediction and determination, function annotation and cellular process recognition.
Measuring native disorder experimentally is time consuming and expensive and thus computational approaches for the prediction of protein disordered regions have received considerable attention in recent years [7]. As a result, a number of disorder prediction software and web services and their underlying methods are quickly becoming a valuable tool for protein structure prediction, determination, and function annotation [8][9][10][11][12][13][14][15][16][17][18]. To stimulate further development of disorder prediction, CASP has dedicated a category to blindly benchmark the current state of the art. Here we benchmark our ab initio and consensus (or meta) disorder predictors along with dozens of other predictors that participated in the CASP8 experiment. The good performance of our PreDisorder server makes it a valuable and accurate tool for protein structure prediction, protein determination and protein engineering.

Ab initio neural network method
Our server, PreDisorder, is based on our ab initio method that participated in CASP8 under the group name MULTICOM_CMFR. This is a machine learning approach using 1-D Recursive Neural Networks. With this approach, a target protein sequence is first aligned against several template profiles using PSI-BLAST. This creates an input profile of the sequence. This profile along with the predicted secondary structure and solvent accessibility is fed into a 1D Recursive Neural Network (1D-RNN) that makes the disorder predictions [6]. More specifically for each protein sequence, the input is a 1-dimentional array I whose length is the total number of the residues in the sequence. Each element I i of the array is a vector with 25 values which represent the residue i. Of these 25 values, 20 represent the frequencies of each amino acid at the corresponding position from PSI-BLAST profile [19]. The other five are binary values used to encode the predicted secondary structure (Helix, Strand or Coil) and solvent accessibility of the residue [20][21][22]. Based on the input I, the 1D-RNN produces an array of real numbers O, where the i th element O i is the probability that the i th residue will be disordered. A large curated dataset was randomly divided into ten subsets of approximately equal size in the preparation for the following ten-fold cross-validated training and testing. And then, this 1D-RNN was trained and cross-validated using the ten subsets [23]. Finally, the predicted disorder probabilities of the residues were rescaled so that the ratio of residues with disorder probability greater than or equal to 0.5 is close to the ratio of the disorder residues in the training dataset [23]. Specifically, the scaling method first identified a probability threshold t (e.g. 0.1) for selecting predicted disorder residues such that the ratio (the number of predicted disordered residues/the number of total residues in the test dataset) is equal to the ratio of disorder residues in the training dataset (e.g. 5%). And then the predicted disorder probabilities (x) was re-scaled as x/t * 0.5 (if x <= t) or 0.5 + 0.5 * (x -t)/(1 -t) (if x >t).

Meta method
A meta method is a consensus approach that makes predictions based on the output of other predictors. Similar ideas have been applied to solve many prediction prob-lems such as protein fold recognition and achieved much better performance than individual predictors. One such example of this approach is 3D-Jury. 3D-Jury is an automated protein structure meta prediction system available through Meta Server, and it generates meta-predictions from a variety of models gained by variable methods [24][25] [26]. Our new meta predictor MULTICOM makes predictions based on a consensus formed from other CASP8 disorder predictors. It removes a few very inaccurate disorder predictors and then averages the output of the remaining disorder predictors. Our simple averaging approach is different from other meta methods based on consensus voting.

Results and discussion
We evaluated 27 disorder predictors that participated in CASP8. Among these predictors were our ab initio method predictor (MULTICOM-CMFR) and meta predictor (MUL-TICOM). They were evaluated on 117 protein targets whose structures were available when our evaluation was conducted. These targets contain 25431 residues and all the disorder predictions for them were downloaded from the CASP8 web site [27]. When evaluating the disorder predictions against the protein targets, target residues that did not have corresponding coordinates in its PDB file were considered to be disordered. The disorder annotations for the targets were curated by Dr.McGuffin [28]. Each residue in the target sequence is tagged with a binary label of "O" (order) or "D" (disorder). We evaluated the methods on all 117 targets and two subsets (97 X-ray structures and 20 NMR structures), respectively. It is worth pointing out that our evaluation serves as a complementary, comparative benchmark of our methods. Readers should refer to the CASP8 assessment paper for the official assessment of disorder predictions [29].
In evaluating the disorder predictors, we considered a number of different, commonly used measurements of performance for binary classifiers. One such measurement was the ROC score. This value represents the area under the Receiver Operating Characteristic (AUC) curve and measures the performance of a classifier system and its dependence upon its discrimination threshold. Ranking the predictors using ROC curves is a widely used method in bioinformatics and CASP competitions [7,30,31].
Another set of commonly used measurements for classifier systems are sensitivity and specificity. For each disorder predictor, we calculated the Positive Sensitivity (residues correctly identified as disordered) and FP is the number of false positives (residues predicted as disordered, but experimentally ordered). TN is the number of true negatives (residues correctly identified as ordered) and FN the number of false negatives (residues predicted as ordered, but experimentally disordered).
While in principle it is possible for a system to achieve both high values for positive and negative sensitivity, in practice it does not happen often. Usually, a sharp increased in one, results in a decrease in the other. An extreme example would be a predictor which identifies all residues as disordered. Such a system would have a positive sensitivity of 100% and a negative sensitivity of 0%.
In an attempt to join several of these measurements into one, we considered the product of positive sensitivity and negative sensitivity and the harmonic mean, or F-measure, of the positive sensitivity and positive specificity [32].
We also calculated a weighted score for each predictor. This is a measure which was introduced in CASP6 and is defined as Score ( ) where W disorder is set to 92.63 and W order to 7.37 [31]. As defined, this measure greatly rewards disordered residues correctly identified as classified as disordered while heavily penalizing any disordered residue that is misclassified. Table 1 reports the ROC scores, weighted score, positive sensitivity, negative specificity, negative sensitivity, negative specificity, product of positive sensitivity and negative sensitivity, F-measure respectively of all the disorder predictors. Moreover, Table 1 also shows the total number of residues predicted by each predictor respectively. For comparison, we also repeated the evaluation for the "only xray" and the "only NMR" sets, and the results are shown in Table 2 and Table 3. Figure 1 shows the ROC curves for the predictors. The predictors are ordered by ROC scores since the ROC measure is probably the most balanced measurement.
The CASP8 disorder prediction methods can be classified into four main categories [33]: (1)   CBRC_poodle, disopred, OnD-CRF and our predictor MULTICOM_CMFR. (4) Hybrid method. Fais-server is a hybrid method that combines both ab initio predictions and homology-based template information. Both ab initio and hybrid methods usually exist as standalone packages, while meta methods rely on other predictors.
In examining the results, no one method appears to perform decisively better than the rest according to all the measures. Predictors from each of the three types of methods (ab initio, meta and clustering) are represented in the top seven when comparing the predictors only on the basis of ROC score, weighted score, specificity or sensitivity. The meta method MULITCOM, the clustering method DISOclust, the hybrid method Fais-server and ab initio method MULTICOM-CMFR and 3Dpro are among top 5 in terms of ROC scores. Other ab initio predictors such as mariner1 and Distill-Punch also performed well. Interestingly, our ab initio predictor MULTICOM-CMFR also ranks first in weighted score and product of positive and negative sensitivity. Being an ab initio method, it also has the benefit of being able to make predictions solely on an input sequence. The other types of methods need additional information such as output from other predictors  CASP8 on the 20 NMR targets (T0437, T0460, T0462, T0464,  T0466, T0467, T0468, T0469, T0471, T0472, T0473, T0474, T0475, T0476, T0480, T0482, T0484, T0492, T0498, T0499)  (e.g. meta methods), tertiary structure models (clustering methods), or homologous structure templates (hybrid methods). Consequently, our PreDisorder server based on MULTICOM-CMFR is generally an accurate predictor that can be applied to the genome-scale annotation of protein disordered regions. Especially regarding the limits of predictability of intrinsically disordered residues from crystallographic experiments, both of our methods performed well on the X-ray targets shown in Table 3 [34]. Several methods (e.g., MULTICOM, DISOclust, fais-server, MUL-TICOM-CMFR, 3Dpro, mariner and Distill-Punch) yield similarly good AUC scores (>= 0.846), suggesting that the accuracy of disorder predictions might be close to the limit [34].
All of the predictors do quite well with respect to negative specificity and negative sensitivity. This is not too surprising as the most of the residues in a protein are ordered and hence the number of true negatives (TN) is very close to the true negatives plus false positives (TN+FP) and to the true negatives plus the false negatives (TN+FN).

Conclusion
This paper presents our disorder prediction web server, PreDisorder, and evaluates its performance against several other disorder predictors. We benchmarked MULTICOM-CMFR, the method employed by Predisorder and our meta method MULTICOM, along with several other pro-tein disorder predictors on the 117 targets used in CASP8.
The results show that our method is among the best and provides reliable protein disordered region predictions. Therefore, our server (PreDisorder) is a useful tool for structural and functional genomics.  Figure 2 Example output from PreDisorder showing probability of disorder for each residue in a sequence (CASP8 target T0470). The red curve represents predicted disorder probabilities. The green curve denotes real disorder annotations (1 -disorder; 0 -not disorder).

Availability and Requirements
The use of PreDisorder is straight forward and takes place through a simple input form. The input form requires only three inputs: email address, target name and protein sequence. PreDisorder can make predictions in a very short time and sends the results back to users via email. Disorder prediction results include the user-defined target name, the author, any predictor remarks and the disorder predictions. These predictions are in CASP format and occupy several lines. Each line contains the residue code, an order or disorder assignment code and the number specifying the associated probability of disorder. We also return the results in graphical form, as seen in Figure 2. In this graph, users can visualize changes in the likelihood of disorder from residue to residue over the submitted sequence. The red curve shows our predicted probability of disorder for each residue in the target sequence, the green curve represents the determined disorder result by biological experiment for the target. In addition, the blue line y = 0.5 represents the threshold we chose to judge the probability of disorder for a residue.