Revisiting the “satisfaction of spatial restraints” approach of MODELLER for protein homology modeling

Giacomo Janson; Alessandro Grottesi; Marco Pietrosanto; Gabriele Ausiello; Giulia Guarguaglini; Alessandro Paiardini

doi:10.1371/journal.pcbi.1007219

Peer Review History

Original SubmissionJune 25, 2019
1 Oct 2019 Decision Letter - Bert L. de Groot, Editor, Arne Elofsson, Editor Dear Dr JANSON, Thank you very much for submitting your manuscript 'Revisiting the “satisfaction of spatial restraints” approach of MODELLER for protein homology modeling' for review by PLOS Computational Biology. Your manuscript has been fully evaluated by the PLOS Computational Biology editorial team and in this case also by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the manuscript as it currently stands. While your manuscript cannot be accepted in its present form, we are willing to consider a revised version in which the issues raised by both reviewers have been adequately addressed. We cannot, of course, promise publication at that time. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Your revisions should address the specific points made by each reviewer. Please return the revised version within the next 60 days. If you anticipate any delay in its return, we ask that you let us know the expected resubmission date by email at ploscompbiol@plos.org. Revised manuscripts received beyond 60 days may require evaluation and peer review similar to that applied to newly submitted manuscripts. In addition, when you are ready to resubmit, please be prepared to provide the following: (1) A detailed list of your responses to the review comments and the changes you have made in the manuscript. We require a file of this nature before your manuscript is passed back to the editors. (2) A copy of your manuscript with the changes highlighted (encouraged). We encourage authors, if possible to show clearly where changes have been made to their manuscript e.g. by highlighting text. (3) A striking still image to accompany your article (optional). If the image is judged to be suitable by the editors, it may be featured on our website and might be chosen as the issue image for that month. These square, high-quality images should be accompanied by a short caption. Please note as well that there should be no copyright restrictions on the use of the image, so that it can be published under the Open-Access license and be subject only to appropriate attribution. Before you resubmit your manuscript, please consult our Submission Checklist to ensure your manuscript is formatted correctly for PLOS Computational Biology: http://www.ploscompbiol.org/static/checklist.action. Some key points to remember are: - Figures uploaded separately as TIFF or EPS files (if you wish, your figures may remain in your main manuscript file in addition). - Supporting Information uploaded as separate files, titled Dataset, Figure, Table, Text, Protocol, Audio, or Video. - Funding information in the 'Financial Disclosure' box in the online system. While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see here. We are sorry that we cannot be more positive about your manuscript at this stage, but if you have any concerns or questions, please do not hesitate to contact us. Sincerely, Bert L. de Groot Associate Editor PLOS Computational Biology Arne Elofsson Deputy Editor PLOS Computational Biology A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The manuscript describes an analysis of the potential to improve homology modelling using “satisfaction of spatial restraints” in the widely used MODELLER package. If the structural divergence between target and template is optimally modelled there is a large room for improvement in particular for multiple templates. Improvements (2% for single and 11% for multiple templates) are expected if the predicted structural divergence correlate >0.6 compared to the true divergence. In addition, the authors also investigate the possibility to include a statistical potential in the objective function of the MODELLER and show that using the build in DOPE statistical potential yields a small but consistent improvement in model quality. Code to for the latter is provided in a git repo. Overall this manuscript was a nice read and present results that can be used to consistently improve modelling results. It also establish an upper limit on model quality that can be gained if the templates are used optimally. However, if I should provide some criticism, a large part of the analysis is done using information from the native structure when setting the local weights. If you know exactly which residues to move and to which degree, you should expect large improvements. 1) The authors provide some perturbation analysis by randomly changing some fraction of weights from their optimal value, thereby reducing the correlation from 1.0 to 0.0 and show that improvements are expected when the correlation is >0.6 (Fig 5). Looking at Fig S2, 0.6 correlation corresponds to changing only 30% (0.3) of the residues. Thus, 70% of the residues have their structural divergence at their optimal and 30% are random. This scenario is highly unrealistic; 0.6 correlation doesn’t seem too high, but since correlation is really effected by small number of random points buried among perfect predictions, the actual required prediction performance might be much higher. I suggest that the perturbation analysis is performed in a more rigorous way that sample distributions more likely to originate from a model quality assessment prediction method, where each estimate would have some uncertainty, Or better use a proper model quality assessment program, like ProQ3D or QMEAN to get realistic estimates. 2) Best performance gain can potentially be obtained using multiple templates. However, again here, the fact the authors are using the knowledge on which template is optimal obfuscates the true value of this results. We already know that if you are able to always pick the best model you would outperform any group in CASP, the problem is to pick the best model/template. Thus, it is crucial that effect of errors in the estimates are investigated more throughly. 3) For multiple templates how are the delta(d_ij) for different templates distributed through the target sequence? Is it different templates that dominates in different regions, or are they intertwined? i.e. is it effectively using more than one template for any given region or is it more picking the best template for each region? 4) It is a bit unclear on how the local weights are implemented, do you provide a custom made restraints file to Modeller? or did you find any other API to interface with the Modeller functions? Anyway I think it would be useful if you in addition to the code you already provide, also provide code for running Modeller with the optimal weight (if native is available), or user-specified given a list of local predicted CA-CA distances. Reviewer #2: This manuscript describes a study of MODELLER, a widely-used software tool for protein homology modeling. Given that MODELLER is widely used, further study of this program and even a small improvement are always desirable. In this manuscript, the authors have studied the relationship between modeling quality and estimation of the difference of an inter-atom distance between target and template. The authors claim that 1) a more accurate estimation of the difference may improve modeling accuracy and 2) modeling quality may be increased by incorporating some statistical potentials into MODELLER. These findings are not very new, but the manuscript provides sufficient data and analysis to back up them, which to the best of my knowledge is not widely available in the literature. The authors have also released source code for the incorporation of DOPE and DFIRE into MODELLER, although this is not very new (Similar code is available at the MODELLER website). Some minor concerns: 1) lines 46-47, it is fine to say that template-based modeling is the most popular, but I am not sure if it is fine to claim that the most successful approach is template-based modeling since this method fails on the modeling of many membrane proteins. In particular, in the past 2-3 years template-free modeling has made a very good progress and now its accuracy is comparable or even better as long as the target protein does not have very good templates. Further, currently the best template-free modeling also works well on membrane proteins. 2) lines 69-71, some revision is needed here. In addition to algorithm advance, the enlargement of both sequence and structure databases is also a very important factor for the improvement of HM modeling. ******** Have all data underlying the figures and results presented in the manuscript been provided?** Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes Reviewer #2: Yes ******** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review?** For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Björn Wallner Reviewer #2: No https://doi.org/10.1371/journal.pcbi.1007219.r001
Revision 1
13 Nov 2019 Decision Letter - Bert L. de Groot, Editor, Arne Elofsson, Editor Dear Dr JANSON, We are pleased to inform you that your manuscript 'Revisiting the “satisfaction of spatial restraints” approach of MODELLER for protein homology modeling' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes. In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pcompbiol/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. One of the goals of PLOS is to make science accessible to educators and the public. PLOS staff issue occasional press releases and make early versions of PLOS Computational Biology articles available to science writers and journalists. PLOS staff also collaborate with Communication and Public Information Offices and would be happy to work with the relevant people at your institution or funding agency. If your institution or funding agency is interested in promoting your findings, please ask them to coordinate their releases with PLOS (contact ploscompbiol@plos.org). Thank you again for supporting Open Access publishing. We look forward to publishing your paper in PLOS Computational Biology. Sincerely, Bert L. de Groot Associate Editor PLOS Computational Biology Arne Elofsson Deputy Editor PLOS Computational Biology Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: All my comments have been adequately addressed. ******** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: Yes ******** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Björn Wallner https://doi.org/10.1371/journal.pcbi.1007219.r002
Formally Accepted
10 Dec 2019 Acceptance Letter - Bert L. de Groot, Editor, Arne Elofsson, Editor PCOMPBIOL-D-19-01048R1 Revisiting the “satisfaction of spatial restraints” approach of MODELLER for protein homology modeling Dear Dr JANSON, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Laura Mallard PLOS Computational Biology \| Carlyle House, Carlyle Road, Cambridge CB4 3DN \| United Kingdom ploscompbiol@plos.org \| Phone +44 (0) 1223-442824 \| ploscompbiol.org \| @PLOSCompBiol https://doi.org/10.1371/journal.pcbi.1007219.r003

Open letter on the publication of peer review reports

PLOS recognizes the benefits of transparency in the peer review process. Therefore, we enable the publication of all of the content of peer review and author responses alongside final, published articles. Reviewers remain anonymous, unless they choose to reveal their names.

We encourage other journals to join us in this initiative. We hope that our action inspires the community, including researchers, research funders, and research institutions, to recognize the benefits of published peer review reports for all parts of the research system.

Learn more at ASAPbio .