An easy-to-use and powerful tool for improved Rosetta comparative modeling of proteins

Period of Performance: 03/01/2016 - 08/31/2016


Phase 1 SBIR

Recipient Firm

Cyrus Biotechnology, Inc.
Seattle, WA 98109
Principal Investigator


? DESCRIPTION (provided by applicant): In this project we aim to improve on the state-of-the-art homology modeling pipeline in the Rosetta software package, strengthen its capability in modeling large proteins and structure refinement with near-atomic resolution density data and sparse NMR data. We will also develop a graphical user interface (GUI) and an easy-to-use backend that will allow a user to easily set up modeling tasks and access large amounts of computing. The success of this project will facilitate drug design and other applications requiring accurate computational models. It will also provide accuracy estimations to inform the user's trust in the output model. The software developed here establishes a framework in which both academic and commercial users without an extensive computational background or the time to learn a complex new command-line tool can interact with the Rosetta modeling software package via a GUI. The three overlapping areas to be investigated here are: 1. Improving homology modeling methods. We will further develop the broken chain kinematics system incorporated in RosettaCM and benchmark against a large dataset collected from previous CASP and CAMEO experiments. A more hierarchical kinematics system will be developed for modeling with known contact information and tested using a dataset with sparse NMR data. 2. Graphical user interface (GUI). One of the challenges of using a modeling software package such as Rosetta is that it requires a large amount of prior training in computer science, including basic Linux skills, software compilation, simple scripting and tabular data manipulation. We will develop a powerful GUI so that interactions with the Rosetta software package become much easier, without sacrificing user control over key aspects of modeling. 3. Cloud computing. Large computing resources are necessary to achieve massive amounts of sampling during structure modeling, and this often leads to more accurate results. However to the access to such computing resources is scarce and expensive to deploy locally. By developing a deployment mechanism that is deliverable on both a local cluster or via cloud computing, we can ensure that tasks that require large amounts of sampling are easily accessible to all scientific users.