CG-GRID: Computational Genetics Grid Resource for Interaction Discovery

Period of Performance: 05/01/2015 - 04/30/2016


Phase 2 STTR

Recipient Firm

Parabon Computation, Inc.
Reston, VA 20190
Principal Investigator


DESCRIPTION (provided by applicant): Advances in DNA sequencing technology have now made it practical and affordable to generate datasets containing millions of genetic attributes that can be tested for association with disease susceptibility. The computational complexity of searching for genetic interactions over such high- dimensional datasets imposes great challenges for genome-wide association studies (GWAS). In Phase I of this project, the Parabon research team, led by principal investigator Dr. Jason Moore of Dartmouth College Geisel School of Medicine, began addressing these bottlenecks by developing a distributed software service for analyzing gene-gene interactions over large GWAS datasets. In particular, the multifactor dimensionality reduction (MDR) algorithm was adapted for use in the Parabon(R) Crush genome mining application. MDR was augmented to employ Crush's opportunistic evolution search algorithm to enable deep, cloud-powered search, across thousands of compute nodes, to identify complex patterns of gene-gene interaction associated with human disease endpoints or forensically relevant traits. The resultant Crush-MDR Software as a Service (SaaS) application, which is available as an online cloud service or in-house enterprise application, was validated and shown to have excellent performance characteristics using simulated GWAS data, and then used to analyze a dataset from the Alzheimer's Disease Neuroimaging Initiative. In Phase II, the Parabon development team will extend the analytical capabilities of the Crush-MDR service and address other GWAS and next-generation sequencing (NGS) bottlenecks by enhancing its Parabon(R) Frontier(R) Compute Platform, a commercial cloud computing platform designed for high-performance computing (HPC) applications. Our overall objective (which was derived from interactions with prospective customers) is to produce a Platform as a Service (PaaS) solution that will greatly accelerate bioinformatics research by providing a comprehensive set of cloud services that collectively address many common bioinformatics bottlenecks and barriers to collaboration.