Cost effective high-quality genome assembly for non-model organisms

Period of Performance: 05/01/2017 - 04/30/2018


Phase 1 SBIR

Recipient Firm

End2end Genomics, LLC
Principal Investigator


Abstract Thousands of genes are shared, many unchanged, among animals that have diverged for billions of years. It is this shared ancestry that allows for genetic animal models for human pathogens?from neural development to antibody production?one of the most powerful tools at our disposal to unravel the mysteries of our genome and its utility for health. These models are currently limited to a few species representing only a tiny fraction of the phenotypic diversity studied by scientists. A major barrier to genomic research in non-model species is a reference genome, the development of which is costly and requires expertise. Our goal is to allow the expansion of model-species to represent all branches of the evolutionary tree by making high quality de novo genome assembly inexpensive and available to all researchers. We propose to develop a ?gold standard? test genome for de novo assembly using an F1 hybrid individual. We will develop an automated evaluation pipeline based on this ?gold standard? to evaluate commodity, and emerging, commercial products for their utility in inexpensive genome assembly (Aim 1). In Phase I we will assess 10X Genomics linked reads, the most promising current technology, for automated de novo genome assembly. This will include essential evaluation and optimization of error rates, as well as benchmarking of required sequencing and computational resources (Aim 2).