Development of a kit for the targeted depletion of abundant sequencesfrom DNA/RNA libraries

Period of Performance: 07/06/2016 - 12/20/2017


Phase 1 SBIR

Recipient Firm

Principal Investigator


? DESCRIPTION (provided by applicant): Human metagenomics - the study of our body's microbial and viral communities through next generation sequencing - is transforming current paradigms of both health and disease. Direct sequencing of microbiome populations has enormous potential for direct use in the diagnosis and epidemiology of infectious disease and for furthering our understanding of the causes, treatment, and prevention of chronic common diseases, from obesity to autoimmune disorders. For this reason, the NIH has invested heavily in research projects (such as the Human Microbiome project) to characterize the microbiome of healthy and diseased individuals across a spectrum of tissues. Translating this work into the clinic, however, requires overcoming a key hurdle: low levels of microbial sequence and high levels of human sequence present in many clinical samples reduce the sensitivity and dramatically increase the cost of metagenomics approaches. IdentifyGenomics has a novel and innovative method for the targeted depletion of human sequences from sequencing libraries. Our Phase 1 hypothesis is that a CRISPR-Cas9 system can be used to systematically and robustly deplete human sequences from sequencing libraries to improve the sensitivity and cost-effectiveness of metagenomics analysis. Specific Aim 1 is to develop guide RNA panels targeted to mitochondrial DNA and ribosomal RNA and to use these to refine reaction conditions and guide RNA design principles. Specific Aim 2 is to develop a transcriptome-wide guide RNA panel to deplete RNAseq libraries and validate it against clinical samples. The expected outcome is a CRISPR-Cas9 method demonstrated to deplete 80% of targeted human sequences from a sequencing library, with less than 5% loss of non-human sequences in one hour. Phase II work will include improving depletion efficiencies towards 100% and extending the method to depletion of the whole human genome from DNA-derived sequencing libraries. Our technology integrates directly into existing workflows and our preliminary data shows we can deplete mtDNA from ATAC-seq libraries by 60% in under an hour. Extending this work to rRNA and then RNAseq will produce a suite of tools for the research and clinical community that will accelerate the adoption of metagenomics sequencing in sequencing-based clinical diagnostics. We have secured our intellectual property by filing patent applications on this technology and have identified leading academic partners to help us test and validate our methods on a range of samples. Finally, our executive leadership and board are well positioned in the community to help us commercialize this technology through existing kit manufacturers (including New England Biolabs) and leading sequencing providers (including Illumina).