Integrated biochemical and bioinformatic technologies for accurate transcriptome-wide full-length RNA assembly.

Period of Performance: 04/01/2016 - 03/31/2017

$347K

Phase 1 STTR

Recipient Firm

Lucigen Corporation
Middleton, WI 53562
Principal Investigator
Principal Investigator

Abstract

? DESCRIPTION (provided by applicant): The human transcriptome is significantly more complex than its cognate genome, due to the hundreds of thousands of possible isoforms, allele-specific expression issues, variable RNA editing changes, and differential expression patterns spanning cell types, developmental stages, and physiological stresses. Next- generation sequencing (NGS) platforms are fundamentally altering genetic and genomic research by providing massive amounts of data in a low-cost, high-throughput format. The main drawback of existing technologies is the short sequence read lengths they produce (Illumina) or the high error rate (PacBio). Identifying single nucleotide variations is problematic with the long read technology and de novo assembly of most transcripts is compromised with short read NGS technologies alone. Even with a high quality reference human genome (which is a mosaic of the parental alleles), transcriptome sequencing and assembly is a significant challenge. Haplotyping across an entire mRNA is critical for understanding the full extent of RNA editing and is not readily achieved without resorting to cloned DNA. New tools that bridge the gap between massively parallel short read sequencing technologies and the need to assemble complete mRNA molecules are clearly needed. The SBIR Phase I of this grant proposes to develop the short read NGS technology to accurately sequence mRNAs along their entire length, regardless of size. This technology will enable the accurate assembly of complex transcriptomes, without cDNA cloning and primer walking using Sanger sequencing based strategies. The development of these tools could enable the de novo sequencing of daunting transcriptomes, reduce computational costs of transcriptome assembly significantly, produce more complete and accurate catalogs of RNA edited transcripts, and make personal transcriptome resequencing tractable.