Reverberation Mitigation of Speech

Period of Performance: 01/01/2015 - 12/31/2015

$150K

Phase 1 STTR

Recipient Firm

Voci Technologies Incorporated
PO Box 1668
Herndon, VA 20172
Principal Investigator

Research Institution

Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213
Institution POC

Abstract

ABSTRACT: Despite decades of research, machine hearing continues to be far less robust than human hearing. The difference is especially noticeable in reverberant environments which pose a critical limitation on the general utility of Human Language Technology. Voci Technologies Incorporated (Voci) is the leading small business developing accelerated Human Language Technology based solutions. Voci is partnering with Richard M. Stern, a Voci advisor and Professor at Carnegie Mellon University (CMU), to develop a solution to the reverberant audio challenge. ??The Voci team will implement three signal processing techniques, whose recent development at CMU was motivated by what is known about human auditory perception. The three techniques have been shown to reduce automatic speech transcription errors for reverberated speech. The techniques are: (1) the power-normalized ceptral coefficient (PNCC) representation; (2) suppression of slowly varying components and the falling edge (SSF); and (3) non-negative matrix factorization. All three techniques will be integrated into, and evaluated as a part of, Voci's commercially available automatic language identification, automatic speech transcription, and automatic speaker clustering applications. The project will result in a feasibility analysis, technology demonstration, and the installation of the resulting experimental systems at AFRL. ?The resulting technology will also be targeted for inclusion into the Next Generation Voice Controlled Automotive Cockpit (NGVCAC) that it is developing in conjunction with VW/Audi.; BENEFIT: The intent of this effort is to produce a "dual use" capability that meets the needs of the US Air Force, DoD and commercial applications. ??Vocis business plan is to apply the technology developed under this STTR to the Next Generation Voice Controlled Automotive Cockpit (NGVCAC) that it is developing in conjunction with VW/Audi. ?A critical consideration in any commercial product is that it is open and easily integratable with other systems. ?The capabilities developed under this STTR will adhere to these architectural principles. ?Voci envisions that the powerful noise cancellation capabilities developed under the STTR will be embedded in existing Voci Automatic Speech Recognition (ASR) products, enhancing these systems' ability to automatically adapt to environmental noise in automotive, aircraft, and other closed environments thereby improving the effectiveness of automatic speech recognition and the utility of ASR in these environments. ?The commercial market for speech recognition in automobiles exceeded $81M in 2011 and is expected to grow to $170M in 2019, representing approximately a 9.7% Compounded Annual Growth Rate (CAGR).