Speech Recognition Enhancement through Digital Signal Noise Processing (DSNP)

Period of Performance: 12/18/1998 - 06/18/1999


Phase 1 SBIR

Recipient Firm

Frontier Technology, Inc.
55 Castilian Drive Array
Goleta, CA 93117
Principal Investigator

Research Topics


We propose a novel approach to speech recognition enhancement that consists of three key components: a) a decomposition of the noisy speech signal into a time-frequency representation; b) a psychoacoustically-motivated source separation algorithm that identifies cells of the time-frequency representation as speech or noise cells; and c) a very low complexity vector quantization algorithm that recovers the time-frequency representation of the noise-suppressed speech signal from the partially specified output of the previous step. Our approach can be viewed as a Psychoacoustic Noise Suppression (PSN) algorithm that enhances the speech signal to enable accurate speech recognition in noisy environments. The PSN algorithm has two key features that differentiate it from other speech enhancement systems: (1) a psychoacoustic model called IWAIF that preserves the features of the speech signal crucial to successful recognition (Krishnamurthy and Feth, Ohio State University). (2) an vector search algorithmic breakthrough, the TNE, discovered by FTI in 1994. This very low complexity algorithm can be used to search the database using partially corrupted or masked vectors, which is key to recovering the noise-suppressed speech signal. BENEFITS: This effort will significantly advance the state-of-the-art in the analysis and recognition of speech and other acoustic signals. The resulting technology will be useful in the areas of machine condition monitoring, medical diagnostics, and speech recognition in harsh environments.