Signal Analysis

Improved feature extraction

The objective of signal analysis is to produce a parameterization of the speech signal suitable for automatic speech recognition. Although, ideally the speech waveform should be modelled directly, today's modelling techniques are not suitable to process the raw speech waveform optimally. Signal analysis aims at separating information relevant for the recognition task from irrelevant information (e.g. speaker or channel characteristics) and at reducing the amount of data that is presented to the speech recognizer.

Acoustic normalization
The complexity of automatic speech recognition tasks has increased dramatically in recent years. The focus has shifted from the transcription of clean read speech to spontaneous speech, recordings in severe acoustic conditions (over telephone or in cars), and scenarios with a large mismatch between training and test conditions. In this context, there is a strong need for acoustic features that contain the information relevant for speech recognition and are robust against channel distortions, noise, and similar phenomena. A method is to normalize the acoustic vectors and thereby remove irrelevant variations.

Noise Robustness
In many practical applications speech recognition systems have to work in adverse environmental conditions. Frequency distortions and noises caused by the transmission are typical for telephone applications. Considerable amounts of varying background noise are a problem for all mobile applications such as cellular phones or speech controlled systems in cars. The recognition error rates of speech recognition systems using standard methods usually rise considerably in these conditions. The noise robustness can be increased by suppressing the contribution of the noise during acoustic feature extraction and/or adapting the acoustic models to the current noise condition.