RWTH ASR - The RWTH Aachen University Speech Recognition System
RWTH ASR is a software package containing a speech recognition decoder together with tools for the development of acoustic models, for use in speech recognition systems. It has been developed by the Human Language Technology and Pattern Recognition Group at the RWTH Aachen University since 2001. Speech recognition systems developed using this framework have been applied successfully in several international research projects and corresponding evaluations.
RWTH ASR consists of several libraries and tools written in C++. Currently, only Linux (x86 and x86-64) platforms are supported.
Features
- decoder for large vocabulary continuous speech recognition
- word conditioned tree search (supporting across-word models)
- HMM emission probability calculation optimized for MMX and SSE2
- refined acoustic pruning using language model lookahead
- word lattice generation
- feature extraction
- a flexible framework for data processing:
Flow
- MFCC features
- voicedness feature
- vocal tract length normalization
- support for several feature dimension reduction methods (e.g. LDA, PCA)
- easy implementation of new features as well as easy integration of external features using Flow networks
- acoustic modeling
- Gaussian mixture distributions for HMM emission probabilities
- phoneme in triphone context (or shorter context)
- across-word context dependency of phonemes
- allophone parameter tying using phonetic decision
trees (classification and regression trees, CART)
- globally pooled diagonal covariance matrix (other types of covariance modelling are possible, but not fully tested)
- language modeling
- support for language models in ARPA format
- speaker adaptation
- Constrained MLLR (CMLLR, "feature space MLLR")
- Unsupervised maximum likelihood linear regression mean adaptation (MLLR)
- speaker / segment clustering using Bayesian Information Criterion (BIC) as stop criterion
- input / output formats
- nearly all input and output data is in easily process-able XML formats
- converter tools for the generation of NIST file formats are included
Documentation
The development of RWTH ASR is ongoing. A Manual is available in the RWTH ASR Manual Wiki. Access to the wiki requires registration.
Publications about the theoretical foundations and methods used can be found in the publications page.
A short introduction is given in these slides.
Questions can be send to rwthasr@i6.informatik.rwth-aachen.de.
Installation
RWTH ASR is available only in source form. See the included README for build instructions.
A set of installed tools and libraries is required (Debian package name given in brackets):
- GCC >= 4.0 (gcc, g++)
- GNU Bison (bison)
- GNU Make (make)
- makedepend (xutils-dev)
- libxml2 (libxml2, libxml2-dev)
- libsndfile (libsndfile1, libsndfile1-dev)
- LAPACK (lapack3, lapack3-dev)
- BLAS (refblas3, refblas3-dev)
- GNU Fortran 77 Runtime Library (libg2c0, libg2c0-dev)
Terms of Use
RWTH ASR is free software; it can be redistributed and/or modified under the terms of the RWTH ASR License. This license includes free usage for non-commercial purposes as long as any changes made to the original software are published under the terms of the same license. Other licenses can be requested.
Download
Remark: No acoustic or language models are included.
To download the software, you have to accept the license terms. Please fill out the form. The information submitted is only for internal usage and will not be given to third parties.