|Automatic Sign Language Recognition (ASLR)|
Developing sign language applications for deaf people can be very important, as many of them, being not able to speak a language, are also not able to read or write a spoken language. Ideally, a translation systems would make it possible to communicate with deaf people. Compared to speech commands, hand gestures are advantageous in noisy environments, in situations where speech commands would be disturbing, as well as for communicating quantitative information and spatial relationships.
A gesture is a form of non-verbal communication made with a part of the body and used instead of verbal communication (or in combination with it). Most people use gestures and body language in addition to words when they speak. A sign language is a language which uses gestures instead of sound to convey meaning combining hand-shapes, orientation and movement of the hands, arms or body, facial expressions and lip-patterns. Contrary to popular belief, sign language is not international. As with spoken languages, these vary from region to region. They are not completely based on the spoken language in the country of origin.
Sign language is a visual language and consists of 3 major components:
- finger-spelling: used to spell words letter by letter
- word level sign vocabulary: used for the majority of communication
- non-manual features: facial expressions and tongue, mouth and body position
Similar to automatic speech recognition (ASR), we focus in automatic sign language recognition (ASLR) on automatically recognizing sign language videos as gloses, which can be later translated by a statistical machine transaltion system into written text (see a demo video).
|Benchmark Databases for Sign Language Recognition and Translation|
In the course of the diploma thesis work Appearance-Based Gesture Recognition of Philippe Dreuw, a new database of fingerspelling letters of German Sign Language (Deutsche Gebärdensprache, DGS) was created. The RWTH Fingerspelling Database contains 35 gestures with video sequences for the signs A to Z and SCH, the German umlauts Ä, Ö, Ü, and for the numbers 1 to 5. Five of the gestures contain inherent motion (J, Z, Ä, Ö and Ü). The recording was done under non-uniform daylight lighting conditions, the back- ground and the camera viewpoints are not constant, and the persons had no restrictions on the clothing while gesturing.
Other ASLR sign language recognition databases used at our institute:
- RWTH German Fingerspelling Database: German sign language, fingerspelling, 1400 utterances, 35 dynamic gestures, 20 speakers
- RWTH-Phoenix Tagesschau: German sign language database, 95 German weather forecast records, 1353 sentences, 1225 signs, fully annotated, 11 speakers
- RWTH-BOSTON-10: American sign language database, containing 110 utterances of 10 American sign language words.
- RWTH-BOSTON-50: American sign language database, 483 utterances, 50 isolated signs, 83 prononciations, 3 speakers
- RWTH-BOSTON-104: American sign language database, 201 sentences, 104 signs, continuous sign language, 3 speakers
- RWTH-BOSTON-400: American sign language database, 843 sentences, about 400 signs, continuous sign language, 5 speakers
- RWTH-BOSTON-Hands: hand tracking database, 1000 frames with annotated hand positions to evaluate hand tracking algorithms
- ATIS Corpus: Irish sign language database, 680 sentences, about 400 signs, continuous sign language, several speakers, with annotated hand and head positions to evaluate hand tracking algorithms
- Corpus NGT: An online corpus of video data from Sign Language of the Netherlands with annotations
- BSL Corpus Project
Philippe Dreuw Last modified: Thu Oct 25 10:00:27 CEST 2007 Disclaimer. Created Wed Dec 22 18:04:32 CET 2004