The RWTH-PHOENIX-Weather Database of German Sign Language is freely available for non-commercial use
J. Forster, C. Schmidt, T. Hoyoux, O. Koller, U. Zelle, J. Piater, and H. Ney. RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus. In Language Resources and Evaluation (LREC), pages 3785-3789, Istanbul, Turkey, May 2012.
Over a period of three years (2009 - 2011) the daily news and weather forecast airings of the German public tv-station PHOENIX featuring sign language interpretation have been recorded. Currently, only the weather forecasts of a subset of 386 editions have been transcribed using gloss notation. The transcriptions have been carried out by deaf and hard-of-hearing native speakers of German sign language. Additionally, the spoken German weather forecast has been transcribed in a semi-automatic fashion using the RASR speech recognition system. Moreover, an additional translation of the glosses into spoken German has been created to capture allowable translation variability.
The signing is recorded by a stationary color camera placed in front of the sign language interpreters. Interpreters wear dark clothes in front of an artificial grey background with color transition. All recorded videos are at 25 frames per second and the size of the frames is 210 by 260 pixels. Each frame shows the interpreter box only.
Due to legal constraints RWTH cannot publish the original annotation files in the ELAN xml format and the recorded video sequences. Instead xml files containing ground-truth gloss annotation with corresponding id as well as the image sequences corresponding to these ids are provided.
So far the RWTH-PHOENIX-Weather database contains seven (7) different signers which are not equally represented in the data. All interpreters are right-dominant signers.
|Signer 01||Signer 02||Signer 03||Signer 04|
|Signer 05||Signer 06||Signer 07|
Data transcription is still ongoing and updates to the database, it's transcriptions and further annotations will be released in due time.
For additional information or requests with regard to the database please contact Jens Forster (forster -AT- cs.rwth-aachen.de) or Christoph Schmidt (schmidt -AT- cs.rwth-aachen.de).
Continuous Sign Language Recognition Setups
Signer Dependent Signer 03
A signer dependent traing and recognition subset has been defined during the SignSpeak project for the evaluation of the sign language recognition. The training and recognition subset has been first described in "RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus" by Forster & Schmidt et. al. in LREC 2012. In contrast to the full RWTH-PHOENIX-Weather database focuses the signer dependent subset on the annotations for the dominant hand of the signer and does not take into account facial expressions or mouthings. Database statistics are given in the following table.
|number of signers||1||1|
|number of editions||41||31|
|duration in minutes||30.85||4.5|
|number of video frames||46,282||6,751|
|number of sentences||304||47|
|number of running glosses||3,309||487|
|number of singeltons||90||-|
Signer Dependent Signer 03 With Visual Variant Annotation
This data set covers the same image sequences as the previous dataset and has the same split into training and recongition data but contains additional explicit visual variant annotations. Variant annotation follows the gloss annotation and is marked by a starting # sign. Underscores (_) are used as delimiter if a variant is composed of several smaller variants.
Get the annotation data here.
Signer 03 Cut-out Gloss Recognition Setups
During the SignSpeak project RWTH has annotated start and end boundaries of glosses signed by Signer03 to allow for experiments in the context of "isolated" sign recognition. Please note that in the following two corpora glosses have been cut-out from full sentences according to ground-truth annotation. As such both corpora do not contain perfectly isolated signs but do contain movement epenthesis effects. Both setups share the same image sequences but differ in the gloss annotation. The division into train and recognition sets is the sames as in the continuous setup.
Please get the relevant image sequences here (1.9GB).
Cut-outs without Visual Variant Annotation
This setup contains gloss annotations of cut-out signs for the dominant hand of Signer 03. Please get the training and recognition annotation here.
Cut-outs with Visual Variant Annotation
This setup contains gloss annotations of cut-out signs for the dominant hand of Signer 03. Visual variant annotations of the signs are included. Please get the training and recognition here.
Sign Language Translation Setups
The RWTH-PHOENIX-Weather corpus contains bilingual data for translation experiments for the language pair German Sign Language <-> German. Currently two setups are available with more to follow in due time.
Full Bilingual Translation Setup
The full bilingual translation setup contains 2640 bilingual sentence pairs for German sign language glosses and written German subdivided into training, development and test set. Preprocessing in the form of adding category tags for numbers, weekdays and months has been applied to the data. If you are interested in the raw text data, please contact the current maintainer of the translation data Christoph Schmidt (schmidt -AT- cs.rwth-aachen.de). Setup statistics are detailed in the following table and in the LREC 2012 paper referenced on this website.
|Train||number of sentences||2612||2612|
|number of running words||20,713||26,585|
|Development||number of sentences||250||250|
|number of running words||2,573||3,293|
|out-of-vocabulary-words (percent running)||1.4%||1.9%|
|Test||number of sentences||228||228|
|number of running words||2,163||2,980|
|out-of-vocabulary-words (percent running)||1.0%||1.5%|
Please download the data files (compressed archive) here.
Signer 03 Only Translation Setup
The Signer 03 only translation setup is the data set used as final element in the SignSpeak evaluation pipeline described in the Year 3 evaluation report. It covers the same data as the Signer 03 continuous recognition setup. The translation data is not preprocessed, containing full gloss annotations including visual variant annotations, mouthings, and facial expression as well as raw German text including punctuation marks.
Please download the translation data here.
Available Annotations and Transcriptions
Hand and Face Tracking Groundtruth
In sign language the majority of all meaning is conveyed by the shape and movement of the hands as well as facial expressions and mouthings. Therefore, it is crucial to reliably track the movement of the hands over time as well as the movement of the face. The groundtruth positions of both hands and the face have been annotated in a total of 39,712 images featuring all seven signers and representing 266 different sign language sentences. The groundtruth coordinates for the hands have been placed at the center of the hand palm and the nose tip has been chosen as face position.
Tracking groundtruth annotation is provided in a single xml file detailing the sentence id and for each frame the position of the hands and the nose tip. The origin (0,0) of the chosen coordinate system is the top left corner of the image. Point 0 refers to the left hand, point 1 to the right hand and point 2 to the nose tip.
Groundtruth Labels for Active-Appearance Face Models
38 facial landmarks have been annotated for signers 01 to 07. In total there are 369 images featuring such facial annotation capturing the majority of the facial expression present in the data.
Facial landmark annotation has been carried out by the Institute of Interactive and Intelligent Systems at the University of Innsbruck Austria. The landmark annotations including rectified face images are available here.
Full Database Recognition
The full RWTH-PHOENIX-Weather database consists of 5356 sentences , 45760 running glosses , about 600k frames and has an vocabulary of about 1200 signs. Although the process of improving the annotations is still going on you can access the image sequences of the whole database here (about 25 GB) and sentence-wise gloss annotations here.
Information about the annotation conventions can be found here in pdf format.
This work has been partly funded by the European Community's Seventh Framework Programme (FP7-ICT-2007-3. Cognitive Systems, Interaction, Robotics - STREP) under grant agreement no. 231424 - SignSpeak Project.