RWTH-PHOENIX-Weather

The RWTH-PHOENIX-Weather Database of German Sign Language is freely available for non-commercial use

If you want to publish results achieved on this database, you should cite the following work:

J. Forster, C. Schmidt, T. Hoyoux, O. Koller, U. Zelle, J. Piater, and H. Ney. RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus. In Language Resources and Evaluation (LREC), pages 3785-3789, Istanbul, Turkey, May 2012.

General Information

Over a period of three years (2009 - 2011) the daily news and weather forecast airings of the German public tv-station PHOENIX featuring sign language interpretation have been recorded. Currently, only the weather forecasts of a subset of 386 editions have been transcribed using gloss notation. The transcriptions have been carried out by deaf and hard-of-hearing native speakers of German sign language. Additionally, the spoken German weather forecast has been transcribed in a semi-automatic fashion using the RASR speech recognition system. Moreover, an additional translation of the glosses into spoken German has been created to capture allowable translation variability.

The signing is recorded by a stationary color camera placed in front of the sign language interpreters. Interpreters wear dark clothes in front of an artificial grey background with color transition. All recorded videos are at 25 frames per second and the size of the frames is 210 by 260 pixels. Each frame shows the interpreter box only.

Due to legal constraints RWTH cannot publish the original annotation files in the ELAN xml format and the recorded video sequences. Instead xml files containing ground-truth gloss annotation with corresponding id as well as the image sequences corresponding to these ids are provided.

So far the RWTH-PHOENIX-Weather database contains seven (7) different signers which are not equally represented in the data. All interpreters are right-dominant signers.

Signer 01	Signer 02	Signer 03


Signer 05	Signer 06	Signer 07

Data transcription is still ongoing and updates to the database, it's transcriptions and further annotations will be released in due time.

For additional information or requests with regard to the database please contact Jens Forster (forster -AT- cs.rwth-aachen.de) or Christoph Schmidt (schmidt -AT- cs.rwth-aachen.de).

Continuous Sign Language Recognition Setups

Signer Dependent Signer 03

A signer dependent traing and recognition subset has been defined during the SignSpeak project for the evaluation of the sign language recognition. The training and recognition subset has been first described in "RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus" by Forster & Schmidt et. al. in LREC 2012. In contrast to the full RWTH-PHOENIX-Weather database focuses the signer dependent subset on the annotations for the dominant hand of the signer and does not take into account facial expressions or mouthings. Database statistics are given in the following table.

Training Recognition

number of signers 1 1

number of editions 41 31

duration in minutes 30.85 4.5

number of video frames 46,282 6,751

number of sentences 304 47

number of running glosses 3,309 487

vocabulary size 266 -

number of singeltons 90 -

The out-of-vocabulary rate of the recognition set is 1.6% of all running glosses in the recognition set and neither the training nor the recognition set contains silence frames except for transitional movements. Ground-truth annotation is provided in the same xml format as the overall RWTH-PHOENIX-Weather database here while the corresponding image sequences are available here (about 2.4 GB).

Signer Dependent Signer 03 With Visual Variant Annotation

This data set covers the same image sequences as the previous dataset and has the same split into training and recongition data but contains additional explicit visual variant annotations. Variant annotation follows the gloss annotation and is marked by a starting # sign. Underscores (_) are used as delimiter if a variant is composed of several smaller variants.

Get the annotation data here.

Signer 03 Cut-out Gloss Recognition Setups

During the SignSpeak project RWTH has annotated start and end boundaries of glosses signed by Signer03 to allow for experiments in the context of "isolated" sign recognition. Please note that in the following two corpora glosses have been cut-out from full sentences according to ground-truth annotation. As such both corpora do not contain perfectly isolated signs but do contain movement epenthesis effects. Both setups share the same image sequences but differ in the gloss annotation. The division into train and recognition sets is the sames as in the continuous setup.

Please get the relevant image sequences here (1.9GB).

Cut-outs without Visual Variant Annotation

This setup contains gloss annotations of cut-out signs for the dominant hand of Signer 03. Please get the training and recognition annotation here.

Cut-outs with Visual Variant Annotation

This setup contains gloss annotations of cut-out signs for the dominant hand of Signer 03. Visual variant annotations of the signs are included. Please get the training and recognition here.

Sign Language Translation Setups

The RWTH-PHOENIX-Weather corpus contains bilingual data for translation experiments for the language pair German Sign Language <-> German. Currently two setups are available with more to follow in due time.

Full Bilingual Translation Setup

The full bilingual translation setup contains 2640 bilingual sentence pairs for German sign language glosses and written German subdivided into training, development and test set. Preprocessing in the form of adding category tags for numbers, weekdays and months has been applied to the data. If you are interested in the raw text data, please contact the current maintainer of the translation data Christoph Schmidt (schmidt -AT- cs.rwth-aachen.de). Setup statistics are detailed in the following table and in the LREC 2012 paper referenced on this website.

Glosses German

Train number of sentences 2612 2612

number of running words 20,713 26,585

vocabulary size 768 1389

singeltons/vocabulary size 32.4% 36.4%

Development number of sentences 250 250

number of running words 2,573 3,293

out-of-vocabulary-words (percent running) 1.4% 1.9%

Test number of sentences 228 228

number of running words 2,163 2,980

out-of-vocabulary-words (percent running) 1.0% 1.5%

Please download the data files (compressed archive) here.

Signer 03 Only Translation Setup

The Signer 03 only translation setup is the data set used as final element in the SignSpeak evaluation pipeline described in the Year 3 evaluation report. It covers the same data as the Signer 03 continuous recognition setup. The translation data is not preprocessed, containing full gloss annotations including visual variant annotations, mouthings, and facial expression as well as raw German text including punctuation marks.

Please download the translation data here.

Available Annotations and Transcriptions

Hand and Face Tracking Groundtruth

In sign language the majority of all meaning is conveyed by the shape and movement of the hands as well as facial expressions and mouthings. Therefore, it is crucial to reliably track the movement of the hands over time as well as the movement of the face. The groundtruth positions of both hands and the face have been annotated in a total of 39,712 images featuring all seven signers and representing 266 different sign language sentences. The groundtruth coordinates for the hands have been placed at the center of the hand palm and the nose tip has been chosen as face position.

RWTH-PHOENIX-Weather Tracking Annotation Example 1

Tracking groundtruth annotation is provided in a single xml file detailing the sentence id and for each frame the position of the hands and the nose tip. The origin (0,0) of the chosen coordinate system is the top left corner of the image. Point 0 refers to the left hand, point 1 to the right hand and point 2 to the nose tip.

Get the groundtruth here and for convenience the corresponding image sequence here (about 1.8 GB).

Groundtruth Labels for Active-Appearance Face Models

38 facial landmarks have been annotated for signers 01 to 07. In total there are 369 images featuring such facial annotation capturing the majority of the facial expression present in the data.

RWTH-PHOENIX-Weather Face Annotation Example 1

RWTH-PHOENIX-Weather Face Annotation Example 2

Facial landmark annotation has been carried out by the Institute of Interactive and Intelligent Systems at the University of Innsbruck Austria. The landmark annotations including rectified face images are available here.

Full Database Recognition

The full RWTH-PHOENIX-Weather database consists of 5356 sentences , 45760 running glosses , about 600k frames and has an vocabulary of about 1200 signs. Although the process of improving the annotations is still going on you can access sentence-wise gloss annotations here.

Information about the annotation conventions can be found here in pdf format.

Funding

This work has been partly funded by the European Community's Seventh Framework Programme (FP7-ICT-2007-3. Cognitive Systems, Interaction, Robotics - STREP) under grant agreement no. 231424 - SignSpeak Project.

	Training	Recognition
number of signers	1	1
number of editions	41	31
duration in minutes	30.85	4.5
number of video frames	46,282	6,751
number of sentences	304	47
number of running glosses	3,309	487
vocabulary size	266	-
number of singeltons	90	-

		Glosses	German
Train	number of sentences	2612	2612
	number of running words	20,713	26,585
	vocabulary size	768	1389
	singeltons/vocabulary size	32.4%	36.4%
Development	number of sentences	250	250
	number of running words	2,573	3,293
	out-of-vocabulary-words (percent running)	1.4%	1.9%
Test	number of sentences	228	228
	number of running words	2,163	2,980
	out-of-vocabulary-words (percent running)	1.0%	1.5%