Jane - The RWTH Aachen University Statistical Machine Translation Toolkit

Jane is RWTH's open source statistical machine translation toolkit. Jane supports state-of-the-art techniques for phrase-based and hierarchical phrase-based machine translation. Many advanced features are implemented in the toolkit, as for instance forced alignment phrase training for the phrase-based model and several syntactic extensions for the hierarchical model.

RWTH has been developing Jane during the past years and it was used successfully in numerous machine translation evaluations. It is developed in C++ with special attention to clean code, extensibility and efficiency. The toolkit is available under an open source non-commercial license. Currently, only Linux (x86 and x86-64) platforms are supported.


Key features of the Jane toolkit are:


The development of Jane is ongoing. A manual is available in pdf format. It is not yet fully completed but it provides enough information for installing Jane, training systems and producing translations.

Publications of the i6 chair about the theoretical foundations and methods used can be found in the publications page.

Example data

You can download the example data for following the walkthrough in the manual:


Jane is available only in source form. See the corresponding chapter in the manual for build instructions.

A set of installed tools and libraries is required:

Additionally, Jane can use following optional dependencies:

Terms of Use

Jane is open source software; it can be redistributed and/or modified under the terms of the RWTH Jane License. This license includes free usage for non-commercial purposes as long as any changes made to the original software are published under the terms of the same license. Other licenses can be requested.

Publications of results obtained through the use of original or modified versions of the software have to cite the authors by refering to the following publications:

D. Vilar, D. Stein, M. Huck, and H. Ney. Jane: Open Source Hierarchical Translation, Extended with Reordering and Lexicon Models. In ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and Metrics MATR (WMT 2010), pages 262-270, Uppsala, Sweden, July 2010.

for the hierarchical phrase-based decoder and

J. Wuebker, M. Huck, S. Peitz, M. Nuhn, M. Freitag, J. Peter, S. Mansour, and H. Ney. Jane 2: Open Source Phrase-based and Hierarchical Statistical Machine Translation. In International Conference on Computational Linguistics (CoLing), pages 483-491, Mumbai, India, December 2012.

for the phrase-based decoder and

M. Freitag, M. Huck, and H. Ney. Jane: Open Source Machine Translation System Combination. In Conference of the European Chapter of the Association for Computational Linguistics (EACL), Gothenburg, Schweden, April 2014.

for system combination


