In machine translation, evaluation is necessary to compare system performance
and rate progress.
Unfortunately, automatically computable criteria like word error rate
(WER) etc. depend fundamentally on the choice of sample translations.
Subjective, manually given criterions like subjective sentence error rate are very
useful for this task, but require labourous evalation by human experts.
To faciliate evaluation as much as possible, and to give easy acess to collected evaluation
data, we developed a tool:
EvalTrans.
It has been successfully applied in various national and international projects.
Recently, it has been used for the final comparative assessment of translation quality in the
European project EuTrans.
Evaluation
Database
GUI
Structure
|
Main menu and the most important submenus. The most important pieces of database statistics are
shown in the main window.
|
List of target sentences assigned to a source sentence in database. These sentences are sorted by
their score here; some sentences have been selected for further operations.
|
|
|
Dialog for the manual evaluation of a source/target hypothesis sentence pair. At the top you will find
the sentence pair; below there is a list of the most similar target sentences in database. Similarity
is indicated by the left-hand exclamation marks; the exact differences can be shown in the white box in the lower half. At the bottom there is the information item error indicator and some
control buttons.
|
During an extrapolation session: Out of the 147 hypothesis sentences, 42 have not been found in database
and have to be evaluated. As a quick check, scores for these sentences have been extrapolated;
in addition, sentence 4 has been re-evaluated manually. Score, WER, information item error etc. are
listed for each hypothesis sentence and for the whole test/hypothesis corpus.
|
|
|
Most dialogs have a "help" button which links directly to the corresponding section
of the hypertext online help.
|
System/Software requirements |
|
To use EvalTrans, you need a system supported by the following software:
- Tcl/Tk 8.0 or higher.
- TclExpat 1.1 or higher. Source code
necessary, as it has to be patched: There are problems with the lack of Unicode support in Tcl 8.0;
higher versions have not been tested yet. TclXML should work (slowly), too, but has not been tested yet.
- BWidget ToolKit 1.2.1 or higher.
- A compiler that supports building shared libraries for Tcl/Tk (e.g. GCC)
Obtaining EvalTrans /
Contact |
|
Click here to download EvalTrans.
EvalTrans is registered in the Natural Language Software Registry (an initiative of the ACL). Click here
to visit the EvalTrans page there.
Contact us
if you have any questions or wish to use EvalTrans for your MT or other NLP projects.
Publications / Ressources
| |
Publications
-
Sonja Nießen, Franz Josef Och, Gregor Leusch, Hermann Ney.
"An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research".
In Proc. 2nd International Conference on Language Resources and Evaluation,
pp. 39-45,
Athens, Greece, May-June 2000.
Corrected version:
-
Stephan Vogel, Sonja Nießen, Hermann Ney.
"Automatic Extrapolation of Human Assessment of Translation Quality".
In 2nd International Conference on Language Resources and Evaluation: Proceedings of the Workshop on Evaluation of Machine Translation,
pp. 35-39, Athens, Greece, May-June 2000.
Ressources