In the winter term 2007/08 the
Lehrstuhl für Informatik 6 will host a seminar entitled:
Seminar "Speech Recognition and Language Processing"
Registration for the seminar:
Registration
for the seminar
is only possible online via the registration
page provided by the
institute. A link can be found on the Computer
Science Department's homepage.
Prerequisites for participation in the seminar:
- Vordiplom or Bachelor degree
- Attendance of the lectures Pattern Recognition and Neural
Networks, Speech Recognition or Statistical Methods in Natural Language
Processing, or evidence of equivalent knowledge.
Seminar format and important dates:
The seminar generally takes place in block mode around the end of the
lecture period. Specific dates will be arranged, these will most
likely be between end of January and mid February 2008.
- Proposals: initial proposals will be accepted up
until the start of the lecture period
(October 15th, 2007) at the Lehrstuhl für Informatik 6
office or by the relevant supervisor. At this time participants must
arrange an appointment with the relevant supervisor. Revised proposals
will be accepted up until two weeks
after the start of the lecture period.
- Report: must be submitted at least 1 month prior to the trial presentation date
to either the Lehrstuhl für Informatik 6 office or the relevant
supervisor.
- Presentation slides: must be submitted at least 1 week prior to the trial presentation date
to either the Lehrstuhl für Informatik 6 office or the relevant
supervisor.
- Trial presentations: at least 2 weeks prior to the
actual presentation date; refer to the section on topics.
- Seminar presentations: the exact dates and plan for
the presentation block (expected to be around end of January to mid
February 2008)
will be arranged and announced for the individual topics.
- Final (possibly corrected) reports and presentation slides:
must be submitted at the latest 2
weeks after the presentation date to either the Lehrstuhl für Informatik 6 office or the relevant supervisor.
- Compulsory attendance: in order to receive a
certificate participants must attend all presentation sessions.
- Ethical Guidelines:The Computer Science
Department of RWTH Aachen University has adopted ethical
guidelines for the authoring of academic work such as seminar
reports. Each student has to comply with these guidelines. In this
regard, you, as a seminar attendant, have to sign a declaration of
compliance, in which you assert that your work complies with the
guidelines, that all references used are properly cited, and that the
report was done autonomously by yourself. We ask you do download the guidelines
and submit the declaration
together with your seminar report and talk to your supervisor.
Note: Deadlines are binding. Failure to meet deadlines can lead
to exclusion from the seminar.
Topics, relevant literature and participants:
Reordering (Sevim, supervisor: Maja Popovic)
- M. Collins, P. Koehn, I. Kucerova: Clause Restructuring for Statistical Machine Translation .
In Proc.of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 531-540, Ann Arbor, MI, June 2005.
- J.M. Crego, J.B. Marino: Integration of POS tag-based source reordering into SMT decoding by an extended search graph . In Proc. of the 7th Conf. of the Association for Machine Translation in the Americas (AMTA 06), pp. 29-36, Boston, MA, August 2006.
System combination (Chandramohan, supervisor: Evgeny Matusov)
- A. V. Rosti, N. F. Ayan, B. Xiang, S. Matsoukas, R. Schwartz, and B. Dorr.: Combining Outputs from Multiple Machine Translation Systems Proc. of NAACL-HLT 2007, Rochester, NY, USA, April, 2007.
- S. Jayaraman and A. Lavie.: Multi-Engline Machine Translation Guided by Explicit Word Matching
10th Conference of the European Association for Machine Translation, pp.~143-152, Budapest, Hungary, 2005.
- Fei Huang and Kishore Papineni. Hierarchical System Combination for Machine Translation.
Proceedings of EMNLP 2007, pp. 277-286, Prague, June 2007.
Storing NLP counts in Bloom Filters (NN, supervisor: Gregor Leusch)
- D. Talbot und M. Osborne: Randomised Language Modelling for Statistical Machine Translation
Proc. of ACL, pp. 512-519, Prague, Czech Republic, June 2007.
- Cohen, S. and Matias, Y.: Spectral bloom filters.
In Proceedings of the 2003 ACM SIGMOD international Conference on Management of Data (San Diego, California, June 09 - 12, 2003) SIGMOD '03. ACM Press, New York.
Syntax-based Translation (Königs, supervisor: David Vilar)
- Daniel Marcu, Wei Wang, Abdessamad Echihabi, and Kevin Knight: SPMT: Statistical Machine Translation with Syntactified Target Language Phrases . Proceedings of EMNLP-2006, pp. 44-52, Sydney, Australia, 2006.
- S. DeNeefe, K. Knight, W. Wang, D. Marcu: What Can Syntax-Based MT Learn from Phrase-Based MT?
Proc. EMNLP-CoNLL, 2007
Error Measures (Liu, supervisor: Daniel Stein)
- A. Lavie, A. Agarwal: METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments Proc. 2nd Workshop on MT, ACL, pp.228-231, Prague, Czech Republic, June 2007.
- K. Papineni, S. Roukus, T. Ward and W. Zhu: BLEU: a Method for Automatic Evaluation of Machine Translation
Proc. ACL, pp. 311-318, Philadelphia, PA, July 2002.
Alignments (NN; supervisor: Arne Mauser)
- A. Fraser, D. Marcu: Getting the structure right for word alignment: LEAF Proc. ACL, Prague, Czech Republic, June 2007.
- J. DeNero, D. Klein: Tailoring Word Alignments to Syntactic Machine Translation
Proc. ACL, Prague, Czech Republic, June 2007.
Guidelines for the report and presentation:
The roughly 20-page report together with the slides (between 20 &
30) for the presentation should be prepared in LaTeX format.
Presentations will consist of 45 minutes presentation time & 15
minutes discussion time. Document templates for both the report and
the presentation slides are provided below along with links to LaTeX
documentation available online. The report and
the slides should be prepared in LaTeX format and submitted
electronically in pdf format. Other formats will not be accepted.
- Online LaTeX-Documentation:
- Guidelines for articles and presentation slides:
General:
- The aim of the seminar for the participants is to learn the
following:
- to tackle a topic and to expand knowledge
- to critically analyze the literature
- to hold a presentation
- Take notice of references
to other topics in the seminar and discuss topics with one
another!
- Take care to stay within your
own topic. To this end participants should be aware of the other
topics in the seminar. If applicable, cross-reference
other articles and presentations.
Specific:
- Important: As part of the introduction, a slide should
outline the most important literature used for the presentation. In
addition, the presentation should clearly indicate which literature the particular
elements of the presentation refer to.
- Take notice of references
to other topics in the seminar and discuss topics with one
another!
- Participants are expected to seek out additional literature on their
topic. Assistance with the literature search is available at the
facultys library. Access to literature is naturally also available at
the Lehrstuhl für Informatik 6 library.
- Notation/Mathematical
Formulas: consistent, correct notation
is essential. When necessary, differing notation from various
literature sources is to be modified or standardized in order to be
clear and consistent. The
lectures held by the Lehrstuhl für Informatik 6 should provide a
guide as to what appropriate notation should look like.
- Tables
must have titles (appearing above the table).
- Figures
must have captions (appearing below the figure).
- In the case that no adequate translation of an
English technical term is available, the term should be used unchanged.
- Articles and presentation slides can also be prepared in
English.
- Completeness:
acknowledge all literature and
sources.
- Referencing must conform to the standard
described in the article template.
- Examples should be used to illustrate points.
- Examples should be as complex as necessary but as simple
as possible.
- Slides should be used
as presentation aids and not to replace the role of the presenter;
specifically, slides should:
- illustrate important points and relationships;
- remind the audience (and the presenter) of important aspects
and considerations;
- give the audience an overview
of the presentation.
- Slides should not contain chunks of text or complicated
sentences; rather they should consist of succinct words and terms.
- Use illustrations
where appropriate - a picture says a thousand words!
- Abbreviations should be defined at the first usage in the manner
demonstrated in the following example: "[...] at the
Rheinisch-Westfälischen Technischen Hochschule (RWTH) there are
[...]".
- Take care to stay within your
own topic. To this end participants should be aware of the other topics in the
seminar. If applicable, cross-reference
other articles and presentations.
- Usage of fonts, typefaces and colors in presentation slides must
be consistent and appropriate. Such means should serve to clarify
points or relationships, not be applied needlessly or at random.
- Care should be taken when selecting fonts for presentation
slides (also within diagrams) to ensure legibility on a projector even
for those seated far from the screen.
Registration for the seminar:
Registration
for the seminar
is only possible online via the registration
page provided by the
institute.
A link can be found on the Computer Science Department's homepage.
Inquiries relating to organizational
aspects of the seminar should be directed to:
Dr. Ralf Schlüter
RWTH Aachen
Lehrstuhl für Informatik 6
Ahornstr. 55
52056 Aachen
Room 6125b (1. Etage E2)
Telephone: 0241 / 80 21 612
E-Mail: schlueter@cs.rwth-aachen.de