Seminar "Seminar Selected Topics in Machine Learning and Human Language Technology"
In the Winter Semester 2024/2025, the Lehrstuhl Informatik 6 will host a
seminar entitled "Selected Topics in Machine Learning and Human Language Technology" for the Master level.
Registration for the seminar
Registration
for the seminar is only possible online via
the central
registration page.
Prerequisites for Participation in the Seminar
- Bachelor degree
- Attendance of the lectures Statistical Classification and Machine Learning, Automatic Speech Recognition, and/or Statistical Methods in Natural Language
Processing, or evidence of equivalent knowledge is highly recommended.
- For successful participants of the above lectures, seminar participation is guaranteed.
General Goals of the Seminar
The goal of the seminar is to autonomously acquire knowledge and critical comprehension
of an assigned topic, and present this topic both in writing and verbally.
|
This includes:
- Performance of a literature review based on the initial
references provided to attain on overview of the assigned seminar topic
- Reading, comprehending and critically analyzing the assigned and found articles.
- Covering relevant publications within the scope of the assigned
topic and the seminar format.
- Describing the topic on this basis in a written report.
- Preparing slides and presenting the topic in an oral
presentation.
- NewsKeeping
the ethical
guidelines for the authoring of academic work, which
especially includes that all work used to prepare the seminar
report and presentation is correctly cited.
Seminar Format and Important Dates
The seminar will be started with a kick-off meeting, which will take place on Monday July 22nd 2024, shortly after the
central
registration for the seminars in the Computer Science
Department. The details of the kick-off meeting will be
communicated directly to the seminar participants selected in
the central
registration.
Please note the following deadlines during the seminar:
- Proposals: initial draft proposals will be accepted up
until 7.10.2024 by email to the
seminar topic's supervisor. At this time, participants must
arrange an appointment with the relevant supervisor. Revised
proposals will be accepted up until 14.10.2024.
- Article: PDF must be submitted until
11.11.2024 by email to the seminar topic's
supervisor.
- Presentation slides: PDF must be submitted by 02.12.2024 by email to the seminar topic's
supervisor.
- Trial presentations: finished by 10.01.2025.
- Seminar presentations: during the last 2 weeks of January, i.e., Jan. 20th-31st
- Final (possibly corrected) articles and presentation slides:
PDF must be submitted 4
weeks after the presentation date at the latest by email to the seminar topic's supervisor.
- Compulsory attendance: in order to pass, participants must attend all presentation sessions.
- Ethical Guidelines:The Computer Science Department
of RWTH Aachen University has
adopted ethical
guidelines for the authoring of academic work, such as seminar
reports. Each student has to comply with these guidelines. In this
regard, you, as a seminar attendant, have to sign
a declaration of
compliance, in which you assert that your work complies with
the guidelines, that all references used are properly cited, and
that the report was done autonomously by yourself. We ask you do
download the
guidelines
and submit
the declaration
together with your seminar report and talk to your supervisor.
You also find
a German version of the
declaration you may use as well.
Note: failure to comply with the ethical guidelines, failure to meet deadlines, absence without permission from
compulsory sessions (presentations and preliminary meeting as
announced by email to each participating student), or dropping out
of the seminar after more than 3 weeks after the preliminary
meeting/topic distribution results in the grade
5.0/not appeared.
The deadline for de-registration from the seminar is 15.08.2024, i.e. within three
weeks after the distribution of the topics.
After this deadline, seminar
participation is confirmed and will be graded.
Topics, Initial References Defining the Topics, Participants, and Supervisors
In general, selected topics from the following general areas of Machine Learning and Human
Language Technology will be offered:
- Automatic Speech Recognition;
- Machine Translation;
- Natural Language Understanding;
- Machine Learning.
Below, you find exemplary topics. However, note that topics are subject to change/updates. The final topics will be presented in a kick-off meeting which will be announced to the seminar participants selected in the central registration
for the seminar.
-
Automatic Speech Recognition Foundation Models (Student: XXX, Supervisor: Xu, Jingjing)
Initial References:
-
Joint Speech and Text Model for Automatic Speech Recognition (Student: XXX, Supervisor: Rossenbach, Nick)
Initial References:
-
Combination of Automatic Speech Recognition Architectures (Student: Nikolov, Supervisor: Berger, Simon)
Initial References:
-
Learning from Human Preferences (Student: XXX, Supervisor: Thulke, David)
Initial References:
-
Streaming Automatic Speech Recognition (Student: XXX, Supervisor: Hilmes, Benedikt)
Initial References:
-
Differentiable Weighted Finite State Transducers (Student: XXX, Supervisor: Raissi, Tina)
Initial References:
-
Automatic Speech Recognition Error Correction (Student: XXX, Supervisor: Yang, Zijian)
Initial References:
Article and Presentation Format
The roughly 15-page article together with the slides (between 20 &
30 in cluding references and blank pages) for the presentation should be prepared in LaTeX format.
Presentations will consist of 30 to 40 minutes presentation time & 15
minutes discussion time. Document templates for both the article and
the presentation slides are provided below along with links to LaTeX
documentation available online. The article and
the slides should be prepared in LaTeX format and submitted
electronically in pdf format. Other formats will not be accepted.
- Online LaTeX-Documentation:
- Article
Template, contains the template and all necessary
files in zip format.
- New Presentation
Slide Template, a zip file containing the template and all
necessary graphics as well as the institutes style template.
Note: We deactivated the RWTH and i6 logos in this version of the template
since the seminar content is produced by students outside of i6.
Detailed Guidelines:
- Take care to stay within your
own topic. To this end participants should be aware of the other topics in the
seminar. If applicable, cross-reference
other articles and presentations.
- Important: As part of the introduction, a slide should
outline the most important literature used for the
presentation. In addition, the presentation should clearly
indicate which literature
the particular elements of the presentation refer to.
- Take notice of references to other topics in the seminar
and discuss topics with one another!
- Participants are expected to seek out additional literature on their
topic. Assistance with the literature search is available at the
facultys library. Access to literature is naturally also available at
the Lehrstuhl Informatik 6 library.
- Notation/Mathematical
Formulas: consistent, correct notation
is essential. When necessary, differing notation from various
literature sources is to be modified or standardized in order to be
clear and consistent. The
lectures held by the Lehrstuhl Informatik 6 should provide a
guide as to what appropriate notation should look like.
- Font sizes: general rule should be that fonts within tables, figures, etc. should not be (considerably) smaller than in the usual text!
- Tables
must have titles (appearing above the table).
- use meaningful notation, and avoid abstract column headings (like System: A,B,C).
- try to find succinct column headings, each column contains only a single factor that is changed
- omit redundant columns (i.e. columns with static entries), the corresponding general information should be given in the table caption
redundant entries should not be repeated - i.e., if in a column an entry is the same as one row above: omit it, at least, if there also were not changes in the entries in the columns before
optional: use \hline and \cline{N-M} to better separate
- include necessary information in caption, i.e. at least the problem covered and the task results are generated on
- Figures
must have captions (appearing below the figure).
- The use of English is recommended and mandatory for the presentation
slides.
Nevertheless, the article and oral presentation may be done in German.
- In the case that no adequate translation of an
English technical term is available, the term should be used unchanged.
- Completeness:
acknowledge all literature and
sources, thus following the ethical guidelines for the authoring of academic work.
- Referencing must conform to the standard
described in the article template.
- Examples should be used to illustrate points.
- Examples should be as complex as necessary but as simple
as possible.
- Slides should be used
as presentation aids and not to replace the role of the presenter;
specifically, slides should:
- illustrate important points and relationships;
- remind the audience (and the presenter) of important aspects
and considerations;
- give the audience an overview
of the presentation.
- Slides should not contain chunks of text or complicated
sentences; rather they should consist of succinct statements and use consistent terminology.
- Avoid using numbered references and use meaningful information about the reference such as authors and year. As an example instead of [1] use [Strickland & Mouruo 1985]
- Use illustrations
where appropriate - a picture says a thousand words!
- Abbreviations should be defined at the first usage in the manner
demonstrated in the following example: "[...] at the
Rheinisch-Westfälischen Technischen Hochschule (RWTH) there are
[...]".
- Usage of fonts, typefaces and colors in presentation slides must
be consistent and appropriate. Such means should serve to clarify
points or relationships, not be applied needlessly or at random.
- Care should be taken when selecting font sizes for presentation
slides (also within diagrams) to ensure legibility on a projector even
for those seated far from the screen.
Some Tips:
Time management is crucial for a successful seminar:
- The draft of the article and the trial presentation slides to be sent before the corresponding deadlines must be complete (in particular, they must contain a required number of pages).
In principle, supervisors will not give any iterative feedback if an updated version is submitted after the deadlines. Do not miss the opportunities!
Successful seminar articles/presentations typically:
- Define the task clearly upfront (what is the problem? what are the input/output of the system?).
- Give a short overview of the existing works on the topic.
- Provide detailed descriptions of the state-of-the-art approach(es) with mathematical definitions using correct notations.
- Include meaningful experimental results (extracted from the papers) with clear definitions of the datasets and the evaluation metrics.
While reading papers, it might be useful to keep the following questions in mind:
- Why is this paper relevant for my topic: a historical piece of work? the state-of-the-art method? Does the paper help me understanding the topic better?
- Do I really understand the paper? Can I describe how the method works without doubt? Can I explain the nature/dimension of all quantities in equations?
Is this paper self-contained? or should I further read cited papers to get more background?
- Is the paper content correct and consistent with other publications on the topic? If not: try to resolve discrepancies and broach such issues when discussing the paper in your article and presentation.
- Are the experiments meaningful and convincing? Can I clearly describe the experiment and the evaluation metric?
- How does this paper relate to other papers I have read? Is the same dataset used for evaluation?
Contact
Questions regarding the content of the assigned seminar topics
should be directed to the respective topic's supervisors.
General and administrative inquiries should be directed to:
Tina Raissi
RWTH Aachen University
Lehrstuhl Informatik 6
Mies-van-der-Rohe-Straße 55
52074 Aachen
Room 6125a
Tel: 0241 80 21630
E-Mail: raissi@cs.rwth-aachen.de