Seminar "Large Scale Language Models and Generative Pretrained Transformers"

In the Summer Semester 2023, the Lehrstuhl Informatik 6 will host a seminar entitled "Large Scale language models and Generative pretrained transformers" for the Bachelor level.

Registration for the seminar

Registration for the seminar is only possible online via the central registration page.

Prerequisites for Participation in the Seminar

Einführung in das wissenschaftliche Arbeiten (Proseminar)
Attendance of the lectures Statistical Classification and Machine Learning, Automatic Speech Recognition, and/or Statistical Methods in Natural Language Processing, or evidence of equivalent knowledge is highly recommended.
For successful participants of the above lectures, seminar participation is guaranteed.

General Goals of the Seminar

The goal of the seminar is to autonomously acquire knowledge and critical comprehension of an assigned topic, and present this topic both in writing and verbally.

This includes:

Performance of a literature review based on the initial references provided to attain on overview of the assigned seminar topic
Reading, comprehending and critically analyzing the assigned and found articles.
Covering relevant publications within the scope of the assigned topic and the seminar format.
Describing the topic on this basis in a written report.
Preparing slides and presenting the topic in an oral presentation.
NewsKeeping the ethical guidelines for the authoring of academic work, which especially includes that all work used to prepare the seminar report and presentation is correctly cited.

Seminar Format and Important Dates

The seminar will be started with a kick-off meeting, which will take place on 17.03.2023. Deatils are communicated directly to the seminar participants selected in the central registration.

Please note the following deadlines during the seminar:

Proposals: initial proposals will be accepted up until 31.03.2023 by email to the seminar topic's supervisor. At this time, participants must arrange an appointment with the relevant supervisor. Revised proposals will be accepted up until 14.04.2023.
Article: PDF must be submitted until 28.05.2023 by email to the seminar topic's supervisor.
Presentation slides: PDF must be submitted by 15.06.2023 by email to the seminar topic's supervisor.
Trial presentations: finished by 30.06.2023.
Seminar presentations: Three-day block lecture August 9th-11th, ~(10h-17h)
Final (possibly corrected) articles and presentation slides: PDF must be submitted 4 weeks after the presentation date at the latest by email to the seminar topic's supervisor.
Compulsory attendance: in order to pass, participants must attend all presentation sessions.
Ethical Guidelines:The Computer Science Department of RWTH Aachen University has adopted ethical guidelines for the authoring of academic work , such as seminar reports. Each student has to comply with these guidelines. In this regard, you, as a seminar attendant, have to sign a declaration of compliance, in which you assert that your work complies with the guidelines, that all references used are properly cited, and that the report was done autonomously by yourself. We ask you do download the guidelines and submit the declaration together with your seminar report and talk to your supervisor. You also find a German version of the declaration you may use as well.

Note: failure to comply with the ethical guidelines, failure to meet deadlines, absence without permission from compulsory sessions (presentations and preliminary meeting as announced by email to each participating student), or dropping out of the seminar after more than 3 weeks after the preliminary meeting/topic distribution results in the grade 5.0/not appeared.

The deadline for de-registration from the seminar is TBA, i.e. within three weeks after the distribution of the topics. After this deadline, seminar participation is confirmed and will be graded.

Topics, Initial References Defining the Topics, Participants, and Supervisors

In general, selected topics from the following general areas of Human Language Technology and Machine Learning will be offered:

Automatic Speech Recognition;
Machine Translation;
Natural Language Understanding;
Machine Learning.

Below, you find exemplary topics. The final topics will be presented in a kick-off meeting which will be announced to the seminar participants selected in the central registration .

Principles of Language Modeling (1): Count-Based and Continous Space (Student: Khan, Supervisor: Raissi)
Initial References:
- Holger Schwenk"Continuous Space Language Models for Statistical Machine Translation"
This topic will look into background of LM before the recurrent based approaches.

Principles of Language Modeling (2): Recurrent NN based Language Models (Students: Pletschko, Supervisor: Raissi)
Initial References:
- Mikolov et al.: "Recurrent neural network based language model"
- Sundermeyer et al.: "LSTM Neural Networks for Language Modeling"
In this topic the student will look into principles of statistical language modeling using recurrent neural networks. Compared to count-based n-gram LMs or feed-forward networks with fixed-length context, recurrent neural network can capture long-time dependencies from past and used that for the prediction of the next word. Moreover LSTM based LMs address the well-known vanishing gradient problem

Extractive Question Answering (Student: Vogelbacher, Supervisor: Rossenbach)
Initial References:
- Rajpurkar et al.: "Know What You Don't Know: Unanswerable Questions for SQuAD"
Given a question and a paragraph of text, the task in extractive question answering is to highlight the answer to the question in the paragraph. The goal of this seminar is to give an overview of different approaches to solve this task starting with task specific architectures up to current approach utilising large language models.

Prompt Engineering (Student: Diepers, Supervisor: Thulke)
Initial References:
- Lester et al.: "The Power of Scale for Parameter-Efficient Prompt Tuning"
When it comes to adapting pretrained transformer models to downstream tasks, the conventional approach involves fine-tuning all model parameters. However, a surprisingly effective alternative is called prompting. This technique involves adapting the model by inputting a task description, optionally along with a set of examples, while keeping all parameters frozen. The purpose of this topic is to provide an overview of the advantages and disadvantages of this method, as well as explore potential extensions.

Large Language Models for Machine Translation (Students: Lu, Supervisor: Gao)
Initial References:
- Stahlberg et al.: "Simple Fusion: Return of the Language Mod"
This topic discusses different approaches to how large language models can be utilised for Machine Translation (MT). This can cover pre-training individual components or zero-/few-shot prompting of very large language models.

Article and Presentation Format

The roughly 20-page article together with the slides (between 20 & 30) for the presentation should be prepared in LaTeX format. Presentations will consist of 30 to 40 minutes presentation time & 15 minutes discussion time. Document templates for both the article and the presentation slides are provided below along with links to LaTeX documentation available online. The article and the slides should be prepared in LaTeX format and submitted electronically in pdf format. Other formats will not be accepted.

Online LaTeX-Documentation:

Document Templates:

Article Template (51kB), contains the template and all necessary files in tar format (or here 10kB in zip format).
New Presentation Slide Template, a zip file containing the template and all necessary graphics as well as the institute’s style template. Note: We deactivated the RWTH and i6 logos in this version of the template since the seminar content is produced by students outside of i6.

Detailed Guidelines:

Take care to stay within your own topic. To this end participants should be aware of the other topics in the seminar. If applicable, cross-reference other articles and presentations.
Important: As part of the introduction, a slide should outline the most important literature used for the presentation. In addition, the presentation should clearly indicate which literature the particular elements of the presentation refer to.
Take notice of references to other topics in the seminar and discuss topics with one another!
Participants are expected to seek out additional literature on their topic. Assistance with the literature search is available at the faculty’s library. Access to literature is naturally also available at the Lehrstuhl Informatik 6 library.
Notation/Mathematical Formulas: consistent, correct notation is essential. When necessary, differing notation from various literature sources is to be modified or standardized in order to be clear and consistent. The lectures held by the Lehrstuhl Informatik 6 should provide a guide as to what appropriate notation should look like.
Tables must have titles (appearing above the table).
Figures must have captions (appearing below the figure).
The use of English is recommended and mandatory for the presentation slides. Nevertheless, the article and oral presentation may be done in German.
In the case that no adequate translation of an English technical term is available, the term should be used unchanged.
Completeness: acknowledge all literature and sources, thus following the ethical guidelines for the authoring of academic work.
Referencing must conform to the standard described in the article template.
Examples should be used to illustrate points.
Examples should be as complex as necessary but as simple as possible.
Slides should be used as presentation aids and not to replace the role of the presenter; specifically, slides should:

illustrate important points and relationships;
remind the audience (and the presenter) of important aspects and considerations;
give the audience an overview of the presentation.
Slides should not contain chunks of text or complicated sentences; rather they should consist of succinct statements and use consistent terminology.

Use illustrations where appropriate - a picture says a thousand words!
Abbreviations should be defined at the first usage in the manner demonstrated in the following example: "[...] at the Rheinisch-Westfälischen Technischen Hochschule (RWTH) there are [...]".
Usage of fonts, typefaces and colors in presentation slides must be consistent and appropriate. Such means should serve to clarify points or relationships, not be applied needlessly or at random.
Care should be taken when selecting font sizes for presentation slides (also within diagrams) to ensure legibility on a projector even for those seated far from the screen.

Some Tips:

Time management is crucial for a successful seminar:

The draft of the article and the trial presentation slides to be sent before the corresponding deadlines must be complete (in particular, they must contain a required number of pages). In principle, supervisors will not give any iterative feedback if an updated version is submitted after the deadlines. Do not miss the opportunities!

Successful seminar articles/presentations typically:

Define the task clearly upfront (what is the problem? what are the input/output of the system?).
Give a short overview of the existing works on the topic.
Provide detailed descriptions of the state-of-the-art approach(es) with mathematical definitions using correct notations.
Include meaningful experimental results (extracted from the papers) with clear definitions of the datasets and the evaluation metrics.

While reading papers, it might be useful to keep the following questions in mind:

Why is this paper relevant for my topic: a historical piece of work? the state-of-the-art method? Does the paper help me understanding the topic better?
Do I really understand the paper? Can I describe how the method works without doubt? Can I explain the nature/dimension of all quantities in equations? Is this paper self-contained? or should I further read cited papers to get more background?
Is the paper content correct and consistent with other publications on the topic? If not: try to resolve discrepancies and broach such issues when discussing the paper in your article and presentation.
Are the experiments meaningful and convincing? Can I clearly describe the experiment and the evaluation metric?
How does this paper relate to other papers I have read? Is the same dataset used for evaluation?

Contact

Questions regarding the content of the assigned seminar topics should be directed to the respective topic's supervisors.

General and administrative inquiries should be directed to:

Tina Raissi
RWTH Aachen University
Lehrstuhl Informatik 6
Theaterstrasse 35-39
52074 Aachen

Room 025
Tel: 0241 80 21630

E-Mail: raissi@cs.rwth-aachen.de