Seminar "Selected Topics in Human Language Technology and Pattern Recognition"

In the Winter Semester 2019/20, the Lehrstuhl Informatik 6 will host a seminar entitled "Selected Topics in Human Language Technology and Pattern Recognition" for Bachelor and for Master level.

Registration for the seminar

Registration for the seminar is only possible online via the central registration page.

Prerequisites for Participation in the Seminar

Bachelor students: Einführung in das wissenschaftliche Arbeiten (Proseminar)
Master students: Bachelor degree
Attendance of the lectures Statistical Classification and Machine Learning, Automatic Speech Recognition, and/or Statistical Methods in Natural Language Processing, or evidence of equivalent knowledge is highly recommended.
For successful participants of the above lectures, seminar participation is guaranteed.

General Goals of the Seminar

The goal of the seminar is to autonomously acquire knowledge and critical comprehension of an assigned topic, and present this topic both in writing and verbally.

This includes:

Performance of a literature review based on the initial references provided to attain on overview of the assigned seminar topic
Reading, comprehending and critically analyzing the assigned and found articles.
Covering relevant publications within the scope of the assigned topic and the seminar format.
Describing the topic on this basis in a written report.
Preparing slides and presenting the topic in an oral presentation.
NewsKeeping the ethical guidelines for the authoring of academic work, which especially includes that all work used to prepare the seminar report and presentation is correctly cited.

Seminar Format and Important Dates

The seminar will be started with a kick-off meeting, which will take place shortly after the central registration for the seminars in the Computer Science Department. The exact date of the kick-off meeting will be communicated directly the seminar participants selected in the central registration.

Please note the following deadlines during the seminar:

Proposals: initial proposals will be accepted up until the start of the semester (Oct. 1, 2019) by email to the seminar topic's supervisor. At this time, participants must arrange an appointment with the relevant supervisor. Revised proposals will be accepted up until two weeks after the start of the semester.
Article: PDF must be submitted until Dec. 1, 2019 by email to the seminar topic's supervisor.
Presentation slides: PDF must be submitted at least 1 week prior to the trial presentation date by email to the seminar topic's supervisor.
Trial presentations: at least 2 weeks prior to the actual presentation date; refer to the topics section.
Seminar presentations: date will be announced during lecture period.
Final (possibly corrected) articles and presentation slides: PDF must be submitted 4 weeks after the presentation date at the latest by email to the seminar topic's supervisor.
Compulsory attendance: in order to pass, participants must attend all presentation sessions.
Ethical Guidelines:The Computer Science Department of RWTH Aachen University has adopted ethical guidelines for the authoring of academic work , such as seminar reports. Each student has to comply with these guidelines. In this regard, you, as a seminar attendant, have to sign a declaration of compliance, in which you assert that your work complies with the guidelines, that all references used are properly cited, and that the report was done autonomously by yourself. We ask you do download the guidelines and submit the declaration together with your seminar report and talk to your supervisor. You also find a German version of the guidelines and a German version of the declaration you may use as well.

Note: failure to meet deadlines, absence without permission from compulsory sessions (presentations and preliminary meeting as announced by email to each participating student), or dropping out of the seminar after more than 3 weeks after the preliminary meeting/topic distribution results in the grade 5.0/not appeared.

The deadline for de-registration from the seminar is within three weeks after the distribution of the topics in during the kick-off meeting. After this deadline, seminar participation is confirmed and will be graded.

Topics, Initial References Defining the Topics, Participants, and Supervisors

In general, selected topics from the following general areas of Human Language Technology and Pattern Recognition will be offered:

Automatic Speech Recognition;
Machine Translation;
Pattern Recognition.

Below, you find exemplary topics. However, note that topics are subject to change/updates. The final topics will be presented in a kick-off meeting which will be announced to the seminar participants selected in the central registration for the seminar.

Speaker Diarization
1. Methods for Speaker Diarization (NN; Supervisor: Wilfried Michel)
  Initial References:
  - M.H. Moattar, M.M. Homayounpour, "A review on speaker diarization systems and approaches", in Speech Communication, Vol. 54, No. 10, 2012.
  - Xavier A. Miro, S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, O. Vinyals , "Speaker Diarization: A Review of Recent Research", in IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20 , No. 2 , Feb. 2012 )
  - A. Zhang, Q. Wang, Z. Zhu, J. Paisley, C. Wang, "Fully Supervised Speaker Diarization", in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, Apr. 2019.
2. Applications and Challenges for Speaker Diarization (NN; Supervisor: Wilfried Michel)
  Initial References:
  - T. L. Nwe, H. Sun, H. Li, S. Rahardja, "Speaker diarization in meeting audio," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, Apr. 2009.
  - K. Church, W. Zhu, J. Vopicka, J. Pelecanos, D. Dimitriadis, P. Fousek, "Speaker diarization: A perspective on challenges and opportunities from theory to practice," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, Mar. 2017.
Speaker Separation
1. Permutation-Invariant Training (NN; Supervisor: Tobias Menne)
  Initial References:
  - D. Yu, M. Kolbaek, Z.-H. Tan, J. Jensen: "Permutation Invariant Training of Deep Models for Speaker-Independent Multi-Talker Speech Separation," in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, Mar. 2017, arXiv:1607.00325.
  - Y. Qian, X. Chang, D. Yu: "Single-Channel Multi-talker Speech Recognition with permutation Invariant Training", submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing, arXiv:1707.06527, Jul. 2017.
2. Speaker-Dependent Speaker Separation (NN; Supervisor: Tobias Menne)
  Initial References:
  - Yan-Hui Tu, Jun Du, Chin-Hui Lee: "A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech," Journal of Signal Processing Systems, pp. 1-11, 2018.
  - Tian Gao, Jun Du, Li-Rong Dai, Chin-Hui Lee: "A unified DNN approach to speaker-dependent simultaneous speech enhancement and speech separation in low SNR environments," Speech Communication, pp. 28-39, Vol. 95, 2017.
Natural Language Understanding
1. Pre-training Language Representation Models for NLP (Choi; Supervisor: Kazuki Irie)
  Initial References:
  - J. Devlin, M.-W. Chang, K. Lee, K. Toutanova: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding," in Proc. Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 4171-4186, Minneapolis, MN, USA, USA, Jun. 2019.
  - Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, Q. V. Le: "XLNet: Generalized Autoregressive Pretraining for Language Understanding," arXiv preprint arXiv:1906.08237.
  - . Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, V. Stoyanov: "RoBERTa: A Robustly Optimized BERT Pretraining Approach," arXiv preprint arXiv:1907.11692.
2. Natural Language Understanding (Zhan; Supervisor: Kazuki Irie)
  Presentation: Wed, Feb 26, 2020, 14h
  
  Initial References:
  - C. Li, L. Li, J. Qi: "A Self-Attentive Model with Gate Mechanism for Spoken Language Understanding", in Proc. Conf. on Empirical Methods in Natural Language Processing (EMNLP), pp. 3824-3833, Brussels, Belgium, Oct. 2018.
  - Q. Chen, Z. Zhuo, W. Wang: "BERT for Joint Intent Classification and Slot Filling," arXiv preprint arXiv:1902.10909, Oct. 2018.
3. Cross-lingual Word Embedding (Vanvinckenroye; Supervisor: Yunsu Kim)
  Presentation: Wed, Feb 26, 2020, 15h
  
  Initial References:
  - T. Mikolov, Q. Le, I. Sutskever: "Exploiting Similarities among Languages for Machine Translation," Sep. 2013.
  - M. Artetxe, G. Labaka, E. Agirre: "Learning Bilingual Word Embeddings with (Almost) No Bilingual Data," in Proceedings of 55th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 451-462, Vancouver, Canada, Jul. 2017.
  - G. Glavas, R. Litschko, S. Ruder, I. Vulic: "How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions," in Proceedings of 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, Jul. 2019.
4. Cross-lingual Sentence Embedding (NN; Supervisor: Yunsu Kim)
  Initial References:
  - H. Schwenk, M. Douze: "Learning Joint Multilingual Sentence Representations with Neural Machine Translation" , in Proceedings of the ACL 2nd Workshop on Representation Learning for NLP (Repl4NLP) , pp. 157-167, Vancouver, Canada, Aug. 2017.
  - M. Artetxe, H. Schwenk: "Massively Multilingual Sentence Embeddings for Zero-Shot Cross-lingual Transfer and Beyond", in Proceedings of 57th Annual Meeting of the Association for Computational Linguistics (ACL), Florence, Italy, Jul. 2019.
  - G. Lample, A. Conneau: "Cross-lingual Language Model Pre-training," arXiv:1901.07291, Jan. 2019.
Sentiment Analysis
1. Emotion Detection (Hugenroth; Supervisor: Eugen Beck)
  Initial References:
  - G. Trigeorgis, F. Ringeval, R. Brueckner, E. Marchi, M. A. Nicolaou, B. Schuller, S. Zafeiriou: "Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network," in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5200-5204, Shanghai, China, Mar. 2016.
  - S. Ghosh, E. Laksana, L. P. Morency, S. Scherer: "Representation Learning for Speech Emotion Recognition," in Proc. 17th Annual Conference of the International Speech Communication Association (Interspeech), pp. 3603-3607, Sep. 2016.
2. Multimodal Sentiment Analysis (Mangel; Supervisor: Eugen Beck)
  Presentation: Wed, Feb 26, 2020, 16h
  
  Initial References:
  - S. Poria, E. Cambria, A. Gelbukh: "Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis, in Proc. Conf. on Empirical Methods in Natural Language Processing, pp. 2539-2544, Lisbon, Portugal, Sept. 2015.
  - Mohammad Soleymani, David Garcia, Brendan Jou, Björn Schuller, Shih-Fu Chang, Maja Pantic, "A survey of multimodal sentiment analysis," in Image and Vision Computing, Vol. 65, pp. 3-14, Elsevier, Sep. 2017.
Language Identification
1. State-of-the-Art Language Identification (Rompelberg; Supervisor: Markus Kitza)
  Presentation: Wed, Feb 26, 2020, 17h
  
  Initial References:
  - D. Snyder, D. Garcia-Romero, A. McCree, G. Sell, D. Povey, S. Khudanpur: "Spoken Language Recognition using X-vectors," in Odyssey 2018 The Speaker and Language Recognition Workshop, Les Sables d'Olonne, France, pp. 105-111, Jun. 2018.
  - J. A. V. Lopez, N. Brummer, N. Dehak: "End-to-End versus Embedding Neural Networks for Language Recognition in Mismatched Conditions," in Odyssey 2018 The Speaker and Language Recognition Workshop, Les Sables d'Olonne, France, pp. 112-119, Jun. 2018
Text-to-Speech
1. Auto-regressive Models (NN; Supervisor: Yingbo Gao)
  Initial References:
  - N. Kalchbrenner, E. Elsen, K. Simonyan, S. Noury, N. Casagrande, E. Lockhart, F. Stimberg, A. van den Oord, S. Dieleman, K. Kavukcuoglu: "Efficient Neural Audio Synthesis," arXiv:1802.08435, Feb. 2018.
  - T. Salimans, A. Karpathy, X. Chen, D. P. Kingma: "PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications," , arXiv:1701.05517, Jan. 2017.
2. Inverse Autoregressive Flows (NN; Supervisor: Yingbo Gao)
  Initial References:
  - A. van den Oord, Y. Li, I. Babuschkin, K. Simonyan, O. Vinyals, K. Kavukcuoglu, G. van den Driessche, E. Lockhart, L. C. Cobo, F. Stimberg, N. Casagrande, D. Grewe, S. Noury, S. Dieleman, E. Elsen, N. Kalchbrenner, H. Zen, A. Graves, H. King, T. Walters, D. Belov, D. Hassabis: "Parallel WaveNet: Fast High-Fidelity Speech Synthesis," arXiv:1711.10433, Nov. 2017.
  - D. P. Kingma, T. Salimans, R. Jozefowicz, X. Chen, I. Sutskever, M. Welling: "Improving Variational Inference with Inverse Autoregressive Flow," arXiv:1606.04934, Jun. 2016.
3. End-to-end Text-to-speech (NN; Supervisor: Peter Vieting)
  Initial References:
  - Y. Taigman, L. Wolf, A. Polyak, E. Nachmani: "VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop," arXiv:1707.06588, Jul. 2017.
  - J. Shen, R. Pang, R. J. Weiss, M. Schuster, N. Jaitly, Z. Yang, Z. Chen, Y. Zhang, Y. Wang, R. J. Skerry-Ryan, R. A. Saurous, Y. Agiomyrgiannakis, Y. Wu: "Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions, arXiv:1712.05884, Dec. 2017.
Speech-to-Text Translation
Reinforcement Learning
1. Minimum Expected Loss Training (Gerstenberger; Supervisor: Albert Zeyer)
  Initial References:
  - R. Prabhavalkar et al, "Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models", ICASSP 2018.
  - M. Shannon, "Optimizing expected word error rate via sampling for speech recognition", arXiv 2017.
  - D. Bahdanau et al, An Actor-Critic Algorithm for Sequence Prediction, ICLR 2017.
  - S. Sabour et al, Optimal Completion Distillation for Sequence Learning, ICLR 2019.
2. Modern Policy Learning Methods for Games (El Qoraichi; Supervisor: Albert Zeyer)
  Initial References:
  - D. Silver et al, Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, arXiv 2017.
  - M. Jaderberg et al, Human-level performance in first-person multiplayer games with population-based deep reinforcement learning, Science 2019.
3. Memory Augmented Networks for Reinforcement Learning (Petrick; Supervisor: Christoph Lüscher)
  Presentation: Thur, Feb 27, 2020, 14h
  
  Initial References:
  - G. Wayne et al, Unsupervised Predictive Memory in a Goal-Directed Agent, arXiv 2018.
  - A. Khan et al, Memory Augmented Control Networks, ICLR 2018.
Text Summarization
1. Extractive Text Summarization (Swoboda; Supervisor: Jan Rosendahl)
  Presentation: Thur, Feb 27, 2020, 15h
  
  Initial References:
  - Mehdi M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, K. Kochut: "Text Summarization Techniques: A Brief Survey," arXiv:1707.02268, Jul. 2017.
  - J.-M. Torres-Moreno: Automatic Text Summarization, John Wiley & Sons, Inc. , Hoboken, NJ, Sep. 2014, 348 pages. (available within RWTH Aachen Network).
2. Abstractive Text Summarization (with Deep Learning) (Wynands; Supervisor: Christian Herold)
  Presentation: Thur, Feb 27, 2020, 16h
  
  Initial References:
  - R. Nallapati, B. Zhou, C. dos Santos, C. Gulcehre, B. Xiang: "Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond," in Proc. 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL), pp. 280-290, Berlin, Germany, Aug. 2016.
  - A. See, P. J. Liu, C. D. Manning: "Get To The Point: Summarization with Pointer-Generator Networks," in Proc. 55th Annual Meeting of the Association for Computational Linguistics, pp. 1073-1083, Vancouver, Canada, Jul./Aug. 2017.
Named Entity Recognition
1. Named Entity Recognition (Becks; Supervisor: Weiyue Wang)
  Presentation: Thur, Feb 27, 2020, 17h
  
  Initial References:
  - G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, C. Dyer: "Neural Architectures for Named Entity Recognition," in Proc. of NAACL-HLT 2016, San Diego, California, Jun., 2016.
  - J. Chiu, E. Nichols: "Named Entity Recognition with Bidirectional LSTM-CNNs," in Proc. of Transactions of the Association for Computational Linguistics, July, 2016.
  - A. Akbik, T. Bergmann, D. Blythe, K. Rasul, S. Schweter, R. Vollgraf: "FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP," in Proc. of NAACL-HLT 2019, Minneapolis, Minnesota, Jun., 2019.
2. Entity Linking (Das; Supervisor: Weiyue Wang)
  Initial References:
  
  A. Moro, A. Raganato, A. Navigli: " Entity Linking meets Word Sense Disambiguation: a Unified Approach," in Proc. of Transactions of the Association for Computational Linguistics, May, 2014.
  
  L. Derczynski, D. Maynard, G. Rizzo, M. van Erp, G. Gorrell, R. Troncy, J. Petrak, K. Bontcheva: " Analysis of named entity recognition and linking for tweets," in Proc. of Information Processing and Management, October, 2014.
  
  N. Kolitsas, O. Ganea: " End-to-End Neural Entity Linking," in Proc. of the 22nd Conference on Computational Natural Language Learning, Brussels, Belgium, October, 2018.
Constituency Parsing
1. Neural Network-based Parsing (NN; Supervisor: Parnia Bahar)
  Initial References:
  
  A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin: "Attention Is All You Need," in Advances in Neural Information Processing Systems 30, Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, Dec. 2017, arXiv:1706.03762.
  
  C. Dyer, A. Kuncoro, M. Ballesteros, N. A. Smith: "Recurrent neural network grammars," in Proc. 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), San Diego, CA, Jun. 2016, arXiv:1602.03762.
  
  M.-T. Luong, Q. V. Le, I. Sutskever, O. Vinyals, L. Kaiser: "Multi-task Sequence to Sequence Learning", in International Conference on Learning Representations (ICLR) , San Juan, Puerto Rico, May 2016, arXiv:1511.06114.
Article and Presentation Format
The roughly 20-page article together with the slides (between 20 & 30) for the presentation should be prepared in LaTeX format. Presentations will consist of 30 to 40 minutes presentation time & 15 minutes discussion time. Document templates for both the article and the presentation slides are provided below along with links to LaTeX documentation available online. The article and the slides should be prepared in LaTeX format and submitted electronically in pdf format. Other formats will not be accepted.
- Online LaTeX-Documentation:
- Document Templates:
Detailed Guidelines:
- Take care to stay within your own topic. To this end participants should be aware of the other topics in the seminar. If applicable, cross-reference other articles and presentations.
- Important: As part of the introduction, a slide should outline the most important literature used for the presentation. In addition, the presentation should clearly indicate which literature the particular elements of the presentation refer to.
- Take notice of references to other topics in the seminar and discuss topics with one another!
- Participants are expected to seek out additional literature on their topic. Assistance with the literature search is available at the faculty’s library. Access to literature is naturally also available at the Lehrstuhl Informatik 6 library.
- Notation/Mathematical Formulas: consistent, correct notation is essential. When necessary, differing notation from various literature sources is to be modified or standardized in order to be clear and consistent. The lectures held by the Lehrstuhl Informatik 6 should provide a guide as to what appropriate notation should look like.
- Tables must have titles (appearing above the table).
- Figures must have captions (appearing below the figure).
- The use of English is recommended and mandatory for the presentation slides. Nevertheless, the article and oral presentation may be done in German.
- In the case that no adequate translation of an English technical term is available, the term should be used unchanged.
- Completeness: acknowledge all literature and sources, thus following the ethical guidelines for the authoring of academic work.
- Referencing must conform to the standard described in the article template.
- Examples should be used to illustrate points.
- Examples should be as complex as necessary but as simple as possible.
- Slides should be used as presentation aids and not to replace the role of the presenter; specifically, slides should:
- Use illustrations where appropriate - a picture says a thousand words!
- Abbreviations should be defined at the first usage in the manner demonstrated in the following example: "[...] at the Rheinisch-Westfälischen Technischen Hochschule (RWTH) there are [...]".
- Usage of fonts, typefaces and colors in presentation slides must be consistent and appropriate. Such means should serve to clarify points or relationships, not be applied needlessly or at random.
- Care should be taken when selecting font sizes for presentation slides (also within diagrams) to ensure legibility on a projector even for those seated far from the screen.
Some Tips:
Time management is crucial for a successful seminar:
- The draft of the article and the trial presentation slides to be sent before the corresponding deadlines must be complete (in particular, they must contain a required number of pages). In principle, supervisors will not give any iterative feedback if an updated version is submitted after the deadlines. Do not miss the opportunities!
Successful seminar articles/presentations typically:
- Define the task clearly upfront (what is the problem? what are the input/output of the system?).
- Give a short overview of the existing works on the topic.
- Provide detailed descriptions of the state-of-the-art approach(es) with mathematical definitions using correct notations.
- Include meaningful experimental results (extracted from the papers) with clear definitions of the datasets and the evaluation metrics.
While reading papers, it might be useful to keep the following questions in mind:
- Why is this paper relevant for my topic: a historical piece of work? the state-of-the-art method? Does the paper help me understanding the topic better?
- Do I really understand the paper? Can I describe how the method works without doubt? Can I explain the nature/dimension of all quantities in equations? Is this paper self-contained? or should I further read cited papers to get more background?
- Is the paper content correct and consistent with other publications on the topic? If not: try to resolve discrepancies and broach such issues when discussing the paper in your article and presentation.
- Are the experiments meaningful and convincing? Can I clearly describe the experiment and the evaluation metric?
- How does this paper relate to other papers I have read? Is the same dataset used for evaluation?
Contact
Questions regarding the content of the assigned seminar topics should be directed to the respective topic's supervisors.

General and administrative inquiries should be directed to:
Parnia Bahar
RWTH Aachen University
Lehrstuhl Informatik 6
Ahornstr. 55
52074 Aachen
Room 6125b
Tel: 0241 80 21632

E-Mail: bahar@cs.rwth-aachen.de
Last modified: 2020-03-03 18:23:31 Disclaimer.

Seminar "Selected Topics in Human Language Technology and Pattern Recognition"

Registration for the seminar

Prerequisites for Participation in the Seminar

General Goals of the Seminar

Seminar Format and Important Dates

Topics, Initial References Defining the Topics, Participants, and Supervisors

Overview of Topics

Article and Presentation Format

Detailed Guidelines:

Some Tips:

Contact