Seminar "Selected Topics in Human Language Technology and Pattern Recognition"

In the Summer Term 2018 the Lehrstuhl Informatik 6 will host a seminar entitled "Selected Topics in Human Language Technology and Pattern Recognition".

Registration for the seminar

Registration for the seminar is only possible online via the central registration page from Friday, Jan. 19 to Friday, Feb. 02, 2018. A link can also be found on the Computer Science Department's homepage.

Prerequisites for participation in the seminar

Bachelor students: Einführung in das wissenschaftliche Arbeiten (Proseminar)
Master students: Bachelor degree
Attendance of the lectures Pattern Recognition and Neural Networks, Speech Recognition or Statistical Methods in Natural Language Processing, or evidence of equivalent knowledge is highly recommended.
For successful participants of the above lectures, seminar participation is guaranteed.

Seminar format and important dates

Please note the following deadlines:

Proposals: initial proposals will be accepted up until the start of the term's lecture period (April 09, 2018) by email to the seminar topic's supervisor. At this time, participants must arrange an appointment with the relevant supervisor. Revised proposals will be accepted up until two weeks after the start of the term.
Article: PDF must be submitted at least 1 month prior to the trial presentation date by email to the seminar topic's supervisor.
Presentation slides: PDF must be submitted at least 1 week prior to the trial presentation date by email to the seminar topic's supervisor.
supervisor.
Trial presentations: at least 2 weeks prior to the actual presentation date; refer to the topics section.
Seminar presentations: date will be announced during lecture period.
Final (possibly corrected) articles and presentation slides: PDF must be submitted at the latest 4 weeks after the presentation date by email to the seminar topic's supervisor.
Compulsory attendance: in order to pass, participants must attend all presentation sessions.
Ethical Guidelines:The Computer Science Department of RWTH Aachen University has adopted ethical guidelines for the authoring of academic work , such as seminar reports. Each student has to comply with these guidelines. In this regard, you, as a seminar attendant, have to sign a declaration of compliance, in which you assert that your work complies with the guidelines, that all references used are properly cited, and that the report was done autonomously by yourself. We ask you do download the guidelines and submit the declaration together with your seminar report and talk to your supervisor. You also find a German version of the guidelines and a German version of the declaration you may use as well.

Note: failure to meet deadlines, absence without permission from compulsory sessions (presentations and preliminary meeting as announced by email to each participating student), or dropping out of the seminar after more than 3 weeks after the preliminary meeting/topic distribution results in the grade 5.0/not appeared.

Topics, relevant references and participants

Speaker Diarization
1. Applications and Challenges (Thull; Supervisor: Wilfried Michel)
  Presentation Date: 19.06
  Initial References:
  - T. L. Nwe, H. Sun, H. Li and S. Rahardja, "Speaker diarization in meeting audio," 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, 2009 doi: 10.1109/ICASSP.2009.4960523
  - K. Church, W. Zhu, J. Vopicka, J. Pelecanos, D. Dimitriadis and P. Fousek, "Speaker diarization: A perspective on challenges and opportunities from theory to practice," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017 doi: 10.1109/ICASSP.2017.7953098
Speaker Separation
1. Deep Clustering (Von Platen; Supervisor: Tobias Menne)
  Presentation Date: 19.06
  Initial References:
  - John R. Hershey, Zhuo Chen, Jonathan Le Roux, Shinji Watanabee: "Deep Clustering: Discriminative Embeddings for Segmentation and Separation," IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, March 20-25, 2016.
  - Zhuo Chen, Yi Luo, Nima Mesgarani: "Deep Attractor Network for Single-Microhpone Speaker Separation", IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, Mar 246-250, 2017
Speaker Identification
1. Speaker Recognition (Tang; Supervisor: Weiyue Wang)
  Presentation Date: 19.06
  Initial References:
  - A Novel Scheme for Speaker Recognition Using a Phonetically-Aware Deep Neural Network, ICASSP 2014, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6853887
  - Deep Neural Network Approaches to Speaker and Language Recognition, IEEE Signal Processing Letters 2015, http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7080838
2. Named Entity Recognition (Petri; Supervisor: Weiyue Wang)
  Presentation Date: 21.06
  Initial References:
  - Neural Architectures for Named Entity Recognition, NAACL 2016, http://www.aclweb.org/anthology/N16-1030
  - End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, ACL 2016, http://www.aclweb.org/anthology/P16-1101
Voice Activity Detection
1. Feature Selection (Bebawi; Supervisor: Christoph Lüscher)
  Presentation Date: 21.06
  Initial References:
  - Ruben Zazo, Tara N. Sainath, Gabor Simko, Carolina Parada, "Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection", Interspeech 2016. http://dx.doi.org/10.21437/Interspeech.2016-268
  - Elie Khoury, Matt Garland, "I-Vectors for Speech Activity Detection", Odyssey 2016. http://www.odyssey2016.org/papers/pdfs_stamped/79.pdf
  - Longbiao Wang, Khomdet Phapatanaburi, Zeyan Oo, Seiichi Nakagawa, Masahiro Iwahashi, Jianwu Dang, "PHASE AWARE DEEP NEURAL NETWORK FOR NOISE ROBUST VOICE ACTIVITY DETECTION", ICME 2017. https://doi.org/10.1109/ICME.2017.8019414
Word Embeddings and Natural Language Understanding
1. Word embeddings and their applications to natural language processing (Gerstenberger; Supervisor: Kazuki Irie)
  Presentation Date: 21.06
  Initial References:
  - J. Pennington, R. Socher, and C. D. Manning. "GloVe: Global Vectors for Word Representation," in Proc. Conf. on Empirical Methods in Natural Language Processing (EMNLP), pages 1532-1543, Doha, Qatar, October 2014. https://www.aclweb.org/anthology/D14-1162
  - M. Kusner, Y. Sun, N. Kolkin, and K. Weinberger, "From Word Embeddings To Document Distances," in Proc. Int. Conf. on Machine Learning (ICML), pages 957-966, Lille, France, July 2015. http://proceedings.mlr.press/v37/kusnerb15.pdf
Sentiment Analysis from Audio
1. Emotion detection (Hanbing; Supervisor: Eugen Beck)
  Presentation Date: 22.06
  Initial References:
  - G. Trigeorgis et al., "Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, 2016, pp. 5200-5204.
  - Ghosh, S., Laksana, E., Morency, L.P. and Scherer, S., 2016, September. Representation Learning for Speech Emotion Recognition. In INTERSPEECH (pp. 3603-3607).
2. Multimodal sentiment analysis (Meng; Supervisor: Eugen Beck)
  Presentation Date: 22.06
  Initial References:
  - Poria, S., Cambria, E. and Gelbukh, A., 2015. Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 2539-2544).
  - Mohammad Soleymani, David Garcia, Brendan Jou, Björn Schuller, Shih-Fu Chang, Maja Pantic, A survey of multimodal sentiment analysis, Image and Vision Computing, Volume 65, 2017, Pages 3-14,
Language Identification (1)
1. Language Identification (Gharbi; Supervisor: Markus Kitza)
  Presentation Date: 22.06
  Initial References:
  - Reviewing automatic language identification: http://ieeexplore.ieee.org/abstract/document/317925/
  - A covariance kernel for svm language recognition: http://ieeexplore.ieee.org/abstract/document/4518566/ (2008)
  - The MITLL NIST LRE 2009 language recognition system: http://ieeexplore.ieee.org/abstract/document/5495080/ (2009)
Speech Synthesis
1. Auto-regressive models (Gajjar; Supervisor: Albert Zeyer)
  Presentation Date: 25.06
  Initial References:
  - Efficient Neural Audio Synthesis, https://arxiv.org/abs/1802.08435
  - PixelCNN++, https://arxiv.org/abs/1701.05517
Speech Enhancement
1. Speech Enhancement for Human Listeners
  Initial References:
  - X. Xu, R. Flynn, and M. Russell, "Speech intelligibility and quality: A comparative study of speech enhancement algorithms," in 2017 28th Irish Signals and Systems Conference (ISSC), 2017, pp. 1-6. http://ieeexplore.ieee.org/document/7983599/
  - P. C. Loizou and G. Kim, "Reasons why Current Speech-Enhancement Algorithms do not Improve Speech Intelligibility and Suggested Solutions," IEEE Trans. Audio. Speech. Lang. Processing, vol. 19, no. 1, pp. 47-56, Jan. 2011. http://ieeexplore.ieee.org/document/5428850/
  - Y. Xu, J. Du, L. Dai, and C. Lee, "A Regression Approach to Speech Enhancement Based on Deep Neural Networks," IEEE Trans. Audio, Speech Lang. Process., vol. 23, no. 1, pp. 7-19, 2015. http://ieeexplore.ieee.org/document/6932438/
2. Speech Enhancement for ASR
  Initial References:
  - F. Weninger, H. Erdogan, S. Watanabe, E. Vincent, J. Roux, J. R. Hershey, and B. Schuller, "Speech Enhancement with LSTM Recurrent Neural Networks and Its Application to Noise-Robust ASR," in Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation - Volume 9237, 2015, pp. 91-99. https://hal.inria.fr/hal-01163493/file/weninger_LVA15.pdf
  - T. Ochiai, S. Watanabe, and S. Katagiri, "Does speech enhancement work with end-to-end ASR objectives?: Experimental analysis of multichannel end-to-end ASR," in 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017, no. 26280063, pp. 1-6. http://ieeexplore.ieee.org/document/8168188/
Constituency Parsing
Text Summarization
1. Extractive Text Summarizatio (Friedrichs; Supervisor: Jan Rosendahl)
  Presentation Date: 25.06
  Initial References:
  - Text Summarization Techniques: A Brief Survey. Mehdi Allahyari, Seyedamin Pouriyeh et. al. 2017. https://arxiv.org/abs/1707.02268
  - Automatic Text Summarization (book). Torres-Moreno, Juan-Manuel, 2014. http://onlinelibrary.wiley.com/book/10.1002/9781119004752 (RWTH Aachen Network)
2. Abstractive Text Summarization (with Deep Learning) (Ahmed; Supervisor: Julian Schamper)
  Presentation Date: 25.06
  Initial References:
  - Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. Ramesh Nallapati, Bowen Zhou et. al. 2016. CoNLL. http://www.aclweb.org/anthology/K16-1028
  - Get To The Point: Summarization with Pointer-Generator Networks. Abigail See, Peter J. Liu et. al. 2017. ACL. http://aclweb.org/anthology/P17-1099
Sentiment Analysis of Text
1. Document/Sentence Level Sentiment Analysis (Tokarchuk; Supervisor: Yunsu Kim)
  Presentation Date: 26.06
  Initial References:
  - Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, E. Hovy, "Hierarchical Attention Networks for Document Classification", NAACL-HLT 2016, http://www.aclweb.org/anthology/N16-1174
  - X. Wang, W. Jiang, Z. Luo, "Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts", COLING 2016, http://www.aclweb.org/anthology/C16-1229
2. Aspect Level Sentiment Analysis (Petrov; Supervisor: Yunsu Kim)
  Presentation Date: 26.06
  Initial References:
  - Y. Wang, M. Huang, L. Zhao, X. Zhu, "Attention-based LSTM for Aspect-level Sentiment Classification", EMNLP 2016, https://aclweb.org/anthology/D16-1058
  - P. Chen, Z. Sun, L. Bing, W. Yang, "Recurrent Attention Network on Memory for Aspect Sentiment Analysis", EMNLP 2017, http://aclweb.org/anthology/D17-1047
Language Identification (2)
1. Language Identification with Deep Learning (Gokrani; Supervisor: Markus Kitza)
  Presentation Date: 26.06
  Initial References:
  - Automatic language identification using deep neural networks: http://ieeexplore.ieee.org/abstract/document/6854622/ (2014)
  - Convolutional ANN: Deep learning for spoken language identification: https://pdfs.semanticscholar.org/1b17/f0926b373ef49245a28fdddd3c9e90006e60.pdf (2009)

Guidelines for the article and presentation

The article and the slides should be prepared in LaTeX format and submitted electronically in pdf format. Other formats will not be accepted.

Online LaTeX-Documentation:

Document Templates:

Article Template (51kB), contains the template and all necessary files in tar format (or here 10kB in zip format).
New Presentation Slide Template, a zip file containing the template and all necessary graphics as well as the institute’s style template. Note: We deactivated the RWTH and i6 logos in this version of the template since the seminar content is produced by students outside of i6.

Guidelines for articles and presentation slides:

General:

The aim of the seminar for the participants is to learn the following:

to tackle a topic and to expand knowledge
to critically analyze the literature
to hold a presentation

Take notice of references to other topics in the seminar and discuss topics with one another!
Take care to stay within your own topic. To this end participants should be aware of the other topics in the seminar. If applicable, cross-reference other articles and presentations.

Specific:

Important: As part of the introduction, a slide should outline the most important literature used for the presentation. In addition, the presentation should clearly indicate which literature the particular elements of the presentation refer to.
Take notice of references to other topics in the seminar and discuss topics with one another!
Participants are expected to seek out additional literature on their topic. Assistance with the literature search is available at the faculty’s library. Access to literature is naturally also available at the Lehrstuhl Informatik 6 library.
Notation/Mathematical Formulas: consistent, correct notation is essential. When necessary, differing notation from various literature sources is to be modified or standardized in order to be clear and consistent. The lectures held by the Lehrstuhl Informatik 6 should provide a guide as to what appropriate notation should look like.
Tables must have titles (appearing above the table).
Figures must have captions (appearing below the figure).
The use of English is recommended and mandatory for the presentation slides. Nevertheless the article and oral presentation might be German.
In the case that no adequate translation of an English technical term is available, the term should be used unchanged.
Completeness: acknowledge all literature and sources.
Referencing must conform to the standard described in the article template.
Examples should be used to illustrate points.
Examples should be as complex as necessary but as simple as possible.
Slides should be used as presentation aids and not to replace the role of the presenter; specifically, slides should:

illustrate important points and relationships;
remind the audience (and the presenter) of important aspects and considerations;
give the audience an overview of the presentation.
Slides should not contain chunks of text or complicated sentences; rather they should consist of succinct words and terms.

Use illustrations where appropriate - a picture says a thousand words!
Abbreviations should be defined at the first usage in the manner demonstrated in the following example: "[...] at the Rheinisch-Westfälischen Technischen Hochschule (RWTH) there are [...]".
Take care to stay within your own topic. To this end participants should be aware of the other topics in the seminar. If applicable, cross-reference other articles and presentations.
Usage of fonts, typefaces and colors in presentation slides must be consistent and appropriate. Such means should serve to clarify points or relationships, not be applied needlessly or at random.
Care should be taken when selecting fonts for presentation slides (also within diagrams) to ensure legibility on a projector even for those seated far from the screen.

Contact

Markus Kitza
RWTH Aachen University
Lehrstuhl Informatik 6
Ahornstr. 55
52074 Aachen

Room 6110

E-Mail: kitza@cs.rwth-aachen.de