Aus listen-summer-school-wiki
Program
- Tuesday, July 17: 08:50-18:40 technical session,
- Wednesday, July 18: 08:50-12:45 technical session,
14:00-21:00 social event (*),
- Thursday, July 19: 08:50-18:00 technical session.
(*) Please note that the social event is a boat trip, so you cannot leave in between.
| | | |
| Chin-Hui Lee, GeorgiaTech, Atlanta, USA: | A machine learning approach for enhancement and recognition of microphone array speech. [Slides] |
| | |
10:15 | Distant Speech Recognition 1 |
| | | | |
| 10:15 | Nikos Stefanakis, FORTH, Heraklion, Greece: | Acoustic beamforming in front of a reflective plane. [Slides] |
| 10:35 | Maurizio Omologo, FBK, Trento, Italy: | Realistic impulse responses for distant-speech recognition. [Slides] |
| 10:55 | Patrick Naylor, Imperial College London, UK, and Marcel Katz, Nuance, Germany: | Robust statistical processing of TDOA estimates for distant speaker diarization. [Slides] |
| | | |
| Jon Barker, U Sheffield, UK: | The CHiME-5 challenge for far-field conversational speech recognition in domestic environments. [Slides] |
| | | |
| Reinhold Häb-Umbach, U Paderborn, Germany: | Neural network supported acoustic beamforming and source separation for ASR. [Slides] |
| | | | |
| 15:30 | Albert Zeyer and Ralf Schlüter, RWTH Aachen, Germany: | Attention-based ASR utilizing byte-pair encoding. [Slides] |
| 15:50 | Florian Metze, CMU, Pittsburgh, USA: | Grounded sequence to sequence transduction (multi-modal speech recognition). [Slides] |
| | |
16:40 | Speaker Separation and Localization |
| | | | |
| 16:40 | Sunit Sivasankaran, INRIA, Nancy, France: | Keyword based speaker localization: Localizing a single speaker in multi-speaker environment. [Slides] |
| 17:00 | Mehdi Zohourian, U Bochum, Germany: | Source localization and separation for binaural hearing aids. [Slides] |
| | |
17:20 | Distant Speech Recognition 2 |
| | | | |
| 17:20 | Wei Zhou, EML, Heidelberg, Germany: | The ASR system for the EML LISTEN demonstrator. [Slides] |
| 17:40 | Jozef Ivanécky, EML, Heidelberg, Germany: | Architecture of the EML LISTEN demonstrator. [Slides] |
| 18:00 | Gerasimos Potamianos, U of Thessaly, Greece: | Far-field ASR in Greek for domestic environment and child-robot-interaction applications. [Slides] |
| 18:20 | Pranay Dighe, IDIAP, Martigny, Switzerland: | Improving far-field ASR using low-rank and sparse models. [Slides] |
| | |
18:40 | End of Session |
| | | |
| Jasha Droppo, Microsoft, Redmond, USA: | The Microsoft 2017 conversational speech recognition system. [Slides] |
| | |
10:15 | Current Topics in ASR 1 |
| | | | |
| 10:15 | Tim Fingscheidt, TU Braunschweig, Germany: | Acoustic model fusion for phoneme recognition according to the turbo principle. [Slides] |
| 10:35 | Mathew Magimai-Doss, IDIAP, Martigny, Switzerland: | Learning to unlearn and relearn speech signal processing using neural networks: current and future perspectives. [Slides, updated July 19, 2018, 15:22h] |
| 10:55 | Markus Müller, KIT, Karlsruhe, Germany: | Neural modulation for multilingual speech recognition. [Slides] |
| | | |
| Satoshi Nakamura, NAIST, Nara, Japan: | Toward Machine Speech Chain with Semi-supervised Learning by ASR-TTS coupling and Next Generation Speech-to-speech Translation. [Slides] |
| | |
14:00 | Departure Bus Transfer to Social Event |
| | |
10:15 | Distant Speech Recognition 3 |
| | | | |
| 10:15 | Igor Szoke, U Brno, Czech Republic: | Collection of re-transmitted data and impulse responses and remote ASR and speaker verification. [Slides] |
| 10:35 | Lukas Drude, U Paderborn, Germany: | Integrating neural network supported beamforming and dereverberation for distant speech recognition. [Slides] |
| 10:55 | Tobias Menne, RWTH Aachen, Germany: | Speaker-adaptive beamforming for ASR. [Slides] |
| | | |
| Alex Acero, Apple, Cupertino, USA: | The Deep Learning Revolution. [Slides (115MB)] |
| | | |
| Björn Hoffmeister, Amazon, Seattle, USA: | Building far-field speech recognition for Amazon Alexa: Challenges and solutions. [Slides] |
| | |
15:30 | Current Topics in ASR 2 |
| | | | |
| 15:30 | Adrián Giménez and Joan Albert Silvestre, UPV, Valencia, Spain: | Speaker-adapted confidence measures for ASR using deep bidirectional recurrent neural networks. [Slides] |
| 15:50 | Alfons Juan, UPV, Valencia, Spain: | Multilingual videos for education. [Slides] |
| | |
16:40 | Speaker Separation, Localization and Diarization |
| | | | |
| 16:40 | Dorothea Kolossa, U Bochum, Germany: | Exploiting structures of temporal causality for robust speaker localization in reverberant environments. [Slides] |
| 17:00 | Lauréline Pérotin, INRIA, Nancy, France: | Multichannel RNN-based separation of overlapping speech. [Slides] |
| 17:20 | Katerina Zmolikova, U Brno, Czech Republic: | Speaker aware neural network for speaker extraction from overlapping speech. [Slides] |
| 17:40 | Sanjeev Khudanpur, JHU, Baltimore, USA: | Experiences and lessons learned from the inaugural DIHARD challenge. [Slides] |
Zip-archive with all presentation slides (200MB, updated July 24, 2018, 13:28h)
Contact
For organizational issues, especially in case of questions
w.r.t. travel and accommodation/hotel booking, please contact our
secretariat:
Anna Eva Andersen, Stephanie Jansen,
phone: +49 (241) 80-21601 or -21606
During the workshop please call Volker Steinbiss, mobile: +49 (I79) 6679I8
Navigationsmenü
Meine Werkzeuge
Varianten