LISTEN Project: Hands-Free Voice-Enabled Interface to Web Applications for Smart Home Environments

LISTEN Workshop / Summer School
July 17-19, 2018
Bonn, Germany

Aus listen-summer-school-wiki
Wechseln zu: Navigation, Suche

Program

  - Tuesday, July 17:    08:50-18:40 technical session,
  - Wednesday, July 18:  08:50-12:45 technical session,
                         14:00-21:00 social event (*),
  - Thursday, July 19:   08:50-18:00 technical session.
  (*) Please note that the social event is a boat trip, so you cannot leave in between.
Tuesday, July 17
08:50Welcome
09:00Keynote 1
Chin-Hui Lee, GeorgiaTech, Atlanta, USA:A machine learning approach for enhancement and recognition of microphone array speech. [Slides]
10:15Distant Speech Recognition 1
10:15Nikos Stefanakis, FORTH, Heraklion, Greece: Acoustic beamforming in front of a reflective plane. [Slides]
10:35Maurizio Omologo, FBK, Trento, Italy:Realistic impulse responses for distant-speech recognition. [Slides]
10:55Patrick Naylor, Imperial College London, UK, and Marcel Katz, Nuance, Germany:Robust statistical processing of TDOA estimates for distant speaker diarization. [Slides]
11:15Break
11:45Keynote 2
Jon Barker, U Sheffield, UK:The CHiME-5 challenge for far-field conversational speech recognition in domestic environments. [Slides]
13:00Lunch
14:15Keynote 3
Reinhold Häb-Umbach, U Paderborn, Germany:Neural network supported acoustic beamforming and source separation for ASR. [Slides]
15:30End-to-End ASR
15:30Albert Zeyer and Ralf Schlüter, RWTH Aachen, Germany:Attention-based ASR utilizing byte-pair encoding. [Slides]
15:50Florian Metze, CMU, Pittsburgh, USA:Grounded sequence to sequence transduction (multi-modal speech recognition). [Slides]
16:10Break
16:40Speaker Separation and Localization
16:40Sunit Sivasankaran, INRIA, Nancy, France:Keyword based speaker localization: Localizing a single speaker in multi-speaker environment. [Slides]
17:00Mehdi Zohourian, U Bochum, Germany:Source localization and separation for binaural hearing aids. [Slides]
17:20Distant Speech Recognition 2
17:20Wei Zhou, EML, Heidelberg, Germany:The ASR system for the EML LISTEN demonstrator. [Slides]
17:40Jozef Ivanécky, EML, Heidelberg, Germany:Architecture of the EML LISTEN demonstrator. [Slides]
18:00Gerasimos Potamianos, U of Thessaly, Greece:Far-field ASR in Greek for domestic environment and child-robot-interaction applications. [Slides]
18:20Pranay Dighe, IDIAP, Martigny, Switzerland:Improving far-field ASR using low-rank and sparse models. [Slides]
18:40End of Session
Wednesday, July 18
08:50Announcements
09:00Keynote 4
Jasha Droppo, Microsoft, Redmond, USA:The Microsoft 2017 conversational speech recognition system. [Slides]
10:15Current Topics in ASR 1
10:15Tim Fingscheidt, TU Braunschweig, Germany:Acoustic model fusion for phoneme recognition according to the turbo principle. [Slides]
10:35Mathew Magimai-Doss, IDIAP, Martigny, Switzerland:Learning to unlearn and relearn speech signal processing using neural networks: current and future perspectives. [Slides, updated July 19, 2018, 15:22h]
10:55Markus Müller, KIT, Karlsruhe, Germany:Neural modulation for multilingual speech recognition. [Slides]
11:15Break
11:30Keynote 5
Satoshi Nakamura, NAIST, Nara, Japan:Toward Machine Speech Chain with Semi-supervised Learning by ASR-TTS coupling and Next Generation Speech-to-speech Translation. [Slides]
12:45Lunch
14:00Departure Bus Transfer to Social Event
Thursday, July 19
08:50Announcements
09:00Keynote 6
Erik McDermott, Google, Mountain View, USA:(Towards) next generation acoustic models for automatic speech recognition. [Slides, updated July 24, 2018, 13:28h]
10:15Distant Speech Recognition 3
10:15Igor Szoke, U Brno, Czech Republic:Collection of re-transmitted data and impulse responses and remote ASR and speaker verification. [Slides]
10:35Lukas Drude, U Paderborn, Germany:Integrating neural network supported beamforming and dereverberation for distant speech recognition. [Slides]
10:55Tobias Menne, RWTH Aachen, Germany:Speaker-adaptive beamforming for ASR. [Slides]
11:15Break
11:45Keynote 7
Alex Acero, Apple, Cupertino, USA: The Deep Learning Revolution. [Slides (115MB)]
13:00Lunch
14:15Keynote 8
Björn Hoffmeister, Amazon, Seattle, USA:Building far-field speech recognition for Amazon Alexa: Challenges and solutions. [Slides]
15:30Current Topics in ASR 2
15:30Adrián Giménez and Joan Albert Silvestre, UPV, Valencia, Spain:Speaker-adapted confidence measures for ASR using deep bidirectional recurrent neural networks. [Slides]
15:50Alfons Juan, UPV, Valencia, Spain:Multilingual videos for education. [Slides]
16:10Break
16:40Speaker Separation, Localization and Diarization
16:40Dorothea Kolossa, U Bochum, Germany:Exploiting structures of temporal causality for robust speaker localization in reverberant environments. [Slides]
17:00Lauréline Pérotin, INRIA, Nancy, France:Multichannel RNN-based separation of overlapping speech. [Slides]
17:20Katerina Zmolikova, U Brno, Czech Republic:Speaker aware neural network for speaker extraction from overlapping speech. [Slides]
17:40Sanjeev Khudanpur, JHU, Baltimore, USA:Experiences and lessons learned from the inaugural DIHARD challenge. [Slides]
18:00End of Session

Zip-archive with all presentation slides (200MB, updated July 24, 2018, 13:28h)

Contact

For organizational issues, especially in case of questions w.r.t. travel and accommodation/hotel booking, please contact our secretariat:
Anna Eva Andersen, Stephanie Jansen, phone: +49 (241) 80-21601 or -21606

During the workshop please call Volker Steinbiss, mobile: +49 (I79) 6679I8


LISTEN Project: Hands-Free Voice-Enabled Interface to Web Applications for Smart Home Environments