Homepage of Albert Zeyer

2025

N. Bayoumi, R. Schmitt, T. Raissi, A. Zeyer, R. Schl�ter, and H. Ney. A Comparative Analysis on ASR System Combination for Attention, CTC, Factored Hybrid, and Transducer Models. In IEEE ITG Conference on Speech Communication, September 2025. To Appear.
J. Xu, Z. Yang, A. Zeyer, E. Beck, Schlueter Ralf, and H. Ney. Dynamic Acoustic Model Architecture Optimization in Training for ASR. In Interspeech, Rotterdam, The Netherlands, August 2025.
R. Schmitt, A. Zeyer, M. Zeineldeen, R. Schl�ter, and H. Ney. The Conformer Encoder May Reverse the Time Dimension. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Hyderabad, India, April 2025. Preprint ArXiv:2501.04521.

2024

M. Zeineldeen, A. Zeyer, R. Schl�ter, and H. Ney. Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seoul, Korea, April 2024. Preprint arxiv:2309.08436.

2023

A. Zeyer, R. Schmitt, W. Zhou, R. Schlüter, and H. Ney. Monotonic segmental attention for automatic speech recognition. In IEEE Spoken Language Technology Workshop (SLT), pages 229-236, Doha, Qatar, January 2023. Preprint Arxiv:2210.14742.

2022

A. Zeyer. Neural Network based Modeling and Architectures for Automatic Speech Recognition and Machine Translation. PhD Thesis, Computer Science Department, RWTH Aachen University, Aachen, Germany, June 2022.
M. Gansen, J. Lou, F. Freye, T. Gemmeke, F. Merchant, A. Zeyer, M. Zeineldeen, R. Schlüter, and X. Fan. Discrete Steps towards Approximate Computing. In International Symposium on Quality Electronic Design (ISQED), pages 1-6, April 2022. DOI: 10.1109/ISQED54688.2022.9806215.

2021

A. Zeyer, A. Merboldt, W. Michel, R. Schl�ter, and H. Ney. Librispeech Transducer Model with Internal Language Model Prior Correction. In Interspeech, pages 2052-2056, August 2021. Arxiv:2104.03006.
W. Zhou, A. Zeyer, A. Merboldt, R. Schl�ter, and H. Ney. Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept. In Interspeech, pages 2891-2895, August 2021. [slides].
M. Zeineldeen, A. Glushko, W. Michel, A. Zeyer, R. Schl�ter, and H. Ney. Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models. In Interspeech, pages 2856-2860, August 2021. [slides] [video].
Albert Zeyer, Ralf Schl�ter, and Hermann Ney. Why does CTC result in peaky behavior?. , May, 2021. Preprint arXiv:2105.14849.
Albert Zeyer, Ralf Schl�ter, and Hermann Ney. A study of latent monotonic attention variants. , March, 2021. Preprint arXiv:2103.16710.

2020

Mohammad Zeineldeen, Albert Zeyer, Wei Zhou, Thomas Ng, Ralf Schl�ter, and Hermann Ney. A Systematic Comparison of Grapheme-based vs. Phoneme-based Label Units for Encoder-Decoder-Attention Models. , November, 2020. Preprint arXiv:2005.09336.
A. Zeyer, N. Rossenbach, P. Bahar, A. Merboldt, and R. Schl�ter. Efficient and Flexible Implementation of Machine Learning for ASR and MT. In Interspeech, Shanghai, China, October 2020. Tutorial slides.
A. Zeyer, A. Merboldt, R. Schl�ter, and H. Ney. A New Training Pipeline for an Improved Neural Transducer. In Interspeech, Shanghai, China, October 2020. [slides].
M. Zeineldeen, A. Zeyer, R. Schl�ter, and H. Ney. Layer-normalized LSTM for Hybrid-HMM and End-to-End ASR. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7679-7683, Barcelona, Spain, May 2020. [slides].
N. Rossenbach, A. Zeyer, R. Schl�ter, and H. Ney. Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7069-7073, Barcelona, Spain, May 2020.
P. Bahar, N. Makarov, A. Zeyer, R. Schl�ter, and H. Ney. Exploring A Zero-Order Direct HMM based on Latent Attention for Automatic Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7854-7858, Barcelona, Spain, May 2020.
V. Bozheniuk, A. Zeyer, R. Schl�ter, and H. Ney. A comprehensive study of Residual CNNs for acoustic modeling in ASR. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7674-7678, Barcelona, Spain, May 2020.

2019

K. Irie, A. Zeyer, R. Schl�ter, and H. Ney. Training Language Models for Long-Span Cross-Sentence Evaluation. In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pages 419-426, Sentosa, Singapore, December 2019. [poster].
A. Zeyer, P. Bahar, K. Irie, R. Schl�ter, and H. Ney. A comparison of Transformer and LSTM encoder decoder models for ASR. In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pages 8-15, Sentosa, Singapore, December 2019.
P. Bahar, A. Zeyer, R. Schl�ter, and H. Ney. On Using SpecAugment for End-to-End Speech Translation. In International Workshop on Spoken Language Translation (IWSLT), Hong Kong, China, November 2019.
K. Irie, A. Zeyer, R. Schlüter, and H. Ney. Language Modeling with Deep Transformers. In Interspeech, pages 3905-3909, Graz, Austria, September 2019. ISCA Best Student Paper Award. [slides].
C. L�scher, E. Beck, K. Irie, M. Kitza, W. Michel, A. Zeyer, R. Schl�ter, and H. Ney. RWTH ASR Systems for LibriSpeech: Hybrid vs Attention. In Interspeech, pages 231-235, Graz, Austria, September 2019.
A. Merboldt, A. Zeyer, R. Schl�ter, and H. Ney. An analysis of local monotonic attention variants. In Interspeech, pages 1398-1402, Graz, Austria, September 2019.
M. A. Tahir, H. Huang, A. Zeyer, R. Schl�ter, and H. Ney. Training of reduced-rank linear transformations for multi-layer polynomial acoustic features for speech recognition. Speech Communication, volume 110, pages 56-63, July 2019.
P. Bahar, A. Zeyer, R. Schl�ter, and H. Ney. On using 2D sequence-to-sequence models for speech recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5671-5675, Brighton, UK, May 2019.

2018

A. Zeyer, A. Merboldt, R. Schl�ter, and H. Ney. A comprehensive analysis on attention models. In Interpretability and Robustness in Audio, Speech, and Language (IRASL) Workshop, Conference on Neural Information Processing Systems (NeurIPS) (NIPS IRASL), Montreal, Canada, December 2018.
E. Matusov, P. Wilken, P. Bahar, J. Schamper, P. Golik, A. Zeyer, J. A. Silvestre-Cerda, A. Martinez-Villaronga, H. Pesch, and J. Peter. Neural Speech Translation at AppTek. In Proceedings of the 15th International Workshop on Spoken Language Translation, pages 104-111, Bruges, Belgium, October 2018.
E. Beck, A. Zeyer, P. Doetsch, A. Merboldt, R. Schl�ter, and H. Ney. Sequence Modeling and Alignment for LVCSR-Systems. In ITG Conference on Speech Communication, Oldenburg, October 2018.
A. Zeyer, K. Irie, R. Schl�ter, and H. Ney. Improved training of end-to-end attention models for speech recognition. In Interspeech, Hyderabad, India, September 2018.
A. Zeyer, T. Alkhouli, and H. Ney. RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition. In Annual Meeting of the Assoc. for Computational Linguistics (ACL), Melbourne, Australia, July 2018.
R. Schl�ter, P. Doetsch, P. Golik, M. Kitza, T. Menne, K. Irie, Z. T�ske, and A. Zeyer. Neuronale Netze in der automatischen Spracherkennung - ein Paradigmenwechsel?. In 44. Jahrestagung f�r Akustik der Deutschen Gesellschaft f�r Akustik, pages 15-28, Munich, Germany, March 2018.

2017

A. Zeyer, E. Beck, R. Schl�ter, and H. Ney. CTC in the Context of Generalized Full-Sum HMM Training. In Interspeech, pages 944-948, Stockholm, Sweden, August 2017.
P. Doetsch, A. Zeyer, P. Voigtlaender, I. Kulikov, R. Schl�ter, and H. Ney. RETURNN: the RWTH extensible training framework for universal recurrent neural networks. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5345-5349, New Orleans, LA, USA, March 2017.
A. Zeyer, I. Kulikov, R. Schl�ter, and H. Ney. Faster sequence training. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5285-5289, New Orleans, LA, USA, March 2017.
A. Zeyer, P. Doetsch, P. Voigtlaender, R. Schl�ter, and H. Ney. A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 2462-2466, New Orleans, LA, USA, March 2017.

2016

P. Doetsch, A. Zeyer, and H. Ney. Bidirectional decoder networks for attention-based end-to-end offline handwriting recognition. In International Conference on Frontiers in Handwriting Recognition (ICFHR), pages 361-366, Shenzhen, China, October 2016.
M. Kitza, J. Heymann, A. Zeyer, R. Schl�ter, and R. H�b-Umbach. Robust Online Multi-Channel Speech Recognition. In ITG Conference on Speech Communication, pages 347-351, Paderborn, Germany, October 2016.
A. Zeyer, R. Schlüter, and H. Ney. Towards online-recognition with deep bidirectional LSTM acoustic models. In Interspeech, pages 3424-3428, San Francisco, CA, USA, September 2016.
T. Menne, J. Heymann, A. Alexandridis, K. Irie, A. Zeyer, M. Kitza, P. Golik, I. Kulikov, L. Drude, R. Schlüter, H. Ney, R. Haeb-Umbach, and A. Mouchtaris. The RWTH/UPB/FORTH System Combination for the 4th CHiME Challenge Evaluation. In The 4th International Workshop on Speech Processing in Everyday Environments, pages 39-44, San Francisco, CA, USA, September 2016.
R. Schlüter, P. Doetsch, P. Golik, M. Kitza, T. Menne, K. Irie, Z. Tüske, and A. Zeyer. Automatic Speech Recognition Based on Neural Networks. In International Conference Speech and Computer (SPECOM), Lecture Notes in Computer Science, Subseries Lecture Notes in Artificial Intelligence, volume 9811, pages 3-17, Budapest, Hungary, August 2016.

Personal

List of Publications