Personal

Welcome to the homepage of Wei Zhou.

I was a PhD student at Machine Learning and Human Language Technology Group of RWTH Aachen University from January 2019 to December 2023.

I mainly focus on Automatic Speech Recognition and my research interests include

Sequence-to-squence Modeling
End-to-end ASR
(Deep) Neural Networks
Search and Decoding
Acoustic Modeling

Other links:

Email: zhou@cs.rwth-aachen.de

Working Experience

01.2019 - 12.2023	PhD at the Machine Learning and Human Language Technology Group, RWTH Aachen University
01.2019 - 12.2023	Speech Scientist (part-time) at AppTek GmbH, Aachen
07.2015 - 12.2018	Speech Recognition Software Engineer at EML Speech Technology GmbH, Heidelberg

Teaching

Exercise Course

Automatic Speech Recognition, WS19/20 - WS21/22
Advanced Methods in Automatic Speech Recognition, SS20 - SS22

Seminar Supervision

Selected Topics in Human Language Technology and Pattern Recognition, SS19 - WS22/23

List of Publications

2024

Zijian Yang, Wei Zhou, Ralf Schlüter, and Hermann Ney. On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers. RWTH Aachen University, 2024. submitted to ICASSP 2024.

2023

Z. Yang, W. Zhou, R. Schlüter, and H. Ney. Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers. In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Taipei, Taiwan, December 2023.
W. Zhou, E. Beck, S. Berger, R. Schlüter, and H. Ney. RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition. In Interspeech, pages 4094-4098, Dublin, Ireland, August 2023. [slides].
W. Zhou, H. Wu, J. Xu, M. Zeineldeen, C. Lüscher, R. Schlüter, and H. Ney. Enhancing and Adversarial: Improve ASR with Speaker Labels. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Rhodes, Greece, June 2023. [poster].
Z. Yang, W. Zhou, R. Schlüter, and H. Ney. Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Rhodes, Greece, June 2023.
T. Raissi, W. Zhou, S. Berger, R. Schlüter, and H. Ney. HMM vs. CTC for Automatic Speech Recognition: Comparison Based on Full-Sum Training from Scratch. In IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, January 2023.
A. Zeyer, R. Schmitt, W. Zhou, R. Schlüter, and H. Ney. Monotonic segmental attention for automatic speech recognition. In IEEE Spoken Language Technology Workshop (SLT), pages 229-236, Doha, Qatar, January 2023. Preprint Arxiv:2210.14742.

2022

W. Zhou, W. Michel, R. Schlüter, and H. Ney. Efficient Training of Neural Transducer for Speech Recognition. In Interspeech, pages 2058-2062, Incheon, Korea, September 2022. [poster].
W. Zhou, Z. Zheng, R. Schlüter, and H. Ney. On Language Model Integration for RNN Transducer based Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 8407-8411, Singapore, May 2022. [poster] [slides].

2021

Yu Qiao, Sourabh Zanwar, Rishab Bhattacharyya, Daniel Wiechmann, Wei Zhou, Elma Kerz, and Ralf Schlüter. Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Dataset Using (Psycho-)Linguistic and Fluency Features. , November, 2021.
W. Zhou, A. Zeyer, A. Merboldt, R. Schlüter, and H. Ney. Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept. In Interspeech, pages 2891-2895, August 2021. [slides].
W. Zhou, M. Zeineldeen, Z. Zheng, R. Schlüter, and H. Ney. Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition. In Interspeech, pages 2886-2890, August 2021. [slides].
Y. Qiao, W. Zhou, E. Kerz, and R. Schlüter. The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech. In Interspeech, pages 4453-4457, August 2021.
W. Zhou, S. Berger, R. Schlüter, and H. Ney. Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5644-5648, June 2021. [poster].

2020

Mohammad Zeineldeen, Albert Zeyer, Wei Zhou, Thomas Ng, Ralf Schlüter, and Hermann Ney. A Systematic Comparison of Grapheme-based vs. Phoneme-based Label Units for Encoder-Decoder-Attention Models. , November, 2020. Preprint arXiv:2005.09336.
W. Zhou, R. Schlüter, and H. Ney. Robust Beam Search for Encoder-Decoder Attention Based Speech Recognition without Length Bias. In Interspeech, pages 1768-1772, Shanghai, China, October 2020. [slides].
W. Zhou, R. Schlüter, and H. Ney. Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7834-7838, Barcelona, Spain, May 2020.
W. Zhou, W. Michel, K. Irie, M. Kitza, R. Schlüter, and H. Ney. The RWTH ASR system for TED-LIUM release 2: Improving Hybrid HMM with SpecAugment. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7839-7843, Barcelona, Spain, May 2020.

2019

Eugen Beck, Wei Zhou, Ralf Schlüter, and Hermann Ney. LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring. , July, 2019. https://arxiv.org/abs/1907.01030.

Full list of publications of the chair.