Personal
Welcome to the homepage of Wei Zhou.
I was a PhD student at Machine Learning and Human Language Technology Group of RWTH Aachen University from January 2019 to December 2023.
I mainly focus on Automatic Speech Recognition and my research interests include
- Sequence-to-squence Modeling
- End-to-end ASR
- (Deep) Neural Networks
- Search and Decoding
- Acoustic Modeling
Other links:
Email: zhou@cs.rwth-aachen.de
Working Experience
Teaching
- Exercise Course
- Seminar Supervision
List of Publications
-
J. Xu, W. Zhou, Z. Yang, E. Beck, and R. Schlueter. Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition. In Interspeech, Kos, Greece, September 2024.
Preprint arxiv:2407.18930.
-
Z. Yang, W. Zhou, R. Schlüter, and H. Ney. On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seoul, Korea, April 2024.
Preprint arxiv:2309.14130.
-
Z. Yang, W. Zhou, R. Schlüter, and H. Ney. Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers. In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Taipei, Taiwan, December 2023.
-
W. Zhou, E. Beck, S. Berger, R. Schlüter, and H. Ney. RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition. In Interspeech, pages 4094-4098, Dublin, Ireland, August 2023.
[slides].
-
W. Zhou, H. Wu, J. Xu, M. Zeineldeen, C. Lüscher, R. Schlüter, and H. Ney. Enhancing and Adversarial: Improve ASR with Speaker Labels. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Rhodes, Greece, June 2023.
[poster].
-
Z. Yang, W. Zhou, R. Schlüter, and H. Ney. Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Rhodes, Greece, June 2023.
-
T. Raissi, W. Zhou, S. Berger, R. Schlüter, and H. Ney. HMM vs. CTC for Automatic Speech Recognition: Comparison Based on Full-Sum Training from Scratch. In IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, January 2023.
-
A. Zeyer, R. Schmitt, W. Zhou, R. Schlüter, and H. Ney. Monotonic segmental attention for automatic speech recognition. In IEEE Spoken Language Technology Workshop (SLT), pages 229-236, Doha, Qatar, January 2023.
Preprint Arxiv:2210.14742.
-
W. Zhou, W. Michel, R. Schlüter, and H. Ney. Efficient Training of Neural Transducer for Speech Recognition. In Interspeech, pages 2058-2062, Incheon, Korea, September 2022.
[poster].
-
W. Zhou, Z. Zheng, R. Schlüter, and H. Ney. On Language Model Integration for RNN Transducer based Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 8407-8411, Singapore, May 2022.
[poster] [slides].
-
Yu Qiao, Sourabh Zanwar, Rishab Bhattacharyya, Daniel Wiechmann, Wei Zhou, Elma Kerz, and Ralf Schlüter. Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Dataset Using (Psycho-)Linguistic and Fluency Features. , November, 2021.
-
W. Zhou, A. Zeyer, A. Merboldt, R. Schlüter, and H. Ney. Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept. In Interspeech, pages 2891-2895, August 2021.
[slides].
-
W. Zhou, M. Zeineldeen, Z. Zheng, R. Schlüter, and H. Ney. Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition. In Interspeech, pages 2886-2890, August 2021.
[slides].
-
Y. Qiao, W. Zhou, E. Kerz, and R. Schlüter. The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech. In Interspeech, pages 4453-4457, August 2021.
-
W. Zhou, S. Berger, R. Schlüter, and H. Ney. Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 5644-5648, June 2021.
[poster].
-
Mohammad Zeineldeen, Albert Zeyer, Wei Zhou, Thomas Ng, Ralf Schlüter, and Hermann Ney. A Systematic Comparison of Grapheme-based vs. Phoneme-based Label Units for Encoder-Decoder-Attention Models. , November, 2020.
Preprint arXiv:2005.09336.
-
W. Zhou, R. Schlüter, and H. Ney. Robust Beam Search for Encoder-Decoder Attention Based Speech Recognition without Length Bias. In Interspeech, pages 1768-1772, Shanghai, China, October 2020.
[slides].
-
W. Zhou, R. Schlüter, and H. Ney. Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7834-7838, Barcelona, Spain, May 2020.
-
W. Zhou, W. Michel, K. Irie, M. Kitza, R. Schlüter, and H. Ney. The RWTH ASR system for TED-LIUM release 2: Improving Hybrid HMM with SpecAugment. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 7839-7843, Barcelona, Spain, May 2020.
Full list of publications of the chair.