David Thulke
Hi, I am PhD student within the Human Language Technology Group
at the Chair of Computer Science 6 of the RWTH Aachen University supervised by Prof. Dr.-Ing. Hermann Ney since January 2020. Additionally, I work as a language processing scientist at AppTek.
My personal research interests include:
- Retrieval Augmented Generation
- Pretraining of (Large) Language Models
- Named Entity Recognition
Other links:
You can find me in room 6123 of our department, call me at +49 241 80 21625 or write an e-mail to <surname>@hltpr.rwth-aachen.de
Publications
-
David Thulke, Yingbo Gao, Petrus Pelser, Rein Brune, Rricha Jalota, Floris Fok, Michael Ramos, Ian van Wyk, Abdallah Nasir, Hayden Goldstein, Taylor Tragemann, Katie Nguyen, Ariana Fowler, Andrew Stanco, Jon Gabriel, Jordan Taylor, Dean Moro, Evgenii Tsymbalov, Juliette de Waal, Evgeny Matusov, Mudar Yaghi, Mohammad Shihadah, Hermann Ney, Christian Dugast, Jonathan Dotan, and Daniel Erasmus. ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change. , January, 2024.
Preprint arXiv:2401.09646.
-
V. A. K. Tran, D. Thulke, Y. Gao, C. Herold, and H. Ney. Does Joint Training Really Help Cascaded Speech Translation?. In Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, December 2022.
-
B. Liao, D. Thulke, S. Hewavitharana, H. Ney, and C. Monz. Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token. In Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, December 2022.
-
N. Daheim, D. Thulke, C. Dugast, and H. Ney. Controllable Factuality in Document-Grounded Dialog Systems Using a Noisy Channel Model. In Conference on Empirical Methods in Natural Language Processing (EMNLP), Abu Dhabi, December 2022.
-
D. Thulke, N. Daheim, C. Dugast, and H. Ney. Adapting Document-Grounded Dialog Systems to Spoken Conversations using Data Augmentation and a Noisy Channel Model. In AAAI-22 10th Dialog System Technology Challenge (DSTC-10) Workshop, pages 9, Online, February 2022.
-
Y. Gao, D. Thulke, A. Gerstenberger, K. V. Tran, R. Schlüter, and H. Ney. On Sampling-Based Training Criteria for Neural Language Modeling. In Interspeech, August 2021.
-
N. Daheim, D. Thulke, C. Dugast, and H. Ney. Cascaded Span Extraction and Response Generation for Document-Grounded Dialog. In ACL-IJCNLP 2021 Workshop on Document-grounded Dialogue and Conversational QA, online, August 2021.
-
E. Tokarchuk, D. Thulke, W. Wang, C. Dugast, and H. Ney. Investigation on Data Adaptation Techniques for Neural Named Entity Recognition. In ACL-IJCNLP 2021 Student Research Workshop, online, August 2021.
-
D. Thulke, N. Daheim, C. Dugast, and H. Ney. Efficient Retrieval Augmented Generation from Unstructured Knowledge for Task-Oriented Dialog. In AAAI-21 9th Dialog System Technology Challenge (DSTC-9) Workshop, February 2021.
Full list of publications
of the chair.
Invited Talks and Panels
-
Keynote Speech - ClimateGPT: Towards Domain-Specific Large Language Models for Climate Change. At ClimateNLP: Natural Language Processing meets Climate Change, ACL 2024 Workshop, Bangkok, August 2024.
-
Panel - Matchmaking for Climate Policy, Information and Finance AI Solutions. At Bonn AI and Climate Expert Meeting, Bonn, July 2024.
-
Invited Talk - ClimateGPT and NLP for Climate. At Climate Analytics Acceleration Hub: Igniting Action & Finance with Innovation, Understanding Risk Global Forum, Himeji, June 2024.
-
Panel - NLP for climate solutions: retrieval augmented generation and other strategies for mananging climate information. At Accelerating Climate Change Action through Machine Learning, Applied Machine Learning Days (AMLD), Lausanne, March 2024.