Statistical Methods in Natural Language Processing
Automatic methods for natural language processing play an
important role in any human-machine interaction applications
and other tasks in artificial intelligence.
This course deals with statistical methods that have been
found most successful for many tasks in natural language
processing.
Contents
- Text and document classification including information retrieval
- Information extraction including tagging and semantic annotation
- Syntactic analysis and parsing
- Language modeling
- Machine translation of natural language
Lecture Notes (Access only permitted within the RWTH domain)
- Slides

Announcements
-
14.06.10: Following schedule modification takes place:
-
On Tuesday, 15.06.10, there will be an additional lecture in our seminar room (Room 6124), from 10:00 to 11:30.
-
07.06.10: Please notice the following update on the lecture schedule:
- On Wednesday, the 09.06.10, there will be no lecture (DIES).
-
29.05.10: Please notice the following update on the lecture schedule:
-
Both lectures next week will take place at the regular dates.
-
05.05.10: Following schedule modifications will take place:
-
On Wednesday, 19.05.10, there will be no lecture.
-
The announced additional lecture on Tuesday, 11.05.10, has been cancelled.
-
03.05.10: Following schedule modifications take place:
-
On Wednesday, 05.05.10, there will be no lecture.
-
On Tuesday, 04.05.10, there will be an additional lecture in our seminar room (Room 6124), from 10:00 to 11:30.
-
27.04.10: Starting from Monday, May 3, the exercise hour is going to take place in our seminar room (Room 6124).
Exercises
- 0. Exercise Sheet
- 1. Exercise Sheet (Submission: May 3, 2010)
Additional data: alice.txt
Example solution for exercise 4: ex01.src.tgz
- 2. Exercise Sheet (Submission: May 10, 2010)
- 3. Exercise Sheet (Submission: May 17, 2010)
Additional data: 20 Newsgroups Corpus, Spam Corpus
An example implementation of a multinomial classifier can be found here.
- 4. Exercise Sheet (Submission: May 31, 2010)
- 5. Exercise Sheet (Submission: June 7, 2010)
Additional data: European Parliament Corpus
Example implementation of a bigram language model
- 6. Exercise Sheet (Submission: June 14, 2010)
- 7. Exercise Sheet (Submission: June 21, 2010)
Additional data: Wall Street Journal POS Corpus
- 8. Exercise Sheet (Submission: June 28, 2010)
- 9. Exercise Sheet (Submission: July 5, 2010)
- 10. Exercise Sheet (Submission: July 19, 2010)
Additional data: Translation data
|