Statistical Methods in Natural Language Processing
Automatic methods for natural language processing play an
important role in any human-machine interaction applications
and other tasks in artificial intelligence.
This course deals with statistical methods that have been
found most successful for many tasks in natural language
processing.
Contents
- Text and document classification including information retrieval
- Information extraction including tagging and semantic annotation
- Syntactic analysis and parsing
- Language modeling
- Machine translation of natural language
Lecture Notes (Access only permitted within the RWTH domain)
- Slides

Calendar
Monday | Comments | Wednesday | Comments |
20.04 | Regular lecture | 22.04 | Regular lecture |
27.04 | Regular lecture | 29.04 | Regular lecture |
04.05 | Cancelled | 06.05 | Regular lecture |
11.05 | Regular lecture | 13.05 | Cancelled |
18.05 | Double lecture | 20.05 | Regular lecture |
25.05 | Double lecture | 27.05 | Regular lecture |
01.06 | Holiday | 03.06 | Holiday |
08.06 | Regular lecture | 10.06 | Cancelled (Dies) |
15.06 | Double lecture | 17.06 | Double lecture |
22.06 | Regular lecture | 24.06 | Regular lecture |
29.06 | Double lecture | 01.07 | Cancelled |
06.07 | Cancelled | 08.07 | Cancelled |
13.07 | Double lecture | 15.07 | Regular lecture |
20.07 | Cancelled | 22.07 | Regular lecture |
Announcements
-
15.06.09: On Wednesday, 17.06, we will have a double lecture. On Monday, 22.06, there will be a regular lecture (the calendar has been updated).
-
08.06.09: Additional schedule modifications. Consult the above calender for an overview.
-
27.05.09: Due to the excursion week (1.-7. June), the due date for the
5. exercise sheet is June 8th.
-
12.05.09: Following schedule modifications will take place:
-
On Wednesday, 13.05.09, there will be no lecture.
-
On Monday, 18.05.09, there will be an additional lecture in our seminar room (Room 6124), from 11:30 to 12:30
-
On Monday, 25.05.09, there will be an additional lecture in our seminar room (Room 6124), from 11:30 to 12:30
There will be no changes to the exercises on these dates.
-
29.04.09: Due to an official travel from Prof. Ney, next week there
will be following modifications to the schedule:
-
On Monday, 4.05.09, there will be no lecture.
-
On Wednesday, 6.05.09, the lecture will be held by David Vilar.
Note: There is no change for the exercise lessons.
-
27.04.09: We were not able to discuss exercise sheet 0 in the exercise
hour. We will discuss them next week, but you can hand them in for some
extra points
Exercises
- 0. Exercise Sheet
- 1. Exercise Sheet (Submission: 04 May 2009)
Additional data: alice.txt
Example solution for exercise 4: ex01.src.tgz
- 2. Exercise Sheet (Submission: 11 May 2009)
- 3. Exercise Sheet (Submission: 18 May 2008)
Additional data: 20 Newsgroups Corpus, Spam Corpus
An example implementation of a multinomial classifier can be found here.
- 4. Exercise Sheet (Submission: 25 May 2008)
- 5. Exercise Sheet (Submission: 8 June 2008)
Additional data: European Parliament Corpus
Example implementation of a bigram language model
- 6. Exercise Sheet (Submission: 15 June 2008)
- 7. Exercise Sheet (Submission: 22 June 2008)
Additional data: Wall Street Journal POS Corpus
Example implementation of a bigram-based POS tagger.
- 8. Exercise Sheet (Submission: 29 June 2008)
- 9. Exercise Sheet (Submission: 6 July 2008)
- 10. Exercise Sheet (Submission: 13 July 2008)
- 11. Exercise Sheet (Submission: 20 July 2008)
Additional data: Translation data
Last Modified
Wed Jul 15 19:33:38 CEST 2009
|
|