Natural Language Understanding

Natural Language Understanding

Natural Language Understanding
Klaus Macherey
An important aspect of spoken dialogue systems is the natural language understanding component. The objective of a natural language understanding unit is to extract all the information from an utterance that are relevant for a specific application.

An interesting approach to natural language understanding can be derived from the field of statistical machine translation. Instead of defining several rules for parsing an input sentence, one can introduce different concepts describing the relevant meaning units for a given application. A concept is defined as the smallest unit of meaning that is relevant to a specific task. For each concept there is a list of associated attributes describing the values of that concept. In this context, the input sentence given in a natural language forms the source language, and the sequence of concepts forms the target language.

The statistical approach automatically learns the alignments between words and concepts from an annotated corpus during a training. In a testing phase, the unknown sequence of concepts is determined. One of the advantages of this approach is that it is easier to annotate a given sentence with a sequence of concepts than to manually design corresponding grammar rules.

Tests were conducted on the TABA-corpus which is a corpus in the domain of a German train timetable information system. In order to take word contexts and local reorderings of the source sentences into account, we used the so-called alignment templates approach which has been proven to be very effective in statistical machine translation. This approach yielded a concept error rate of 4.2% on the TABA-corpus test set.

Last modified November 5, 2001