DMC 2004 results RWTH Aachen

A group of three students submitted their results to the Data Mining Cup 2004.
(See also the article from the Computerzeitung (in German): Informatikernachwuchs forscht nahe an Trends)
This is the result, places 1, 3, and 5 out of 97 student submissions (note how small the difference between second and third place is):

The complete list is available on the web site of the contest as a pdf document.

The winning method was the following (contact me for more information):
  1. data preprocessing with binning and slightly different methods for the different types of features (results in 123 new features)
  2. estimation of "naive Bayes" posterior probabilities for each feature using relative frequencies (conditional probability of class c given feature value for feature i)
  3. estimation of a log-linear model for the three classes using the maximum entropy approach (similar to multi-class logistic regression) with the feature posterior probabilities as new feature functions

We also submitted some other results as "non-students" that did not enter the competition and got places 1, 2, and 4 out of 15 submissions:

Daniel Keysers
Last modified: Thu Apr 21 10:23:23 CEST 2005