Next: Connection between the two
Up: Comparison of Log-Linear Models
Previous: Maximum Entropy, Gaussian and
In [9], a class-dependent weighted
dissimilarity measure for nearest neighbor classifiers was
introduced. The squared distance is defined as
where
denotes the dimension index and
is the class the
reference vector
belongs to. The parameters
are
estimated with respect to a discriminative training criterion that
takes into account the out-of-class information and can be derived
from the minimum classification error criterion:
 |
(4) |
In other words, the parameters are chosen to minimize the average
ratio of the distance to the closest prototype of the same class with
respect to the distance to the closest prototype of the competing
classes.
To minimize the criterion, a gradient descent approach is used and a
leaving one out estimation with the weighted measure is computed at
each step of the gradient procedure. The weights selected by the
algorithm are those weights with the best leaving one out estimation
instead of the weights with the minimum criterion value. In the
experiments, only the weights
were estimated
according to the proposed criterion. The references
were
chosen as the means for the one-prototype approach and in the
multiple-prototype approach the whole training set was used.
Also in this approach, we have a strong relation to Gaussian
models. Consider the use of one prototype per class. The distance
measure then is a class-dependent Mahalanobis distance with
class-specific, diagonal covariance matrices
diag |
|
The decision rule is then equivalent to the use of single Gaussian
models in combination with an additional factor to compensate for the
missing normalization factor of the Gaussian. In the case of multiple
prototypes per class, the equivalence is extensible to mixtures of
Gaussian densities.
Next: Connection between the two
Up: Comparison of Log-Linear Models
Previous: Maximum Entropy, Gaussian and
Daniel Keysers
2004-03-10