next up previous
Next: Conclusions Up: Classification of Medical Images Previous: Methods

Experimental results

Figure 1: One image from each of the six IRMA-1617 classes: `abdomen', `skull', `chest', `limbs', `breast', and `spine'.
\includegraphics[height=2.3cm]{535.abdomen-00001-500XP2NA.eps} \includegraphics[height=2.3cm]{535.schaedel-00006-700XPBNA.eps} \includegraphics[height=2.3cm]{535.thorax-00035-600XH1NB.eps} \includegraphics[height=2.3cm]{535.arm-00009-110XPBLA.eps} \includegraphics[height=2.3cm]{535.mamma-00019-800XQSRO.eps} \includegraphics[height=2.3cm]{535.wirbel-00152-420XPBNA.eps}  

Figure 2: Several images from class `chest' from the IRMA-1617 database.
\includegraphics[height=1.65cm]{535.thorax-00276-610XPBLA.jpg.eps} \includegraphics[height=1.65cm]{535.thorax-00277-600XP1NB.jpg.eps} \includegraphics[height=1.65cm]{535.thorax-00419-600XP2NA.jpg.eps} \includegraphics[height=1.65cm]{535.thorax-00430-600XP2NA.jpg.eps} \includegraphics[height=1.65cm]{535.thorax-00438-600XP1NC.jpg.eps} \includegraphics[height=1.65cm]{535.thorax-00454-612XPBLJ.jpg.eps}

The experimental results were obtained on the RWTH Aachen University IRMA database of 1617 secondary digital medical radiographs from the six classes `abdomen', `skull', `chest', `limbs', `breast', and `spine' (IRMA: image retrieval in medical applications [1]). The images were labeled by expert radiologists. They have widely differing sizes and were scaled to a common height of 32 pixels preserving the aspect ratio. One example from each of these classes is shown in Figure 1. The difficulty of this task is due to the fact that a large intra-class variability exists, as shown in Figure 2. The gray values were normalized to span the full gray level range for each image. The error rates are obtained using a leaving one out approach, i.e. each image is classified in turn using the remaining images as training data. This approach ensures that the classifier has `never seen' the image that is tested and therefore results in a valid test error rate. Some known error rates on the IRMA-1617 database using other methods are given in Table 2 along with the results of the experiments.


Table 2: Error rates (ER) for different methods on the IRMA-1617 corpus. NN: nearest neighbor; IDM: image distortion model; P2DHMM: pseudo 2-dimensional hidden Markov model; P2DHMDM: pseudo 2-dimensional hidden Markov distortion model.
reference method ER [%]
[2] cooccurrence matrices 29.0
[2] Euclidean 1-NN 15.8
[6] local representations, thresholding 9.7
[2] kernel densities, thresholding, IDM 9.0
[2] + tangent distance 8.0
this work   1-NN, gradients, thresholding - local image parts, IDM 6.6
  - P2DHMM 5.7
  - P2DHMDM 5.3

Using the proposed techniques for the inclusion of local context information of image gradient (Sobel operator) and local image parts (3$ \times$3 sub images), the performance using image distortion and thresholding could be significantly improved from 9.0% to 6.6% error rate. Note that now the feature vector associated with each pixel has the dimensionality $ 18=2 \cdot 3 \cdot 3$ instead of just one value for the pixel gray value. Modeling local dependencies by using the pseudo two-dimensional hidden Markov model and the local context information of the image gradient the error rate could be reduced to 5.7%. Finally, allowing for additional deviations resulting in the P2DHMDM, the error rate could be further reduced to 5.3%. This is a remarkable relative improvement of about one third with respect to the previous best result of 8.0% that included tangent distance.


next up previous
Next: Conclusions Up: Classification of Medical Images Previous: Methods
Daniel Keysers 2004-03-10