Seminar "Selected Topics in Human Language Technology and Pattern Recognition"
In the summer term 2013 the Lehrstuhl für Informatik 6 will host a seminar entitled "Selected Topics in Human Language Technology and Pattern Recognition".
Registration for the seminar
Registration for the seminar is only possible online via the
registration
page provided by the Computer Science Department.
Prerequisites for participation in the seminar
- Bachelor students: Einführung in das wissenschaftliche Arbeiten (Proseminar)
- Master students: Bachelor degree
- Diploma students: Vordiplom
- Attendance of at least one of the lectures Pattern Recognition and Neural
Networks, Introduction to Statistical Classification, Automatic Speech Recognition, or Statistical Methods in Natural Language
Processing, or evidence of equivalent knowledge.
- For successful participants of the above lectures, the possibility of a seminar
talk is guaranteed.
Seminar format and important dates
The seminar generally takes place in block mode around the end of the
lecture period. Specific dates will be arranged, these will most
likely be around end of July/beginning of August 2013.
- Proposals: initial proposals (report's content
page) will be accepted up until the start of the term (April 1, 2013) by the
individual supervisor. Email submission will be sufficient. At this
time participants must arrange an appointment with the individual
supervisor. Revised proposals will be accepted up until two weeks after the start of the term
(April 14, 2013).
- Article: must be submitted by June 7, 2013 and at least 1 month prior
to the trial presentation date to the individual supervisor
in electronic form (PDF).
- Presentation slides: must be submitted at least
1 week prior to the trial
presentation date to the individual supervisor in
electronic form (PDF).
- Trial presentations: at least 2 weeks prior to the
actual presentation date. Please refer to your individual
supervisor to schedule your trial presentation.
- Seminar presentations: the exact dates and
schedule for the presentation block (expected to be around end of
July/beginning of August 2013) will be arranged and announced for
the individual topics.
- Final (possibly corrected) articles and presentation
slides: must be submitted within 4 weeks after the presentation date at the latest to
the individual supervisor in electronic form (PDF).
- Compulsory attendance: in order to receive a
certificate participants must attend all presentation
sessions.
- Ethical Guidelines:The Computer Science Department
of RWTH Aachen University has adopted ethical
guidelines for the authoring of academic work such as seminar
reports. Each student has to comply with these guidelines. In this
regard, you, as a seminar attendant, have to sign a declaration of
compliance, in which you assert that your work complies with
the guidelines, that all references used are properly cited, and
that the report was done autonomously by yourself. We ask you do
download the guidelines
and submit the declaration together
with your seminar report and talk to your individual supervisor.
You also find a German
version of the guidelines and a German version of the
declaration you may use as well.
Note: failure to meet deadlines, absence without permission
from compulsory sessions (presentations and preliminary meeting as
announced by email to each participating student), or dropping out of
the seminar after more than 3 weeks after the kick-off meeting
(i.e. by March 11, 2013) results in the grade 5.0/not appeared.
Topics, relevant references and participants
Specific topics will be introduced at a preparatory meeting
in the seminar room at the Lehrstuhl für Informatik 6.
In general, selected topics from the following general areas of Human
Language Technology and Pattern Recognition will be offered:
- Automatic Speech Recognition;
- Machine Translation;
- Pattern Recognition.
Some possible topics, individual supervisors, and basic references:
- Open-vocabulary Handwriting (and Speech) Recognition (Matysiak; Supervisor: Michal Kozielski)
References:
-
Vertanen, K.: Combining open vocabulary recognition and word
confusion networks. Acoustics, Speech and Signal Processing, 2008.
(ICASSP 2008), pp.4325-4328.
-
Issam Bazzi, Richard M. Schwartz, John Makhoul: An Omnifont
Open-Vocabulary OCR System for English and Arabic. IEEE Trans. Pattern
Anal. Mach. Intell. 21(6): 495-504 (1999)
-
M. A. Basha Shaik, A. El-Desoky Mousa, R. Schlüter, and H. Ney: Hybrid
Language Models Using Mixed Types of Sub-lexical Units for Open
Vocabulary German LVCSR. In Interspeech, pages 1441-1444, Florence,
Italy, August 2011.
- Feature Extraction for Off-line Handwriting Recognition (Ketabdar; Supervisor: Michal Kozielski)
References:
-
John T. Favata, Geetha Srikantan: A multiple feature/resolution
approach to handprinted digit and character recognition. International
Journal of Imaging Systems and Technology. Volume 7, Issue 4, pages
304-311, Winter 1996
-
Marti, U.-V. and Bunke, H.:Using a statistical language model to
improve the performance of an HMM-based cursive handwriting recognition
systems, Published in Book Hidden Markov models, pages 65--90, 2002
-
José A. Rodriguez, Florent Perronnin: Local gradient histogram
features for word spotting in unconstrained handwritten documents; ICFHR
2008 (International Conference on Frontiers in Handwriting Recognition),
Montréal, Canada, 19-21 August, 2008
- The Vanishing Gradient Problem in Recurrent Neural Networks (Hauptmann; Supervisor: Patrick Dötsch)
References:
-
Bengio, Y.; Simard, P.; Frasconi, P.: Learning long-term
dependencies with gradient descent is difficult. Neural Networks, IEEE
Transactions on , vol.5, no.2, pp.157-166, Mar 1994
-
Sepp Hochreiter , Yoshua Bengio , Paolo Frasconi , Jürgen Schmidhuber:
Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term
Dependencies. In book A field guide to dynamical recurrent neural
networks, pages 237-243, 2001.
-
Sepp Hochreiter, Jürgen Schmidhuber: Long Short-Term Memory. In
journal Neural Computations, volume 9, number 8, 1997, pages 1735-1780.
- Connectionist Temporal Classification (Voigtländer; Supervisor: Patrick Dötsch)
References:
-
Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber:
Connectionist temporal classification: labelling unsegmented sequence
data with recurrent neural networks; In ICML '06, pages 369-376, 2006.
-
Graves, A.; Liwicki, M.; Fernandez, S.; Bertolami, R.; Bunke, H.;
Schmidhuber, J.: A Novel Connectionist System for Unconstrained
Handwriting Recognition. Pattern Analysis and Machine Intelligence,
vol. 31, no. 5, pages 855-868, May 2009.
-
Alex Graves:
Supervised Sequence Labelling with Recurrent Neural Networks; Studies in
Computational Intelligence, volume 385, pages 1-131, 2012.
- Self-Taught Learning for Handwriting Recognition (Kim; Supervisor: Mehdi Hamdani)
References:
-
Rajat Raina, Alexis Battle, Honglak Lee, Benjamin Packer, Andrew Y.
Ng: Self-taught learning: transfer learning from unlabeled data; ICML
2007, pages 759-766.
-
Bastien et al.: Deep Self-Taught Learning for Handwritten Character
Recognition, in CoRR, 2010.
-
Bengio: Deep Learning of Representations for Unsupervised and Transfer
Learning; JMLR W&CP 27:17-36, 2012.
- Signer Adaptive Techniques for Sign Language Recognition (Habalov; Supervisor: Yannick Gweth)
References:
-
Gweth, Yannick and Plahl, Christian and Ney, Hermann: Enhanced
Continuous Sign Language Recognition using PCA and Neural Network
Features; CVPR 2012 Workshop on Gesture Recognition, 2012
-
von Agris, U. and Knorr, M. and Kraiss, K.-F.: The Significance of
Facial Features for Automatic Sign Language Recognition; in FG 2008
-
Wade Shen; Reynolds, D.; , "Improved GMM-based language recognition
using constrained MLLR transforms," Acoustics, Speech and Signal
Processing, 2008. ICASSP 2008. IEEE International Conference on , vol.,
no., pp.4149-4152, March 31 2008-April 4 2008
- Tracking with Body Pose Estimation in Sign Language (NN; Supervisor: Christian Oberdörfer)
References:
-
Buehler et al.: Upper Body Detection and Tracking in Extended Signing
Sequences, in IJCV 2011.
-
Pfister et al.: Automatic and Efficient Long Term Arm and Hand Tracking
for Continuous Sign Language TV Broadcast, in BMVC 2012.
-
Yang et al.: Articulated Pose Estimation with flexible
Mixtures-of-Parts, in CVPR 2011.
- Spotting Signs in Context (NN; Supervisor: Jens Forster)
References:
-
Yang, Sclaroff, Lee: Sign Language Spotting with a Threshold Model
Based on Conditional Random Fields; in IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 31, no 7, 2009, pages 1264-1277
-
Cooper and Bowden: Learning Signs from Subtitles: A Weakly Supervised
Approach to Sign Language Recognition; in CVPR 2009, pages 2568-2574
-
Buehler, Everingham, Zisserman: Employing signed TV broadcasts for
automated learning of British Sign Language; Workshop on the
Representation and Processing of Sign Languages 2010, LREC 2010 Malta
- Handling Epenthesis in Sign Language Recognition (Thill; Supervisor: Jens Forster)
References:
-
Yang, Sarkar, Loeding: Handling Movement Epenthesis and Sign
Ambiguities in Continuous Sign Language Recognition using Nested Dynamic
Programming; IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 32, no. 3, 2010, pages 462-477
-
Fang, Gao, Zhao: Large-Vocabulary Continuous Sign Language Recognition
Based on Transition-Movement Models; IEEE Transactions on Systems, Man,
And Cypernetics - Part A: Systems and Humans, vol. 37, no. 1, 2007
-
Kelly, McDonald, Markham: Recognizing Spatiotemporal Gestures and
Movement Epenthesis in Sign Language; in International Machine Vision
and Image Processing Conference 2009, pages 145-150
- Keypoint-based Object Matching (Takahashi; Supervisor: Harald Hanselmann)
References:
-
Li, H. and Huang, J. and Zhang, S. and Huang, X.: Optimal object
matching via convexification and composition, ICCV 2011, pages 33-40
-
Jiang, H. and Tian, T.P. and Sclaroff, S.: Scale and rotation
invariant matching using linearly augmented trees; CVPR 2011, pages 2473-2480
-
Lowe, D.G.: Distinctive image features from scale-invariant keypoints;
International journal of computer vision, vol 60, no 2, pages 91-110, 2004
- Multimodal Sign Language Recognition (NN; Supervisor: Yannick Gweth)
References:
-
von Agris, U. and Knorr, M. and Kraiss, K.-F.: The Significance of
Facial Features for Automatic Sign Language Recognition; in FG 2008
-
Kelly, Daniel: A Framework for Continuous Multimodal Sign Language Recognition. ICMI-MLMI 2009
-
Dan, Luan et al.: Human Gesture Analysis using Multimodal Features.
IEEE International Conference on Multimedia and Expo Workshops
Guidelines for the article and presentation
The roughly 20-page article together with the slides (between 20 &
30) for the presentation should be prepared in LaTeX format.
Presentations will consist of 45 minutes presentation time & 15
minutes discussion time. Document templates for both the article and
the presentation slides are provided below along with links to LaTeX
documentation available online. The article and
the slides have to be prepared in LaTeX format using the provided templates and submitted
electronically in pdf format. Other formats will not be accepted.
- Online LaTeX-Documentation:
- Guidelines for articles and presentation slides:
General:
- The aim of the seminar for the participants is to learn the
following:
- to tackle a topic and to expand knowledge
- to critically analyze the literature
- to hold a presentation
- Take notice of references
to other topics in the seminar and discuss topics with one
another!
- Take care to stay within your
own topic. To this end participants should be aware of the other
topics in the seminar. If applicable, cross-reference
other articles and presentations.
Specific:
- Important: As part of the introduction, a slide should
outline the most important literature used for the presentation. In
addition, the presentation should clearly indicate which literature the particular
elements of the presentation refer to.
- Take notice of references
to other topics in the seminar and discuss topics with one
another!
- Participants are expected to seek out additional literature on their
topic. Assistance with the literature search is available at the
facultys library. Access to literature is naturally also available at
the Lehrstuhl für Informatik 6 library.
- Notation/Mathematical
Formulas: consistent, correct notation
is essential. When necessary, differing notation from various
literature sources is to be modified or standardized in order to be
clear and consistent. The
lectures held by the Lehrstuhl für Informatik 6 should provide a
guide as to what appropriate notation should look like.
- Tables
must have titles (appearing above the table).
- Figures
must have captions (appearing below the figure).
- In the case that no adequate translation of an
English technical term is available, the term should be used unchanged.
- Articles and presentation slides can also be prepared in
English.
- Completeness:
acknowledge all literature and
sources.
- Referencing must conform to the standard
described in the article template.
- Examples should be used to illustrate points.
- Examples should be as complex as necessary but as simple
as possible.
- Slides should be used
as presentation aids and not to replace the role of the presenter;
specifically, slides should:
- illustrate important points and relationships;
- remind the audience (and the presenter) of important aspects
and considerations;
- give the audience an overview
of the presentation.
- Slides should not contain chunks of text or complicated
sentences; rather they should consist of succinct words and terms.
- Use illustrations
where appropriate - a picture says a thousand words!
- Abbreviations should be defined at the first usage in the manner
demonstrated in the following example: "[...] at the
Rheinisch-Westfälischen Technischen Hochschule (RWTH) there are
[...]".
- Take care to stay within your
own topic. To this end participants should be aware of the other topics in the
seminar. If applicable, cross-reference
other articles and presentations.
- Usage of fonts, typefaces and colors in presentation slides must
be consistent and appropriate. Such means should serve to clarify
points or relationships, not be applied needlessly or at random.
- Care should be taken when selecting fonts for presentation
slides (also within diagrams) to ensure legibility on a projector even
for those seated far from the screen.
Contact
Inquiries should be directed to the respective supervisors or to:
Dr. Ralf Schlüter
RWTH Aachen
Lehrstuhl für Informatik 6
Ahornstr. 55
52056 Aachen
Raum 6125b
Telefon: 0241 / 80-21612
E-Mail: schlueter@cs.rwth-aachen.de