JP van Oosten

Can Markov properties be learned by hidden Markov modelling algorithms?

Sep 21, 2010

Recently I received my MSc degree in Artificial Intelligence. The topic of my final project and thesis was hidden Markov models (HMMs). The main questions were whether the properties of Markov processes could be reliably learned by the current hidden Markov modelling algorithms and whether the Markov property is important in language and handwriting recognition.

Part of the thesis is an in-depth overview of the theory of hidden Markov models and the Baum-Welch algorithm, so if you are interested in HMMs in general, or in the conclusions of the research, please feel free to read my thesis: Can Markov properties be learned by hidden Markov modelling algorithms?

The abstract:

Hidden Markov models (HMMs) are a common classification technique for time series and sequences in areas such as speech recognition, bio-informatics and handwriting recognition. HMMs are used to model processes which behave according to the Markov property: The next state is only influenced by the current state, not by the past. Although HMMs are popular in handwriting recognition, there are some doubts about their usage in this field.

A number of experiments have been performed with both artificial and natural data. The artificial data was specifically generated for this study, either by transforming flat-text dictionaries or by selecting observations probabilistically under predefined modelling conditions. The natural data is part of the collection from the Queen's Office (Kabinet der Koningin), and was used in studies on handwriting recognition. The experiments try to establish whether the properties of Markov processes can be successfully learned by hidden Markov modelling, as well as the importance of the Markov property in language in general and handwriting in particular.

One finding of this project is that not all Markov processes can be successfully modelled by state of the art HMM algorithms, which is strongly supported by a series of experiments with artificial data. Other experiments, with both artificial and natural data show that removing the temporal aspects of a particular hidden Markov model can still lead to correct classification. These and other critical remarks will be explicated in this thesis.