Learning and Inference of Probabilistic Finite State Machines using MML and Applications to Classification Problem

2017-06-19T02:40:31Z (GMT) by VIDYA SAIKRISHNA
This thesis examines the problem of learning Probabilistic Finite State Machines from text data and applies it to text classification. Probabilistic Finite State Machines capture regularities and patterns in the text data very effectively and this feature is combined with the ability to compress using the Minimum Message Length principle. Different approaches are developed and are applied on a two-class classification scenario like, classifying spam and non-spam emails on the Enron spam datasets and prediction of individuals in the Activities of Daily Living datasets. The approaches produce significant results and outperform the existing methods of classification.