Abstraction Augmented Markov Models

Description:

Article discussing the abstraction augmented Markov models.

Creator(s):
Creation Date: December 2010
Partner(s):
UNT College of Engineering
Collection(s):
UNT Scholarly Works
Usage:
Total Uses: 32
Past 30 days: 1
Yesterday: 0
Creator (Author):
Caragea, Cornelia

University of North Texas; Iowa State University

Creator (Author):
Silvescu, Adrian

Iowa State University

Creator (Author):
Caragea, Doina

Kansas State University

Creator (Author):
Honavar, Vasant

Iowa State University

Publisher Info:
Place of Publication: [New York, New York]
Date(s):
  • Creation: December 2010
Description:

Article discussing the abstraction augmented Markov models.

Degree:
Note:

© 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Note:

Abstract: High accuracy sequence classification often requires the use of higher order Markov models (MMs). However, the number of MM parameters increases exponentially with the range of direct dependencies between sequence elements, thereby increasing the risk of overfitting when the data set is limited in size. We present abstraction augmented Markov models (AAMMs) that effectively reduce the number of numeric parameters of kᵗʰ order MMs by successively grouping strings of length k (i.e., k-grams) into abstraction hierarchies. We evaluate AAMMs on three protein subcellular localization prediction tasks. The results of our experiments show that abstraction makes it possible to construct predictive models that use significantly smaller number of features (by one to three orders of magnitude) as compared to MMs. AAMMs are competitive with and, in some cases, significantly outperform MMs. Moreover, the results show that AAMMs often perform significantly better than variable order Markov models, such as decomposed context tree weighting, prediction by partial match, and probabilistic suffix trees.

Physical Description:

10 p.: ill.

Language(s):
Subject(s):
Keyword(s): Markov models | abstraction | sequence classification
Source: Proceedings of the Tenth Institute of Electrical and Electronics Engineers (IEEE) International Conference on Data Mining, 2010, Sydney, Australia
Contributor(s):
Partner:
UNT College of Engineering
Collection:
UNT Scholarly Works
Identifier:
  • ARK: ark:/67531/metadc180962
Resource Type: Paper
Format: Text
Rights:
Access: Public