Kno.e.sis Publications

Mining Sequence Classifiers for Early Prediction

Zhengzheng Xing
Jian Pei
Guozhu Dong, Wright State University - Main CampusFollow
Philip S. Yu

Document Type

Conference Proceeding

Publication Date

4-2008

Abstract

Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since temporal order is important in sequence data, in many critical applications of sequence classification such as medical diagnosis and disaster prediction, early prediction is a highly desirable feature of sequence classifiers. In early prediction, a sequence classifier should use a prefix of a sequence as short as possible to make a reasonably accurate prediction. To the best of our knowledge, early prediction on sequence data has not been studied systematically.

In this paper, we identify the novel problem of mining sequence classifiers for early prediction. We analyze the problem and the challenges. As the first attempt to tackle the problem, we propose two interesting methods. The sequential classification rule (SCR) method mines a set of sequential classification rules as a classifier. A so-called early-prediction utility is defined and used to select features and rules. The generalized sequential decision tree (GSDT) method adopts a divide-and-conquer strategy to generate a classification model. We conduct an extensive empirical evaluation on several real data sets. Interestingly, our two methods achieve accuracy comparable to that of the state-of-the-art methods, but typically need to use only very short prefixes of the sequences. The results clearly indicate that early prediction is highly feasible and effective.

Comments

Presented at the Society for Industrial and Applied Mathematics' International Conference on Data Mining, Atlanta, GA, April 24-26, 2008.

Repository Citation

Xing, Z., Pei, J., Dong, G., & Yu, P. S. (2008). Mining Sequence Classifiers for Early Prediction. Proceedings of the 2008 SIAM International Conference on Data Mining, 644-655.
https://corescholar.libraries.wright.edu/knoesis/390

DOI

10.1137/1.9781611972788.59

Link to Full Text

COinS

Kno.e.sis Publications

Mining Sequence Classifiers for Early Prediction

Document Type

Publication Date

Abstract

Comments

Repository Citation

DOI

Search

Browse

About

SelectedWorks Sites

Kno.e.sis Publications

Mining Sequence Classifiers for Early Prediction

Authors

Document Type

Publication Date

Abstract

Comments

Repository Citation

DOI

Share

Search

Browse

About

SelectedWorks Sites