Kno.e.sis Publications

Combining the Strength of Pattern Frequency and Distance for Classification

Jinyan Li
Kotagiri Ramamohanarao
Guozhu Dong, Wright State University - Main CampusFollow

Document Type

Conference Proceeding

Publication Date

4-2001

Find in a Library

Catalog Record

Abstract

Supervised classification involves many heuristics, including the ideas of decision tree, k-nearest neighbour (k-NN), pattern frequency, neural network, and Bayesian rule, to base induction algorithms. In this paper, we propose a new instance-based induction algorithm which combines the strength of pattern frequency and distance. We define a neighbourhood of a test instance. If the neighbourhood contains training data, we use k-NN to make decisions. Otherwise, we examine the support (frequency) of certain types of subsets of the test instance, and calculate support summations for prediction. This scheme is intended to deal with outliers: when no training data is near to a test instance, then the distance measure is not a proper predictor for classification. We present an effective method to choose an “optimal” neighbourhood factor for a given data set by using a guidance from a partial training data. In this work, we find that our algorithm maintains (sometimes exceeds) the outstanding accuracy of k-NN on data sets containing pure continuous attributes, and that our algorithm greatly improves the accuracy of k-NN on data sets containing a mixture of continuous and categorical attributes. In general, our method is much superior to C5.0.

Comments

Presented at the 5th Pacific-Asia Conference on Advances in Knowledge Discovery (PAKDD), Hong Kong, April 16-18, 2001.

Repository Citation

Li, J., Ramamohanarao, K., & Dong, G. (2001). Combining the Strength of Pattern Frequency and Distance for Classification. Lecture Notes in Computer Science, 2035, 455-466.
https://corescholar.libraries.wright.edu/knoesis/420

DOI

10.1007/3-540-45357-1_48

Link to Full Text

Catalog Record

COinS

Kno.e.sis Publications

Combining the Strength of Pattern Frequency and Distance for Classification

Document Type

Publication Date

Find in a Library

Abstract

Comments

Repository Citation

DOI

Search

Browse

About

SelectedWorks Sites

Kno.e.sis Publications

Combining the Strength of Pattern Frequency and Distance for Classification

Authors

Document Type

Publication Date

Find in a Library

Abstract

Comments

Repository Citation

DOI

Share

Search

Browse

About

SelectedWorks Sites