Genetic algorithms are powerful tools for k-nearest neighbors classification. Traditional knn classifiers employ Euclidian distance to assess neighbor similarity, though other measures may also be used. GAs can search for optimal linear weights of features to improve knn performance using both Euclidian distance and cosine similarity. GAs also optimize additive feature offsets in search of an optimal point of reference for assessing angular similarity using the cosine measure. This poster explores weight and offset optimization for knn with varying similarity measures, including Euclidian distance (weights only), cosine similarity, and Pearson correlation. The use of offset optimization here represents a novel technique for enhancing Pearson/knn classification performance. Experiments compare optimized and non-optimized classifiers using public domain datasets. While unoptimized Euclidian knn often outperforms its cosine and Pearson counterparts, optimized Pearson and cosine knn classifiers show equal or improved accuracy compared to weight-optimized Euclidian knn.
Peterson, M. R.,
Doom, T. E.,
& Raymer, M. L.
(2005). GA-Facilitated Classifier Optimization with Varying Similarity Measures. GECCO '05 Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, 1549-1550.