Document Type
Article
Publication Date
8-2012
Abstract
DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for such data. Since a committee's accuracy is greatly influenced by the diversity among its member classifiers, CABD uses two new ideas to "optimize" that diversity, namely (1) the concept of attribute behavior–based similarity between attributes, and (2) the concept of attribute usage diversity among trees. The ideas are effective for microarray data, since such data have many features and behavior similarity between genes can be high. Experiments on microarray data for six cancers show that CABD outperforms previous ensemble methods significantly and outperforms SVM, and show that the diversified features used by CABD's decision tree committee can be used to improve performance of other classifiers such as SVM. CABD has potential for other high-dimensional data, and its ideas may apply to ensembles of other classifier types.
Repository Citation
Han, Q.,
& Dong, G.
(2012). Using Attribute Behavior Diversity to Build Accurate Decision Tree Committees for Microarray Data. Journal of Bioinformatics and Computational Biology, 10 (4), 1250005-1-1250005-14.
https://corescholar.libraries.wright.edu/knoesis/382
DOI
10.1142/S0219720012500059
Included in
Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, Science and Technology Studies Commons
Comments
The attached PDF document is the unpublished, peer-reviewed version of this article. The final version of this article was published in the Journal of Bioinformatics and Computational Biology, 10, 4, 2012, 1250005-1-1250005-14, doi: 10.1142/S0219720012500059 © copyright World Scientific Publishing Company, http://www.worldscientific.com/worldscinet/jbcb.