Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and instance acquisition effort, considering all the data attributes accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance and redundancy, what ranking does not immediately decide. Additionally, feature ranking methods from different independent sources are called in for the direct comparison.
|Cite as: Bagirov, A., Yatsko, A., Stranieri, A. and Jelinek, H. (2011). Feature Selection using Misclassification Counts. In Proc. Australasian Data Mining Conference (AusDM 11) Ballarat, Australia. CRPIT, 121. Vamplew, P., Stranieri, A., Ong, K.-L., Christen, P. and Kennedy, P. J. Eds., ACS. 51-62 |