The data mining inspired problem of finding the critical, and most useful features to be used to classify a data set, and construct rules to predict the class of future examples is an interesting and important problem. It is also one of the most useful problems with applications in many areas such as microarray analysis, genomics, proteomics, pattern recognition, data compression and knowledge discovery. Expressed as k-Feature Set it is also a formally hard problem. In this paper we present a method for coping with this hardness using the combinatorial optimisation and parameterized complexity inspired technique of sound reduction rules. We apply our method to an interesting data set which is used to predict the winner of the popular vote in the U.S. presidential elections. We demonstrate the power and exibility of the reductions, especially when used in the context ofthe (/alpha, /beta)k -Feature Set variant problem.
|Cite as: Moscato, P., Mathieson, L., Mendes, A. and Berretta, R. (2005). The Electronic Primaries: Predicting the U.S. Presidency Using Feature Selection with Safe Data Reduction. In Proc. Twenty-Eighth Australasian Computer Science Conference (ACSC2005), Newcastle, Australia. CRPIT, 38. Estivill-Castro, V., Ed. ACS. 371-380. |
(local if available)