Feature Space Transformation and Decision Results Interpretation

Li, J. and Ong, H.-L.

    Gene expression profiles and proteomic data are extremely high-dimensional data. Though support vector machines can well learn the inner relationship of the data for classification, the non-linear kernel functions pose an obstacle to explain the prediction reasons to non-specialists. We prefer to use rulebased methods due to there easy interpretability. In this paper, we fist discuss feature space transformation. Each new feature (a rule) is a combination of multiple original features provide that the new feature captures a large percentage of a class of data, but with no occurrence in the other class. Under the description of new features, training or test data are clearly class-separable. Then we discuss a more sophisticated rule-based method, called PCL, for classification. PCL provides easily explainable classifications cores for us to better understand the predictions and the test data themselves. Visualization is also used to enhance the understanding of the classifier output. We use rich examples to d3monstrate our main points.
Cite as: Li, J. and Ong, H.-L. (2003). Feature Space Transformation and Decision Results Interpretation. In Proc. First Asia-Pacific Bioinformatics Conference (APBC2003), Adelaide, Australia. CRPIT, 19. Chen, Y.-P. P., Ed. ACS. 129-137.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS