Conferences in Research and Practice in Information Technology
  

Online Version - Last Updated - 20 Jan 2012

 

 
Home
 

 
Procedures and Resources for Authors

 
Information and Resources for Volume Editors
 

 
Orders and Subscriptions
 

 
Published Articles

 
Upcoming Volumes
 

 
Contact Us
 

 
Useful External Links
 

 
CRPIT Site Search
 
    

Kernel-based Principal Components Analysis on Large Telecommunication Data

Sato, T., Huang, B., Lefait, G., Kechadi, M-T. and Buckley, B.

    Linear Principal Components Analysis (LPCA) is known for its simplicity to reduce the features dimensionality. An extension of LPCA, Kernel Principal Components Analysis (KPCA), outperforms LPCA when applied on non-linear data in high dimensional feature space. However, on large datasets with high input space, KPCA deals with a memory issue and imbalance classification problems with difficulty. This paper presents an approach to reduce the complexity of the training process of KPCA by condensing the training set with sampling and clustering techniques as pre-processing step. The experiments were carried out on a large real-world Telecommunication dataset and were assessed on a churn prediction task. The experiments show that the proposed approach, when combined with clustering techniques, can efficiently reduce feature dimension and outperforms standard PCA for customer churn prediction.
Cite as: Sato, T., Huang, B., Lefait, G., Kechadi, M-T. and Buckley, B. (2009). Kernel-based Principal Components Analysis on Large Telecommunication Data. In Proc. Australasian Data Mining Conference (AusDM'09) Melbourne, Australia. CRPIT, 101. Kennedy P. J., Ong K. and Christen P. Eds., ACS. 109-116
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS