Classifying Text Documents by Associating Terms With Text Categories

Zaiane, O.R. and Antonie, M.-L.

    Automatic text categorization has always been an important application and research topic since the inception of digital documents. Today, text categorization is a necessity due to the very large amount of text documents that we have to deal with daily. Many techniques and algorithms fro automatic text categorization have been devised and proposed in the literature. However, there is still much room for improving the effectiveness of these classifiers, and new models need to be examined. We propose herein a new approach for automatic text categorization. This paper explores the use of association rule mining in building a text categorization system and proposes a new fast algorithm for building a text categorization system and proposes a new fast algorithm for building a text classifier. Our approach has the advantage of a very fast raining phase, and the rules of the classifier generated are easy to understand and manually tuneable. Our investigation leads to conclude that association rule mining is a good and promising strategy for efficient automatic text categorization.
Cite as: Zaiane, O.R. and Antonie, M.-L. (2002). Classifying Text Documents by Associating Terms With Text Categories. In Proc. Thirteenth Australasian Database Conference (ADC2002), Melbourne, Australia. CRPIT, 5. Zhou, X., Ed. ACS. 215-222.
pdf (from crpit.com) pdf (local if available) BibTeX EndNote GS