|
| | | |
Using Text Classification to Predict the Gene Knockout Behaviour of S. Cerevisiae
Caldon, P.
A naive Bayes classifier was used to analyze gene behavior based on text data and presented as an entry for the 2002 KDD Cup, a data mining exercise to predict the behavior of the yeast S. Cerevisiae. The solution presented was based on the multinomial event model for text classification(McCallum & Nigam 1998) with a feature selection mechanism added. Despite this simple model, performance close to that of the best entries in the competition could be obtained, which were using more sophisticated techniques. It appears that seemingly minor effort in using prior knowledge to conate the gene classes, as well as the previously described effectiveness of the naive Bayes method contributed to this success. |
Cite as: Caldon, P. (2003). Using Text Classification to Predict the Gene Knockout Behaviour of S. Cerevisiae. In Proc. First Asia-Pacific Bioinformatics Conference (APBC2003), Adelaide, Australia. CRPIT, 19. Chen, Y.-P. P., Ed. ACS. 211-214. |
(from crpit.com)
(local if available)
|
|