|
| | | |
Factors Influencing Robustness and Effectiveness of Conditional Random Fields in Active Learning Frameworks
Kholghi, M., Sitbon, L., Zuccon, G. and Nguyen, A.
Active learning approaches reduce the annotation cost
required by traditional supervised approaches to reach the
same effectiveness by actively selecting informative
instances during the learning phase. However,
effectiveness and robustness of the learnt models are
influenced by a number of factors. In this paper we
investigate the factors that affect the effectiveness, more
specifically in terms of stability and robustness, of active
learning models built using conditional random fields
(CRFs) for information extraction applications. Stability,
defined as a small variation of performance when small
variation of the training data or a small variation of the
parameters occur, is a major issue for machine learning
models, but even more so in the active learning
framework which aims to minimise the amount of training
data required. The factors we investigate are a) the choice
of incremental vs. standard active learning, b) the feature
set used as a representation of the text (i.e., morphological
features, syntactic features, or semantic features) and c)
Gaussian prior variance as one of the important CRFs
parameters. Our empirical findings show that incremental
learning and the Gaussian prior variance lead to more
stable and robust models across iterations. Our study also
demonstrates that orthographical, morphological and
contextual features as a group of basic features play an
important role in learning effective models across all
iterations. |
Cite as: Kholghi, M., Sitbon, L., Zuccon, G. and Nguyen, A. (2014). Factors Influencing Robustness and Effectiveness of Conditional Random Fields in Active Learning Frameworks. In Proc. Twelfth Australasian Data Mining Conference (AusDM14) Brisbane, Australia. CRPIT, 158. Li, X., Liu, L., Ong, K.L. and Zhao, Y. Eds., ACS. 69-78 |
(from crpit.com)
(local if available)
|
|