|
| | | |
A Novel Framework Using Two Layers of Missing Value Imputation
Rahman, M.G. and Islam, M.Z.
In this study we present a novel framework that uses two
layers/steps of imputation namely the Early-Imputation
step and the Advanced-Imputation step. In the early imputation
step we first impute the missing values (both numerical
and categorical) using existing techniques. The
main goal of this step is to carry out an initial imputation
and thereby refine the records having missing values so
that they can be used in the second layer of imputation
through an existing technique called DMI. The original
DMI ignores the records having missing values. Therefore,
we argue that if a data set has a huge number of missing
values then the imputation accuracy of DMI may suffer
significantly since it ignores a huge number of records.
In this study we present four versions of the framework
and compare them with three existing techniques on two
natural data sets that are publicly available. We use four
evaluation criteria and two statistical significance analyses.
Our experimental results indicate a clear superiority
of the proposed framework over the existing techniques. |
Cite as: Rahman, M.G. and Islam, M.Z. (2013). A Novel Framework Using Two Layers of Missing Value Imputation. In Proc. Eleventh Australasian Data Mining Conference (AusDM13) Canberra, Australia. CRPIT, 146. Christen, P., Kennedy, P., Liu, L., Ong, K.L., Stranieri, A. and Zhao, Y. Eds., ACS. 149-160 |
(from crpit.com)
(local if available)
|
|