|
| | | |
Priority Driven K-Anonymisation for Privacy Protection
Sun, X., Wang, H. and Li, J.
Given the threat of re-identification in our growing
digital society, guaranteeing privacy while providing
worthwhile data for knowledge discovery has become
a difficult problem. k-anonymity is a major technique
used to ensure privacy by generalizing and suppressing attributes and has been the focus of intense research in the last few years. However, data modification techniques like generalization may produce
anonymous data unusable for medical studies because
some attributes become too coarse-grained. In this
paper, we propose a priority driven k-anonymisation
that allows to specify the degree of acceptable distortion for each attribute separately. We also define
some appropriate metrics to measure the distance and
information loss, which are suitable for both numerical and categorical attributes. Further, we formulate
the priority driven k-anonymisation as the k-nearest
neighbor (KNN) clustering problem by adding a constraint that each cluster contains at least k tuples.
We develop an efficient algorithm for priority driven
k-anonymisation. Experimental results show that the
proposed technique causes significantly less distortions. |
Cite as: Sun, X., Wang, H. and Li, J. (2008). Priority Driven K-Anonymisation for Privacy Protection. In Proc. Seventh Australasian Data Mining Conference (AusDM 2008), Glenelg, South Australia. CRPIT, 87. Roddick, J. F., Li, J., Christen, P. and Kennedy, P. J., Eds. ACS. 73-78. |
(from crpit.com)
(local if available)
|
|