This paper presents a concept hierarchy-based approach to privacy preserving data collection for data mining called the P-level model. The P-level model allows data providers to divulge information at any chosen privacy level (P-level), on any attribute. Data collected at a high P-level signifies divulgence at a higher conceptual level and thus ensures more privacy. Providing guarantees prior to release, such as satisfying k-anonymity (Samarati 2001; Sweeney 2002) , can further protect the collected data set from privacy breaches due to linking the released data set with external data sets. However, the data mining process, which involves the integration of various data values, can constitute a privacy breach if combinations of attributes at certain P-levels result in the inference of knowledge that exists at a lower P-level. This paper describes the P-level reduction phenomenon and proposes methods to identify and control the occurrence of this privacy breach.
|Cite as: Williams, A. and Barker, K. (2007). Controlling Inference: Avoiding P-level Reduction during Analysis. In Proc. Fifth Australasian Information Security Workshop (Privacy Enhancing Technologies) (AISW 2007), Ballarat, Australia. CRPIT, 68. Brankovic, L. and Steketee, C., Eds. ACS. 193-200. |
(local if available)