|
| | | |
A Differentially Private Decision Forest
Fletcher, S. and Islam, M.Z.
With the ubiquity of data collection in today’s society,
protecting each individual’s privacy is a growing concern.
Differential Privacy provides an enforceable definition
of privacy that allows data owners to promise
each individual that their presence in the dataset will
be almost undetectable. Data Mining techniques are
often used to discover knowledge in data, however
these techniques are not differentially privacy by default.
In this paper, we propose a differentially private
decision forest algorithm that takes advantage of
a novel theorem for the local sensitivity of the Gini
Index. The Gini Index plays an important role in
building a decision forest, and the sensitivity of it’s
equation dictates how much noise needs to be added
to make the forest be differentially private. We prove
that the Gini Index can have a substantially lower
sensitivity than that used in previous work, leading to
superior empirical results. We compare the prediction
accuracy of our decision forest to not only previous
work, but also to the popular Random Forest algorithm
to demonstrate how close our differentially private
algorithm can come to a completely non-private
forest. |
Cite as: Fletcher, S. and Islam, M.Z. (2015). A Differentially Private Decision Forest. In Proc. Thirteenth Australasian Data Mining Conference (AusDM 2015) Sydney, Australia. CRPIT, 168. Ong, K.L., Zhao, Y., Stone, M.G. and Islam, M.Z. Eds., ACS. 99-108 |
(from crpit.com)
(local if available)
|
|