Synthesizing differentially private datasets using random mixing

Cited 2 time in webofscience Cited 2 time in scopus
  • Hit : 148
  • Download : 0
The goal of differentially private data publishing is to release a modified dataset so that its privacy can be ensured while allowing for efficient learning. We propose a new data publishing algorithm in which a released dataset is formed by mixing l randomly chosen data points and then perturbing them with an additive noise. Our privacy analysis shows that as l increases, noise with smaller variance is sufficient to achieve a target privacy level. In order to quantify the usefulness of our algorithm, we adopt the accuracy of a predictive model trained with our synthetic dataset, which we call the utility of the dataset. By characterizing the utility of our dataset as a function of l, we show that one can learn both linear and nonlinear predictive models so that they yield reasonably good prediction accuracies. Particularly, we show that there exists a sweet spot on l that maximizes the prediction accuracy given a required privacy level, or vice versa. We also demonstrate that given a target privacy level, our datasets can achieve higher utility than other datasets generated with the existing data publishing algorithms.
Publisher
Proceedings of the IEEE International Symposium on Information Theory
Issue Date
2019-07-05
Language
English
Citation

Proceedings of the IEEE International Symposium on Information Theory(ISIT), pp.542 - 546

DOI
10.1109/ISIT.2019.8849381
URI
http://hdl.handle.net/10203/269354
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 2 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0