go back
go back
Volume 18, No. 12
PrivEval: a tool for interactive evaluation of privacy metrics in synthetic data generation
Abstract
Synthetic data generation (SDG) is the process of generating a new synthetic dataset based on the statistical properties of a confidential existing dataset. Differential privacy is the property of a SDG mechanism that establishes how protected individuals whose sensitive data is part of the confidential dataset are, when sharing such data. To ensure a SDG is differentially private, noise is injected into the statistics learned from the dataset. Depending on the amount of noise injected, we witness a trade-off between privacy and utility. Privacy is then measured via a set of privacy metrics that usually establish a lower bound on a few aspects of the privacy-utility tradeoff. Therefore, it is not possible to assess privacy based only on one metric. To close this gap, we demonstrate PrivEval, a tool to assist users in evaluating the privacy properties of a synthetic dataset. PrivEval implements several privacy metrics and validates them on both a single user and the overall dataset. Besides, PrivEval checks assumptions behind each metric. Hence, PrivEval is a first step to bridge the gap between privacy experts and the general public to make privacy estimation more transparent.
PVLDB is part of the VLDB Endowment Inc.
Privacy Policy