TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing

Hu, A.; Xie, R.; Lu, Z.; Xue, M.

doi:10.1145/3460120.3485251

TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing

dc.contributor.author	Hu, A.
dc.contributor.author	Xie, R.
dc.contributor.author	Lu, Z.
dc.contributor.author	Hu, A.
dc.contributor.author	Xue, M.
dc.contributor.conference	ACM SIGSAC Conference on Computer and Communications Security (15 Nov 2021 - 19 Nov 2021 : Virtual Online (Republic of Korea))
dc.date.issued	2021
dc.description.abstract	Generative Adversarial Networks (GAN)-synthesized table publishing lets people privately learn insights without access to the private table. However, existing studies on Membership Inference (MI) Attacks show promising results on disclosing membership of training datasets of GAN-synthesized tables. Different from those works focusing on discovering membership of a given data point, in this paper, we propose a novel Membership Collision Attack against GANs (TableGAN-MCA), which allows an adversary given only synthetic entries randomly sampled from a black-box generator to recover partial GAN training data. Namely, a GAN-synthesized table immune to state-of-the-art MI attacks is vulnerable to the TableGAN-MCA. The success of TableGAN-MCA is boosted by an observation that GAN-synthesized tables potentially collide with the training data of the generator. Our experimental evaluations on TableGAN-MCA have five main findings. First, TableGAN-MCA has a satisfying training data recovery rate on three commonly used real-world datasets against four generative models. Second, factors, including the size of GAN training data, GAN training epochs and the number of synthetic samples available to the adversary, are positively correlated to the success of TableGAN-MCA. Third, highly frequent data points have high risks of being recovered by TableGAN-MCA. Fourth, some unique data are exposed to unexpected high recovery risks in TableGAN-MCA, which may attribute to GAN’s generalization. Fifth, as expected, differential privacy, without the consideration of the correlations between features, does not show commendable mitigation effect against the TableGAN-MCA. Finally, we propose two mitigation methods and show promising privacy and utility trade-offs when protecting against TableGAN-MCA.
dc.description.statementofresponsibility	Aoting Hu, Renjie Xie, Zhigang Lu, Aiqun Hu, Minhui Xue
dc.identifier.citation	Proceedings of the ACM Conference on Computer and Communications Security, 2021, pp.2096-2112
dc.identifier.doi	10.1145/3460120.3485251
dc.identifier.isbn	9781450384544
dc.identifier.issn	1543-7221
dc.identifier.orcid	Xue, M. [0000-0001-5411-5039] [0000-0002-9172-4252]
dc.identifier.uri	https://hdl.handle.net/2440/135640
dc.language.iso	en
dc.publisher	Association for Computing Machinery (ACM)
dc.publisher.place	New York, NY, United States
dc.relation.grant	http://purl.org/au-research/grants/arc/DP210102670
dc.rights	© 2021 Association for Computing Machinery.
dc.source.uri	https://dl.acm.org/doi/proceedings/10.1145/3460120
dc.subject	Security and privacy; Computing methodologies; Machine learning
dc.title	TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing
dc.type	Conference paper
pubs.publication-status	Published

Collections

Computer Science publications

TableGAN-MCA: Evaluating Membership Collisions of GAN-Synthesized Tabular Data Releasing

Files

Collections