Detection of Membership Inference Attacks on GAN Models

Ekramifard, Ala; Amintoosi, Haleh; Hosseini Seno, Seyed Amin

doi:10.22042/isecure.2024.461639.1131

Detection of Membership Inference Attacks on GAN Models

Document Type : Research Article

Authors

Ala Ekramifard

Haleh Amintoosi

Seyed Amin Hosseini Seno

Computer Engineering Department, Faculty of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran

https://doi.org/10.22042/isecure.2024.461639.1131

Abstract

In the realm of machine learning, Generative Adversarial Networks (GANs) have revolutionized the generation of synthetic data, closely mirroring the distribution of real datasets. This paper delves into the privacy concerns associated with GANs, particularly focusing on Membership Inference Attacks (MIAs), which aim to determine if a specific record was used in training a model. Such attacks pose significant privacy risks, especially when sensitive data is involved. To combat this, we propose a novel detector model designed to identify and thwart MIAs within GANs. Our model, which operates as an additional layer of protection for Machine Learning as a Service (MLaaS) providers, leverages outputs from both the discriminator and generator to ascertain the membership status of data samples. We introduce two variants of the detector model—supervised and unsupervised—based on the availability of information from the discriminator. The supervised detector employs labeled data for training, while the unsupervised detector uses anomaly detection techniques. Our experimental evaluation spans various GAN architectures and datasets, ensuring the robustness and generalizability of our approach. The paper also analyzes the impact of dataset size on the detector's effectiveness. By integrating our detector, MLaaS providers can enhance privacy safeguards, striking a balance between model utility and data protection.

Keywords

Machine Learning

Privacy

Generative Adversarial Network

Membership Inference Attacks

[1] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. volume 27. Curran Associates, Inc., 2014.
[2] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron CCourville. Improved training of wasserstein gans. Advances in neural information processing systems, 30, 2017.
[3] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
[4] Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.
[5] Diederik P Kingma and Max Welling. Autoencoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
[6] Diederik P Kingma and Max Welling. Stochastic gradient vb and the variational auto-encoder. volume 19, page 121, 2014.
[7] Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther. Autoencoding beyond pixels using a learned similarity metric. pages 1558–1566, 2016.
[8] Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, Waibhav Tembe, Jill Muehling, John V Pearson, Dietrich A Stephan, Stanley F Nelson, and David W Craig. Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS genetics, 4:e1000167, 2008.
[9] Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. pages 3–18, 2017.
[10] Jamie Hayes, Luca Melis, George Danezis, and Emiliano De Cristofaro. Logan: Membership inference attacks against generative models. pages 133–152, 2019.
[11] Ahmed Salem, Yang Zhang, Mathias Humbert, Pascal Berrang, Mario Fritz, and Michael Backes. Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246, 2018.
[12] Zheng Li and Yang Zhang. Membership leakage in label-only exposures. pages 880–895, 2021.
[13] Milad Nasr, Reza Shokri, and Amir Houmansadr. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. pages 739–753, 2019.
[14] Klas Leino and Matt Fredrikson. Stolen memories: Leveraging model memorization for calibrated {White-Box}membership inference. pages 1605–1622, 2020.
[15] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. pages 1273–1282, 2017.
[16] Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Lei Yu, and Wenqi Wei. Demystifying membership inference attacks in machine learning as a service. IEEE Transactions on Services Computing, 2019.
[17] Benjamin Hilprecht, Martin H¨arterich, and Daniel Bernau. Monte carlo and reconstruction membership inference attacks against generative models. Proc. Priv. Enhancing Technol., 2019:232–249, 2019.
[18] Kin Sum Liu, Bo Li, and Jie Gao. Generative model: Membership attack, generalization and diversity. CoRR, abs/1805.09898, 2018.
[19] Kin Sum Liu, Chaowei Xiao, Bo Li, and Jie Gao. Performing co-membership attacks against deep generative models. pages 459–467, 2019.
[20] Dingfan Chen, Ning Yu, Yang Zhang, and Mario Fritz. Gan-leaks: A taxonomy of membership inference attacks against generative models. pages 343–362, 2020.
[21] Hailong Hu and Jun Pang. Membership inference attacks against gans by leveraging over-representation regions. pages 2387–2389, 2021.
[22] Maryam Azadmanesh, Behrouz Shahgholi Ghahfarokhi, and Maede Ashouri Talouki. An autoencoder based membership inference attack against generative adversarial network. ISeCure, 15.2, pages 240–253, 2023.
[23] Saverio Cavasin, Mari Daniele, Simone Milani, and Mauro Conti. Fingerprint membership and identity inference against generative adversarial networks. Pattern Recognition Letters, pages 231–241, 2024.
[24] Abdallah Alshantti, Adil Rasheed, and Frank Westad. Privacy re-identification attacks on tabular gans. 2024.
[25] Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15:1929–1958, 2014.
[26] Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. pages 2818–2826, 2016.
[27] Milad Nasr, Reza Shokri, and Amir Houmansadr. Machine learning with membership privacy using adversarial regularization. pages 634–646, 2018.
[28] Christopher A Choquette-Choo, Florian Tramer, Nicholas Carlini, and Nicolas Papernot. Label-only membership inference attacks. pages 1964–1974, 2021.
[29] Qingrong Chen, Chong Xiang, Minhui Xue, Bo Li, Nikita Borisov, Dali Kaarfar, and Haojin Zhu. Differentially private data generative models. arXiv preprint arXiv:1812.02274, 2018.
[30] Karl Weiss, Taghi M. Khoshgoftaar, and DingDing Wang. A survey of transfer learning. Journal of Big data, pages 1–40, 2016.
[31] Gary B Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. 2008.
[32] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. 2009.
[33] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. pages 3730–3738, 2015.
[34] Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20.
[35] Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, , and W. Philip Kegelmeyer. Smote: synthetic minority oversampling technique. Journal of artificial intelligence research, pages 321–357, 2002.
[36] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
[37] Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. 2019. github.com/huggingface/pytorch-pretrained-BigGAN.