添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
模型反转攻击(MIA)旨在从目标模型的训练集中恢复私有数据,这对深度学习模型的隐私构成威胁。MIA 主要关注白盒场景,其中攻击者可以完全访问目标模型的结构和参数。然而,实际应用是黑盒的,对手不容易获得模型相关参数,并且各种模型仅输出预测标签。现有的黑盒 MIA 主要侧重于设计优化策略,生成模型仅从白盒 MIA 中使用的 GAN 迁移而来。据我们所知,我们的研究是对仅标签黑盒场景中可行攻击模型的开创性研究。在本文中,我们开发了一种新颖的 MIA 方法,使用条件扩散模型来恢复目标的精确样本,无需任何额外的优化,只要目标模型输出标签即可。引入了两种主要技术来执行攻击。首先,选择与目标模型任务相关的辅助数据集,以目标模型预测的标签作为指导训练过程的条件。其次,将目标标签和随机标准正态分布噪声输入到训练好的条件扩散模型中,生成具有预先定义的引导强度的目标样本。然后我们筛选出最具稳健性和代表性的样本。此外,我们首次提出使用学习感知图像块相似度(LPIPS)作为 MIA 的评估指标之一,在攻击准确性、真实性和相似性方面进行系统的定量和定性评估。实验结果表明,该方法无需优化即可生成与目标相似且准确的数据,并且在仅标签场景中优于先前方法的生成器。 Model inversion attacks (MIAs) are aimed at recovering private data from a target model's training set, which poses a threat to the privacy of deep learning models. MIAs primarily focus on the white-box scenario where the attacker has full access to the structure and parameters of the target model. However, practical applications are black-box, it is not easy for adversaries to obtain model-related parameters, and various models only output predicted labels. Existing black-box MIAs primarily focused on designing the optimization strategy, and the generative model is only migrated from the GAN used in white-box MIA. Our research is the pioneering study of feasible attack models in label-only black-box scenarios, to the best of our knowledge. In this paper, we develop a novel method of MIA using the conditional diffusion model to recover the precise sample of the target without any extra optimization, as long as the target model outputs the label. Two primary techniques are introduced to execute the attack. Firstly, select an auxiliary dataset that is relevant to the target model task, and the labels predicted by the target model are used as conditions to guide the training process. Secondly, target labels and random standard normally distributed noise are input into the trained conditional diffusion model, generating target samples with pre-defined guidance strength. We then filter out the most robust and representative samples. Furthermore, we propose for the first time to use Learned Perceptual Image Patch Similarity (LPIPS) as one of the evaluation metrics for MIA, with systematic quantitative and qualitative evaluation in terms of attack accuracy, realism, and similarity. Experimental results show that this method can generate similar and accurate data to the target without optimization and outperforms generators of previous approaches in the label-only scenario.