UIFV: Data Reconstruction Attack in Vertical Federated Learning

Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao·June 18, 2024

Summary

The paper investigates privacy risks in Vertical Federated Learning (VFL), with a focus on data reconstruction attacks. Traditional gradient-based methods are limited in VFL due to their dependence on model details. The authors introduce the Unified InverNet Framework (UIFV), a novel approach that uses intermediate feature data for data reconstruction without reliance on gradients or specific model structures. Experiments on four datasets (Bank, Adult, Credit, and CIFAR10) show that UIFV significantly outperforms existing techniques, revealing vulnerabilities in VFL systems. The study evaluates different attack scenarios, including Query Attack, Data Passive Attack, and Stealth Attack, and compares the method with state-of-the-art techniques. It also explores defense mechanisms, emphasizing the need for robust privacy protection in collaborative learning environments. The research highlights the trade-offs between privacy and utility in vertical federated learning.

Key findings

5

Paper digest

What problem does the paper attempt to solve? Is this a new problem?

The paper aims to address the issue of Data Reconstruction Attack in Vertical Federated Learning . This attack involves reconstructing original training data by exploiting the internal parameters of machine learning models used in a federated learning setting . The paper explores various methods such as model information-based reconstruction, feature-based methods, and shadow model training to reconstruct data in a vertical federated learning architecture . While the concept of data reconstruction attacks in federated learning is not entirely new, the paper contributes by proposing novel methods and frameworks to enhance the effectiveness and stealthiness of such attacks .


What scientific hypothesis does this paper seek to validate?

This paper aims to validate the effectiveness of the proposed UIFV attack framework in Vertical Federated Learning (VFL) through extensive experiments . The study focuses on evaluating the attack accuracy of the proposed method and comparing it with state-of-the-art methods to demonstrate its higher attack effectiveness . The research seeks to provide insights into the security implications of data reconstruction attacks in VFL scenarios, particularly in the context of model inversion attacks and feature inference attacks .


What new ideas, methods, or models does the paper propose? What are the characteristics and advantages compared to previous methods?

The paper proposes several novel ideas, methods, and models in the context of data reconstruction attacks in Vertical Federated Learning (VFL) :

  • GRN and GIA Methods: The paper introduces the GRN (Generalized Reconstruction Network) method, which aims to recover data of the passive party in VFL by utilizing the passive party's model and parameters. Additionally, it presents the GIA (Gradient Inference Attack) method, an improvement over GRN, which constructs a shadow model using a small amount of known auxiliary data and confidence scores to emulate the real passive party's model for data reconstruction without direct access .
  • Defense Strategies: The paper discusses defense strategies against data reconstruction attacks, such as differential privacy technology and purification defense strategies. These strategies aim to add noise to gradients during training to protect individual data privacy and reduce the likelihood of attackers using model outputs for inference .
  • UIFV Attack Framework: The paper introduces the UIFV attack framework, which includes four different scenarios for data reconstruction attacks in VFL. It evaluates the effectiveness of the proposed method through extensive experiments, demonstrating higher attack accuracy compared to state-of-the-art methods .
  • Experimental Results: The paper presents detailed experimental results on the effectiveness of the UIFV method in data reconstruction attacks. It uses metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) to measure the quality of image reconstructions. The results show superior performance of the proposed method across various scenarios compared to existing methods like GRN and GIA . The Unified InverNet Framework in Vertical Federated Learning (UIFV) introduces innovative characteristics and advantages compared to previous methods in data reconstruction attacks:
  • Novel Approach: UIFV diverges from traditional attack strategies by utilizing intermediate features of the target model instead of relying on gradient or model information. This method constructs an InverNet to extract original data information from the model's intermediate features, enabling effective data reconstruction in various black-box scenarios .
  • Flexibility and Applicability: UIFV offers a more flexible and effective means for data reconstruction attacks in complex VFL environments. It overcomes the limitations of existing methods like GRN and GIA, which are restricted by the need for model access or specific model requirements. UIFV is designed to be applicable in various black-box attack scenarios, enhancing its adaptability .
  • Higher Attack Effectiveness: Extensive experiments demonstrate that UIFV achieves higher attack accuracy compared to state-of-the-art methods. It surpasses 96% accuracy in scenarios like QA, showcasing its superior performance across various metrics and datasets. For instance, in the DPA scenario, UIFV achieved a PSNR of 25.61 and an SSIM of 0.89, outperforming comparative methods significantly .
  • Privacy Risks Awareness: The research highlights the significant privacy risks faced by VFL systems and emphasizes the urgent need for robust defense mechanisms against data reconstruction attacks. UIFV's innovative approach addresses real-world attack scenarios in practical VFL environments, providing a comprehensive solution to counter privacy threats effectively .
  • Experimental Validation: Through detailed experiments and ablation studies, UIFV's effectiveness is confirmed, showcasing its robustness in maintaining image quality and outperforming existing methods like GRN and GIA. The method's performance is evaluated using metrics like PSNR and SSIM, demonstrating its superiority in data reconstruction accuracy and quality .

Do any related researches exist? Who are the noteworthy researchers on this topic in this field?What is the key to the solution mentioned in the paper?

Several related research studies exist in the field of Vertical Federated Learning (VFL) and data reconstruction attacks. Noteworthy researchers in this field include Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, and Yubing Bao . These researchers have contributed to studies on privacy risks, data leakage, and data reconstruction in VFL.

The key solution mentioned in the paper is the Unified InverNet Framework (UIFV), which introduces a novel and flexible approach to address privacy risks in VFL . This framework leverages intermediate feature data exchanged between participants during the inference phase of VFL to reconstruct original data, without relying on gradients or specific model structures. The UIFV method significantly outperforms state-of-the-art techniques in attack precision, highlighting the importance of enhancing privacy protection in VFL systems .


How were the experiments in the paper designed?

The experiments in the paper were designed to evaluate the Unified InverNet Framework in Vertical Federated Learning (UIFV) for data reconstruction attacks in VFL environments. The experiments aimed to assess the effectiveness of the proposed method compared to state-of-the-art methods . The experiments included detailed reconstruction experiments on the CIFAR10 image dataset using metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) to measure the effectiveness of the UIFV method . Additionally, the experiments involved implementing representative data reconstruction defense methods within the VFL framework, such as Differential Privacy and Gaussian noise addition, to test the robustness of the UIFV method . The paper conducted experiments across four different scenarios in four datasets to evaluate the impact of these defense methods on the effectiveness of the attacks .


What is the dataset used for quantitative evaluation? Is the code open source?

The dataset used for quantitative evaluation in the study is comprised of various datasets, including the Bank dataset, Adult dataset, Credit dataset, and CIFAR10 dataset . The datasets range from banking marketing analysis to image recognition, each with unique features and challenges . However, the provided context does not mention whether the code used in the study is open source or not.


Do the experiments and results in the paper provide good support for the scientific hypotheses that need to be verified? Please analyze.

The experiments and results presented in the paper provide strong support for the scientific hypotheses that needed verification. The Unified InverNet Framework (UIFV) introduced in the study demonstrates a novel and flexible approach to data reconstruction in Vertical Federated Learning (VFL) . The experiments conducted in the paper show that the UIFV method significantly outperforms state-of-the-art techniques in attack precision, indicating the effectiveness of the proposed approach . The study exposes severe privacy vulnerabilities within VFL systems, confirming the necessity of enhancing privacy protection in VFL architectures .

The experimental results in the paper showcase high attack effectiveness, surpassing 96% in overall, discrete, and continuous accuracy on various datasets, including bank and adult datasets . The continuous accuracy achieved by the UIFV method on the bank dataset reached 96.4%±0.2, demonstrating the robustness and superiority of the proposed approach . Additionally, the experiments conducted on the CIFAR10 image dataset using metrics like PSNR and SSIM further validate the effectiveness of the UIFV method in data reconstruction .

Overall, the experiments detailed in the paper provide concrete evidence supporting the scientific hypotheses put forth in the study. The results not only validate the effectiveness of the UIFV approach in data reconstruction in VFL but also highlight the critical privacy risks and vulnerabilities associated with current VFL systems .


What are the contributions of this paper?

The contributions of the paper include:

  • Jirui Yang contributed to Methodology and Writing - original draft.
  • Peng Chen contributed to Methodology and Writing - review and editing.
  • Zhihui Lu contributed to Methodology, Supervision, and Writing - review and editing.
  • Qiang Duan contributed to Methodology and Writing - review and editing.
  • Yubing Bao contributed to Formal analysis and Data curation .

What work can be continued in depth?

Further research in the field of data reconstruction attacks in vertical federated learning can be expanded in several areas:

  • Exploring Different Attack Methods: Future studies can delve into exploring and developing novel attack methods beyond the existing gradient-based, model information-based, and feature-based approaches .
  • Enhancing Reconstruction Accuracy: Researchers can focus on improving the accuracy of data reconstruction by investigating the impact of different auxiliary dataset sizes on the reconstruction accuracy of the method, as larger auxiliary datasets have shown to enhance reconstruction accuracy .
  • Evaluation on Various Datasets: Conducting evaluations on a wider range of datasets beyond the bank, adult, credit, and CIFAR10 datasets used in the current study can provide a more comprehensive understanding of the effectiveness of data reconstruction attacks in different data domains .
  • Security and Privacy Considerations: Future work can also emphasize exploring the security boundaries of data reconstruction methods to enhance the privacy and security of sensitive data in vertical federated learning settings .
  • Performance Optimization: Researchers can focus on optimizing the performance of data reconstruction attacks by refining the attack algorithms and methodologies to achieve better results in terms of accuracy and efficiency .

Tables

2

Introduction
Background
Evolution of Federated Learning
Challenges in Vertical Federated Learning (VFL)
Objective
To explore privacy risks in VFL, specifically data reconstruction attacks
Introduce the Unified InverNet Framework (UIFV) as a novel solution
Method
Data Collection
Selection of datasets: Bank, Adult, Credit, and CIFAR10
Data partitioning for VFL scenarios
Data Preprocessing
Preparation of intermediate feature data
Handling confidentiality in VFL settings
Unified InverNet Framework (UIFV)
Design
Independence from model gradients and structures
Performance Evaluation
Reconstruction accuracy compared to existing techniques
Attack Scenarios
Query Attack
Description and implementation
Impact on privacy in VFL
Data Passive Attack
Methodology and results
Demonstrating vulnerabilities in VFL systems
Stealth Attack
Stealthy nature and effectiveness
Comparison with previous attack methods
Defense Mechanisms
Exploration of countermeasures
Importance of robust privacy protection
Utility vs. Privacy Trade-offs
Analysis of the balance in VFL
Practical implications for collaborative learning environments
Conclusion
Summary of findings
Future research directions in privacy-enhanced VFL
Basic info
papers
cryptography and security
machine learning
artificial intelligence
Advanced features
Insights
What types of attack scenarios are evaluated in the paper, and what are the key findings regarding the effectiveness of UIFV against them?
What is the primary focus of the paper concerning privacy risks in Vertical Federated Learning?
How does UIFV compare to traditional gradient-based methods in terms of vulnerability detection in VFL systems?
What is the main contribution of the authors' introduced Unified InverNet Framework (UIFV) in addressing data reconstruction attacks in VFL?

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao·June 18, 2024

Summary

The paper investigates privacy risks in Vertical Federated Learning (VFL), with a focus on data reconstruction attacks. Traditional gradient-based methods are limited in VFL due to their dependence on model details. The authors introduce the Unified InverNet Framework (UIFV), a novel approach that uses intermediate feature data for data reconstruction without reliance on gradients or specific model structures. Experiments on four datasets (Bank, Adult, Credit, and CIFAR10) show that UIFV significantly outperforms existing techniques, revealing vulnerabilities in VFL systems. The study evaluates different attack scenarios, including Query Attack, Data Passive Attack, and Stealth Attack, and compares the method with state-of-the-art techniques. It also explores defense mechanisms, emphasizing the need for robust privacy protection in collaborative learning environments. The research highlights the trade-offs between privacy and utility in vertical federated learning.
Mind map
Reconstruction accuracy compared to existing techniques
Independence from model gradients and structures
Comparison with previous attack methods
Stealthy nature and effectiveness
Demonstrating vulnerabilities in VFL systems
Methodology and results
Impact on privacy in VFL
Description and implementation
Performance Evaluation
Design
Handling confidentiality in VFL settings
Preparation of intermediate feature data
Data partitioning for VFL scenarios
Selection of datasets: Bank, Adult, Credit, and CIFAR10
Introduce the Unified InverNet Framework (UIFV) as a novel solution
To explore privacy risks in VFL, specifically data reconstruction attacks
Challenges in Vertical Federated Learning (VFL)
Evolution of Federated Learning
Future research directions in privacy-enhanced VFL
Summary of findings
Practical implications for collaborative learning environments
Analysis of the balance in VFL
Importance of robust privacy protection
Exploration of countermeasures
Stealth Attack
Data Passive Attack
Query Attack
Unified InverNet Framework (UIFV)
Data Preprocessing
Data Collection
Objective
Background
Conclusion
Utility vs. Privacy Trade-offs
Defense Mechanisms
Attack Scenarios
Method
Introduction
Outline
Introduction
Background
Evolution of Federated Learning
Challenges in Vertical Federated Learning (VFL)
Objective
To explore privacy risks in VFL, specifically data reconstruction attacks
Introduce the Unified InverNet Framework (UIFV) as a novel solution
Method
Data Collection
Selection of datasets: Bank, Adult, Credit, and CIFAR10
Data partitioning for VFL scenarios
Data Preprocessing
Preparation of intermediate feature data
Handling confidentiality in VFL settings
Unified InverNet Framework (UIFV)
Design
Independence from model gradients and structures
Performance Evaluation
Reconstruction accuracy compared to existing techniques
Attack Scenarios
Query Attack
Description and implementation
Impact on privacy in VFL
Data Passive Attack
Methodology and results
Demonstrating vulnerabilities in VFL systems
Stealth Attack
Stealthy nature and effectiveness
Comparison with previous attack methods
Defense Mechanisms
Exploration of countermeasures
Importance of robust privacy protection
Utility vs. Privacy Trade-offs
Analysis of the balance in VFL
Practical implications for collaborative learning environments
Conclusion
Summary of findings
Future research directions in privacy-enhanced VFL
Key findings
5

Paper digest

What problem does the paper attempt to solve? Is this a new problem?

The paper aims to address the issue of Data Reconstruction Attack in Vertical Federated Learning . This attack involves reconstructing original training data by exploiting the internal parameters of machine learning models used in a federated learning setting . The paper explores various methods such as model information-based reconstruction, feature-based methods, and shadow model training to reconstruct data in a vertical federated learning architecture . While the concept of data reconstruction attacks in federated learning is not entirely new, the paper contributes by proposing novel methods and frameworks to enhance the effectiveness and stealthiness of such attacks .


What scientific hypothesis does this paper seek to validate?

This paper aims to validate the effectiveness of the proposed UIFV attack framework in Vertical Federated Learning (VFL) through extensive experiments . The study focuses on evaluating the attack accuracy of the proposed method and comparing it with state-of-the-art methods to demonstrate its higher attack effectiveness . The research seeks to provide insights into the security implications of data reconstruction attacks in VFL scenarios, particularly in the context of model inversion attacks and feature inference attacks .


What new ideas, methods, or models does the paper propose? What are the characteristics and advantages compared to previous methods?

The paper proposes several novel ideas, methods, and models in the context of data reconstruction attacks in Vertical Federated Learning (VFL) :

  • GRN and GIA Methods: The paper introduces the GRN (Generalized Reconstruction Network) method, which aims to recover data of the passive party in VFL by utilizing the passive party's model and parameters. Additionally, it presents the GIA (Gradient Inference Attack) method, an improvement over GRN, which constructs a shadow model using a small amount of known auxiliary data and confidence scores to emulate the real passive party's model for data reconstruction without direct access .
  • Defense Strategies: The paper discusses defense strategies against data reconstruction attacks, such as differential privacy technology and purification defense strategies. These strategies aim to add noise to gradients during training to protect individual data privacy and reduce the likelihood of attackers using model outputs for inference .
  • UIFV Attack Framework: The paper introduces the UIFV attack framework, which includes four different scenarios for data reconstruction attacks in VFL. It evaluates the effectiveness of the proposed method through extensive experiments, demonstrating higher attack accuracy compared to state-of-the-art methods .
  • Experimental Results: The paper presents detailed experimental results on the effectiveness of the UIFV method in data reconstruction attacks. It uses metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) to measure the quality of image reconstructions. The results show superior performance of the proposed method across various scenarios compared to existing methods like GRN and GIA . The Unified InverNet Framework in Vertical Federated Learning (UIFV) introduces innovative characteristics and advantages compared to previous methods in data reconstruction attacks:
  • Novel Approach: UIFV diverges from traditional attack strategies by utilizing intermediate features of the target model instead of relying on gradient or model information. This method constructs an InverNet to extract original data information from the model's intermediate features, enabling effective data reconstruction in various black-box scenarios .
  • Flexibility and Applicability: UIFV offers a more flexible and effective means for data reconstruction attacks in complex VFL environments. It overcomes the limitations of existing methods like GRN and GIA, which are restricted by the need for model access or specific model requirements. UIFV is designed to be applicable in various black-box attack scenarios, enhancing its adaptability .
  • Higher Attack Effectiveness: Extensive experiments demonstrate that UIFV achieves higher attack accuracy compared to state-of-the-art methods. It surpasses 96% accuracy in scenarios like QA, showcasing its superior performance across various metrics and datasets. For instance, in the DPA scenario, UIFV achieved a PSNR of 25.61 and an SSIM of 0.89, outperforming comparative methods significantly .
  • Privacy Risks Awareness: The research highlights the significant privacy risks faced by VFL systems and emphasizes the urgent need for robust defense mechanisms against data reconstruction attacks. UIFV's innovative approach addresses real-world attack scenarios in practical VFL environments, providing a comprehensive solution to counter privacy threats effectively .
  • Experimental Validation: Through detailed experiments and ablation studies, UIFV's effectiveness is confirmed, showcasing its robustness in maintaining image quality and outperforming existing methods like GRN and GIA. The method's performance is evaluated using metrics like PSNR and SSIM, demonstrating its superiority in data reconstruction accuracy and quality .

Do any related researches exist? Who are the noteworthy researchers on this topic in this field?What is the key to the solution mentioned in the paper?

Several related research studies exist in the field of Vertical Federated Learning (VFL) and data reconstruction attacks. Noteworthy researchers in this field include Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, and Yubing Bao . These researchers have contributed to studies on privacy risks, data leakage, and data reconstruction in VFL.

The key solution mentioned in the paper is the Unified InverNet Framework (UIFV), which introduces a novel and flexible approach to address privacy risks in VFL . This framework leverages intermediate feature data exchanged between participants during the inference phase of VFL to reconstruct original data, without relying on gradients or specific model structures. The UIFV method significantly outperforms state-of-the-art techniques in attack precision, highlighting the importance of enhancing privacy protection in VFL systems .


How were the experiments in the paper designed?

The experiments in the paper were designed to evaluate the Unified InverNet Framework in Vertical Federated Learning (UIFV) for data reconstruction attacks in VFL environments. The experiments aimed to assess the effectiveness of the proposed method compared to state-of-the-art methods . The experiments included detailed reconstruction experiments on the CIFAR10 image dataset using metrics like Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) to measure the effectiveness of the UIFV method . Additionally, the experiments involved implementing representative data reconstruction defense methods within the VFL framework, such as Differential Privacy and Gaussian noise addition, to test the robustness of the UIFV method . The paper conducted experiments across four different scenarios in four datasets to evaluate the impact of these defense methods on the effectiveness of the attacks .


What is the dataset used for quantitative evaluation? Is the code open source?

The dataset used for quantitative evaluation in the study is comprised of various datasets, including the Bank dataset, Adult dataset, Credit dataset, and CIFAR10 dataset . The datasets range from banking marketing analysis to image recognition, each with unique features and challenges . However, the provided context does not mention whether the code used in the study is open source or not.


Do the experiments and results in the paper provide good support for the scientific hypotheses that need to be verified? Please analyze.

The experiments and results presented in the paper provide strong support for the scientific hypotheses that needed verification. The Unified InverNet Framework (UIFV) introduced in the study demonstrates a novel and flexible approach to data reconstruction in Vertical Federated Learning (VFL) . The experiments conducted in the paper show that the UIFV method significantly outperforms state-of-the-art techniques in attack precision, indicating the effectiveness of the proposed approach . The study exposes severe privacy vulnerabilities within VFL systems, confirming the necessity of enhancing privacy protection in VFL architectures .

The experimental results in the paper showcase high attack effectiveness, surpassing 96% in overall, discrete, and continuous accuracy on various datasets, including bank and adult datasets . The continuous accuracy achieved by the UIFV method on the bank dataset reached 96.4%±0.2, demonstrating the robustness and superiority of the proposed approach . Additionally, the experiments conducted on the CIFAR10 image dataset using metrics like PSNR and SSIM further validate the effectiveness of the UIFV method in data reconstruction .

Overall, the experiments detailed in the paper provide concrete evidence supporting the scientific hypotheses put forth in the study. The results not only validate the effectiveness of the UIFV approach in data reconstruction in VFL but also highlight the critical privacy risks and vulnerabilities associated with current VFL systems .


What are the contributions of this paper?

The contributions of the paper include:

  • Jirui Yang contributed to Methodology and Writing - original draft.
  • Peng Chen contributed to Methodology and Writing - review and editing.
  • Zhihui Lu contributed to Methodology, Supervision, and Writing - review and editing.
  • Qiang Duan contributed to Methodology and Writing - review and editing.
  • Yubing Bao contributed to Formal analysis and Data curation .

What work can be continued in depth?

Further research in the field of data reconstruction attacks in vertical federated learning can be expanded in several areas:

  • Exploring Different Attack Methods: Future studies can delve into exploring and developing novel attack methods beyond the existing gradient-based, model information-based, and feature-based approaches .
  • Enhancing Reconstruction Accuracy: Researchers can focus on improving the accuracy of data reconstruction by investigating the impact of different auxiliary dataset sizes on the reconstruction accuracy of the method, as larger auxiliary datasets have shown to enhance reconstruction accuracy .
  • Evaluation on Various Datasets: Conducting evaluations on a wider range of datasets beyond the bank, adult, credit, and CIFAR10 datasets used in the current study can provide a more comprehensive understanding of the effectiveness of data reconstruction attacks in different data domains .
  • Security and Privacy Considerations: Future work can also emphasize exploring the security boundaries of data reconstruction methods to enhance the privacy and security of sensitive data in vertical federated learning settings .
  • Performance Optimization: Researchers can focus on optimizing the performance of data reconstruction attacks by refining the attack algorithms and methodologies to achieve better results in terms of accuracy and efficiency .
Tables
2
Scan the QR code to ask more questions about the paper
© 2025 Powerdrill. All rights reserved.