adap/flower

Add Flower Baseline: FedDebug

warisgill opened this issue ยท 4 comments

Paper

Waris Gill, Ali Anwar, Muhammad Ali Gulzar (2023). FedDebug: Systematic Debugging for Federated Learning Applications

Link

https://dl.acm.org/doi/pdf/10.1109/ICSE48619.2023.00053

Maybe give motivations about why the paper should be implemented as a baseline.

FedDebug introduces the first testing and debugging technique for Federated Learning (FL) applications. It addresses the challenge of identifying faulty client models during the aggregation process on the server, which lacks test data. This paper proposes differential testing for FL, which localizes a faulty client by comparing neuron activations, eliminating the need for real-world test data. FedDebug's uniqueness lies in its compatibility with any variant of FedAvg, as it integrates seamlessly without necessitating changes to existing aggregation protocols. For more detail, please read FedDebug paper.

image

Is there something else you want to add?

FedDebug was presented at ICSE-2023, a prestigious conference in software engineering. Dr. Nicholas Lane mentioned our paper in a tweet, suggesting its port to Flwr. https://twitter.com/niclane7/status/1622865906874388484

We also published FedDefender, a tool for detecting backdoor attacks in Federated Learning. It's already implemented in Flwr (Github Link), and I plan to open an issue to include FedDefender as a baseline as well. The advantage of these papers is that they don't necessitate any changes in aggregation techniques and can work with any existing fusion methods.

Implementation

To implement this baseline, it is recommended to do the following items in that order:

For first time contributors

Prepare - understand the scope

  • Read the paper linked above
  • Decide which experiments you'd like to reproduce. The more the better!
  • Follow the steps outlined in Add a new Flower Baseline.
  • You can use as reference other baselines that the community merged following those steps.

Verify your implementation

  • Follow the steps indicated in the EXTENDED_README.md that was created in your baseline directory
  • Ensure your code reproduces the results for the experiments you chose
  • Ensure your README.md is ready to be run by someone that is no familiar with your code. Are all step-by-step instructions clear?
  • Ensure running the formatting and typing tests for your baseline runs without errors.
  • Clone your repo on a new directory, follow the guide on your own README.md and verify everything runs.

Hey @warisgill, thanks for opening the issue. Having FedDebug would be great! The process to create a baseline is simple, just follow the steps in the Contributing a new baseline section. Or if you prefer a bit more guidance, you'll find a step-by-step guide in our documentation.

Usually baselines implement at least the "core results" of the paper, since those are often the most useful for people wanting to either reproduce the results in the paper or want to compare them against other baselines. What plot/tables are you thinking of reproducing with your baseline in Flower?

Hi @jafermarq , thank you so much. Table 2 and Figure 10 are mainly the core results of the paper. So, I will first implement the FedDebug technique in a generic way so that anyone can use it with any Fusion technique and with any CNN architecture. Then I will reproduce the Table 2 and Figure 10 results. If you have any suggestions, please share it with me. Thanks a lot.

Hi @warisgill what you propose makes sense. Let me know when you'd like me to review your pull request :)

Hi @jafermarq , I have created the PR. I have closely followed the guidelines. Please let me know if there is anything I need to fix. Thank you so much.