Trusted-AI/adversarial-robustness-toolbox

Incorrect Documentation regarding attacks.poisoning

IsaiahHarvi opened this issue · 1 comments

Is your feature request related to a problem? Please describe.
The documentation for art.attacks.poisoning.SleeperAgentAttack clearly states that poison will return the indices of poisoned samples. However, it only returns x_train and y_train with the indices of poisoned samples. There is not a return for a list containing indexes of x_train that contain poisoned samples.

There is a get_poison_indicies() function in the documentation. However, it does not have any explanation in the docs and also seems redundant due to the wording under the poison method.

Describe the solution you'd like
Update the documentation to be specific and include more information regarding the get_poison_indicies() method.

Describe alternatives you've considered
My alternative is to include a third item as a return from the poison method, which is best_indicies_poison
Whilst removing the get_poison_indicies() method.

Ex:
return x_train, y_train, best_indices_poison

Additional context
image
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/modules/attacks/poisoning.html#sleeper-agent-attack

To address the concern about the perceived discrepancy in the art.attacks.poisoning.SleeperAgentAttack documentation regarding the return of poisoned sample indices, it's important to clarify that this might not represent a problem in the functionality or the documentation itself, but rather a misunderstanding of the intended use and design of the library.

The Advanced Robustness Toolkit (ART) is designed to be modular and flexible, accommodating various use cases and methodologies within the domain of adversarial machine learning. The design decision to separate the poison method from the get_poison_indices() method can be rationalized as follows:

Separation of Concerns: By design, the poison method focuses on the generation of poisoned samples, modifying x_train and y_train accordingly. This allows for a clear and focused functionality - generating poisoned data. The method's primary goal is to output poisoned data ready for training or analysis, not to track or manage indices of modifications.

Modularity: The get_poison_indices() function, although not detailed extensively in the documentation, likely serves a specialized purpose separate from the actual poisoning process. This could involve post-processing analysis, debugging, or specific research needs where knowing the exact indices of poisoned samples is crucial. Keeping this functionality separate enhances the toolkit's modularity, allowing users to opt-in to additional functionalities without complicating the core poisoning process.

Flexibility for Users: The current setup provides users with flexibility in how they handle poisoned data. Users who need the indices can call get_poison_indices() after poisoning, while those who don't need this information aren't forced to deal with an additional return value that might be irrelevant to their use case. This design choice respects the diverse needs of users and use cases.

Avoiding Redundancy and Confusion: Adding best_indices_poison as a third return value to the poison method, while removing get_poison_indices(), could streamline the process but at the cost of flexibility and modularity. It also introduces potential confusion for users who might not be interested in the indices, making the library seem more complex for simple poisoning tasks.

Given these considerations, the current implementation and documentation structure might not be an issue but a deliberate design choice to cater to a broad range of use cases and preferences within the adversarial machine learning community. Improving the documentation to better explain the role and usage of get_poison_indices() would certainly help clarify any misunderstandings without necessarily changing the library's architecture. This approach maintains the toolkit's flexibility and modularity while addressing the need for clearer guidance on using the available functionalities.