Unclear definition for poisoner_data_path

Currently in the poisoner, there exists two paths:

OpenBackdoor/openbackdoor/attackers/poisoners/poisoner.py

Lines 23 to 24 in 58dcbdc

    
                   poison_data_basepath (:obj:`str`, optional): the path to the poisoned data. Default to `None`. 
        
                   poisoned_data_path (:obj:`str`, optional): the path to save the poisoned data. Default to `None`.

According to the docstring, poison_data_basepath is for loading and poisoned_data_path is for saving.

However, in the following code, we can find that both poison_data_basepath and poisoned_data_pathare both used for saving, which could lead to confusion.

OpenBackdoor/openbackdoor/attackers/poisoners/poisoner.py

Lines 80 to 85 in 58dcbdc

    
           else: 
        
               poison_train_data = self.poison(data["train"]) 
        
               self.save_data(data["train"], self.poison_data_basepath, "train-clean") 
        
               self.save_data(poison_train_data, self.poison_data_basepath, "train-poison") 
        
           poisoned_data["train"] = self.poison_part(data["train"], poison_train_data) 
        
           self.save_data(poisoned_data["train"], self.poisoned_data_path, "train-poison")

I suggest these two parameters can be merged as one

Hi, thank you for your feedback!
We set two separate parameters to distinguish between the path of a fully poisoned dataset and that of a partially poisoned dataset. To improve reusability, we first poison the entire clean dataset and save the results to poison_data_basepath. This poison dataset can be used to produce different partially poisoned datasets with different poison_setting and poison_rate, which are saved to poisoned_data_path.
However, we will consider merging them if they lead to confusion.

I understand. Thanks!

	poison_data_basepath (:obj:`str`, optional): the path to the poisoned data. Default to `None`.
	poisoned_data_path (:obj:`str`, optional): the path to save the poisoned data. Default to `None`.

	else:
	poison_train_data = self.poison(data["train"])
	self.save_data(data["train"], self.poison_data_basepath, "train-clean")
	self.save_data(poison_train_data, self.poison_data_basepath, "train-poison")
	poisoned_data["train"] = self.poison_part(data["train"], poison_train_data)
	self.save_data(poisoned_data["train"], self.poisoned_data_path, "train-poison")