Can you tell me how the dataset should be organized in this code
Closed this issue · 2 comments
Can you tell me how the dataset should be organized in this code? For example, I randomly selected 5,000 images from diffusiondb, how should I organize my reference images and prompt strings?Like that?
--datasets
--main
--diffusiondb
--image1
--image2
--attacked
Hi, thank you for your interest and for reaching out. Apologies for the delayed response.
For non-attacked images, which include both the original and watermarked images, we suggest the following directory structure. The dataset names diffusiondb
and mscoco
are used here as examples. The directory real
contains non-watermarked, original images, and <watermark_method>
should be replaced with your specific watermarking technique.
main
├── diffusiondb
│ ├── prompts.json
│ ├── real
│ ├── <watermark_method>
└── mscoco
├── prompts.json
├── real
└── <watermark_method>
For attacked images, please use the same structure:
attacked
├── diffusiondb
│ ├── <attack_method>-<attack_strength>-<watermark_method>
└── mscoco
└── <attack_method>-<attack_strength>-<watermark_method>
Please configure your .env
file to specify the parent directory of both main
and attacked
folders as follows:
DATA_DIR=/path/to/datasets
This organization will facilitate correct file management and accessibility within your project.
Thank you for your response! I really appreciate your work!