A simple image dataset fetcher
- *booru via Apkawa/booru-rs
- fetch images by query
- create image_name.txt with tags
Usage: image-dataset-fetcher [OPTIONS] --target <TARGET> <COMMAND>
Commands:
booru Fetch from booru
help Print this message or the help of the given subcommand(s)
Options:
-p, --proxy <PROXY> Verbosity log debug
-t, --target <TARGET> Path to store fetched dataset
-h, --help Print help
-V, --version Print version
$ image-dataset-fetcher help booru
Fetch from booru
Usage: image-dataset-fetcher --target <TARGET> booru [OPTIONS]
Options:
-e, --engine <ENGINE> Booru engine [default: danbooru]
-u, --url <URL> Booru url
-q, --query <QUERY> Query
-l, --limit <LIMIT> Limit [default: 100]
-p, --pages <PAGES> Pages [default: 1]
-h, --help Print help
-V, --version Print version
Example fetch from danbooru
image-dataset-fetcher -t /path/to/dataset/ -p https://proxyhost:3143 booru -e danbooru -q "masabodo shiroko_(blue_archive)" -limit 10
TODO make command for preparation dataset, see example
-
remove dups via
apt install findimagedupes
findimagedupes -R -i 'VIEW(){ for f in "$@";do echo $f;done }' -- /path/to/dataset/ | grep .txt -v | xargs -r -n 1 -I {} rm {}
-
Resize all image and convert to jpg via
apt install imagemagick
mogrify -resize 512x512 -format jpg -path /path/to/dataset_resized/ /path/to/dataset/ cp /path/to/dataset/*.txt /path/to/dataset_resized/
-
Rename dataset_resized to format
img/{reperats}_{instance_prompt} {class_prompt}
forkohya_ss
. as example:- repeats -
40
- instance prompt
masabodo
? TODO check docs - class prompt
1girl
? TODO check docs
folder name must be
img/40_masabodo 1girl
- it is image folder for kohya_ss - repeats -