picobyte/stable-diffusion-webui-wd14-tagger

如何手动安装模型 How to install a model manually

Coloured-glaze opened this issue · 32 comments

My server is not able to download the models from huggingface, so I need to install the models manually.
I would like to know where the model files should be placed, thank you very much.

屏幕截图 2023-07-21 202842
屏幕截图 2023-07-21 202800

It's in a subfolder of 'stable-diffusion-webui/models'. I cannot check right now where exactly, but See here

For the WaifuDiffusionInterrogator this line. The shared.models_path is 'stable-diffusion-webui/models'.

Edit: corrected path

When I rolled back the commit to 66b7724 and used wd14-swinv2-v2 it successfully loaded the model

屏幕截图 2023-07-21 212019

Do you expect access to huggingface? (Edit: I mean to ask: is it possible on your device?) Does the roll back mean you were able to load an existing model from a past configuration? part of the ML change #6 was a change of names, which may have this as side effect, not sure.

When I rolled back the commit to 66b7724 and used wd14-swinv2-v2 it successfully loaded the model

If you really need to go back that far to get it working again, I'd say related changes occurred in this one. It was not working in v1.0.0?

WSH032 commented

For some reason, Chinese users need to use a proxy to access huggingface, which means that most of the time hf_hub_download() cannot work properly.
Compared to the previous version, the download directory of the model has indeed changed. It means that need download model again.

WSH032 commented

OK, I see. I guess I kown why.

Do not specify the cache_dir parameter, leave it to the user to set it through environment variables.
Set local_dir to mdir.

hf_hub_download(
    self.repo_id,
    self.model_path,
    local_dir=mdir
)

Ok, did that. This originates from f47d5da, please let me know if this fixed it for you.

WSH032 commented

@picobyte
Something wrong. Need to set different local_dir for different models.
EF$LF6QJKT8_`JP0Y8DT1A4

我明白你的意思,下载错过了顶级目录并且正在相互覆盖。
I see what you mean, the downloads miss the top directory and are overwriting one another, also for my last few.

json_pp < model.json | tail -n 21
   {
      "model_path" : "~/stable-diffusion-webui/models/interrogators/models--SmilingWolf--wd-v1-4-vit-tagger-v2/snapshots/1f3f3e8ae769634e31e1ef696df11ec37493e4f2/model.onnx",
      "name" : "WD14 ViT v2",
      "tags_path" : "~/stable-diffusion-webui/models/interrogators/models--SmilingWolf--wd-v1-4-vit-tagger-v2/snapshots/1f3f3e8ae769634e31e1ef696df11ec37493e4f2/selected_tags.csv"
   },

   {
      "model_path" : "~/stable-diffusion-webui/models/interrogators/model.onnx",
      "name" : "WD14 ConvNeXTV2 v1",
      "tags_path" : "~/stable-diffusion-webui/models/interrogators/selected_tags.csv"
   }
]

It seems there used to be a 'models/' + repo_id with slashes replaced with double dashes.

WSH032 commented

Edit:
If use cache_dir instead of local_dir like before, everything will work fine, but the user cannot manually download the model to use.
By explicitly specifying local_dir (the location where the file is stored), the user can place their own model file, instead of using hf_hub_download to download.


~/stable-diffusion-webui/models/interrogators/models--SmilingWolf--wd-v1-4-vit-tagger-v2/snapshots/1f3f3e8ae769634e31e1ef696df11ec37493e4f2/model.onnx

It is from the huggingface cache cacha_dir.


~/stable-diffusion-webui/models/interrogators/model.onnx

This is the file we need to use, you can get it by setting local_dir.

We can place the 'models/' + repo_id with slashes replaced with double dashes in there manually if it's not already there.

WSH032 commented

Do you mean, using cache_dir.

If the user needs to use their own model, they can manually place it in the models–SmilingWolf–wd-v1-4-vit-tagger-v2/snapshots/1f3f3e8ae769634e31e1ef696df11ec37493e4f2/ directory?


Or do you mean, using local_dir, but keeping the top directory name similar to models–SmilingWolf–wd-v1-4-vit-tagger-v2?

I meant this

def download(self) -> None:
        dashed_part = 'models--' + '--'.join(self.repo_id.split('/'))
        mdir = Path(shared.models_path, 'interrogators', dashed_part)
        ...
        model_path = hf_hub_download(
                repo_id=self.repo_id,
                filename=self.model_path,
                local_dir=mdir)

but I have to check, mdir is also used for model.json

I think what might work is

def download(self) -> None:
        dashed_part = 'models--' + '--'.join(self.repo_id.split('/'))
        mdir = Path(shared.models_path, 'interrogators', dashed_part)
        local_dir = Path(shared.models_path, 'interrogators', dashed_part)
        ...
        model_path = hf_hub_download(
                repo_id=self.repo_id,
                filename=self.model_path,
                cache_dir=mdir,
                local_dir=local_dir)
WSH032 commented

But usually, we don’t specify cache_dir, but set it through the environment variable HF_HOME

The code is feasible, it’s just that the cache location is different, it doesn’t affect the normal work

WSH032 commented

By the way, it’s better to use the model file through a explicit local_dir, rather than the return value of hf_hub_download().

Because the latter has to call hf_hub_download, which means that you have to be online. For the former, we can use os.path.exist to determine if the model file exists without calling hf_hub_download().

As long as we provide the user with a explicit local_dir, they can download the model themselves.

ok I will fix that too, then.
maybe we can add in settings:

    shared.opts.add_option(
        key='tagger_hf_cache_dir',
        info=shared.OptionInfo(
            os.environ.get('HF_HOME', str(Path(shared.models_path, 'interrogators'))),
            label='HuggingFace cache directory',
            section=section,
        ),
    )
WSH032 commented

I support it‘s a good idea
Refer to https://huggingface.co , there is another environment variable HUGGINGFACE_HUB_CACHE

okay it's in this pull request. Note that I removed all local_dir. It should work with both environment variables, or default to the Path(shared.models_path, 'interrogators') but will be configurable in Settings -> Tagger.
comments or thoughts?

For me it seems to work, though I get no updates in the file models/interrogators/model.json anymore. The question is, does this also work behind the proxy or with only local dirs.

好吧,它在 this 拉取请求中。请注意,我删除了所有 local_dir。它应该与两个环境变量一起使用,或者默认为“Path(shared.models_path, 'interrogators')”,但可以在“设置”->“标记器”中进行配置。
评论或想法?

对我来说,它似乎有效,尽管我不再在文件 models/interrogators/model.json 中得到任何更新。问题是,这是否也适用于代理服务器或仅适用于本地目录。

hmm, from environment_variables, they are not exactly the same. HUGGINGFACE_HUB_CACHE defaults to "$HF_HOME/hub" so I think we only need HUGGINGFACE_HUB_CACHE`, if the user has only set HF_HOME, then this default should save the user, still.

WSH032 commented

Edit:
Toriato seems do not specify cache_dir. letting the environment variable specify it is just to let users take previous caches. In other words, even if cache_dir is changed, it doesn't matter, it will just cause another download


The situation is this, Chinese users can manually download the model by accessing the huggingface web page, but they cannot use hf_hub_download().

So the key to the problem is to determine the model file location in advance, and then users can manually place the file there, so that there is no need to call hf_hub_download() again.

There is a point worth noting, even if hf_hub_download() does not download new files, it will still raise an error for Chinese users who have proxy problems.

local_dir is used to specify the location where the model is stored, which is the model file location that I mentioned above, and cache_dir is actually not important

I did introduce an is_hf parameter to WaifuDiffusionInterrogator, here, If it is set to False, then a local model can be used. I actually use that for two models, but have it hard-coded. It could become use_hf, and probably some code changes, to avoid the download. Would also require changes in the other Interrogator subclasses.
I think the interrogators should be read from a json. Currently they in the tagger/utils.py dict. If existing use_hf could be set to False to avoid the download, in refresh_interrogators() (below the dict)

but.. users won't get updates anymore.

WSH032 commented

I think the interrogators should be read from a json. Currently they in the tagger/utils.py dict. If existing use_hf could be set to False to avoid the download, in refresh_interrogators() (below the dict)

OK, I got it.
Forgive me for not noticing it before.
I believe that's a good idea too.

That's fine, thanks for helping me out. I'd have a hard time understanding the problem. However, I'm a bit afraid that changes here might raise problems for others, so also hesitant with this change.

Edit: there's the HF_HUB_OFFLINE boolean. We can ask the user to set that and try.. catch, see _raise_if_offline_mode_is_enabled

WSH032 commented

tag_images_by_wd14_tagger.py

This is the implementation code I use for image-deduplicate-cluster-webui model download.

I check if the model file exists by os.path.exists, if it exists, and the user does not request keep_updating, then hf_hub_download will never be called (even when the first time use, which means even no cache)


These are model files

AZ WR2NE~(4R{2_MQU{~9S3
JK0M$LKMKETTAM9 KG621LQ
U(6Z0Z$8ORJCXWT(O2GEFV2


This is a checkbox for keep_updating (Please ignore some English translation errors as I only provided the Chinese UI)
1FCR3UZ44~~2YVV2`GIEBT4


I think the interrogators should be read from a json

Although different from my implementation, I think this is also a good idea.

Because I only need to manage few models, but you need to manage many. That may be complicated, so I agree that json may be more suitable.

I’m not very familiar with the workflow of stable-diffusion-webui-wd14-tagger now, once I have time, I’ll try to figure it out.

I pushed a change on top: d252c2a, See commit message. It unifies the HuggingFace download, gives more download tweak options. If you want to set local_dir instead of the cache_dir, you cane edit the Settings -> Tagger -> HuggingFace parameters. It can contain all supported parameters from [hf_hub_download] (https://huggingface.co/docs/huggingface_hub/main/en/package_reference/file_download#huggingface_hub.hf_hub_download); any invalid argument will be ignored with a warning in the logs. repo_id and filename, as well, they are overridden for the respective download.
As I mentioned, if the user sets HF_HUB_OFFLINE=1as environment variable, it will now fallback to the local_dir.
I made more changes, an .info json alongside the model. It's not enough tested for inclusion.

Alinyq commented

image
可以在自己指定一个路径,在代码中修改加载模型的部分,加载自己指定的路径即可

So where should I put model . Which dir ?

1、修改代码防止联网:stable-diffusion-webui/extensions/stable-diffusion-webui-wd14-tagger/tagger/interrogator.py
修改代码stable-diffusion-webui_bhb:extensions:stable-diffusion-webui-wd14-tagger:tagger:interrogator py
2、下载「SmilingWolf模型的」放在stable-diffusion-webuib/models/TaggerOnnx文件下面,文件夹命名参考下面
屏幕快照 2023-10-19 下午3 34 05
3、最终目录类似这样,可以下载多个模型库,分别放在相应文件夹中
屏幕快照 2023-10-19 下午3 33 27

HtopH commented

1、修改代码防止联网:stable-diffusion-webui/extensions/stable-diffusion-webui-wd14-tagger/tagger/interrogator.py 修改代码stable-diffusion-webui_bhb:extensions:stable-diffusion-webui-wd14-tagger:tagger:interrogator py 2、下载「SmilingWolf模型的」放在stable-diffusion-webuib/models/TaggerOnnx文件下面,文件夹命名参考下面 屏幕快照 2023-10-19 下午3 34 05 3、最终目录类似这样,可以下载多个模型库,分别放在相应文件夹中 屏幕快照 2023-10-19 下午3 33 27

yes,it works,thanks

1、修改代码防止联网:stable-diffusion-webui/extensions/stable-diffusion-webui-wd14-tagger/tagger/interrogator.py 修改代码stable-diffusion-webui_bhb:extensions:stable-diffusion-webui-wd14-tagger:tagger:interrogator py 2、下载「SmilingWolf模型的」放在stable-diffusion-webuib/models/TaggerOnnx文件下面,文件夹命名参考下面 屏幕快照 2023-10-19 下午3 34 05 3、最终目录类似这样,可以下载多个模型库,分别放在相应文件夹中 屏幕快照 2023-10-19 下午3 33 27

生效了,感谢

Please allow me to wrap it now for it's kind of messy by far

  • set ENV HF_HUB_OFFLINE=1 and change parameter is_hf to False in extensions/stable-diffusion-webui-wd14-tagger/tagger/interrogator.py, then tagger will look for the "files" in "local_dir"
  • the "files" are model.onnx and selected_tags.csv that we can access at huggingface
  • and the "local_dir" represents models/TaggerOnnx/wd...

By doing so, these SmilingWolf/ models are good to use without network access to huggingface
However, I still can't use ML-danbooru models. I did put some files at some dirs as the utils.py says like this pic
image

May someone help us with that or is it can't be done right now?