segmind/segmoe

Support local safetensors file

6DammK9 opened this issue · 1 comments

I may add PR but I have no time to code.
As an A1111 / ComfyUI user, I find that it doesn't support local safetensors file out of the box.
Instead of pip install segmoe (I encountered dependency hell even I am using conda already), I suggest uninstall segmoe and then directly using the cloned files (put to ./segmoe which is same directory of a sample usage code python train.py) and start modifying the script.
But I attempted to load model with diffusers, and I got no idea how to do it.
Hope that there will be solution and let me not needed to rely on CivitAI models (or using a dummy http host to host the models)

For HTTP host, npm install -g http-server will save your day.

edit: wget is not present in Windows, install it first.
edit2: Got it working. Required Code changes. Mainly StableDiffusionXLPipeline.from_single_file and modify URL to in main.py.

API_MODEL_URL_CIVITAI = "https://civitai.com/api/download/models/"
API_MODEL_URL = "http://localhost:8080/models/"

Config file x17-AstolfoMoE_a3p6.yaml:

base_model: http://localhost:8080/models/x17-AstolfoMix-x13te0x14te1.safetensors
num_experts: 2
moe_layers: all
num_experts_per_tok: 1
experts:
  - source_model: http://localhost:8080/models/_x14-ponyDiffusionV6XL_v6.safetensors
    positive_prompt: "xxx"
    negative_prompt: "xxx"
  - source_model: http://localhost:8080/models/_x08-animagineXLV3_v30.safetensors
    positive_prompt: "xxx"
    negative_prompt: "xxx"

And finally python not_train.py:

from segmoe import SegMoEPipeline
import torch

# OOM with RTX 3090, need 48GB+ VRAM! 

pipeline = SegMoEPipeline("x17-AstolfoMoE_a3p6.yaml", device="cpu", torch_dtype=torch.float, variant="fp32")

pipeline.save_pretrained("segmoe_v0")

Finally spent 26 minutes and around 80GB of RAM to "train" on a i9-7960X CPU.
"eval" (Generate image) can use cuda, which is as fast as usual.

Added in Support, Thanks for the suggestion!