Zeqiang-Lai/Anything2Image

App is not working fully yet?

SoftologyPro opened this issue · 13 comments

Start app
Click dog_audio.wav
Click Submit
Processes for a while and then shows <PIL.Image.Image image mode=RGB size=768x768 at 0x176753EF700> in the output
No errors shown on command line, no image created.

Sorry It is a bug of 1.0.4 and has been fixed at 1.0.5.

you could solve it via

pip install anything2image --upgrade

Still fails here. I setup a new venv then installed the latest 1.0.5.

Also, why are there 2 Textbox areas? Shouldn't there only be 1 text, 1 audio and 1 image and depending on what the user sets they are combined for the final image.

did you still face the same issue?

The first text box act as a prompt, and can be leave as empty. The following box including the second text box act as additional condition, and one of them must be provided.

Therefore, for text to image, you have to input text to second box and leave the first one empty

Yes, same issue. I created a new python environment. Activated the environment.

pip install anything2image (which installed the latest 1.0.5)
python -m anything2image.app

Error - No module named 'torchaudio'

So I install the GPU torch...

pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

Then restart the GUI
python -m anything2image.app
Click dog_audio.wav
Click Submit
Process runs for around 12 seconds and then the text <PIL.Image.Image image mode=RGB size=768x768 at 0x16928B9CCD0> is displayed in the output area. No image shown.

There is a tmp.wav created, but no images.

Very sorry for the error. I guess it might be some error in pypi packages.

Could you try to clone the repo and install locally via

pip install .

to see if the error still exists.

PS:no need to setup a new env

OK, tried this (I create the venv just to keep it isolated and new)

git clone https://github.com/Zeqiang-Lai/Anything2Image
cd Anything2Image
python -m venv .venv
.venv\scripts\activate
pip install .
python -m anything2image.app

ModuleNotFoundError: No module named 'torchaudio'

Install GPU torch again

pip uninstall -y torch
pip install --no-cache-dir --ignore-installed --force-reinstall --no-warn-conflicts torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

python -m anything2image.app

GUI starts
Click dog_audio.wav
Now I get a picture of a dog in the output area.
Clicking submit runs the process again and generates a new dog image.

Manual git clone and pip install . works.
pip install anything2image does not work.

Hope that helps you find the problem.
Also one minor issue, show the GUI URL as 127.0.0.1 and not 0.0.0.0 as under Windows it won't open a 0.0.0.0 URL.

Thanks @SoftologyPro. With your feedback, I update the UI and release the new versions. Hope this one could make it better.

image

That is so much better now, thank you. It works fine here.
Could you add settings for image size, strength and noise like in this repository? Then it would be perfect.
https://github.com/sail-sg/BindDiffusion

The settings of image size, strength, and noise have been added at 1.1.0, which also introduces an option for scheduler.

image

Thank you.
Strength is another useful setting too as that can influence how close to the initial image the generated images are.
One minor thing, the default is/was 768x768 and not 512x512.

Ops you are right. I mess it up with sd1.x. Thanks for that.

hi Zeqiang,
Does this repo work with sd1.5?

No, we rely on the SD Unclip to achieve the embedding magic, which enables the audio to image, etc.