[Guide] Using an AMD GPU
Chryseus opened this issue · 7 comments
Leaving this here for anyone wanting to use it with an AMD GPU.
Tested working with Python 3.10.8 Arch Linux, AMD RX 6650XT.
Installing
Install ROCm on your system, how you do this is going to vary with distribution, so search for how to install for your specific distribution, Windows or OSX I have no idea sorry, Arch users go here.
Create python virtual environment
python -m venv venv
Activate environment
source venv/bin/activate
Make sure pip and wheel is latest version
pip install --upgrade pip wheel
Install lama cleaner
pip install lama-cleaner
Install Torch ROCm 5.2 for AMD GPU support
pip install torch --extra-index-url https://download.pytorch.org/whl/rocm5.2/ --force
Remove cached packages if desired
pip cache purge
Run
lama-cleaner --model=lama --device=cuda --port=8080
Launching next time
source venv/bin/activate
lama-cleaner --model=lama --device=cuda --port=8080
Navi 23
Navi 23 (gfx1032) is not currently supported by ROCm, fortunately you can trick it into thinking it's Navi 21 and this works just fine, add HSA_OVERRIDE_GFX_VERSION=10.3.0 just before you launch it, this may also apply to Navi 22 cards.
great! may I ask two question:
- how is the performance? could you share with us?
- any chance to utilize multiple gpu?
great! may I ask two question:
1. how is the performance? could you share with us? 2. any chance to utilize multiple gpu?
Performance should be very similar to what you would get with a comparable Nvidia GPU.
As for multiple GPUs I'm not sure, Torch does support it but I'm not sure Lama cleaner does or how you would enable it.
After following this guide I just get:
lama-cleaner: error: torch.cuda.is_available() is False, please use --device cpu or check your pytorch installation
Perhaps something has changed in these many months?
On Debian 12
RX5700 XT
After following this guide I just get:
lama-cleaner: error: torch.cuda.is_available() is False, please use --device cpu or check your pytorch installation
Perhaps something has changed in these many months?
On Debian 12 RX5700 XT
I don't believe anything has changed other than ROCm 5.6 being the default version now, make sure ROCm is properly installed for your distribution, you may have some trouble with Debian although there is a few guides around such as this.
Yeah this happens to me too.
I'm new to this and that page is incomprehensible.
I've just installed it today taking notes:
Installation for AMD GPU-s on linux
Tested on Debian 12 (bookworm) with RX 6700 XT.
Used as desktop for some time so has sudo
installed and non-free-firmware
and non-free
repositories enabled.
Otherwise is vanilla desktop install if I recall correctly.
Prerequisites
amdgpu
driver
Check as follows:
$ lshw -c video|grep driver
WARNING: you should run this program as super-user.
configuration: depth=32 driver=amdgpu latency=0 resolution=2560,1440
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.
$
If there is a driver=amdgpu
text then driver is ok, moving to the next point.
If the correct driver is not installed, see here for Debian:
https://wiki.debian.org/AtiHowTo
Group membership.
Your user shall be a member of groups video
and render
.
If your username is e.g. jdoe
then you shall see this:
$ getent group video
video:x:44:jdoe
$ getent group render
render:x:105:jdoe
$
If you are not, add your user to these groups:
$ sudo usermod -a -G render jdoe
$ sudo usermod -a -G video jdoe
and logout / login.
Quick check for group membership
It is possible to check if the above parts worked by e.g. using some of the GPU monitoring software.
It should work without sudo
, with your usual user rights.
For example, radeontop
:
$ sudo apt install radeontop
$ radeontop
radeontop unknown, running on NAVY_FLOUNDER bus 0c, 120 samples/sec
│
Graphics pipe 0,00% │
───────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────
Event Engine 0,00% │
│
Vertex Grouper + Tesselator 0,00% │
│
Texture Addresser 0,00% │
│
Shader Export 0,00% │
Sequencer Instruction Cache 0,00% │
Shader Interpolator 0,00% │
│
Scan Converter 0,00% │
Primitive Assembly 0,00% │
│
Depth Block 0,00% │
Color Block 0,00% │
│
957M / 12233M VRAM 7,83% │█████
61M / 7943M GTT 0,76% │
0,10G / 1,00G Memory Clock 9,60% │██████
0,01G / 2,72G Shader Clock 0,20% │
│
Installation
Python, pip, venv
$ sudo apt install python3-venv python3-pip
Virtual environment
Create virtual environment for the software somewhere, e.g. in ~/projects or in ~/Documents
$ python3 -m venv ~/projects/lama-cleaner
Activate the environment
$ source ~/projects/lama-cleaner/bin/activate
(lama-cleaner) $
Install torch
(lama-cleaner) $ pip install torch --index-url https://download.pytorch.org/whl/rocm5.6
This will take some time, pip will download ca 1.5Gb here.
After installation you should see something similar in the console:
(lama-cleaner) $ pip install torch --index-url https://download.pytorch.org/whl/rocm5.6
Looking in indexes: https://download.pytorch.org/whl/rocm5.6
Collecting torch
Using cached https://download.pytorch.org/whl/rocm5.6/torch-2.1.1%2Brocm5.6-cp311-cp311-linux_x86_64.whl (1590.4 MB)
Collecting filelock
Using cached https://download.pytorch.org/whl/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Collecting typing-extensions
Using cached https://download.pytorch.org/whl/typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting sympy
Using cached https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting networkx
Using cached https://download.pytorch.org/whl/networkx-3.0-py3-none-any.whl (2.0 MB)
Collecting jinja2
Using cached https://download.pytorch.org/whl/Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting fsspec
Using cached https://download.pytorch.org/whl/fsspec-2023.4.0-py3-none-any.whl (153 kB)
Collecting pytorch-triton-rocm==2.1.0
Using cached https://download.pytorch.org/whl/pytorch_triton_rocm-2.1.0-cp311-cp311-linux_x86_64.whl (195.6 MB)
Collecting MarkupSafe>=2.0
Using cached https://download.pytorch.org/whl/MarkupSafe-2.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (28 kB)
Collecting mpmath>=0.19
Using cached https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
Installing collected packages: mpmath, typing-extensions, sympy, networkx, MarkupSafe, fsspec, filelock, pytorch-triton-rocm,
jinja2, torch
Successfully installed MarkupSafe-2.1.3 filelock-3.9.0 fsspec-2023.4.0 jinja2-3.1.2 mpmath-1.3.0 networkx-3.0 pytorch-triton-rocm-2.1.0 sympy-1.12 torch-2.1.1+rocm5.6 typing-extensions-4.4.0
(lama-cleaner) $
Install lama-cleaner
(lama-cleaner) $ pip install lama-cleaner
You should see success message in the end:
Successfully installed Pillow-10.0.1 aiofiles-23.2.1 altair-5.1.2 annotated-types-0.6.0 antlr4-python3-runtime-4.9.3 anyio-3.7.1
attrs-23.1.0 bidict-0.22.1 certifi-2023.7.22 charset-normalizer-3.3.2 click-8.1.7 colorama-0.4.6 contourpy-1.2.0 controlnet-aux-0.0.3
cycler-0.12.1 diffusers-0.16.1 einops-0.7.0 fastapi-0.104.1 ffmpy-0.3.1 flask-2.2.3 flask-cors-4.0.0 flask-socketio-5.3.6
flaskwebgui-0.3.5 fonttools-4.44.3 fsspec-2023.10.0 gradio-4.4.0 gradio-client-0.7.0 h11-0.14.0 httpcore-1.0.2 httpx-0.25.1
huggingface-hub-0.19.4 idna-3.4 imageio-2.32.0 importlib-metadata-6.8.0 importlib-resources-6.1.1 itsdangerous-2.1.2
jsonschema-4.20.0 jsonschema-specifications-2023.11.1 kiwisolver-1.4.5 lama-cleaner-1.2.5 lazy_loader-0.3 loguru-0.7.2
markdown-it-py-3.0.0 matplotlib-3.8.1 mdurl-0.1.2 numpy-1.26.2 omegaconf-2.3.0 opencv-python-4.8.1.78 orjson-3.9.10
packaging-23.2 pandas-2.1.3 piexif-1.1.3 pydantic-2.5.1 pydantic-core-2.14.3 pydub-0.25.1 pygments-2.16.1 pyparsing-3.1.1
python-dateutil-2.8.2 python-engineio-4.8.0 python-multipart-0.0.6 python-socketio-5.10.0 pytz-2023.3.post1 pyyaml-6.0.1
referencing-0.31.0 regex-2023.10.3 requests-2.31.0 rich-13.7.0 rpds-py-0.13.0 safetensors-0.4.0 scikit-image-0.22.0 scipy-1.11.3 semantic-version-2.10.0 shellingham-1.5.4 simple-websocket-1.0.0 six-1.16.0 sniffio-1.3.0 starlette-0.27.0 tifffile-2023.9.26
timm-0.9.10 tokenizers-0.13.3 tomlkit-0.12.0 toolz-0.12.0 torchvision-0.16.1 tqdm-4.66.1 transformers-4.27.4 typer-0.9.0
typing-extensions-4.8.0 tzdata-2023.3 urllib3-2.1.0 uvicorn-0.24.0.post1 websockets-11.0.3 werkzeug-2.2.2 whichcraft-0.6.1
wsproto-1.2.0 yacs-0.1.8 zipp-3.17.0
(lama-cleaner) $
Check if it works
(lama-cleaner) $ lama-cleaner --model=lama --device=cuda --port=8080
- Platform: Linux-6.1.0-13-amd64-x86_64-with-glibc2.36
- Python version: 3.11.2
- torch: 2.1.1+rocm5.6
- torchvision: 0.16.1
- Pillow: 10.0.1
- diffusers: 0.16.1
- transformers: 4.27.4
- opencv-python: 4.8.1.78
- xformers: N/A
- accelerate: N/A
- lama-cleaner: 1.2.5
- rembg: N/A
- realesrgan: N/A
- gfpgan: N/A
Downloading: "https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.pt" to /home/user/.cache/torch/hub/checkpoints/big-lama.pt
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 196M/196M [00:23<00:00, 8.86MB/s]
2023-11-17 23:15:05.167 | INFO | lama_cleaner.helper:download_model:52 - Download model success, md5: e3aa4aaa15225a33ec84f9f4bc47e500
2023-11-17 23:15:05.167 | INFO | lama_cleaner.helper:load_jit_model:102 - Loading model from: /home/user/.cache/torch/hub/checkpoints/big-lama.pt
Running on http://127.0.0.1:8080
Press CTRL+C to quit
Non-Pro video cards environment variables
UI was visible but python fails with segmentation fault trying to perform image processing:
2023-11-17 23:16:44.376 | INFO | lama_cleaner.server:process:284 - Origin image shape: (512, 512, 3)
2023-11-17 23:16:44.376 | INFO | lama_cleaner.model.base:__call__:83 - hd_strategy: Crop
2023-11-17 23:16:44.377 | INFO | lama_cleaner.model.base:_pad_forward:61 - final forward pad size: (512, 512, 3)
Segmentation fault (core dumped)
Environment variable is required to override GPU to compatible Pro model:
$ export HSA_OVERRIDE_GFX_VERSION=10.3.0
HSA_OVERRIDE_GFX_VERSION=10.3.0
works for RX 6700 XT.
For other cards there may be other values necessary, requires some searching.
E.g. RX 5700XT (Navi 10 - RDNA 1.0) may require one more variable according to comments here:
https://old.reddit.com/r/StableDiffusion/comments/ww436j/howto_stable_diffusion_on_an_amd_gpu/
export AMDGPU_TARGETS="gfx1010"
export HSA_OVERRIDE_GFX_VERSION=10.3.0
For newer RDNA3 cards (7600 and maybe others) shall be something like:
export HSA_OVERRIDE_GFX_VERSION=11.0.0
And run again:
(lama-cleaner) $ lama-cleaner --model=lama --device=cuda --port=8080
Convenience script
To start faster in the future without manually activating the virtual environment and
exporting variables, use something like this:
HSA_OVERRIDE_GFX_VERSION=10.3.0 ~/projects/lama-cleaner/bin/lama-cleaner --model=lama --device=cuda --port=8080
Or add the following script to the ~/bin, e.g. ~/bin/lama-cleaner
#!/bin/sh
export HSA_OVERRIDE_GFX_VERSION=10.3.0
~/projects/lama-cleaner/bin/lama-cleaner --model=lama --device=cuda --port=8080 "$@"
import setuptools
from pathlib import Path
package_files = Path("iopaint/web_app").glob("**/*")
package_files = [str(it).replace("iopaint/", "") for it in package_files]
package_files += ["model/anytext/ocr_recog/ppocr_keys_v1.txt"]
package_files += ["model/anytext/anytext_sd15.yaml"]
package_files += ["model/original_sd_configs/sd_xl_base.yaml"]
package_files += ["model/original_sd_configs/sd_xl_refiner.yaml"]
package_files += ["model/original_sd_configs/v1-inference.yaml"]
package_files += ["model/original_sd_configs/v2-inference-v.yaml"]
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
def load_requirements():
requirements_file_name = "requirements.txt"
requires = []
with open(requirements_file_name) as f:
for line in f:
if line:
requires.append(line.strip())
return requires
https://setuptools.readthedocs.io/en/latest/setuptools.html#including-data-files
setuptools.setup(
name="IOPaint",
version="1.0.3",
author="PanicByte",
author_email="cwq1913@gmail.com",
description="Image inpainting, outpainting tool powered by SOTA AI Model",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/Sanster/lama-cleaner",
packages=setuptools.find_packages("."),
package_data={"iopaint": package_files},
install_requires=load_requirements(),
python_requires=">=3.7",
entry_points={"console_scripts": ["iopaint=iopaint:entry_point"]},
classifiers=[
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
],
)from iopaint import entry_point
if name == "main":
entry_point().DS_Store
*.py[co]
docs/index.md
site/
PyCharm IDE
.idea#!/usr/bin/env python
coding: utf-8
"""
The approach taken is explained below. I decided to do it simply.
Initially I was considering parsing the data into some sort of
structure and then generating an appropriate README. I am still
considering doing it - but for now this should work. The only issue
I see is that it only sorts the entries at the lowest level, and that
the order of the top-level contents do not match the order of the actual
entries.
This could be extended by having nested blocks, sorting them recursively
and flattening the end structure into a list of lines. Revision 2 maybe ^.^.
"""
def sort_blocks():
# First, we load the current README into memory
with open('README.md', 'r') as read_me_file:
read_me = read_me_file.read()
# Separating the 'table of contents' from the contents (blocks)
table_of_contents = ''.join(read_me.split('- - -')[0])
blocks = ''.join(read_me.split('- - -')[1]).split('\n# ')
for i in range(len(blocks)):
if i == 0:
blocks[i] = blocks[i] + '\n'
else:
blocks[i] = '# ' + blocks[i] + '\n'
# Sorting the libraries
inner_blocks = sorted(blocks[0].split('##'))
for i in range(1, len(inner_blocks)):
if inner_blocks[i][0] != '#':
inner_blocks[i] = '##' + inner_blocks[i]
inner_blocks = ''.join(inner_blocks)
# Replacing the non-sorted libraries by the sorted ones and gathering all at the final_README file
blocks[0] = inner_blocks
final_README = table_of_contents + '- - -' + ''.join(blocks)
with open('README.md', 'w+') as sorted_file:
sorted_file.write(final_README)
def main():
# First, we load the current README into memory as an array of lines
with open('README.md', 'r') as read_me_file:
read_me = read_me_file.readlines()
# Then we cluster the lines together as blocks
# Each block represents a collection of lines that should be sorted
# This was done by assuming only links ([...](...)) are meant to be sorted
# Clustering is done by indentation
blocks = []
last_indent = None
for line in read_me:
s_line = line.lstrip()
indent = len(line) - len(s_line)
if any([s_line.startswith(s) for s in ['* [', '- [']]):
if indent == last_indent:
blocks[-1].append(line)
else:
blocks.append([line])
last_indent = indent
else:
blocks.append([line])
last_indent = None
with open('README.md', 'w+') as sorted_file:
# Then all of the blocks are sorted individually
blocks = [
''.join(sorted(block, key=str.lower)) for block in blocks
]
# And the result is written back to README.md
sorted_file.write(''.join(blocks))
# Then we call the sorting method
sort_blocks()
if name == "main":
main()#!/usr/bin/env python#!/usr/bin/env python.DS_Store
*.py[co]
docs/index.md
site/
PyCharm IDE
.ideahttps://tieba.baidu.com/p/1890266873?tid=1890266873&threadType=1040&jump_type=pbPage&jump_tieba_native=1