Sanster/IOPaint

[Guide] Using an AMD GPU

Chryseus opened this issue · 7 comments

Leaving this here for anyone wanting to use it with an AMD GPU.
Tested working with Python 3.10.8 Arch Linux, AMD RX 6650XT.

Installing
Install ROCm on your system, how you do this is going to vary with distribution, so search for how to install for your specific distribution, Windows or OSX I have no idea sorry, Arch users go here.

Create python virtual environment
python -m venv venv

Activate environment
source venv/bin/activate

Make sure pip and wheel is latest version
pip install --upgrade pip wheel

Install lama cleaner
pip install lama-cleaner

Install Torch ROCm 5.2 for AMD GPU support
pip install torch --extra-index-url https://download.pytorch.org/whl/rocm5.2/ --force

Remove cached packages if desired
pip cache purge

Run
lama-cleaner --model=lama --device=cuda --port=8080

Launching next time
source venv/bin/activate
lama-cleaner --model=lama --device=cuda --port=8080

Navi 23
Navi 23 (gfx1032) is not currently supported by ROCm, fortunately you can trick it into thinking it's Navi 21 and this works just fine, add HSA_OVERRIDE_GFX_VERSION=10.3.0 just before you launch it, this may also apply to Navi 22 cards.

yi commented

great! may I ask two question:

  1. how is the performance? could you share with us?
  2. any chance to utilize multiple gpu?

great! may I ask two question:

1. how is the performance? could you share with us?

2. any chance to utilize multiple gpu?

Performance should be very similar to what you would get with a comparable Nvidia GPU.
As for multiple GPUs I'm not sure, Torch does support it but I'm not sure Lama cleaner does or how you would enable it.

After following this guide I just get:

lama-cleaner: error: torch.cuda.is_available() is False, please use --device cpu or check your pytorch installation

Perhaps something has changed in these many months?

On Debian 12
RX5700 XT

After following this guide I just get:

lama-cleaner: error: torch.cuda.is_available() is False, please use --device cpu or check your pytorch installation

Perhaps something has changed in these many months?

On Debian 12 RX5700 XT

I don't believe anything has changed other than ROCm 5.6 being the default version now, make sure ROCm is properly installed for your distribution, you may have some trouble with Debian although there is a few guides around such as this.

Yeah this happens to me too.
I'm new to this and that page is incomprehensible.

aa956 commented

I've just installed it today taking notes:

Installation for AMD GPU-s on linux

Tested on Debian 12 (bookworm) with RX 6700 XT.
Used as desktop for some time so has sudo installed and non-free-firmware and non-free repositories enabled.
Otherwise is vanilla desktop install if I recall correctly.

Prerequisites

amdgpu driver

Check as follows:

$ lshw -c video|grep driver
WARNING: you should run this program as super-user.
       configuration: depth=32 driver=amdgpu latency=0 resolution=2560,1440
WARNING: output may be incomplete or inaccurate, you should run this program as super-user.
$ 

If there is a driver=amdgpu text then driver is ok, moving to the next point.

If the correct driver is not installed, see here for Debian:
https://wiki.debian.org/AtiHowTo

Group membership.

Your user shall be a member of groups video and render.

If your username is e.g. jdoe then you shall see this:

$ getent group video
video:x:44:jdoe
$ getent group render
render:x:105:jdoe
$ 

If you are not, add your user to these groups:

$ sudo usermod -a -G render jdoe
$ sudo usermod -a -G video jdoe

and logout / login.

Quick check for group membership

It is possible to check if the above parts worked by e.g. using some of the GPU monitoring software.
It should work without sudo, with your usual user rights.

For example, radeontop:

$ sudo apt install radeontop
$ radeontop
                                 radeontop unknown, running on NAVY_FLOUNDER bus 0c, 120 samples/sec                                 

                                             Graphics pipe   0,00% │
───────────────────────────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────
                                              Event Engine   0,00% │

                               Vertex Grouper + Tesselator   0,00% │

                                         Texture Addresser   0,00% │

                                             Shader Export   0,00% │
                               Sequencer Instruction Cache   0,00% │
                                       Shader Interpolator   0,00% │

                                            Scan Converter   0,00% │
                                        Primitive Assembly   0,00% │

                                               Depth Block   0,00% │
                                               Color Block   0,00% │

                                        957M / 12233M VRAM   7,83% │█████
                                           61M / 7943M GTT   0,76% │
                                0,10G / 1,00G Memory Clock   9,60% │██████
                                0,01G / 2,72G Shader Clock   0,20% │

Installation

Python, pip, venv

$ sudo apt install python3-venv python3-pip

Virtual environment

Create virtual environment for the software somewhere, e.g. in ~/projects or in ~/Documents

$ python3 -m venv ~/projects/lama-cleaner

Activate the environment

$ source ~/projects/lama-cleaner/bin/activate
(lama-cleaner) $ 

Install torch

(lama-cleaner) $ pip install torch --index-url https://download.pytorch.org/whl/rocm5.6

This will take some time, pip will download ca 1.5Gb here.

After installation you should see something similar in the console:

(lama-cleaner) $ pip install torch --index-url https://download.pytorch.org/whl/rocm5.6
Looking in indexes: https://download.pytorch.org/whl/rocm5.6
Collecting torch
  Using cached https://download.pytorch.org/whl/rocm5.6/torch-2.1.1%2Brocm5.6-cp311-cp311-linux_x86_64.whl (1590.4 MB)
Collecting filelock
  Using cached https://download.pytorch.org/whl/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Collecting typing-extensions
  Using cached https://download.pytorch.org/whl/typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting sympy
  Using cached https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting networkx
  Using cached https://download.pytorch.org/whl/networkx-3.0-py3-none-any.whl (2.0 MB)
Collecting jinja2
  Using cached https://download.pytorch.org/whl/Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting fsspec
  Using cached https://download.pytorch.org/whl/fsspec-2023.4.0-py3-none-any.whl (153 kB)
Collecting pytorch-triton-rocm==2.1.0
  Using cached https://download.pytorch.org/whl/pytorch_triton_rocm-2.1.0-cp311-cp311-linux_x86_64.whl (195.6 MB)
Collecting MarkupSafe>=2.0
  Using cached https://download.pytorch.org/whl/MarkupSafe-2.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (28 kB)
Collecting mpmath>=0.19
  Using cached https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
Installing collected packages: mpmath, typing-extensions, sympy, networkx, MarkupSafe, fsspec, filelock, pytorch-triton-rocm,
jinja2, torch
Successfully installed MarkupSafe-2.1.3 filelock-3.9.0 fsspec-2023.4.0 jinja2-3.1.2 mpmath-1.3.0 networkx-3.0 pytorch-triton-rocm-2.1.0 sympy-1.12 torch-2.1.1+rocm5.6 typing-extensions-4.4.0
(lama-cleaner) $ 

Install lama-cleaner

(lama-cleaner) $ pip install lama-cleaner

You should see success message in the end:

Successfully installed Pillow-10.0.1 aiofiles-23.2.1 altair-5.1.2 annotated-types-0.6.0 antlr4-python3-runtime-4.9.3 anyio-3.7.1
attrs-23.1.0 bidict-0.22.1 certifi-2023.7.22 charset-normalizer-3.3.2 click-8.1.7 colorama-0.4.6 contourpy-1.2.0 controlnet-aux-0.0.3
cycler-0.12.1 diffusers-0.16.1 einops-0.7.0 fastapi-0.104.1 ffmpy-0.3.1 flask-2.2.3 flask-cors-4.0.0 flask-socketio-5.3.6
flaskwebgui-0.3.5 fonttools-4.44.3 fsspec-2023.10.0 gradio-4.4.0 gradio-client-0.7.0 h11-0.14.0 httpcore-1.0.2 httpx-0.25.1
huggingface-hub-0.19.4 idna-3.4 imageio-2.32.0 importlib-metadata-6.8.0 importlib-resources-6.1.1 itsdangerous-2.1.2
jsonschema-4.20.0 jsonschema-specifications-2023.11.1 kiwisolver-1.4.5 lama-cleaner-1.2.5 lazy_loader-0.3 loguru-0.7.2
markdown-it-py-3.0.0 matplotlib-3.8.1 mdurl-0.1.2 numpy-1.26.2 omegaconf-2.3.0 opencv-python-4.8.1.78 orjson-3.9.10
packaging-23.2 pandas-2.1.3 piexif-1.1.3 pydantic-2.5.1 pydantic-core-2.14.3 pydub-0.25.1 pygments-2.16.1 pyparsing-3.1.1
python-dateutil-2.8.2 python-engineio-4.8.0 python-multipart-0.0.6 python-socketio-5.10.0 pytz-2023.3.post1 pyyaml-6.0.1
referencing-0.31.0 regex-2023.10.3 requests-2.31.0 rich-13.7.0 rpds-py-0.13.0 safetensors-0.4.0 scikit-image-0.22.0 scipy-1.11.3 semantic-version-2.10.0 shellingham-1.5.4 simple-websocket-1.0.0 six-1.16.0 sniffio-1.3.0 starlette-0.27.0 tifffile-2023.9.26
timm-0.9.10 tokenizers-0.13.3 tomlkit-0.12.0 toolz-0.12.0 torchvision-0.16.1 tqdm-4.66.1 transformers-4.27.4 typer-0.9.0
typing-extensions-4.8.0 tzdata-2023.3 urllib3-2.1.0 uvicorn-0.24.0.post1 websockets-11.0.3 werkzeug-2.2.2 whichcraft-0.6.1
wsproto-1.2.0 yacs-0.1.8 zipp-3.17.0
(lama-cleaner) $ 

Check if it works

(lama-cleaner) $ lama-cleaner --model=lama --device=cuda --port=8080
- Platform: Linux-6.1.0-13-amd64-x86_64-with-glibc2.36
- Python version: 3.11.2
- torch: 2.1.1+rocm5.6
- torchvision: 0.16.1
- Pillow: 10.0.1
- diffusers: 0.16.1
- transformers: 4.27.4
- opencv-python: 4.8.1.78
- xformers: N/A
- accelerate: N/A
- lama-cleaner: 1.2.5
- rembg: N/A
- realesrgan: N/A
- gfpgan: N/A

Downloading: "https://github.com/Sanster/models/releases/download/add_big_lama/big-lama.pt" to /home/user/.cache/torch/hub/checkpoints/big-lama.pt
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 196M/196M [00:23<00:00, 8.86MB/s]
2023-11-17 23:15:05.167 | INFO     | lama_cleaner.helper:download_model:52 - Download model success, md5: e3aa4aaa15225a33ec84f9f4bc47e500
2023-11-17 23:15:05.167 | INFO     | lama_cleaner.helper:load_jit_model:102 - Loading model from: /home/user/.cache/torch/hub/checkpoints/big-lama.pt
Running on http://127.0.0.1:8080
Press CTRL+C to quit

Non-Pro video cards environment variables

UI was visible but python fails with segmentation fault trying to perform image processing:

2023-11-17 23:16:44.376 | INFO     | lama_cleaner.server:process:284 - Origin image shape: (512, 512, 3)
2023-11-17 23:16:44.376 | INFO     | lama_cleaner.model.base:__call__:83 - hd_strategy: Crop
2023-11-17 23:16:44.377 | INFO     | lama_cleaner.model.base:_pad_forward:61 - final forward pad size: (512, 512, 3)
Segmentation fault (core dumped)

Environment variable is required to override GPU to compatible Pro model:

$ export HSA_OVERRIDE_GFX_VERSION=10.3.0

HSA_OVERRIDE_GFX_VERSION=10.3.0 works for RX 6700 XT.

For other cards there may be other values necessary, requires some searching.

E.g. RX 5700XT (Navi 10 - RDNA 1.0) may require one more variable according to comments here:
https://old.reddit.com/r/StableDiffusion/comments/ww436j/howto_stable_diffusion_on_an_amd_gpu/

export AMDGPU_TARGETS="gfx1010"
export HSA_OVERRIDE_GFX_VERSION=10.3.0

For newer RDNA3 cards (7600 and maybe others) shall be something like:

export HSA_OVERRIDE_GFX_VERSION=11.0.0

And run again:

(lama-cleaner) $ lama-cleaner --model=lama --device=cuda --port=8080

Convenience script

To start faster in the future without manually activating the virtual environment and
exporting variables, use something like this:

HSA_OVERRIDE_GFX_VERSION=10.3.0 ~/projects/lama-cleaner/bin/lama-cleaner --model=lama --device=cuda --port=8080

Or add the following script to the ~/bin, e.g. ~/bin/lama-cleaner

#!/bin/sh
export HSA_OVERRIDE_GFX_VERSION=10.3.0
~/projects/lama-cleaner/bin/lama-cleaner --model=lama --device=cuda --port=8080 "$@"

import setuptools
from pathlib import Path

package_files = Path("iopaint/web_app").glob("**/*")
package_files = [str(it).replace("iopaint/", "") for it in package_files]
package_files += ["model/anytext/ocr_recog/ppocr_keys_v1.txt"]
package_files += ["model/anytext/anytext_sd15.yaml"]
package_files += ["model/original_sd_configs/sd_xl_base.yaml"]
package_files += ["model/original_sd_configs/sd_xl_refiner.yaml"]
package_files += ["model/original_sd_configs/v1-inference.yaml"]
package_files += ["model/original_sd_configs/v2-inference-v.yaml"]

with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()

def load_requirements():
requirements_file_name = "requirements.txt"
requires = []
with open(requirements_file_name) as f:
for line in f:
if line:
requires.append(line.strip())
return requires

https://setuptools.readthedocs.io/en/latest/setuptools.html#including-data-files

setuptools.setup(
name="IOPaint",
version="1.0.3",
author="PanicByte",
author_email="cwq1913@gmail.com",
description="Image inpainting, outpainting tool powered by SOTA AI Model",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/Sanster/lama-cleaner",
packages=setuptools.find_packages("."),
package_data={"iopaint": package_files},
install_requires=load_requirements(),
python_requires=">=3.7",
entry_points={"console_scripts": ["iopaint=iopaint:entry_point"]},
classifiers=[
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Topic :: Scientific/Engineering :: Artificial Intelligence",
],
)from iopaint import entry_point

if name == "main":
entry_point().DS_Store

*.py[co]

docs/index.md
site/

PyCharm IDE

.idea#!/usr/bin/env python

coding: utf-8

"""
The approach taken is explained below. I decided to do it simply.
Initially I was considering parsing the data into some sort of
structure and then generating an appropriate README. I am still
considering doing it - but for now this should work. The only issue
I see is that it only sorts the entries at the lowest level, and that
the order of the top-level contents do not match the order of the actual
entries.

 This could be extended by having nested blocks, sorting them recursively 
 and flattening the end structure into a list of lines. Revision 2 maybe ^.^. 

"""

def sort_blocks():
# First, we load the current README into memory
with open('README.md', 'r') as read_me_file:
read_me = read_me_file.read()

 # Separating the 'table of contents' from the contents (blocks) 
 table_of_contents = ''.join(read_me.split('- - -')[0]) 
 blocks = ''.join(read_me.split('- - -')[1]).split('\n# ') 
 for i in range(len(blocks)): 
     if i == 0: 
         blocks[i] = blocks[i] + '\n' 
     else: 
         blocks[i] = '# ' + blocks[i] + '\n' 

 # Sorting the libraries 
 inner_blocks = sorted(blocks[0].split('##')) 
 for i in range(1, len(inner_blocks)): 
     if inner_blocks[i][0] != '#': 
         inner_blocks[i] = '##' + inner_blocks[i] 
 inner_blocks = ''.join(inner_blocks) 

 # Replacing the non-sorted libraries by the sorted ones and gathering all at the final_README file 
 blocks[0] = inner_blocks 
 final_README = table_of_contents + '- - -' + ''.join(blocks) 

 with open('README.md', 'w+') as sorted_file: 
     sorted_file.write(final_README) 

def main():
# First, we load the current README into memory as an array of lines
with open('README.md', 'r') as read_me_file:
read_me = read_me_file.readlines()

 # Then we cluster the lines together as blocks 
 # Each block represents a collection of lines that should be sorted 
 # This was done by assuming only links ([...](...)) are meant to be sorted 
 # Clustering is done by indentation 
 blocks = [] 
 last_indent = None 
 for line in read_me: 
     s_line = line.lstrip() 
     indent = len(line) - len(s_line) 

     if any([s_line.startswith(s) for s in ['* [', '- [']]): 
         if indent == last_indent: 
             blocks[-1].append(line) 
         else: 
             blocks.append([line]) 
         last_indent = indent 
     else: 
         blocks.append([line]) 
         last_indent = None 

 with open('README.md', 'w+') as sorted_file: 
     # Then all of the blocks are sorted individually 
     blocks = [ 
         ''.join(sorted(block, key=str.lower)) for block in blocks 
     ] 
     # And the result is written back to README.md 
     sorted_file.write(''.join(blocks)) 

 # Then we call the sorting method 
 sort_blocks() 

if name == "main":
main()#!/usr/bin/env python#!/usr/bin/env python.DS_Store

*.py[co]

docs/index.md
site/

PyCharm IDE

.ideahttps://tieba.baidu.com/p/1890266873?tid=1890266873&threadType=1040&jump_type=pbPage&jump_tieba_native=1