tunib-ai/parallelformers

docker support

hyunwoongko opened this issue · 2 comments

We will continue to log problems with Docker containers on this thread. And we aim to solve it. Ultimately, the goal is to deploy the model in a Kubernetes environment. If anyone has any problems with the Docker environment, please feel free to leave issues. We will actively review and resolve them.

Running Flask in a docker container with the module results in problems that do not produce any errors. The performance is equal to CPU and in order to receive printing from lines below the parallelize(), flush = True is needed. In rare cases, it freezes. Running on Ipython on the same machine works fine, with 10x speed-ups which is awesome!

Some docker info:
FROM nvidia/cuda:11.4.2-cudnn8-devel-ubuntu20.04 as base
ENV PYTHON_VERSION 3.7.8
Using pipenv to install modules

docker run --gpus '"device=0,1"' --ipc=host .....ports, volume and docker......

parallelformers uses shared memory to transfer data into other processes.

I found ALL errors in environments with limited resource like docker container are related with shared memory size.
So, if you want to use larger model in docker container, please REMOVE the limitation of shared memory by --ipc=host or INCREASE shared memory size of your docker container by --shm-size=?gb.