docker build crashes my machine
dustyatx opened this issue · 7 comments
Not sure if this is a problem with my machine or not but I have a 32 core machine i9 with 128 GB of RAM & this pegs all my cores at 100%. Looks like there is some substantial compiling going on. The second time it spikes up my machine crashes. I was trying to figure out how to limit the number of cores it uses.. not sure but it might also spike memory usage up to 128GB or that could be an artifact of HTOP being disconnected over a SSH session hard to say.
Not my area of expertise but I haven't run into to many containers doing this kind of intensive build, normally it's just software installations and configuration.
I'm including the terminal logging that I captured.
dockerbuild_terminal_logging.txt
OK just confirmed that it also crashes on a Google Cloud VM, 22 Cores, 128GB of ram. Looks like this one ran out of RAM as well..
gc_docker_build_fail_22core_128.txt
Hi! I had the same issue. I changed the Dockerfile so to include the env variable MAX_JOBS:
RUN rm -rf ./flash-attention/* &&
pip uninstall flash_attn -y &&
git clone https://github.com/Dao-AILab/flash-attention.git &&
cd flash-attention/csrc/rotary && MAX_JOBS=4 python setup.py install &&
cd ../layer_norm && MAX_JOBS=4 python setup.py install &&
cd ../../ && MAX_JOBS=4 python setup.py install
RUN MAX_JOBS=4 pip install ninja tokenizers==0.14.1 einops transformers==4.34.1
I used 4, but I guess you can go for something higher (maybe 10).
@nedRad88 AH thank you.!! It's been a long time since I worked with Docker. I was trying to pass MAX_JOBS in with the docker build command and it wouldn't work. Looking forward to giving this a try.
Yes! @nedRad88 's suggestion should work 👍 . ninja
can use a lot of resources when building flash-attn if MAX_JOBS is not specified. Feel free to re-open if the issue persists.
This work around did not work for me. I updated the Docker file with the MAX_JOBS set to 4 and I'm still running out of memory. Any other advice?
`
FROM nvcr.io/nvidia/pytorch:23.06-py3
WORKDIR /workdir
RUN rm -rf ./flash-attention/* &&
pip uninstall flash_attn -y &&
git clone https://github.com/Dao-AILab/flash-attention.git &&
cd flash-attention/csrc/rotary && MAX_JOBS=4 python setup.py install &&
cd ../layer_norm && python setup.py install &&
cd ../../ && python setup.py install
RUN MAX_JOBS=4 pip install ninja tokenizers==0.14.1 einops transformers==4.34.1
`
OK I got it to build.. had to rent a VM big enough to handle this.. for some reason the max_jobs setting didn't seem to do anything for me.
96 Core & 604 GB (300+ used at peak).
MAX JOBS
controls how many processes to launch in parallel for compilation. You could try adding that as env variable or setting that for each flash_attn
install step e.g.,
cd flash-attention/csrc/rotary && MAX_JOBS=4 python setup.py install &&
cd ../layer_norm && MAX_JOBS=4 python setup.py install &&
cd ../../ && MAX_JOBS=4 python setup.py install