awslabs/multi-model-server

mms-gpu - cuda error - No kernel image is available for execution on the device

kaushal-idx opened this issue · 0 comments

I am getting this issue, not sure what is going wrong

image

my driver is 470.103.01 and cuda version on host machine is 11.4

image

I tried replicating the same in mms-gpu-docker

FROM nvidia/cuda:11.4.0-cudnn8-runtime-ubuntu20.04

ENV PYTHONUNBUFFERED TRUE

RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    fakeroot \
    ca-certificates \
    dpkg-dev \
    g++ \
    python3-dev \
    openjdk-8-jdk-headless \
    curl \
    vim \
    && rm -rf /var/lib/apt/lists/* \
    && cd /tmp \
    && curl -O https://bootstrap.pypa.io/pip/3.6/get-pip.py \
    && python3 get-pip.py


RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
RUN update-alternatives --install /usr/local/bin/pip pip /usr/local/bin/pip3 1

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    ffmpeg libsm6 libxext6

RUN pip install  multi-model-server \
    && pip install  mxnet-cu92mkl==1.4.0
    
RUN useradd -m model-server \
    && mkdir -p /home/model-server/tmp
COPY --chown=model-server dockerd-entrypoint.sh /usr/local/bin/dockerd-entrypoint.sh
COPY --chown=model-server config.properties /home/model-server
COPY --chown=model-server extract_snapshot_details.py /home/model-server
COPY --chown=model-server get_snapshot.py /home/model-server
RUN chmod +x /usr/local/bin/dockerd-entrypoint.sh \
    && chown -R model-server /home/model-server
EXPOSE 8080 8081
USER model-server
WORKDIR /home/model-server
ENV TEMP=/home/model-server/tmp
ENV AWS_PROFILE=textract
COPY --chown=model-server requirements.txt .
ENV PATH="/home/model-server/.local/bin:${PATH}"
RUN pip install -r requirements.txt
RUN mkdir -p /home/model-server/model-store/logs
ENTRYPOINT ["/usr/local/bin/dockerd-entrypoint.sh"]
CMD ["serve"]

Not sure what am i doing wrong, but i am getting the following error

image

Can you please help me, solve this