deepjavalibrary/djl-demo

Running Hugging face model via DJL inside Inferentia environment failes with incompatible engine type

aamirbutt opened this issue · 3 comments

So, I have been trying to run a transfomer model via DJL in an inferentia environment. I am following this tutorial (https://github.com/deepjavalibrary/djl-demo/blob/master/huggingface/inferentia/README.md).

After following every step, when I run the following line, it gives me an engine incompatibility error:

djl-serving -m "bert_qa::Python:*=file:$HOME/source/djl-demo/huggingface/inferentia/question_answering"

The error is:

(myenv) ubuntu@ip-172-31-61-29:~/code/djl-demo/huggingface/inferentia/question_answering$ djl-serving -m "bert_qa:PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"
[INFO ] -
Model server home: /usr/local/djl-serving-0.14.0
Current directory: /home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
Temp directory: /tmp
Number of CPUs: 4
Max heap size: 1918
Config file: /usr/local/djl-serving-0.14.0/conf/config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8080
Model Store: /usr/local/djl-serving-0.14.0/models
Initial Models: bert_qa:PyTorch:*=file:/home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
Netty threads: 0
Maximum Request Size: 6553500
[INFO ] - Initializing model: bert_qa:PyTorch:*=file:/home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
[INFO ] - Loading model bert_qa on cpu().
[INFO ] - Model server stopped.
[ERROR] - Unexpected error
java.util.concurrent.CompletionException: ai.djl.repository.zoo.ModelNotFoundException: ModelZoo doesn't support specified engine: *
        at ai.djl.serving.models.ModelManager.lambda$registerWorkflow$0(ModelManager.java:143) ~[serving-0.14.0.jar:?]
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1771) ~[?:?]
        at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1763) ~[?:?]
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ~[?:?]
        at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016) ~[?:?]
        at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665) ~[?:?]
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598) ~[?:?]
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) ~[?:?]
Caused by: ai.djl.repository.zoo.ModelNotFoundException: ModelZoo doesn't support specified engine: *
        at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:119) ~[api-0.14.0.jar:?]
        at ai.djl.serving.models.ModelManager.lambda$registerWorkflow$0(ModelManager.java:129) ~[serving-0.14.0.jar:?]
        ... 7 more
        

I have tried setting different engines in serving.properties file but the error remains the same. Any idea what could be the reason here?

@aamirbutt
Sorry for the delay due to holidays.

I'm not able to reproduce your issue. the message in your log is suspicious:

[INFO ] - Loading model bert_qa on cpu().

Are you running on inf1 EC2 instance? Can share what the message returned with the following command:

curl http://169.254.169.254/latest/meta-data/instance-type

@aamirbutt
It looks like you missed one : in your command line:

(myenv) ubuntu@ip-172-31-61-29:~/code/djl-demo/huggingface/inferentia/question_answering$ djl-serving -m "bert_qa:PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"

It should be:

djl-serving -m "bert_qa::PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"

Thanks for your help. I have been able to run this successfully now, The main issue was that my environment variables were not correctly set.

I have posted another question. Can you please help me with that, too? Thanks.
#200