Running Hugging face model via DJL inside Inferentia environment failes with incompatible engine type

Question

Running Hugging face model via DJL inside Inferentia environment failes with incompatible engine type

aamirbutt opened this issue 3 years ago · 3 comments

So, I have been trying to run a transfomer model via DJL in an inferentia environment. I am following this tutorial (https://github.com/deepjavalibrary/djl-demo/blob/master/huggingface/inferentia/README.md).

After following every step, when I run the following line, it gives me an engine incompatibility error:

djl-serving -m "bert_qa::Python:*=file:$HOME/source/djl-demo/huggingface/inferentia/question_answering"

The error is:

(myenv) ubuntu@ip-172-31-61-29:~/code/djl-demo/huggingface/inferentia/question_answering$ djl-serving -m "bert_qa:PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"
[INFO ] -
Model server home: /usr/local/djl-serving-0.14.0
Current directory: /home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
Temp directory: /tmp
Number of CPUs: 4
Max heap size: 1918
Config file: /usr/local/djl-serving-0.14.0/conf/config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8080
Model Store: /usr/local/djl-serving-0.14.0/models
Initial Models: bert_qa:PyTorch:*=file:/home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
Netty threads: 0
Maximum Request Size: 6553500
[INFO ] - Initializing model: bert_qa:PyTorch:*=file:/home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
[INFO ] - Loading model bert_qa on cpu().
[INFO ] - Model server stopped.
[ERROR] - Unexpected error
java.util.concurrent.CompletionException: ai.djl.repository.zoo.ModelNotFoundException: ModelZoo doesn't support specified engine: *
        at ai.djl.serving.models.ModelManager.lambda$registerWorkflow$0(ModelManager.java:143) ~[serving-0.14.0.jar:?]
        at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1771) ~[?:?]
        at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1763) ~[?:?]
        at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ~[?:?]
        at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016) ~[?:?]
        at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665) ~[?:?]
        at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598) ~[?:?]
        at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) ~[?:?]
Caused by: ai.djl.repository.zoo.ModelNotFoundException: ModelZoo doesn't support specified engine: *
        at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:119) ~[api-0.14.0.jar:?]
        at ai.djl.serving.models.ModelManager.lambda$registerWorkflow$0(ModelManager.java:129) ~[serving-0.14.0.jar:?]
        ... 7 more

I have tried setting different engines in serving.properties file but the error remains the same. Any idea what could be the reason here?

Answer 1 · 2022-01-03T18:40:23.000Z

@aamirbutt
Sorry for the delay due to holidays.

I'm not able to reproduce your issue. the message in your log is suspicious:

[INFO ] - Loading model bert_qa on cpu().

Are you running on inf1 EC2 instance? Can share what the message returned with the following command:

curl http://169.254.169.254/latest/meta-data/instance-type

Answer 2 · 2022-01-03T18:51:32.000Z

@aamirbutt
It looks like you missed one : in your command line:

(myenv) ubuntu@ip-172-31-61-29:~/code/djl-demo/huggingface/inferentia/question_answering$ djl-serving -m "bert_qa:PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"

It should be:

djl-serving -m "bert_qa::PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"

Answer 3 · 2022-01-04T06:48:37.000Z

Thanks for your help. I have been able to run this successfully now, The main issue was that my environment variables were not correctly set.

I have posted another question. Can you please help me with that, too? Thanks.
#200