Running Hugging face model via DJL inside Inferentia environment failes with incompatible engine type
aamirbutt opened this issue · 3 comments
So, I have been trying to run a transfomer model via DJL in an inferentia environment. I am following this tutorial (https://github.com/deepjavalibrary/djl-demo/blob/master/huggingface/inferentia/README.md).
After following every step, when I run the following line, it gives me an engine incompatibility error:
djl-serving -m "bert_qa::Python:*=file:$HOME/source/djl-demo/huggingface/inferentia/question_answering"
The error is:
(myenv) ubuntu@ip-172-31-61-29:~/code/djl-demo/huggingface/inferentia/question_answering$ djl-serving -m "bert_qa:PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"
[INFO ] -
Model server home: /usr/local/djl-serving-0.14.0
Current directory: /home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
Temp directory: /tmp
Number of CPUs: 4
Max heap size: 1918
Config file: /usr/local/djl-serving-0.14.0/conf/config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8080
Model Store: /usr/local/djl-serving-0.14.0/models
Initial Models: bert_qa:PyTorch:*=file:/home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
Netty threads: 0
Maximum Request Size: 6553500
[INFO ] - Initializing model: bert_qa:PyTorch:*=file:/home/ubuntu/code/djl-demo/huggingface/inferentia/question_answering
[INFO ] - Loading model bert_qa on cpu().
[INFO ] - Model server stopped.
[ERROR] - Unexpected error
java.util.concurrent.CompletionException: ai.djl.repository.zoo.ModelNotFoundException: ModelZoo doesn't support specified engine: *
at ai.djl.serving.models.ModelManager.lambda$registerWorkflow$0(ModelManager.java:143) ~[serving-0.14.0.jar:?]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1771) ~[?:?]
at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1763) ~[?:?]
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ~[?:?]
at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016) ~[?:?]
at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665) ~[?:?]
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598) ~[?:?]
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) ~[?:?]
Caused by: ai.djl.repository.zoo.ModelNotFoundException: ModelZoo doesn't support specified engine: *
at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:119) ~[api-0.14.0.jar:?]
at ai.djl.serving.models.ModelManager.lambda$registerWorkflow$0(ModelManager.java:129) ~[serving-0.14.0.jar:?]
... 7 more
I have tried setting different engines in serving.properties
file but the error remains the same. Any idea what could be the reason here?
@aamirbutt
Sorry for the delay due to holidays.
I'm not able to reproduce your issue. the message in your log is suspicious:
[INFO ] - Loading model bert_qa on cpu().
Are you running on inf1
EC2 instance? Can share what the message returned with the following command:
curl http://169.254.169.254/latest/meta-data/instance-type
@aamirbutt
It looks like you missed one :
in your command line:
(myenv) ubuntu@ip-172-31-61-29:~/code/djl-demo/huggingface/inferentia/question_answering$ djl-serving -m "bert_qa:PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"
It should be:
djl-serving -m "bert_qa::PyTorch:*=file:$HOME/code/djl-demo/huggingface/inferentia/question_answering"