--model-store directory not found: model_store
metempasa opened this issue · 8 comments
Hello, Thanks for this project,
I use your default dockerfile, it is built but when i run the docker i got this error --model-store directory not found: model_store
You have any idea?
Hello and happy that you like it.
Hard to say without more information. But make sure you have your trained weights on the ressources/
folder.
I would say it looks that the command to convert the weights to torchscript failed. Show me your logs if you want more help.
Cheers
that was all my bad sorry
Line 26 in 1fcbd7b
it should be "--model-store", "model-store" instead of "--model-store", "model_store", then problem will be solved.
Thanks for your response
one last thing,
could you please share your post body for example, i am newbie at this multipart post thing.
Have a nice day!
There is no json body in the request, its a multipart form request:
Key/value with the keys strings like "img_{i}" and value is bytecode array of the image
I changed my own java_home which is working (I am sure) path but it is still same
@metempasa I am facing the same error when running torchserve i.e.
2021-02-15 17:04:15,545 [DEBUG] W-9000-my_model_name_0.1 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:188)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
2021-02-15 17:04:15,546 [INFO ] W-9000-my_model_name_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - self.handle_connection(cl_socket)
2021-02-15 17:04:15,573 [INFO ] W-9000-my_model_name_0.1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "C:\ProgramData\Miniconda3\Lib\site-packages\ts\model_service_worker.py", line 116, in handle_connection
2021-02-15 17:04:15,573 [WARN ] W-9000-my_model_name_0.1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: my_model_name, error: Worker died.
OS: Windows 10
Python: 3.6
If you are running with a GPU try to:
- check that you have nvidia-docker installed
- make a change in docker-compose configs to force GPU usage (there is an issue on docker-compose github open)