[Feature request] Adding a checker to see if a custom endpoint is working properly
remyleone opened this issue · 1 comments
remyleone commented
I'm trying to run a model using the following command on my server:
docker run --gpus all --shm-size 1g -p 8080:80 -v /scratch/data:/data -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HUB_ENA^CE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:1.1.0 --model-id bigcode/starcoder
But when I configure the IP in my editor: http://XXXX:8080/generate
I would like to have a test from the editor that tell me whether or not the editor can successfully connect.
It could be helpful in the settings or as a dedicated command from the LLM extensions to verify that everything is in place and have helpful messages in case it is not.
As a check from the server side, I'm using nvidia-smi
to see if additional GPU usage is happening
github-actions commented
This issue is stale because it has been open for 30 days with no activity.