[BUG]使用远程服务器运行chatbot_gradio.py时意外中断连接

Question

[BUG]使用远程服务器运行chatbot_gradio.py时意外中断连接

Opened this issue 5 months ago · 2 comments

您好，在使用实验室远程服务器（3张3090卡）运行chatbot_gradio.py文件时会发生服务器中断连接情况，直接退出原本正常运行界面，或者是直接卡住，想请问一下这是什么原因呢？是远程服务器无法带动该程序吗？
所运行的具体命令为python ./examples/chatbot_gradio.py --deepspeed configs/ds_config_chatbot.json --model_name_or_path YOUR-LLAMA --lora_model_path ./robin-7b --prompt_structure "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: {input_text}###Assistant:" --end_string "#" --max_new_tokens 200

Answer 1 · 2024-07-24T02:07:48.000Z

不好意思，具体运行命令写错了，具体运行命令为python ./examples/chatbot_gradio.py --deepspeed configs/ds_config_chatbot.json --model_name_or_path output_models/pretrained_minigpt4_7b.pth --prompt_structure "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.###Human: {input_text}###Assistant:" --end_string "#" --max_new_tokens 200

Answer 2 · 2024-08-12T15:26:34.000Z

Thanks for your interest in LMFlow! For 7B multi-modal models, running on 3090 may lead to slow inference speed, but should not cause any network connection problems. To resolve the network connection issue, we recommend first running the program in a tmux session, where disconnection will not effect the running of the program. Then you can access the server again and inspect the RAM and GPU memory allocation, which can be helpful for locating the problem. Hope this information can be helpful 🙏

感谢您对LMFlow的支持！7B多模态模型在3090上跑可能会比较慢，不过应该不至于导致网络断开。我们推荐在tmux里跑这个程序，这样网络断掉后程序还在。如果还断开，那么可以再登录，因为程序还在，所以可以检查CPU和GPU占用，帮助进一步定位具体的问题。希望上述操作能有帮助 🙏