您好,关于 训练和推理,有建议的服务器配置吗?
willqq opened this issue · 6 comments
willqq commented
您好,关于 训练和推理,有建议的服务器配置吗?
soulteary commented
推理的话,我本地试着跑了下,在尚未开启任何优化的情况下,差不多 14G 左右显存足够,所以 v100 或者其他的 3090 啥的大于 14G 的卡都行(保险点,整个 16G? @willqq
Fri Jul 21 15:36:56 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | Off |
| 31% 45C P2 69W / 450W | 14341MiB / 24564MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1434 G /usr/lib/xorg/Xorg 167MiB |
| 0 N/A N/A 1673 G /usr/bin/gnome-shell 16MiB |
| 0 N/A N/A 4343 C python 14154MiB |
+-----------------------------------------------------------------------------+
soulteary commented
为项目点赞 @shiyemin 👍
额外补充,我对这个模型做了一个简单的量化尝试,大概5G显存就能跑了,模型下载地址:
https://huggingface.co/soulteary/Chinese-Llama-2-7b-4bit/tree/main
配套的说明 & 博客教程,有需自取
https://github.com/soulteary/docker-llama2-chat
cc @willqq
shiyemin commented
基于 @soulteary 提供的代码,LinkSoul也提供了4bit量化的版本,方便大家使用。
willqq commented
感谢各位的耐心解答