Problem with deepspeed finetuning
Opened this issue · 1 comments
freQuensy23-coder commented
I've tried to train Mistral-7b-v0.1 on multiple GPU-s using deepspeed.
I started with example from ReadMe -
from xllm import Config
from xllm.datasets import GeneralDataset
from xllm.cli import cli_run_train
import deepspeed
print(deepspeed.__file__, deepspeed.__version__)
if __name__ == '__main__':
train_data = ["Hello!"] * 100
train_dataset = GeneralDataset.from_list(data=train_data)
cli_run_train(config_cls=Config, train_dataset=train_dataset)
And start it using:
deepspeed --num_gpus=4 main.py --deepspeed_stage 2 --apply_lora True
But it it did not start:
https://gist.github.com/freQuensy23-coder/3a2341d4642b19b07fd533ac62fbb6cb
Enviromental params
CUDA 11.7
torch==2.0.1
deepspeed==0.13.1
packaging
xllm
freQuensy23-coder commented
It looks like a problem on my part with installing the deepspeed
7 - AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'