Problem with deepspeed finetuning

Question

Problem with deepspeed finetuning

Opened this issue 8 months ago · 1 comments

freQuensy23-coder commented 8 months ago

I've tried to train Mistral-7b-v0.1 on multiple GPU-s using deepspeed.
I started with example from ReadMe -

from xllm import Config
from xllm.datasets import GeneralDataset
from xllm.cli import cli_run_train

import deepspeed

print(deepspeed.__file__, deepspeed.__version__)

if __name__ == '__main__':
    train_data = ["Hello!"] * 100
    train_dataset = GeneralDataset.from_list(data=train_data)
    cli_run_train(config_cls=Config, train_dataset=train_dataset)

And start it using:

deepspeed --num_gpus=4 main.py --deepspeed_stage 2   --apply_lora True

But it it did not start:
https://gist.github.com/freQuensy23-coder/3a2341d4642b19b07fd533ac62fbb6cb

Enviromental params
CUDA 11.7
torch==2.0.1
deepspeed==0.13.1
packaging
xllm

Answer 1 · 2024-01-29T11:47:17.000Z

It looks like a problem on my part with installing the deepspeed

7 - AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'