project-baize/baize-chatbot

Weird Reported Memory Usage

Lyken17 opened this issue · 3 comments

I notice in current report

  Training (with int8)
Baize-7B 26GB
Baize-13B 25GB
Baize-30B 42GB

13B models consumes actually less memory than 7B. Is it a typo?

Also had a question, is this really 1 Gig shy of being able to run on a 4090? Or is this just the memory that wound up getting used during training and wouldn't actually potentially prevent say a 24 gig VRAM device from running this.

Baize-13B 25GB
13B models consumes actually less memory than 7B. Is it a typo?

It's not! As we said in README, the reported GPU memory usage is based on the default settings, where we use 1/2 of 7B's batch size for 13B.

Also had a question, is this really 1 Gig shy of being able to run on a 4090? Or is this just the memory that wound up getting used during training and wouldn't actually potentially prevent say a 24 gig VRAM device from running this.

No, you can definitely get it running on 4090. Just change $BATCH_SIZE in python finetune.py 7b $BATCH_SIZE 0.0002 alpaca,stackoverflow,quora to a smaller value then you're good to go!