use Model Parallelism for GPT2?
yijinlee opened this issue · 4 comments
hf-transformers GPT2 (and T5) has a parallelize
method to use model parallelism to load into multi-GPU so that the combined GPU-mem is enough for the GPT2 size. I've tried to 'hack' around to see if I can make it work within blurr
for fastai, but unfortunately have not been successful. Pointers on how to get it working, and how best to add this into blurr
codebase as a PR, will be much appreciated.
Snippet of what I did:
model_cls = AutoModelForSequenceClassification
pretrained_model_name = "gpt2-medium"
hf_arch, hf_config, hf_tokenizer, hf_model = BLURR.get_hf_objects(pretrained_model_name, model_cls=model_cls)
hf_model.transformer.parallelize() # the hf method
hf_model.transformer.model_parallel
# True
hf_model.transformer.device_map # I have two GPUs
# {0: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
# 1: [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]}
if (hf_tokenizer.pad_token is None): hf_tokenizer.pad_token = '[PAD]'
hf_tokenizer.pad_token, hf_tokenizer.pad_token_id
hf_model.config.pad_token_id = hf_tokenizer.pad_token_id
blocks = (HF_TextBlock(hf_arch, hf_config, hf_tokenizer, hf_model), CategoryBlock)
dblock = DataBlock(blocks=blocks, get_x=ColReader('x'), get_y=ColReader('y'), splitter=RandomSplitter())
bs = 1
dls = dblock.dataloaders(df, bs=bs)
model = HF_BaseModelWrapper(hf_model)
learn = Learner(dls,
model,
opt_func=partial(Adam, decouple_wd=True),
loss_func=CrossEntropyLossFlat(),
metrics=[accuracy],
cbs=[HF_BaseModelCallback],
splitter=hf_splitter).to_fp16()
learn.freeze()
learn.fit_one_cycle(1,1e-3)
The error I got was:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking arugment for argument weight in method wrapper_native_layer_norm)
A few links that I looked at, for the parallelize
method:
huggingface/transformers#8696
https://huggingface.co/transformers/_modules/transformers/models/gpt2/modeling_gpt2.html
Thanks.
I'll take a look. I like it.
I'll take a look. I like it.
Is there anything I can help with on this? : )
Closing this out for now ... feel free to open if there are still issues.