RWKV split CPU & GPU results in high perplexity

Question

RWKV split CPU & GPU results in high perplexity

3outeille opened this issue a year ago · 4 comments

System Info

Using #22797 (comment) PR, I tried to evaluate perplexity on wikitext2 using HuggingFace RWKV but found a weird behavior (gist to reproduce the bug: https://gist.github.com/3outeille/e74ec833ec2800a94325f8dad8e0da3d).

When model is fully loaded on CPU or GPU, perlexity is fine
When some block of RWKV are loaded in CPU and GPU, perplexity is high

Any idea ?

Who can help?

@sgugger, @younesbelkada

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

https://gist.github.com/3outeille/e74ec833ec2800a94325f8dad8e0da3d

Expected behavior

Full CPU ✔️ :
- nlls: tensor([2.0129, 2.3220, 2.3500])
- Perplexity: 9.284077644348145
Full GPU ✔️ :
- nlls: tensor([2.0137, 2.3223, 2.3496], device='cuda:0', dtype=torch.float16)
- Perplexity: 9.2890625
Split 🔴 :
- nlls: tensor([15.6641, 15.9141, 16.5469], device='cuda:0', dtype=torch.float16)
- Perplexity: 9312564.0

Answer 1 · 2023-05-23T12:13:00.000Z

@younesbelkada Any update ?

Answer 2 · 2023-06-26T07:46:22.000Z

Hi @3outeille
Sadly I didn't had time to check that out, are you still facing the issue with the latest main branch of transformers & accelerate?

Answer 3 · 2023-06-26T09:21:14.000Z

Hi @younesbelkada, I update transformers & accelerate to the latest release version as shown here: https://github.com/3outeille/hf_rwkv_bug/blob/master/requirements.txt and the bug is still here

Answer 4 · 2023-07-20T15:03:08.000Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.