When scanning using 'meta' tensors, handle torch calls to allocate new data.

Question

When scanning using 'meta' tensors, handle torch calls to allocate new data.

Closed this issue 6 months ago · 0 comments

JadenFiotto-Kaufman commented 8 months ago

Sometimes (in models like llama), running the model creates new tensors like torch.ones(...) or something. This creates a cpu tensor by default which can mess with meta tensors during scanning. Need to set default device in _scan like torch.set_default_device or use patching if need be.