[Feature] nn.LazyLinear

Question

[Feature] nn.LazyLinear

Opened this issue 4 months ago · 2 comments

What is the feature?

The use of nn.LazyLinear will result in an error in the _dump_init_info function of the BaseModule. The main reason is that _dump_init_info writes the model's weights to the saved info, and if nn.LazyLinear is used, the weights are not initialized before the forward pass, leading to an error.
It is hoped that support for the use of nn.LazyLinear can be provided.

Any other context?

https://pytorch.org/docs/stable/generated/torch.nn.modules.lazy.LazyModuleMixin.html#torch.nn.modules.lazy.LazyModuleMixin
https://pytorch.org/docs/stable/generated/torch.nn.LazyLinear.html#torch.nn.LazyLinear

Answer 1 · 2024-02-04T15:54:51.000Z

Hi @holdjun , thanks for your feedback. We will fix it ASAP.

Answer 2 · 2024-02-07T02:51:33.000Z

Hi @holdjun , if you use lazy modules, what is your expected behavior (skip or other action) when loading checkpoints?