Missing Pretraining Code and Detailed Explanation

Question

Missing Pretraining Code and Detailed Explanation

Opened this issue 16 days ago · 0 comments

5i-wanna-be-the-666 commented 16 days ago

Description

Thank you for open-sourcing the ft-transformer repository! I’ve been exploring the project and encountered some challenges that I’d like to bring to your attention.

The documentation mentions that ft-transformer uses an approach similar to Electra for training the model.transformer. However, there’s no provided implementation of the pretraining phase in the repository, and the description of this process is somewhat unclear. This makes it difficult for users to reproduce the pretraining stage.

Problems

Missing pretraining code: The repository currently doesn’t include any implementation of the pretraining phase, making it hard for users to perform end-to-end pretraining with ft-transformer.
Insufficient documentation: The description of how Electra-style training is applied to the model.transformer lacks clarity and detail. There’s no concrete example or explanation of the method.

Suggestions

It would be helpful to include the pretraining code for Electra-style training on ft-transformer (e.g., generator and discriminator implementation).
Alternatively, a more detailed explanation in the documentation would be valuable, including:
- The definition of the pretraining tasks.
- The design of the loss function.
- How the generator and discriminator interact.
- Example code, pseudocode, or even a high-level flowchart (if the full implementation cannot be shared at this time).

This would help users better understand and reproduce the pretraining process, making the project more accessible and easier to use.

Environment Details

ft-transformer version: (please specify the version you are using)
Python version: (please specify your Python version)
Other dependencies: (if relevant, include version details of key dependencies)

Thanks

Thank you for your hard work and contribution! Looking forward to future updates.