Results from word_language_model should be more organized
NoahSchiro opened this issue · 1 comments
Is your feature request related to a problem? Please describe.
When we run word_language_model/main.py, we can select between a variety of models for the specific task at hand. However, when we generate the model and save the weights of training, we just dump out a generic "model.pt".
Similarly, the results of generate.py spit out a generic generated.txt file.
Users have the option to change this in command line, but the default is a generic file name.
Describe the solution
I think it would be beneficial (for the purposes of comparing models) to write out to a "transformer.pt" or "lstm.pt" so that they are separate files and analysis can be run on multiple models after training.
Similarly with the generated txt file, instead of generated.txt, the default should be the same as the name of the model ([model]_gen.txt)
All of this would also be better organized if put into a "results/" subdirectory within the word_language_model directory.
Describe alternatives solution
I am open to hearing other ideas, or an argument for why a generic name is preferred as the default. It might also be useful to be able to distinguish between models of the same architecture but different hyper-parameters, though this could also result in very long file names. A possible middle ground is to just also output the text that is generated during training so there is a log of batch sizes / how loss changes over time / etc.
I can create a PR in short order if this is deemed a valid change by the package maintainers.
Hey there, I agree with your overall suggestion. I believe the file name should just indicate the model name and any other details can be included in the TXT file itself or even another file if required. Also please note that the model info itself would be stored in the pt file itself.
I am happy to take a look and submit a PR implementing those suggestions