win10ogod opened this issue a year ago · 0 comments
I found that the dim parameter affects the learning loss and n_layers affects the training speed. It took 30 minutes. The larger layer only had a loss of 2, but it took 3 hours.