danieldjohnson/biaxial-rnn-music-composition

Output tends to converge to being identical to one particular piece

streamliner18 opened this issue · 1 comments

In training small batches of music (~30), I realized that the output of the network especially further in the training process, tend to directly plagiarize one of the input pieces. This issue is particularly apparent when training with small EDM segments, which are essentially repeating chords. Is there a way to modify the network such that this problem is avoided?

Just for reference, I upsized the network by 2x the original sizes in blind attempt to produce a better result. Not sure if that is a root cause to this issue.

That sounds like an overfitting problem: you likely don't have enough data for it to learn to generalize well and make novel output. There are a couple of things you can try:

  • Get more training pieces. In my experiments I generally used more than 100 pieces at a time.
  • Train it for less time. This may produce lower-quality output but will generally avoid plagiarizing any particular piece.
  • Make the network smaller instead of larger. Smaller networks are more constrained in what they can do and thus have a harder time "memorizing" inputs.

Unfortunately there isn't any "magic bullet" that makes this problem go away for all datasets, so it might be good to just try a few of those and see if anything improves the results.