For this project, I worked on Kaggler to execute and run my program:
Kaggle URL : https://www.kaggle.com/code/louisminguet/audio-generation-using-neural-net
According to the various tests that I was able to carry out, we observe that by simply using LSTM the results are the least interesting because many notes are repeated.
By using Embedded LSTM we get less repetitions of notes, the result is more convincing, but remains approximate.
With GAN, one obtains slightly more harmonious results than Embedded LSTM, but progress remains to be made.
- LSTM
- LSTM using Embedding
- GAN
- Epoch : 70
- Dropout : 0.5
- Activation function : Relu (dense: 3)
- Loss : Mae
- Optimizer : Adam
- Batch size : 256
- 128 notes
- Epoch : 200
- Embed size : 100
- Dropout : 0.3
- Activation function : Relu (dense: 1)
- Optimizer : RMSProp
- 128 notes
- Epoch : 60