For genererating all 5 stems from an audio I have used spleeter. Spleeter is Deezer source separation library with pretrained models written in Python and uses Tensorflow. It makes it easy to train source separation model (assuming you have a dataset of isolated sources), and provides already trained state of the art model for performing various flavour of separation :
- Vocals (singing voice) / accompaniment separation (2 stems)
- Vocals / drums / bass / other separation (4 stems)
- Vocals / drums / bass / piano / other separation (5 stems)
You can easily generate all 5 stems without installing anything ? In spleeter official doccumentation they have set up a Google Colab. Remember to add -p spleeter:5stems
spleeter separate -i audio_example.mp3 -p spleeter:5stems -o output
you can play with the parameter of psi values, and the mouth_open multiplier. Also, there are more vectors to try out. **The "vocals" stem is tied to the mouth vectors, with bass and other parts of the music tied to other aspects of the image.
Here is the youtube link , that I have generated this audio reactive faces.