Wanting to spark a dialogue on this subject
Opened this issue · 4 comments
I'm very interested in this project, not knowing much about the process I blindly tried to use gpt to solve this issue, but your project is much nicer and further down the production line...
I ran a kick through it and it sounded a little bassy, was thiking about starting that as a discussion on how to improve its output, since the transients seem ok and it did pick up something that resembles a sine, that's a good starting point, getting it to pick up the pitch envelope
Do you have any thoughts on how to make it sensitive to that?
Ok so I'm running a batch of sounds on it here, I've got a 1000 analog drum sounds, could we use these as a ground truth to test and improve the output for this kind of input? I can already see some things that appear to be lost in translation...
This is an example, it appears that there is a phasing issue at the end of the wavetable it generates in many sounds
In short envelope tracks, it appears that it's having trouble picking up the length of the envelope, making them too sort
When looking at my version it really appears like there's something wrong with it, because I see this in the output
C:\app\WPy64-310111\python-3.10.11.amd64\lib\site-packages\syntheon\inferencer\vital\models\preprocessor.py:127: FutureWarning: Pass sr=16000 as keyword args. From version 0.10 passing these as positional arguments will result in an error
x, sr = librosa.load(f, sampling_rate)
C:\app\WPy64-310111\python-3.10.11.amd64\lib\site-packages\librosa\core\convert.py:1332: RuntimeWarning: divide by zero encountered in log10
- 2 * np.log10(f_sq)
@lanmower Thank you for your interest in this project! I am currently not focusing fully on Syntheon, but happy to discuss about improvements.
For the issue which the kick sounds "bassy" one way is to introduce filter modulation. This might be made possible with a recent related research.
For shorter-than-expected envelope detection, we rely on librosa.onset.onset_detect
to detect onsets, and cut out a one-shot sample for further analysis. The onset detection could go wrong, which results in a one-shot too-short most of the time (hence might also affect the resulting wavetable). One way is to migrate to other onset detection / transcription libraries (e.g. Essentia, or BasicPitch), but each comes with its own inaccuracies, and some might not work well on drums.
Happy to have a look at the analog drum sounds too.