Please update this so it works for latest generation diffsinger models that have linguistic.onnx models
yakotoka opened this issue · 1 comments
yakotoka commented
So it looks like newer generation diffsinger models now have linguistic models that take in tokens, word divisions and word durations where the output is encoder_out and x_masks which then feed to the duration.onnx model
Example below(please tell me the if zeroes are needed in the below example)
results = linguistic_model.run(None, {
"tokens":[[26, 1, 22, 35, 11]] ,
"word_div": [[3,2,0,0,0]],
"word_dur": [[48,24,0,0,0]]
})
Happy to get your thoughts, thank you!
yqzhishen commented
This project is deprecated now. You can use OpenUTAU for DiffSinger to synthesis with ONNX models. Anyway, this is only a simple demo project, and you can extend it or even re-write it easily