Please update this so it works for latest generation diffsinger models that have linguistic.onnx models

Question

Please update this so it works for latest generation diffsinger models that have linguistic.onnx models

yakotoka opened this issue 2 years ago · 1 comments

So it looks like newer generation diffsinger models now have linguistic models that take in tokens, word divisions and word durations where the output is encoder_out and x_masks which then feed to the duration.onnx model

Example below(please tell me the if zeroes are needed in the below example)
results = linguistic_model.run(None, {
"tokens":[[26, 1, 22, 35, 11]] ,
"word_div": [[3,2,0,0,0]],
"word_dur": [[48,24,0,0,0]]
})

Happy to get your thoughts, thank you!

Answer 1 · 2023-08-25T14:04:01.000Z

This project is deprecated now. You can use OpenUTAU for DiffSinger to synthesis with ONNX models. Anyway, this is only a simple demo project, and you can extend it or even re-write it easily