[FR] Add one more splitting algorithm for testing
Closed this issue · 1 comments
HarikalarKutusu commented
nv: seNtences-first w. unique Voices
- Sort by unique sentences recording count, distribute to test, dev, rest to train, 80-10-10%
- One voice only in one split
- Ensure voice diversity 25-25-50% as in v1 algorithm
Most probably there will be cases that this algorithm will fail to use the whole dataset as it tries to enforce both sentence and voice diversity.
HarikalarKutusu commented
We added two other (vw & vx), but will not ad the one above.