Some question about datasets?
Opened this issue · 2 comments
fengshi-cherish commented
What's the difference of test-clean and test-clean large(same question about test-other)?
pkufool commented
No difference, just larger. We guarantee that the test subsets don't have overlap books/speakers with training set, so we can't put them into training set, we don't want to waste this part of data, so release them too, in case someone want to test their models in a larger test set.
fengshi-cherish commented
so i just need download all json file in run.sh instead of run_pipeline.sh? And large.tar in run_pipeline.sh include large.json(in run.sh) and test_clean_large.json? Test_clean has no overlap with test_clean_large?