Some question about datasets?

Question

Some question about datasets?

Opened this issue 6 months ago · 2 comments

What's the difference of test-clean and test-clean large(same question about test-other)?

Answer 1 · 2024-04-03T06:37:11.000Z

No difference, just larger. We guarantee that the test subsets don't have overlap books/speakers with training set, so we can't put them into training set, we don't want to waste this part of data, so release them too, in case someone want to test their models in a larger test set.

Answer 2 · 2024-04-03T19:45:28.000Z

so i just need download all json file in run.sh instead of run_pipeline.sh? And large.tar in run_pipeline.sh include large.json(in run.sh) and test_clean_large.json? Test_clean has no overlap with test_clean_large?