Releases of the individual datasets?
Closed this issue · 2 comments
Hi, I feel like it would be great, if you could prepare also releases of the individual corpora (earnings21, earnings22).
People could then download only the part they care about instead of a zip of whole repo (or getting git of the whole repo).
Just for your consideration -- but I think you should do it :)
Hi, does the method in https://github.com/revdotcom/speech-datasets#how-to-check-out-only-a-single-dataset accomplish this for you?
sorry for the late response. Those solutions could work, but I think the input barrier is quite large (reasonably new git with sparse checkout and git-lfs). Also, I added recipes to earnings21 and earnings22 to lhotse (https://github.com/lhotse-speech/lhotse) and I just feel downloading single zip file release would be great. But I don't know what are the github limits to these things, so it might just not be possible