common-voice/cv-dataset

Feature request: CSV

bulvara opened this issue · 3 comments

A CSV just with the most important data locale, totalHrs, validHrs and nr_of_voices would be great.

I'm using cat $cv_corpus.json | jq -r '.locales | keys[] as $k | "\($k),\(.[$k] | .validHrs)"'

Nice one-liner, but unfortunately not cross-platform/language.

@bulvara, I implemented the following visualization WebApp. It is not a 1-to-1 replica, but it is a flattened version. You can also download the tables as csv files or graphs as png. I keep it up-to-date by following this repo.

https://cv-metadata-viewer.netlify.app/

Here is the related repo: https://github.com/HarikalarKutusu/cv-tbox-metadata-viewer

Any request/bug report/pr is welcome there...