This repo version controls ASR training data

How to USE dvc (data version controlling)

First you need to install the dvc

pip install dvc

To use this dataset in your project repo, add it as a submodule, in your project code

git submodule add https://github.com/Moumeneb1/try-dvc

This will clone the gitlab data model, if you wanna track the specefic version of the dataset used on for the ressults, otherwise just use, git clone/

We use git tags to track versions of our dataset,

use git checkout to switch to specific versions, of datasets

git checkout tags/v1.0

Now you need to add the remote dvc file

dvc remote add -d storage gdrive://1c-tFhMoms1PhkN7J4NcS8OswpDJOGJds

Now your dvc points to the datastorage ( where the data is actually stored), you can now pull the dataset you need

dvc pull dataset.dvc