/dvc_tutorial

data version control tutorial

Apache License 2.0Apache-2.0

dvc_tutorial

data version control tutorial

Do it

  1. first, you should make a git repository. and then clone it to your local computer

  2. install

pip install dvc
  1. initialize dvc project
dvc init
  1. add remote repository
pip install dvc_gdrive
dvc remote list
dvc remote add -d mygdrive gdrive://<folderID>
  1. add first version of data
dvc get https://github.com/iterative/dataset-registry tutorials/versioning/data.zip
unzip data.zip & rm -f data.zip
dvc add data/
dvc push
git commit -m "ADD first version of data/"
git tag -a "v1.0" -m "data v1.0 1000 images"
git push
  1. add new version of data
dvc get https://github.com/iterative/dataset-registry tutorials/versioning/new-labels.zip
unzip new-labels.zip & rm -f new-labels.zip
dvc add data/

dvc diff
git diff data.dvc
git commit -am "New version of data/ with more training images"
git tag -a "v2.0" -m "data v2.0, 2000 images"
dvc push