iterative/dvc

Is there a way to print dvc data version of the current data file. I am messing around with dvc.api, but couldnt find a functionality that could print the current version of the data(git tag) inside the python code?

lahrims opened this issue · 4 comments

I am currently using DVC to track my data and am ususally adding the git tag to version the data file, I am manually checking out to particular version of the data. Is there a way to print the data version inside a python code.

I have tried using dvc.api.exp_show() but it gets me all the list also tried dvc.api.all_tags same!

but, i want to print the current version of the data

I see you asked the same here. Are you looking to print the current Git revision? The md5 of the file?

I see you asked the same here. Are you looking to print the current Git revision? The md5 of the file?

Sorry, I am new to DVC. I am looking to print the git tag. Also is there a way to checkout into the directory from python. dvc.api.open() works to open a data file, but i want to fetch a directory instead.

I am looking to print the git tag.

Since DVC relies on Git versioning here, you don't need DVC at all to print the current git tag or revision. You can use other Python libraries for Git. For example:

from dulwich.repo import Repo
from dulwich.porcelain import describe

repo = Repo('.')
describe(repo)

Also is there a way to checkout into the directory from python. dvc.api.open() works to open a data file, but i want to fetch a directory instead.

There are two ways you can manage DVC-tracked directories in Python:

  1. Use https://dvc.org/doc/api-reference/dvcfilesystem
  2. Use the undocumented internal Repo API, which mimics the command-line interface. For example:
from dvc.repo import Repo

repo = Repo()
repo.checkout() # or repo.fetch() or repo.pull() or any other command

Thanks for helping out.