iterative/dvc

dvc metrics diff --all: on same branch is empty

Opened this issue · 2 comments

Bug Report

Description

Hello,
we found that dvc metrics diff --alloutputs nothing, if a_revand b_rev refer to the same git commit. No matter if they are exactly the same or different types of references (HEAD vs branch_name, etc.)

Reproduce

  • setup any dvc stage with metrics.
  • run dvc metrics diff --all $(git rev-parse --abbrev-ref HEAD) HEAD

Expected

output metrics-diff table with all values with diff = 0.0

Environment information

Output of dvc doctor:

-------------------------
Platform: Python 3.8.10 on Linux-6.5.0-28-generic-x86_64-with-glibc2.34
Subprojects:
        dvc_data = 2.22.6
        dvc_objects = 1.4.9
        dvc_render = 0.7.0
        dvc_task = 0.4.0
        scmrepo = 1.6.0
Supports:
        http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2024.3.1, boto3 = 1.34.51)
Config:
        Global: /home/mtunkowitsch/.config/dvc
        System: /etc/xdg/xdg-ubuntu/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p6
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/nvme0n1p6
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/6a15ed68432f0e8e4dba7e407082545a

Additional Information (if any):

Running the debugger, we noticed that metrics/diff.py:diff expects the results of metrics.show() to have the exact rev keys extracted. However metrics/show.py:show uses the brancher to extract the revs to use.
However the brancher groups revs with the same sha and joins them.

This means that when we call dvc metrics diff --all main main the brancher would group main and main and return main,main.
Hence the repo.metrics.show() outputs all metrics with the key main,main and the repo.metrics.diff() doesn't find results for main

@MaximilianTunk Great research! Since you have already dug this far, do you think you would be able to contribute a fix?

@dberenbaum Thank you :) I'm interested in contributing a fix, but i have to admit it'll be my first one ever to an open source github project. I have a really nice colleague that is here to help any time if I need help or have questions, so fixing it should'nt be any problem. If I have any questions or updates about possible solutions, I'll add a comment to this thread!