`dvc data status` key error
mattangus opened this issue · 1 comments
mattangus commented
Bug Report
Running dvc data status
gives me an error on one machine but not on another:
Error
dvc data status -v 10:23:23
2024-04-24 10:23:26,182 DEBUG: v3.50.0 (pip), CPython 3.12.2 on Linux-6.5.0-28-generic-x86_64-with-glibc2.35
2024-04-24 10:23:26,182 DEBUG: command: /home/matt/workspace/virtual_environments/py-3.12/bin/dvc data status -v
2024-04-24 10:23:26,494 ERROR: unexpected error - b'2c6373811567f2b2023f065fb5a333fdeefd54bb'
Traceback (most recent call last):
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dvc/cli/__init__.py", line 211, in main
ret = cmd.do_run()
^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dvc/cli/command.py", line 27, in do_run
return self.run()
^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dvc/commands/data.py", line 110, in run
status = self.repo.data_status(
^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dvc/repo/data.py", line 234, in status
git_info = _git_info(repo.scm, untracked_files=untracked_files)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dvc/repo/data.py", line 141, in _git_info
staged, unstaged, untracked = scm.status(untracked_files=untracked_files)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/scmrepo/git/__init__.py", line 307, in _backend_func
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 880, in status
staged, unstaged, untracked = git_status(
^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/porcelain.py", line 1318, in status
tracked_changes = get_tree_changes(r)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/porcelain.py", line 1456, in get_tree_changes
for change in index.changes_from_tree(r.object_store, tree_id):
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/index.py", line 553, in changes_from_tree
yield from changes_from_tree(
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/index.py", line 657, in changes_from_tree
for name, mode, sha in iter_tree_contents(object_store, tree):
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/object_store.py", line 1745, in iter_tree_contents
tree = store[entry.sha]
~~~~~^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/object_store.py", line 154, in __getitem__
type_num, uncomp = self.get_raw(sha1)
^^^^^^^^^^^^^^^^^^
File "/home/matt/workspace/virtual_environments/py-3.12/lib/python3.12/site-packages/dulwich/object_store.py", line 601, in get_raw
raise KeyError(hexsha)
KeyError: b'2c6373811567f2b2023f065fb5a333fdeefd54bb'
2024-04-24 10:23:26,517 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out)
2024-04-24 10:23:26,517 DEBUG: Removing '/home/matt/workspace/HA/OD-Stuff/.pQ5HNP36nqZbOfuABjeJDA.tmp'
2024-04-24 10:23:26,517 DEBUG: Removing '/home/matt/workspace/HA/OD-Stuff/.pQ5HNP36nqZbOfuABjeJDA.tmp'
2024-04-24 10:23:26,517 DEBUG: Removing '/home/matt/workspace/HA/OD-Stuff/.pQ5HNP36nqZbOfuABjeJDA.tmp'
2024-04-24 10:23:26,517 DEBUG: Removing '/home/matt/workspace/HA/OD-Stuff/ha_gym/.dvc/.cache/files/md5/.cOGbTVUVskCrA6vs0mD64Q.tmp'
2024-04-24 10:23:26,525 DEBUG: Version info for developers:
DVC version: 3.50.0 (pip)
-------------------------
Platform: Python 3.12.2 on Linux-6.5.0-28-generic-x86_64-with-glibc2.35
Subprojects:
dvc_data = 3.15.1
dvc_objects = 5.0.0
dvc_render = 1.0.1
dvc_task = 0.3.0
scmrepo = 3.1.0
Supports:
http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
s3 (s3fs = 2024.2.0, boto3 = 1.34.34)
Config:
Global: /home/matt/.config/dvc
System: /etc/xdg/xdg-ubuntu/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme1n1p3
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/nvme1n1p3
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/b11a8fb5114eb46d6400fbaefadf5890
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2024-04-24 10:23:26,527 DEBUG: Analytics is enabled.
2024-04-24 10:23:26,550 DEBUG: Trying to spawn ['daemon', 'analytics', '/tmp/tmp3i_1azl0', '-v']
2024-04-24 10:23:26,557 DEBUG: Spawned ['daemon', 'analytics', '/tmp/tmp3i_1azl0', '-v'] with pid 141995
Description
This seems to be related to the untracked changes I have in my working directory. However, the other machine that this command works on also has many untracked changes too. dvc status
still works.
Reproduce
I'm not sure how to reproduce this issue.
Expected
On the other machine the same command outputs No changes.
.
Environment information
Output of dvc doctor
:
$ dvc doctor
DVC version: 3.50.0 (pip)
-------------------------
Platform: Python 3.12.2 on Linux-6.5.0-28-generic-x86_64-with-glibc2.35
Subprojects:
dvc_data = 3.15.1
dvc_objects = 5.0.0
dvc_render = 1.0.1
dvc_task = 0.3.0
scmrepo = 3.1.0
Supports:
http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
s3 (s3fs = 2024.2.0, boto3 = 1.34.34)
Config:
Global: /home/matt/.config/dvc
System: /etc/xdg/xdg-ubuntu/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme1n1p3
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/nvme1n1p3
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/b11a8fb5114eb46d6400fbaefadf5890
Additional Information (if any):
dberenbaum commented
Thanks for the report. Unfortunately, since it's not reproducible, and the error comes not from dvc but from dulwich, I am going to close this one since it does not look like there's much we can do.