Wrong symlinks when fetch / checkout is done from symlinked folder
thorstenwagner opened this issue · 10 comments
Assume the following:
/a_dataset.yaml # this is my root directory
/A/a.txt
Now a.txt was updated.
If I run the following commands from root /, everything works fine:
rm /A/a.txt
dud fetch
dud checkout
If I run the following commands from /A/, the symlink to a.txt
is broken:
rm a.txt
dud fetch
dud checkout
Best,
Thorsten
Hi @thorstenwagner. I need more information before I can reproduce this.
/a_dataset.yaml # this is my root directory
Do you mean this is your system's root directory (the absolute path /
), or your project's root directory (where the .dud
directory lives)?
Now a.txt was updated.
Was it ever committed? How was it updated? Was it committed again after the update?
dud fetch
Is dud fetch
necessary to reproduce this issue? If so, what's your remote config? Where's dud push
in this scenario?
When I assume the simplest scenario, I can't reproduce this:
dud init
mkdir A
echo foo > A/a.txt
dud stage gen -o A/a.txt | tee a_dataset.yaml
dud stage add a_dataset.yaml
dud commit --copy
echo bar >> A/a.txt
dud commit
tree
rm A/a.txt
dud checkout
tree
cd A
rm a.txt
dud checkout
tree
Output:
Dud project initialized.
See .dud/config.yaml and .dud/rclone.conf to customize the project.
working-dir: .
outputs:
A/a.txt: {}
Added a_dataset.yaml to the index.
committing stage a_dataset.yaml
A/a.txt 4 B / 4 B 100% ?/s 1ms total
committing stage a_dataset.yaml
A/a.txt 8 B / 8 B 100% ?/s 1ms total
.
├── A
│ └── a.txt -> ../.dud/cache/ab/b4ca7eb554f159c4970bf8c7c723b724ff9e88cfeb5ee5eec6894f67bcd86b
├── a_dataset.yaml
└── run.sh
2 directories, 3 files
checking out stage a_dataset.yaml
A/a.txt 1 / 1 100% ?/s 0s total
.
├── A
│ └── a.txt -> ../.dud/cache/ab/b4ca7eb554f159c4970bf8c7c723b724ff9e88cfeb5ee5eec6894f67bcd86b
├── a_dataset.yaml
└── run.sh
2 directories, 3 files
checking out stage a_dataset.yaml
A/a.txt 1 / 1 100% ?/s 0s total
.
└── a.txt -> ../.dud/cache/ab/b4ca7eb554f159c4970bf8c7c723b724ff9e88cfeb5ee5eec6894f67bcd86b
1 directory, 1 file
It would be most helpful if you could provide a Bash script, like this one, which completely reproduces this issue.
/ is my project directory, not my system root :-) Lets see if I can make it somehow reproducible.
In principle the file A.txt were updated successfully on a different computer. Therefore I'm only tried to fetch the updated data on the different computer
To give you an example with the actual data:
I'm interested in the file gt.txt
:
tomotwin_evaluation_dataset is my project directory. Now I delete the file gt.txt
and then dud fetch; dud checkout
:
Looks good. Now I navigate to the dataset directory, delete gt.txt and run fetch+checkout:
As you can see, the symlink is broken now.
Kudo 2 @mstabrin , he made a reproducible example:
mkdir -p dudtest/mydud
ln -rs dudtest/mydud mydud
cd mydud
dud init
mkdir A
echo foo > A/a.txt
dud stage gen -o A/a.txt | tee a_dataset.yaml
dud stage add a_dataset.yaml
dud commit --copy
echo bar >> A/a.txt
dud commit
tree
rm A/a.txt
dud checkout
tree
cd A
rm a.txt
dud checkout
tree
@mstabrin is wondering: Why is the symlink relative to root? ^^
Thanks for the example that I can run and reproduce!
Having a project directory be a symlink is an unexpected use case. It's certainly not something I was planning to support, simply because I hadn't thought of it. I will poke at this a bit further, and if a fix is simple I will add it. But I can't guarantee support for a symlinked project directory. Can you help me understand the motivation behind this pattern?
My hunch is that this is the issue. From the Go docs (emphasis my own):
Getwd returns a rooted path name corresponding to the current directory. If the current directory can be reached via multiple paths (due to symbolic links), Getwd may return any one of them.
Consequently, I'm not sure how easy this will be to fix. I'll keep looking at it, though.