Pruning results in different hashes for files with CRLF line endings
Opened this issue · 1 comments
Verify canary release
- I verified that the issue exists in the latest Turborepo canary release.
Link to code that reproduces this issue
https://github.com/cjquines/turborepo-gitattributes/
Which canary version will you have in your reproduction?
turbo 2.3.4-canary.2
Enviroment information
CLI:
Version: 2.3.4-canary.2
Path to executable: /home/runner/work/turborepo-gitattributes/turborepo-gitattributes/node_modules/.pnpm/turbo-linux-64@2.3.4-canary.2/node_modules/turbo-linux-64/bin/turbo
Daemon status: Not running
Package manager: pnpm
Platform:
Architecture: x86_64
Operating system: linux
WSL: false
Available memory (MB): 14810
Available CPU cores: 4
Environment:
CI: Some(
"GitHub Actions",
)
Terminal (TERM): unknown
Terminal program (TERM_PROGRAM): unknown
Terminal program version (TERM_PROGRAM_VERSION): unknown
Shell (SHELL): unknown
stdin: true
Expected behavior
File hashes for README-dos-dos.md
and README-unix-dos.md
should be the same before and after running turbo prune
.
Actual behavior
File hashes are different.
To Reproduce
Checkout the repo. Run ./test.sh
. Observe that the hashes of some files are different.
Additional context
Might have to do with .gitattributes
, but not sure. CRLF is always weird.
I see the issue, we're calling git_odb_hashfile which has the disclaimer of:
Similar functionality to git.git's git hash-object without the -w flag, however, with the --no-filters flag
--no-filters
Hash the contents as is, ignoring any input filter that would have been chosen by the attributes mechanism, including the end-of-line conversion. If the file is read from standard input then this is always implied, unless the --path option
is given.
The hash outside out/
have hashes that respect .gitattribute
because the hashes have been written to the object database since they're part of a commit. I will look and see if we can switch to use git_repository_hashfile as that can respect .gitattributes
. (Update: this doesn't exist as a binding in the library we use so there might be a delay as I implement it)
In the meantime, there are two workarounds you could try if this is blocking for you:
- Add the pruned output to a commit e.g.
git add out && git commit -m 'pruned'
, this will get all of the hashes written to the object database with.gitattributes
respected - Manually add the file hashes to the database:
git hash-object -w out/apps/app-a/README-dos-dos.md