[BUG] FetcherBase._tarxOptions removes files with identical inodes
Opened this issue · 1 comments
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
_tarxOptions
in FetcherBase
specifies an extraction filter that removes any tar entry that has a type matching Link
. node-tar marks files that have more than one hardlink on them as being of type Link
. This makes the behavior of tarballStream
differ dependent on whether one of the source files happens to have the same inode as another source file. Only one copy of the hardlinked files will thus end up in the target directory.
This is problematic for systems like Guix System where identical files may be deduplicated with hardlinks.
Expected Behavior
The effective output of the tarballStream
should be the same independent of whether the involved files share inodes.
Steps To Reproduce
mkdir pkgA
cat<<EOF>>pkgA/package.json
{
"name": "pkgA",
"version": "0.0.0",
"description": "",
"dependencies": {
"pkgB": "../pkgB"
},
"author": "",
"license": ""
}
EOF
mkdir pkgB
cat<<EOF>>pkgB/package.json
{
"name": "pkgB",
"version": "0.0.0",
"description": "",
"author": "",
"license": ""
}
EOF
touch pkgB/index.js
mkdir pkgB/dist
# duplicate a file via hardlink
ln pkgB/index.js pkgB/dist/index.js
This is what this looks like:
$ tree
.
├── pkgA
│ └── package.json
└── pkgB
├── dist
│ └── index.js
├── index.js
└── package.json
4 directories, 4 files
Now install pkgA and observe that index.js only appears once.
cd pkgA
npm install --offline --install-links=true
We only see dist/index.js, not its hardlinked alter ego:
$ tree node_modules
node_modules/
└── pkgB
├── dist
│ └── index.js
└── package.json
Environment
- npm: 9.5.1
- Node: v18.16.0
- OS: Guix System
- platform: x86_64
If you think that this is rather a bug report for node-tar, please do say so. Perhaps it should not label hardlinked files as Link
.