
Submodules of git dependencies are not fetched

ntucker opened this issue ยท 25 comments

Do you want to request a feature or report a bug?

What is the current behavior?
Having a git dependency with a submodule results in the submodule being an empty directory

If the current behavior is a bug, please provide the steps to reproduce.

  1. Setup NPM package at a github url
  2. Add submodule to said package
  3. Add dependency in package.json for local project that points to the github url of said package
  4. Yarn install
  5. Check node_modules directory for that package and see that submodule folder is empty.

What is the expected behavior?
Submodule directories should have the contents (the cloning process should run submodule init and submodule update)

Please mention your node.js, yarn and operating system version.
Node: 7.0.0
Yarn: 0.16.1
OS: Ubuntu 14.04

Here's the discussion for when they added this to npm: npm/npm#1876


Some npm packages won't work without this feature.

Moreover, the submodules get deleted if they are in there rightful place before running yarn

I look forward to start using yarn but I cannot switch until this gets fixed :(


Any news on it?

While i hope this issue is getting some research invested into it, this placeholder workaround might serve as a temporary solution for all of us waiting

In package.json of a module being fetched, add a following script
"postinstall": "grep url .gitmodules | sed 's/.*= //' | while read url; do git clone $url; done"

Qix- commented

So here's the synopsis of the problem:

The Git utilities included with Yarn seem to do a git clone and then git archive to archive the repository. I've modified the clone command to include --depth=1 --recursive, but the problem is with git archive - it's been identified that it doesn't archive submodules and thus will need to have a workaround.

I'll see what I can do.

Qix- commented

Also, I highly suggest that support for remote archiving is removed. Git has terrible support for it, and it would require a lot of extra code just to get submodules working. As it stands, there isn't any code that enables any of the features, and github.com is explicitly disabled for remote archiving.

I'll make a note of this in the eventual PR and ask the owners of Yarn if it should be removed. I agree with the comment in-source:

const supportsArchiveCache: {[key: string]: boolean} = map({
  'github.com': false, // not support, doubt they will ever support it

Also, the code for determining if the URI supports remote archiving is extremely hackish and nasty. It'd be a delight to see it gone IMO.

try {
  await Git.spawn(['archive', `--remote=${ref.repository}`, 'HEAD', Date.now() + '']);
  throw new Error();
} catch (err) {
  const supports = err.message.indexOf('did not match any files') >= 0;
  return (supportsArchiveCache[hostname] = supports);

For all of the extra code overhead and this lengthy check on what appears to be every fetch, I'm not convinced the tradeoff is worth it.

Without it, we can simply do a bulk copy (or even better, a bulk hard link - but I won't write that logic ๐Ÿ˜…) between the fetched and the target directory, which would fix this issue and simplify the fetching code a bit more - and probably make cloning from a repository much faster since it requires one less request.

is there any fix for this ?

We are officially waiting on this too as an offshoot from the https://github.com/oracle/node-oracledb v2.0 will need this to work.

@pgkehle do you work at oracle?

Hopefully some company here in this thread can aloc one developer to contribute this feature to yarn?

@brunolemos No I do not. I work at NC State. But we are using oracle servers, some of which still use the legacy LONG RAW type, and it's a need for what my department is developing. We do not have the bandwidth to look into making that install issue work. ๐Ÿ˜ž

As a work around, you could remove git dependencies from your package.json, and have a postinstall hook script
The script could either fallback to npm install just for the git dependencies, or it could use git directly to clone them into node_modules, and then descends into each one and runs yarn or npm install (if they themselves have dependencies)

@jokeyrhyme yes, that is what we are doing currently. See oracle/node-oracledb#794 (comment), in case someone needs an example of how to do that.

What's the status? This still does not work :(

Qix- commented

The status is nobody has submitted a PR and the ticket is still open, @TheAifam5.

In package.json of a module being fetched, add a following script
"postinstall": "grep url .gitmodules | sed 's/.*= //' | while read url; do git clone $url; done"

If deal with an empty submodule folder with git clone, a permission error has occurred and you yarn add stopped working.

what should I do?

I solved it, using a symbolic link.

โ”œโ”€โ”€ repositoryC(submodule)
โ”‚     โ””โ”€โ”€ hogehoge
โ””โ”€โ”€ node_modules/
    โ””โ”€โ”€ repositoryB/
        โ””โ”€โ”€ repositoryC(symbolic link)
             โ””โ”€โ”€ (hogehoge)

This is only a temporary solution, but ... https://qiita.com/nitaking/items/3340955e80144ec90ad6

In my case, to be able to install it on the correct paths of .gitmodules, I needed to set the postinstall on package.json as this:

"postinstall": "sed -n -e '/path/,/url/p' .gitmodules | sed 'N;s/\\n/\\$$$/' | while IFS= read -r line; do if [[ $line =~ (.*)\\$\\$\\$(.*) ]] ; then path=\"$(echo ${BASH_REMATCH[1]} | sed 's/.*= //')\"; url=\"$(echo ${BASH_REMATCH[2]} | sed 's/.*= //')\"; git clone $url $path; fi done",


In order to bypass the problem where it gives trouble with future yarn adds/installs saying that the folders are not empty, just use this instead:

"postinstall": "sed -n -e '/path/,/url/p' .gitmodules | sed 'N;s/\\n/\\$$$/' | while IFS= read -r line; do if [[ $line =~ (.*)\\$\\$\\$(.*) ]] ; then path=\"$(echo ${BASH_REMATCH[1]} | sed 's/.*= //')\"; url=\"$(echo ${BASH_REMATCH[2]} | sed 's/.*= //')\"; if [ -d $path ] ; then start=$PWD; cd $path; git pull; cd $PWD; else git clone $url $path; fi fi done",

Any progress on this?

Qix- commented

@Shinobu1337 If there's nothing here, then no. Progress in the OSS community is pretty transparent.

EDIT (1.5 years later): not sure how what I said was subjective. Someone creating a PR would have it mentioned here, so if there's nothing here on this github issue then nothing has been done...

@reyalpsirc not work in my computer, so add some improves to your codes. It should manually replace xxxxx with your packages which contain submodules:

"postinstall": "project_dir=$PWD; modules=( 'xxxxx' ); for module in \"${modules[@]}\"; do echo \"Notice: update module ${module}\"; cd \"./node_modules/${module}\"; if [[ -f .gitmodules ]]; then sed -n -e '/path/,/url/p' .gitmodules | sed 'N;s/\\n/\\$$$/' | while IFS= read -r line; do if [[ $line =~ (.*)\\$\\$\\$(.*) ]] ; then path=\"$(echo ${BASH_REMATCH[1]} | sed 's/.*= //')\"; url=\"$(echo ${BASH_REMATCH[2]} | sed 's/.*= //')\"; if [[ -d \"${path}/.git\" ]] ; then echo \"Notice: git pull submodule $(realpath ${path})\"; cd \"${path}\" && git pull; else if [[ -z \"$(ls -A ${path})\" ]]; then echo \"Notice: git clone submodule $(realpath ${path})\"; rm -rf \"${path}\"; git clone $url $path; else echo \"Notice: $(realpath ${path}) submodule folder is not empty, try to remove it before clone.\"; fi; fi; else echo \"Notice: module ${module} is not contain submodules.\"; fi; done; else echo \"Notice: module ${module} is not contain submodules.\"; fi; cd \"${project_dir}\"; done;"

Test on macOS bash

Not fixed yet in 2020? Ok time to move on to something else

This is very disappointing, especially since npm fixed this years ago. Now I can't not use any dependencies without a dummy package.json added. napa would be a workaround if I wouldn't have a install step (cmake-js), since you can't not portable chain install steps with npm. All post install steps here fail due to not being portable on windows.