facebook/watchman

[Feature Request] Symlink support

JohnyDays opened this issue ยท 60 comments

Is there any performance / complexity reason for not supporting symlinks?

With the release of react-native, and the use of watchman in its packager, it's troubling that npm's (react-native's chosen package manager) endorsed method of developing local interdependent libraries with npm link is not supported, and having to commit to a remote repo and download on every change is obviously not ideal.

wez commented

Watchman recursively watches files that are contained within a filesystem tree.
Symlinks can point to an arbitrary location on the filesystem.
Changes to symlink targets that are outside the tree are not observable.

Why not simply resolve the symlink and watch its target? Because it isn't simple. Here are a handful of reasons that make this a difficult prospect. This is not an exhaustive list, it's just a few reasons I can rattle off the top of my head:

  • The target of the symlink may not exist at the time we want to establish the watch, and neither may any of the parent directories in the symlink path. The operating system will not allow us to watch a path that does not exist
  • Any of the directory components in the target symlink path may themselves be a symlink and may change at an arbitrary time in their lifecycle in a way that is not detectable unless we watch every single directory component in the symlink target path
  • The target of the symlink, or any of the directory components of the symlink target, may be a remote filesystem that doesn't support change notifications
  • Each additional watched directory consumes more filesystem watch resources. There are finite limits on these, and not every system allows raising these limits.
  • Because the hypothetical watches that we'd establish for each symlink are distinct from the watch used for the root of the real tree, we lose certain ordering guarantees when processing change notifications from the kernel. We wouldn't be able to tell whether a change to the symlink target happened logically before or after changes that we observe in the main tree. This important property is used to ensure that filesystem changes have been observed up to the logical point in time of a query. More details on this can be found at https://facebook.github.io/watchman/docs/cookies.html
  • Filesystem changes are fraught with TOCTOU (time of check, time of use) issues already. Adding user-space resolution of symlinks increases the chances of more TOCTOU bugs dramatically; any component of the symlink path may change while we're processing information about them.

Correctly handling all of the above would be tremendously complex and still be error prone. As a result, it is unlikely that we'll ever add support for resolving and tracking symlink targets.

wez commented

Taking a step back from this, can you describe what you're trying to do and why symlinks are important? I'm a casual and occasional node and npm user and am not familiar with npm link

I understand why this would be difficult, and I'll close the ticket and resign to using the above solution for developing local modules.

I'll attempt to explain the process behind npm link, with my use case as an example.

I am developing a module called react-native-waterfall, it implements a generic waterfall view.
I am at the same time developing, in a sister folder, a module called react-native-social-waterfall, and this module depends on the above module.
I am at the same time developing utility modules, which both of the above modules depend on.

npm requires modules from either:

  1. a globally qualified name e.g react-native-waterfall
  2. a path e.g ../react-native-waterfall
  3. an url e.g https://github.com/facebook/watchman/

The second solution has 2 problems:

  1. Every time you change any file in a dependent module, you have to run npm install ../react-native-waterfall to reinstall the module.
  2. You are checking in code that doesn't mirror the final product, where you will be using globally qualified namespaces hosted on npm.

The third solution has 2 problems as well(both are slow)

  1. Every time you change any file in a dependent module, you have to commit it and push it to origin.
  2. You must then go back to your dependent module and do npm update to grab the latest commit

npm link works the following way:

  1. You go into the depended upon module and write npm link, this will register react-native-waterfall as a globally qualified name
  2. You go into the dependent module and write npm link react-native-waterfall, this will register a symlink to the globally qualified name, using the folder where you wrote npm link.

This allows the following benefits:

  1. Your checked in code mirrors the final product.
  2. You do not have to do anything to update the local dependency files, meaning you can quickly test code without committing / running commands.
  3. You can even edit the depended upon module in your dependent folder, and the changes will be applied to the original module.
wez commented

Thanks. How does watchman fit into this?

I am using watchman with the react-native packager, which automatically parses dependencies, transforms code and reloads the application when you change a file. However I have to continuously restart the packager because it doesn't detect the symlinked dependencies's changes.

Idea:
I understand why the added complexity of implicitly following symlinks would cause too many troubles, both in implementation and in maintenance. So would it be possible to instead implement an explicit/manual symlink helper in the form of:

/dir1/
/dir2/

watch dir1
watch dir2 --link-to dir1

Any change in a file at dir2 would trigger an event for an analogous file in dir1
This could easily be automated into a npm link workflow, along with some kind of unlink function

wez commented

Sorry to disappoint; it's too complex to build something to handle this that will work according to expectations.
It sounds like this is something that the react package manager should be handling, since it is the component that has special knowledge of the situation.

wez commented

I've been giving this some thought. I think full-blown symlink handling is too fraught with problems to be something we can commit to, but it doesn't mean that we can't help out in some of the easier or more common cases.

I'll collect some thoughts into a wiki page through the weekend and link to it from here

Sounds good! Would be a great productivity boost for any npm-related consumers developing modules simultaneously, which seems to be a great deal of them nowadays

This came up a few times. I think we can handle it in the RN packager by detecting symlinks and starting a new watch on these symlinks.

There is also reports that the initial file system crawl (which queries watchman) is wrong if there is any linked directories. @wez: even if watchman doesn't follow the symlink shouldn't it report them like any other file?

wez commented

@amasad I've been brainstorming with @bhamiltoncx and we have a plan for this. I'd say hold off from implementing anything in RN for the moment; we have some diffs in progress.

Any updates on this? Really makes developing modules a pain.

๐Ÿ‘

+1

+1, unfortunately, as you know, RN is in a bit of a bind without symlinks, as it's a common practice to npm link local modules for development. I understand the challenges you describe in #105 (comment), but I wanted to chime in with a +1 for the pain point I'm currently experiencing.

For now I've thrown in a bash script to recursively copy all my local dependencies into my project's node_modules directory. It isn't a perfect solution, but it works. So hopefully others could find this workaround useful.

Ehesp commented

+1

jbpin commented

+1

What are module developers doing in general to work around this? @ajwhite, your solution sounds reasonable. Do you run your script by hand or have you wired up watchman to run it for you?

I'm running them by hand currently. I'm sure there's a better way

Ehesp commented

I'm just opening a new project from the node modules directory and working on it that way... Sucks but it's the easiest way I'm finding without messing about with scripts and stuff.

Yeahhh I was doing that too @Ehesp, but then the module was growing a bit so I had to get it into a safer place.

@Ehesp can you manage such a project with git without messing around with git submodules? I guess since node_modules is in .gitignore at the top level that might be doable.

Ehesp commented

It's a bit stinky but it does work like you say, since you also need to install any dependancies at root project level (so npm i without saving it).

Yeah watchman-make is pretty easy to set up do these simple copies.

@chetstone thanks for the link! I've never really gotten my hands directly on watchman, it's always been just part of my tool chain boilerplate. This is helpful, appreciated.

jbpin commented

I wonder if a sinopia instance loaded as docker container on a vagrant provisioned virtual machine will solve this.. It's a bit too heavy to manage. Npm link is a way better direction.

corbt commented

@wez is this still on the roadmap for watchman? If not it would be good to know so that we can add support to the react-native packager.

wez commented

Sorry for letting this sit. At this time, we have no one looking at this issue because it is not a priority in our main use cases. I'd be happy to give guidance on any of the components below if someone would like to contribute pull requests to help bring us closer to having this done.

What's we'd discussed as an implementation strategy was this:

  1. Add the target of a symlink as a metadata field in the state that watchman tracks per file
  2. Whenever a file of type symlink is changed, record readlink() in that slot
  3. Add link target as an available field for queries (https://facebook.github.io/watchman/docs/cmd/query.html#available-fields)
  4. Add a watch_symlinks: true option to .watchmanconfig. When set, each observed symlink in that watch will be implicitly watch-project'd (See #347)
  5. Add follow_symlinks: true option to the query engine. When set, the query engine behavior will change to be aware of symlinks. (See #349)
  6. We'd need to come up with some concept of global clock to meaningfully understand and reason about changes being made across multiple projects. (See #350)
  7. Subscriptions are tricky. (See #351)

"Aware of symlinks" is rather difficult

  1. The initial query, while the lock is held on the initial root, collects the non-symlink results and separately records the set of symlink results.
  2. Before releasing the lock, any of the symlink results that point into the current root are translated such that they report the symlink name but the result of the target.
  3. After releasing the lock, any of the symlink results that point into different watches are then processed in a similar fashion, but with an awareness of the effective path of the symlink.

Global clock is something that needs some thought

Right now we track an abstract clock per watched project. There are two components to the clock; the root number and the tick counter. Each time a change is observed, the tick counter is incremented. This is fundamentally how we perform time-based queries. The root number is bumped each time we recrawl a root and allows us to disambiguate certain classes of stale clock values when deciding which events a client has missed.

Since each watch ticks independently, and symlinks can point to many different projects, there are multiple different clocks. We'd need some basis for synchronizing these watches to be able to meaningfully and easily query across them.

One possibility is adding a global root number and tick counter that are bumped whenever there is a recrawl and whenever a change is observed, respectively. We'd then add a global clock file to the state that we track per file, and update this at the same time that we update the existing oclock field. We'd then need to add a global_since generator to the query engine to indicate that this is the query strategy that should be used.

See #350

Subscriptions

A given watch may be the target of multiple symlinks from multiple projects. It is desirable that a change be propagated out to subscribers watching all references to that symlink. That means that when assessing subscriptions:

  1. We'd need to track the graph of subscription source and target
  2. When a file node changes, we'd need to walk the edges of that graph to compute the set of visible aliases
  3. For each of the aliased watches, for each subscription that is using 'follow_symlinks: true', we need to inject a change event with the aliased path. This requires coordination across multiple locks with multiple connected clients.

See #351

Caveats

  1. Not all of the symlink targets in a project will be watchable.
    The filesystems in which the targets reside may not support change notifications at all, or may be restricted from being watched by local policy (https://facebook.github.io/watchman/docs/config.html#enforce_root_files, https://facebook.github.io/watchman/docs/config.html#illegal_fstypes)
  2. There's plenty of scope for TOCTOU attacks, care needs to be taken to ensure that we can't be DoS'd into looping forever while trying to resolve symlinks.

Partial Support

All-singing, all-dancing support is a lot of work, but isn't strictly needed if you can contrive to have all of the files residing physically under the same watched root:

/path/to/myproject/.watchmanconfig    (defines the project root)
/path/to/myproject/node_modules/      (contains dirs that have symlinks that all point
                                      to paths under  /path/to/myproject/node_modules/

In this case, all of the changes are taking place under the same root, so we wouldn't need to do any of the implicit watch-project stuff, wouldn't need a global clock and wouldn't need to coordinate subscriptions across multiple watches.

The scope of work would be reduced to managing the aliasing within the same root at query execution time.

You can create module aliases like:

/**
 * @providesModule mylibraryname
 */

at the top of the module directory index and include it like you would any other npm package:

import { SuperBadAssComponent } from 'mylibraryname';
dutzi commented

@wez, the partial support you mentioned would be very helpful!

It would help cases in which a team is working on multiple packages: pack1, pack2 and pack3, where pack1 is dependent on pack2 and pack3. And pack2 is dependent on pack3.

In a situation like this the ultimate workflow would be:

~/pack1
โ”œโ”€โ”€ node_modules/pack2 -> ~/pack2
โ””โ”€โ”€ node_modules/pack3 -> ~/pack3

~/pack2
โ””โ”€โ”€ node_modules/pack3 -> ~/pack3

~/pack3

But this would also be sufficient:

~/pack1
โ”œโ”€โ”€ node_modules/pack2 (git repo)
โ”‚   โ””โ”€โ”€ node_modules/pack3 -> ~/pack1/node_modules/pack3
โ””โ”€โ”€ node_modules/pack3 (git repo)

And definitley much better than the only stucture we can use right now, having multiple clones of the same repo inside node_modules:

~/pack1
โ”œโ”€โ”€ node_modules/pack2 (pack2 git repo)
โ”‚   โ””โ”€โ”€ node_modules/pack3 -> (pack3 git repo)
โ””โ”€โ”€ node_modules/pack3 (pack3 git repo)

Is this latest commit related to the 'partial support' mentioned above? Just curious what more needs to be done.

wez commented

I created some internal tasks to work towards implementing 'partial support' as outlined above for new hires at FB to pick up during bootcamp so that we can semi-passively make progress towards implementing this. If you are looking in from outside FB and want to see this move faster by submitting some code, please don't be afraid to reach out; I'd be happy to give some guidance.

Yeah, I'm interested! But don't have a ton of experience with C. I work with React Native and having this would really improve the dev experience for those of us writing modules for it, as it would allow testing them inside real apps while we develop.

dutzi commented

Hi all, being tired of how screwed up this situation is for us I wrote a simple tool that listens to changes in one folder and copies them to another folder. It's called WML (Watchman-Links, since it's based on watchman). Check it out here.

It's very simple to use:

# add the link to wml using `wml add <src> <dest>`
wml add ~/my-package ~/main-project/node_modules/my-package
# start watching for changes
wml start

@wez after doing a ton of debugging and tracing this afternoon. I actually think that allowing watchman to support symlinks (in some fashion) or copy-pasting (wml and custom scripts) are really the only options for React Native.

Would you be able to elaborate on what needs to be done to support:

  • following symlinks to initially discover the correct paths
  • watching symlinked paths

I'm not very sharp with C but am motivated to see this work, so I'm willing to help so that this works out of the box with RN. Watchman is the only thing that is fast enough or can support the large number of files in an RN project.

wez commented

@ekryski I put a pretty detailed dump of what is needed in the long comment above. In terms of breaking this down into tasks, #347 is almost done and I just opened #349, #350 and #351.

#349 is in the FB Bootcamp onboarding pool of tasks, which is to say that they have no explicit priority and no guarantee of making progress--they're waiting for an appropriately interested person to come along and pick them up. I'll look at adding the others into the pool a bit later.

#347 corresponds to item (4) in my list above, and #349 to (5), #350 to (6) and #351 to (7).

I think RN really wants subscriptions to trigger pushing things down to the mobile device/simulator, which means that you need everything that we've spec'd so far.

So has this been picked up? Now we are trying to share some code between web and native react. Like redux reducers and stuff, business logic. But we thought the easiest way would be to make a "commons" folder outside web and native, to have this symlinked inside web and native.

But watchman doesnt find any of the files in the symlinked folder.

This way we could share busniess logic between web and native.

@Snorlock we use git submodule for that. Shared (smart, no view) components and logic are in a 'core' repo which is a submodule of both apps.

wez commented

@Snorlock: @farnz is working towards this, but it's not our highest priority.

Ok, sounds good, we decided to use a wrapper for watchman and copying our files inn when we hit npm run start, modified the start script. So watchman for commons folder and watchman for the native project runs the same time

Symlink is working for me. All you need to do, is create a package.json in your shared directory with a same name what you want to use with your alias.
eg.:

+-- apps
|   +-- _core
|       +-- bar.js
|       +-- package.json // {"name": "@core"}
|   +-- app1
|       +-- index.js
|       +-- package.json
|   +-- app2
|       +-- index.js
|       +-- package.json
+-- node_modules
+-- package.json

Run this in your app1 and app2 directory:

ln -s ../../_core ./node_modules/@core

Make sure you will restart your packager etc, and remove caches.

And from this point you can use in your code like this in your app1/index.js:
Import bar from '@core/bar'

zubko commented

I've used the way by @gyurobenjamin, thanks, it works for now,
in my case I've also added a post install script to package.json, because node_modules isn't checked in to git repo.

"scripts": {
    ....
    "postinstall": "... the same ln command ..."
}

Might be obvious, but hard-linking works great if your symlink isn't pointing at a directory.

@farnz are you still looking into this?

wez commented

No one is actively working on this issue at the moment

Hello, I am working on a lerna project which have a web app and a react native module which also support web.

I now want to install and work in development from within a react native project app. I have been trying all the tools online and everyconfiguration, I am not able to setup a react native development environment because of this.

What's the solution in 2021 ?

life is what happens while developers expect to have symlink support on watchman...

Thanks for the tip :) I am now trying with wl, but despite the folder being watched, it does not copy the change. Any hint perhaps ?

I expect Watchman to support symlink because this is a widely used feature used by npm, yarn and all the node package manager available out there so it is kind of expected and needed.

I am extracting a web app module so I can turn it compatible with react native, how should I configure a proper development environment within my native app if I can't use linking between the two?

Meanwhile, literally every single major and minor bundler for the Node ecosystem supports following symlinks. That this has been an open request for over 6 years, that the metro team has fallen this far behind ecosystem norms, and that the team has remained stubborn on this issue is absolutely mind blowing. Maintainers: you were wrong. Let's own the mistake and do right by the community.

Not sure if this workaround has been mentioned, but you can create a metro config file that explicitly adds a symlinked project to the project roots. I did it like this. In the code there's a comment with a link to the repo I stole it from.

7 years later, symlinked packages have long since been the standard for all JS package managers. This issue makes developing a react-native app in a monorepo unnecessarily difficult and error prone. If this issue is never going to be fixed can we please close this issue with an explanation of the rationale and possible workarounds?

Has anyone using yarn workspaces had success with options like nohoist or install --focus? Watchman lacking this feature is killing monorepo setups... ๐Ÿ˜ž

gajus commented

The lack of symlink support was an unpleasant surprise while developing https://github.com/gajus/turbowatch/

The lack of symlink support was an unpleasant surprise while developing https://github.com/gajus/turbowatch/

Funny to see you here! The lack of symlink support was why we couldn't consider turbowatch.

gajus commented

Funny to see you here! The lack of symlink support was why we couldn't consider turbowatch.

Huge ๐Ÿคฆโ€โ™‚๏ธ We only tested Turbowatch in the context of re-building packages and apps when changes in the workspace are detected, but failed to check if it will work when the goal is to detected when symlinked dependencies change. The lack of symlink support makes Watchman unusable for our use case (monorepo with linked dependencies).

For what it is worth, I am rewriting Turbowatch to allow choosing between using Watchman or chokidar as a backend. Expect an update in the next 12 hours.

gajus commented

https://github.com/gajus/turbowatch/releases/tag/v2.0.0 Turbowatch made a switch to chokidar. However, I kept the API such that we could revert to using Watchman, or maybe even support multiple backends.

gajus commented

@chadaustin @xavierd @fanzeyi (tagging recently active contributors) Is there a chance of this issue receiving attention?

@gajus Sorry. Likely no.

The harsh reality is that there is not really a case for symlink support internally (I recently chatted with folks about this), and we probably won't be motivated to implement this.

gajus commented

@gajus Sorry. Likely no. The harsh reality is that there is not really a case for symlink support internally (I recently chatted with folks about this), and we probably won't be motivated to implement this.

Thank you for the response. Would you consider accepting a PR?

Would you consider accepting a PR?

Certainly. Personally I'd want to make Watchman easier to work in OSS and make contributions.