guard/listen

Monitor events for remote mounted filesystems

avit opened this issue · 23 comments

avit commented

As a follow-up to #53, getting callbacks for events on remote host mounts is not supported by inotify on linux, or fsevents on Mac. In fact, none of the protocols (e.g. smbfs/cifs, vboxfs, nfs) even forward events from the server; they must be polled.

Working in a VM seems to be a popular workflow and it would be nice if there was a way to capture file changes from the host OS, because polling sucks. Just throwing this here as an idea before I (read: #lazyweb) decide to take a stab at it:

  1. Add an adapter that listens for filesystem events on a TCP port
  2. Provide a notifier (guard plugin?) that can be run on the remote host for passing the events to the adapter's port

Thoughts?

Yeah sounds interesting, what about adding all that directly in Listen?

  1. Launch listen on the vm that passing the events on a TCP port:
Listen.to('vm_dir', sent: { tcp: 3333 })
  1. Add an adapter that listen on the TCP port, that can be used directly by Guard via Listen.
Listen.to(3333, remote: :tcp)

Any other/better DSL proposition are welcome :)

@guard/core-team thoughts?

Interesting indeed. Just as @thibaudgg said, It's better IMHO if Listen handles the entire process for a couple of reasons, but mainly for the serialization of the events.

As for the API, I don't see the reason for specifying the protocol. We could simply let the adapter handle that internally.

Here is my proposal for the API:

Server:

Listen.to('dir', forward_to: 3333)

# Remote machines
Listen.to('dir', forward_to: '192.168.1.64:3333')

Client

The Listen.to method already works with paths, so I don't think we should add the functionality of listing to ports to it. Most libraries I've looked at use the on method for listing on ports, so maybe we could use it as well:

Listen.on(3333) { ... }

# Remote machines
Listen.on('192.168.1.64:3333')
avit commented

Thanks for considering the idea! Does this API also need to handle path mapping? So maybe the to method is still appropriate for naming the base directory:

# server: fsevent paths are truncated to the relative project root when forwarding
Listen.to('/Users/andrew/Sites/project', forward_to: '33.33.33.10:3333')

# client: must see forwarded path as if it resides in the listening directory
# i.e. prepend directory in front of received event path
Listen.to('/vagrant', on: 3333)

Multiple listeners on different directories can just use different ports, yes? That would be the simplest solution.

I don't see the need to handle path mapping, could you provide an example please?

@Maher4Ever great API!

avit commented

A path on the server doesn't correspond to the same path where it's mounted on the client. e.g. on my Mac I keep a rails project in /Users/andrew/Sites/xyz. When I use Vagrant to set up a virtual machine for this project, the path gets mounted in the VM as /vagrant.

The tcp forwarder needs to remove the "/Users/andrew/Sites/xyz" path prefix, and only forward the relative path part (e.g. "spec/models/user_spec.rb")

The tcp listener needs to know what to do with the relative path, which means prefixing it with the "/vagrant" path.

I still think that using the Listen.to method is not the best thing to do for a couple of reasons. Listen.to already handles a lot of things (single paths, multiple paths, object vs block API). Using it would add to the complexity and could potentially confuse users.

Moreover, consider the following example:

Listen.to('/vagrant')
# vs.
Listen.to('/vagrant', on: 3333)

Isn't it a bit confusing? If you haven't read the docs, you probably won't know what the :on option does or that these two method calls are totally different.

Now consider the following:

Listen.to('/vagrant')

# vs.

Listen.on(3333, :base_directory => '/vagrant')
# or ...
Listen.on(3333, :prefix => '/vagrant')

The method calls have different signatures which indicates they are used for different reasons.

Multiple listeners could indeed by implemented by using multple ports, but how do suggest the API should look like? If you are planning on serializing the events before forwarding them, you could use the same port!
Also, how do you guys suggest handling the prefix when working with multiple paths?

@thibaudgg Thanks :).

avit commented

I don't disagree with your API, @Maher4Ever, I was just pointing out that the base directory might need to be given on both sides.

To me, Listen.to('/vagrant', on: 3333) is understood like I want to listen to events in this local directory (/vagrant), but using a different adapter: tcp client instead of inotify (equivalent to force_polling: true). It seems like the same method semantics, but it might not work for multiple directories, block params etc like you said.

(I don't have a use case for multiple directories, but I guess this should handle it too because local listener does? That's why I was originally thinking different directories would be separate listeners and ports.)

How will your :base_directory or :prefix parameter handle multiple directories? What if we do:

Listen.on(3333, :mapping => { "/Users/andrew/Sites/xyz" => "/vagrant", "/Library/Webserver/Documents" => "/var/www/htdocs"})
# or
Listen.on(3333)
      .mapping("/Users/andrew/Sites/xyz", "/vagrant")
      .mapping("/Library/WebServer/Documents", "/var/www/htdocs")

This frees the server from needing to strip any absolute paths, but it does bind the client to the directory structure from the server...

Although I like your approach in handling prefixes with the :mapping option, I guess there is something easier to do: Only allow forwarding the events of a single directory!

It's of course feasible to forward multiple directories events using a single port, but then handling the mapping as you said would couple the client and the server which I really don't like. We currently don't allow getting relative paths in events when watching multiple directories for that same reason.

To put it in code:

# Server
Listen.to('/home/user/dir1', forward_to: 3333)
Listen.to('/home/user/dir2', forward_to: 4444)

# Client
Listen.on(3333) # With no base_directory, the absolute path would be used
Listen.on(4444, base_directory: '/vagrant')

@thibaudgg If this gets implemented, do you think it's a good idea to raise exceptions for unsupported options passed to the Listen.to method? Now when someone passes the :relative_path option while watching multiple directories it simply gets ignored. That can result in confusing situations for the :forward_to option if the user doesn't read the docs.

+1 for only allow forwarding events of a single directory, much easier.

@Maher4Ever yeah +1 for raising a exceptions, or a least a warning message!

avit commented

Looks good to me too, this will work. Single directory would be simplest.

dwt commented

I like all of the ideas.

However since even a server client solution doesn't cover all use cases, I'd like to propose to also implement an override to force listen to go back to polling.

Perhaps an environment Variable? (As you often can't instruct listen directly if it is used as part of another library that doesn't expose any configuration for it.)

@dwt yeah having an environment variable to force polling is a good idea I think. Can you already make a pull request with that?

@avit Still taking a stab at this? I'd love to help out on this as this is the workflow I use now.

avit commented

I haven't looked at this in a while; stopped using Guard for now, actually. This should probably be linked here, it could be useful: https://github.com/thibaudgg/rb-fsevent

@avit Are you still using vagrant in your workflow? The spring gem now uses this gem to monitor file changes so I've been trying to get that to work with vagrant.

avit commented

@axsuul yes, still using vagrant but no guard / spring / zeus or similar auto-reloading magic, yet.

Has anyone made any progress on implementing this yet? Are there alternative solutions to polling?

I've just started work on TCP-forwarding for Listen 1.x as I cannot get the Guardfile to work correctly for Listen 2.x (master). Since Listen 2.x seems to be heavily reliant on Celluloid the feature should be easily portable.

As far as an ETA, I'm hoping to finalize something this week.

Hi @timkurvers, please could you retry with the master branch of Listen, it now use listen2 branch of Guard and should work fine.
I would preferably adding this feature directly to 2.0. Thanks!

Have just pushed an initial implementation to a feature/tcp-forwarding branch.

Usage example one end:

listener = Listen.to 'lib', forward_to: 4000 do |modified, added, removed| ... end
listener.start

Usage example other end:

listener = Listen.on 4000 do |modified, added, removed| ... end
listener.start

Bugs/features/issues remaining:

  • Missing specs: I will need help in this department as I'm not sure how to approach speccing certain things.
  • Relative paths/multiple directories: at present the paths broadcast are absolute, which makes it useless for usage with VMs. No warnings/exceptions are raised yet for unsupported scenarios.
  • Celluloid::IO's TCPSocket seems to have issues connecting to local addresses, hence the current workaround.
  • The broadcasting listener seems to fire off an added-report for an entire directory the first time any changes are detected. This might be a Listen 2.0 thing, @thibaudgg?
  • Is Listen::TCP::Listener the right way to go? It keeps the TCP forwarding functionality neatly encapsulated but it does override/hook into the base-class.

This is very much a work in progress, but I hope we'll be able to figure out the nitty-gritty together! 👊

Very nice, please open a pull request so we can have a discussion there.

Missing specs: I will need help in this department as I'm not sure how to approach speccing certain things.

Will discuss that in PR, no problem

Relative paths/multiple directories: at present the paths broadcast are absolute, which makes it useless for usage with VMs. No warnings/exceptions are raised yet for unsupported scenarios.

Absolute paths are the way to go for me, maybe we can add an option to replace some part of the path when used in a VM, though?

Celluloid::IO's TCPSocket seems to have issues connecting to local addresses, hence the current workaround.

Ok, please report it to Celluloid crew. I'm sure they would like to know about that.

The broadcasting listener seems to fire off an added-report for an entire directory the first time any changes are detected. This might be a Listen 2.0 thing.

Normally they are reported with a silence: true flag and should not be appear in the callback. (They are made for building the files record).

Is Listen::TCP::Listener the right way to go? It keeps the TCP forwarding functionality neatly encapsulated but it does override/hook into the base-class.

Look nice like that, but we could discuss that in the PR as well.

Thanks, looking forward to merging that!

e2 commented

Just FYI, guys: #258

(While I've read all the comments above, any feedback, hindsight would be appreciated).