facebook/watchman

thread 'tokio-runtime-worker' panicked at 'Watchman subscription error: Watchman error: Lost connection to watchman'

andrewhamon opened this issue · 1 comments

I am using watchman packaged with nix

version = "2023.01.30.00";

I am using watchman indirectly via the relay compiler in watch mode, i.e. I am running relay-compiler --watch relay.config.json

Everything works as expected for a short while, but eventually I bump in to this:

watchman client task failed: EOF on Watchman socket
thread 'tokio-runtime-worker' panicked at 'Watchman subscription error: Watchman error: Lost connection to watchman', /Users/runner/work/relay/relay/compiler/crates/relay-compiler/src/compiler.rs:146:33
stack backtrace:
   0:        0x104c46398 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h1543c132bc4e188c
   1:        0x104c63b74 - core::fmt::write::hda8e8eb84b49cbfc
   2:        0x104c3fe74 - std::io::Write::write_fmt::hb84c8996aec7120c
   3:        0x104c47ba4 - std::panicking::default_hook::{{closure}}::hdf06011cb093de6a
   4:        0x104c47908 - std::panicking::default_hook::hd7ceb942fff7b170
   5:        0x104c4803c - std::panicking::rust_panic_with_hook::h053d4067a63a6fcb
   6:        0x104c47f70 - std::panicking::begin_panic_handler::{{closure}}::hea9e6c546a23e8ff
   7:        0x104c46874 - std::sys_common::backtrace::__rust_end_short_backtrace::hd64e012cf32134c6
   8:        0x104c47cc8 - _rust_begin_unwind
   9:        0x104cce754 - core::panicking::panic_fmt::hbfde5533e1c0592e
  10:        0x1045d72a0 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::h9f72396ad5d30de0
  11:        0x10455ac04 - tokio::runtime::task::core::Core<T,S>::poll::h76b173c3d8ecf5a6
  12:        0x10455cc80 - tokio::runtime::task::harness::Harness<T,S>::poll::h1059e4f55713c7ab
  13:        0x1049094a4 - tokio::runtime::scheduler::multi_thread::worker::Context::run_task::h94c241f67b235c34
  14:        0x104908698 - tokio::runtime::scheduler::multi_thread::worker::Context::run::h0beb3b2adfb4e39a
  15:        0x1049022f8 - tokio::macros::scoped_tls::ScopedKey<T>::set::h9b0d2cc51839cc6a
  16:        0x104908198 - tokio::runtime::scheduler::multi_thread::worker::run::h02fc32c96ea54b63
  17:        0x10490e0c4 - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h24100a2c96784975
  18:        0x1048fe57c - tokio::runtime::task::core::Core<T,S>::poll::h75273ba800309cf8
  19:        0x1048ffcc4 - tokio::runtime::task::harness::Harness<T,S>::poll::hab95a62476febe6d
  20:        0x1048f7634 - tokio::runtime::blocking::pool::Inner::run::h13f21ebf96a92aea
  21:        0x1048f1d20 - std::sys_common::backtrace::__rust_begin_short_backtrace::h11f35592722ed5c6
  22:        0x1048f29fc - core::ops::function::FnOnce::call_once{{vtable.shim}}::had28d6a12507d147
  23:        0x104c4c214 - std::sys::unix::thread::Thread::new::thread_start::h403ab16d5f453cd4
  24:        0x1a3c6e06c - __pthread_deallocate

When this happens, I see this output in the logs:

Terminating due to signal 15 Terminated generated by pid=1 uid=0  (0)
0   watchman                            0x0000000102141034 _ZL13crash_handleriP9__siginfoPv + 708
1   libsystem_platform.dylib            0x00000001a3c9c2a4 _sigtramp + 56
2   libsystem_pthread.dylib             0x00000001a3c70394 _pthread_join + 444
3   libc++.1.0.dylib                    0x0000000102f70f04 _ZNSt3__16thread4joinEv + 32
4   watchman                            0x000000010215dafc _Z16w_start_listenerv + 1548
5   watchman                            0x0000000102166f04 _ZL11run_serviceON8watchman11ProcessLock6HandleE + 424
6   watchman                            0x000000010216383c _ZL25run_service_in_foregroundv + 220
7   watchman                            0x0000000102163030 _ZL10inner_mainiPPc + 6404
8   watchman                            0x0000000102161558 main + 40
9   dyld                                0x00000001a3943e50 start + 2544
2023-03-20T00:46:46,304: [listener] waiting for 1 clients to terminate
2023-03-20T00:46:46,335: [sanitychecks] done with sanityCheckThread
2023-03-20T00:46:50,956: [listener] 1 roots were still live at exit
2023-03-20T00:46:50,956: [listener] Exiting from service with res=true

I think I know a proximate cause, which was that I also had homebrew version of watchman installed which was interfering.

Here is what I observed:

  • after every crash, i noticed that the watchman binary from homebrew was running
  • even if i killed this process, it would always come back. I'm not sure what kept launching watchman

Once I uninstalled the homebrew version, I am unable to reproduce this issue.

Is there any logic in watchman which will kill other watchman processes? These two installs were configured with different statedirs so I would expect them to be able to coexist just fine.

In any case, i don't need them to coexist - watchman was the very last holdout that I needed homebrew for.