Recursive dir watching on Linux seems unreliable
andrewmoise opened this issue · 1 comments
The self tests don't pass on Linux (Debian bookworm, 12.5, with Go 1.19.8 on kernel 6.1.0-18). I started getting into this because it seems like create events for files in newly-created subdirectories in my application were sometimes getting dropped, but it was a little hard to come up with a minimal repro case for it, so I grabbed the source (main from 2024-03-03) and the self tests sometimes pass and sometimes not.
Here's debug output from a successful run:
2024/03/03 17:36:15.811494 [D] received: path="/home/moise/notify/testdata/1326860591/src/github.com/rjeczalik/fs/cmd/gotree/main.go", event=notify.Remove, sys=&{9 512 0 0} (i=3, j=0)
2024/03/03 17:36:15.972497 [D] dispatching notify.Create on "/tmp/TestRenameInRoot2861869223/001/bar"
2024/03/03 17:36:15.972670 [D] dispatching notify.Rename on "/tmp/TestRenameInRoot2861869223/001/foo"
2024/03/03 17:36:15.972698 [D] dispatching notify.Rename on "/tmp/TestRenameInRoot2861869223/001/foo"
2024/03/03 17:36:16.023304 [D] dispatching notify.Create on "/tmp/TestRenameInRoot2861869223/001/bar/file"
2024/03/03 17:36:16.023754 [D] Stop(0xc00107d9e0) error: <nil>
2024/03/03 17:36:16.024748 [D] dispatching notify.Create on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.125288 [D] dispatching notify.Create on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.125556 [D] dispatching notify.Remove on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.125567 [D] dispatching notify.Remove on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.125572 [D] dispatching notify.Remove on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.125579 [D] dispatching notify.Write on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.125602 [D] dispatching notify.Create on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.226253 [D] dispatching notify.Create on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.226561 [D] dispatching notify.Remove on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.226574 [D] dispatching notify.Remove on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.226611 [D] dispatching notify.Remove on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.226622 [D] dispatching notify.Write on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.226654 [D] dispatching notify.Create on "/tmp/TestRecreated357106433/001/folder"
2024/03/03 17:36:16.327541 [D] dispatching notify.Create on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.327564 [D] dispatching notify.Write on "/tmp/TestRecreated357106433/001/folder/file"
2024/03/03 17:36:16.327664 [D] Stop(0xc000fc4ea0) error: <nil>
... and then, some from runs where the self tests failed:
2024/03/03 17:36:53.882575 [D] dispatching notify.Rename on "/tmp/TestRenameInRoot1459564899/001/foo"
2024/03/03 17:36:53.882699 [D] dispatching notify.Create on "/tmp/TestRenameInRoot1459564899/001/bar"
2024/03/03 17:36:53.882726 [D] dispatching notify.Rename on "/tmp/TestRenameInRoot1459564899/001/foo"
2024/03/03 17:36:53.933185 [D] dispatching notify.Create on "/tmp/TestRenameInRoot1459564899/001/bar/file"
2024/03/03 17:36:53.933269 [D] Stop(0xc000905020) error: <nil>
2024/03/03 17:36:53.934419 [D] dispatching notify.Create on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:54.035003 [D] dispatching notify.Create on "/tmp/TestRecreated139528610/001/folder/file"
2024/03/03 17:36:54.035314 [D] dispatching notify.Remove on "/tmp/TestRecreated139528610/001/folder/file"
2024/03/03 17:36:54.035324 [D] dispatching notify.Remove on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:54.035333 [D] dispatching notify.Write on "/tmp/TestRecreated139528610/001/folder/file"
2024/03/03 17:36:54.035348 [D] dispatching notify.Remove on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:54.035375 [D] dispatching notify.Create on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:54.136571 [D] dispatching notify.Create on "/tmp/TestRecreated139528610/001/folder/file"
2024/03/03 17:36:54.136596 [D] dispatching notify.Write on "/tmp/TestRecreated139528610/001/folder/file"
2024/03/03 17:36:54.137062 [D] dispatching notify.Remove on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:54.137140 [D] dispatching notify.Remove on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:54.137166 [D] dispatching notify.Remove on "/tmp/TestRecreated139528610/001/folder/file"
2024/03/03 17:36:54.137219 [D] dispatching notify.Create on "/tmp/TestRecreated139528610/001/folder"
2024/03/03 17:36:58.935406 [D] Stop(0xc0008f77a0) error: <nil>
--- FAIL: TestRecreated (5.00s)
notify_test.go:184: ######## First ########
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Create
notify_test.go:171: /tmp/TestRecreated139528610/001/folder/file notify.Create
notify_test.go:191: ######## Second ########
notify_test.go:171: /tmp/TestRecreated139528610/001/folder/file notify.Write
notify_test.go:171: /tmp/TestRecreated139528610/001/folder/file notify.Remove
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Remove
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Remove
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Create
notify_test.go:171: /tmp/TestRecreated139528610/001/folder/file notify.Write
notify_test.go:171: /tmp/TestRecreated139528610/001/folder/file notify.Create
notify_test.go:199: ######## Third ########
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Create
notify_test.go:171: /tmp/TestRecreated139528610/001/folder/file notify.Remove
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Remove
notify_test.go:171: /tmp/TestRecreated139528610/001/folder notify.Remove
notify_test.go:176: timed out before receiving event
2024/03/03 17:36:59.135079 [D] ExpectRecordedCalls: i=0
or
2024/03/03 17:26:10.598743 [D] dispatching notify.Rename on "/tmp/TestRenameInRoot227291980/001/foo"
2024/03/03 17:26:10.598840 [D] dispatching notify.Create on "/tmp/TestRenameInRoot227291980/001/bar"
2024/03/03 17:26:10.598848 [D] dispatching notify.Rename on "/tmp/TestRenameInRoot227291980/001/foo"
2024/03/03 17:26:10.649596 [D] dispatching notify.Create on "/tmp/TestRenameInRoot227291980/001/bar/file"
2024/03/03 17:26:10.649712 [D] Stop(0xc0001c60c0) error: <nil>
2024/03/03 17:26:10.650461 [D] dispatching notify.Create on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:10.750864 [D] dispatching notify.Create on "/tmp/TestRecreated81199395/001/folder/file"
2024/03/03 17:26:10.750980 [D] dispatching notify.Write on "/tmp/TestRecreated81199395/001/folder/file"
2024/03/03 17:26:10.751393 [D] dispatching notify.Remove on "/tmp/TestRecreated81199395/001/folder/file"
2024/03/03 17:26:10.751417 [D] dispatching notify.Create on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:10.751423 [D] dispatching notify.Remove on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:10.751434 [D] dispatching notify.Remove on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:10.751484 [D] dispatch did not reach leaf: Node /tmp/TestRecreated81199395/001/folder: file does not exist
2024/03/03 17:26:10.852158 [D] dispatching notify.Create on "/tmp/TestRecreated81199395/001/folder/file"
2024/03/03 17:26:10.852409 [D] dispatching notify.Write on "/tmp/TestRecreated81199395/001/folder/file"
2024/03/03 17:26:10.852420 [D] dispatching notify.Remove on "/tmp/TestRecreated81199395/001/folder/file"
2024/03/03 17:26:10.852425 [D] dispatching notify.Remove on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:10.852430 [D] dispatching notify.Remove on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:10.852436 [D] dispatching notify.Create on "/tmp/TestRecreated81199395/001/folder"
2024/03/03 17:26:15.651386 [D] Stop(0xc0001c63c0) error: <nil>
--- FAIL: TestRecreated (5.00s)
notify_test.go:184: ######## First ########
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Create
notify_test.go:171: /tmp/TestRecreated81199395/001/folder/file notify.Write
notify_test.go:171: /tmp/TestRecreated81199395/001/folder/file notify.Create
notify_test.go:191: ######## Second ########
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Remove
notify_test.go:171: /tmp/TestRecreated81199395/001/folder/file notify.Remove
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Create
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Remove
notify_test.go:171: /tmp/TestRecreated81199395/001/folder/file notify.Create
notify_test.go:199: ######## Third ########
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Create
notify_test.go:171: /tmp/TestRecreated81199395/001/folder/file notify.Write
notify_test.go:171: /tmp/TestRecreated81199395/001/folder/file notify.Remove
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Remove
notify_test.go:171: /tmp/TestRecreated81199395/001/folder notify.Remove
notify_test.go:176: timed out before receiving event
... but like I say it's not consistent. It seems to always fail within that same TestRecreated test, but not always at the same place, and sometimes it passes fine.
Let me know any other info that's useful to provide or anything else I can do.
Seeing the same, with Syncthing.
Can reproduce with i.e:
- Create folder FOLDER1
- Rename folder FOLDER1 to FOLDER2
- Create folder FOLDER1 > no events received from this folder or anything nested within.
Interestingly fsnotify does the same thing, with it's new recursive watcher. I'm running inotifywait in the background and it picks up all events fine.