cashapp/hermit

Hermit seems to often hang/deadlock

staktrace opened this issue · 3 comments

I'm often running things in different shells that happen to directly or indirectly invoke hermit. And I'm finding that frequently (multiple times a day) one or more of those hermit invocations will just hang. This includes things like cd commands where I'm moving in/out of hermit-managed directories with the hermit shell-hooks active.

Filing this issue to track, will dump more information here as I obtain it.

I had a hang just now and ran kill -ABRT <pid> on the pid listed by ps:

$ ps
  PID TTY           TIME CMD
  760 ttys000    0:02.08 -bash
42799 ttys000    0:00.04 bash /Users/kartikaya/zspace/cash-handy-scripts/kochiku-build-time-data/main.sh /Users/kartikaya/switch/documents/square/2022-q2-buildtimes/baseline true
47619 ttys000    0:00.01 bash /Users/kartikaya/zspace/cash-handy-scripts/kochiku-build-time-data/project_data.sh
47636 ttys000    0:00.03 bash /Users/kartikaya/zspace/cash-handy-scripts/kochiku-build-time-data/project_data.sh
48298 ttys000    0:00.00 bash /Users/kartikaya/zspace/cash-handy-scripts/kochiku-build-time-data/project_data.sh
48300 ttys000    2:32.00 /Users/kartikaya/Library/Caches/hermit/pkg/hermit@square/hermit --level=fatal exec /Users/kartikaya/zspace/cash/bin/jq -- -r .build_part.build_attempts[] | select(.state == "passed") | [.build_part_id, .started_at, .finished_a
48301 ttys000    0:00.00 sed -e s#^#5420445,compile,# -e s#$#,true#
...
$ kill -ABRT 48300

Over in the shell running the command:

fatal error: notesleep not on g0
panic during panic

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.sigtramp called from 0x7ff81e449dfd
stack: frame={sp:0xc000009a60, fp:0xc000009ab0} stack=[0xc000002000,0xc00000a000)
0x000000c000009960:  0x0000000000000000  0x000000c0000099c0
0x000000c000009970:  0x0000000000000000  0x0000000000000000
0x000000c000009980:  0x000000c000000000  0x0000000000000000
0x000000c000009990:  0x0000000000000000  0x0000000000000000
0x000000c0000099a0:  0x000000c000009ee0  0x000000c000009f48
0x000000c0000099b0:  0x000000c000009a28  0x0000000001049811 <runtime.sigtrampgo+0x00000000000001b1>
0x000000c0000099c0:  0x000000c000000006  0x0000000001b57540
0x000000c0000099d0:  0x000000c0000099e8  0x0000000001b57160
0x000000c0000099e0:  0x0000000000000000  0x0000000000000000
0x000000c0000099f0:  0x0000000000000000  0x0000000000000000
0x000000c000009a00:  0x0000000000000000  0x0000000000000000
0x000000c000009a10:  0x0000000001b57160  0x000000c000009ee0
0x000000c000009a20:  0x000000c000009f48  0x000000c000009a50
0x000000c000009a30:  0x0000000001067e6e <runtime.sigtrampgo+0x000000000000002e>  0x0000000000000006
0x000000c000009a40:  0x000000c000009ee0  0x000000c000009f48
0x000000c000009a50:  0x000000c000009aa0  0x000000000106735c <runtime.sigtramp+0x000000000000003c>
0x000000c000009a60: <0x0000000000000006  0x000000c000009ee0
0x000000c000009a70:  0x000000c000009f48  0x000000c000009f48
0x000000c000009a80:  0x44dfef763d50f81f  0x0000000000000000
0x000000c000009a90:  0x0000350100003600  0x0000000000003500
0x000000c000009aa0:  0x000000c000009ab0 !0x00007ff81e449dfd
0x000000c000009ab0: >0x00007ff7bfefe260  0x0000000000000000
0x000000c000009ac0:  0x0000000000000000  0x0000000000000000
0x000000c000009ad0:  0x0000000000000000  0x0200013100000085
0x000000c000009ae0:  0x000000000317000f  0x0000000000000104
0x000000c000009af0:  0x00000000039d0600  0x00007ff7bfefe1b8
0x000000c000009b00:  0x0000000000003500  0x0000000001b578e0
0x000000c000009b10:  0x0000350100003600  0x00007ff7bfefe260
0x000000c000009b20:  0x00007ff7bfefe1b8  0x0000000000000000
0x000000c000009b30:  0x00000000000000a0  0x0000000000000000
0x000000c000009b40:  0x0000000000000246  0x0000000000000016
0x000000c000009b50:  0x0000000000000000  0x0000350100003600
0x000000c000009b60:  0x0000000000003500  0x00007ff81e3fa3ea
0x000000c000009b70:  0x0000000000000247  0x0000000000000007
0x000000c000009b80:  0x0000000000000000  0x0000000000000000
0x000000c000009b90:  0x0000000000000000  0x000000000000037f
0x000000c000009ba0:  0x0000000000000000  0x0000000000000000
runtime.throw({0x1634ab7, 0x0})
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/panic.go:1198 +0x71
runtime.notesleep(0x0)
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/lock_sema.go:169 +0x11a
runtime.(*rwmutex).rlock.func1()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/rwmutex.go:54 +0x78
runtime.(*rwmutex).rlock(0x0)
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/rwmutex.go:41 +0x59
runtime.preemptM(0xc000601200)
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/signal_unix.go:357 +0x29
runtime.preemptone(0x0)
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/proc.go:5596 +0x5e
runtime.preemptall()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/proc.go:5558 +0x55
runtime.freezetheworld()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/proc.go:874 +0x73
runtime.startpanic_m()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/panic.go:1346 +0x16a
runtime.sighandler(0x6, 0x1b57540, 0xc0000099e8, 0x1b57160)
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/signal_unix.go:620 +0x309
runtime.sigtrampgo(0x6, 0xc000009ee0, 0xc000009f48)
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/signal_unix.go:460 +0x1b1
runtime.sigtrampgo(0x6, 0xc000009ee0, 0xc000009f48)
	<autogenerated>:1 +0x2e
runtime: unexpected return pc for runtime.sigtramp called from 0x7ff81e449dfd
stack: frame={sp:0xc000009a60, fp:0xc000009ab0} stack=[0xc000002000,0xc00000a000)
0x000000c000009960:  0x0000000000000000  0x000000c0000099c0
0x000000c000009970:  0x0000000000000000  0x0000000000000000
0x000000c000009980:  0x000000c000000000  0x0000000000000000
0x000000c000009990:  0x0000000000000000  0x0000000000000000
0x000000c0000099a0:  0x000000c000009ee0  0x000000c000009f48
0x000000c0000099b0:  0x000000c000009a28  0x0000000001049811 <runtime.sigtrampgo+0x00000000000001b1>
0x000000c0000099c0:  0x000000c000000006  0x0000000001b57540
0x000000c0000099d0:  0x000000c0000099e8  0x0000000001b57160
0x000000c0000099e0:  0x0000000000000000  0x0000000000000000
0x000000c0000099f0:  0x0000000000000000  0x0000000000000000
0x000000c000009a00:  0x0000000000000000  0x0000000000000000
0x000000c000009a10:  0x0000000001b57160  0x000000c000009ee0
0x000000c000009a20:  0x000000c000009f48  0x000000c000009a50
0x000000c000009a30:  0x0000000001067e6e <runtime.sigtrampgo+0x000000000000002e>  0x0000000000000006
0x000000c000009a40:  0x000000c000009ee0  0x000000c000009f48
0x000000c000009a50:  0x000000c000009aa0  0x000000000106735c <runtime.sigtramp+0x000000000000003c>
0x000000c000009a60: <0x0000000000000006  0x000000c000009ee0
0x000000c000009a70:  0x000000c000009f48  0x000000c000009f48
0x000000c000009a80:  0x44dfef763d50f81f  0x0000000000000000
0x000000c000009a90:  0x0000350100003600  0x0000000000003500
0x000000c000009aa0:  0x000000c000009ab0 !0x00007ff81e449dfd
0x000000c000009ab0: >0x00007ff7bfefe260  0x0000000000000000
0x000000c000009ac0:  0x0000000000000000  0x0000000000000000
0x000000c000009ad0:  0x0000000000000000  0x0200013100000085
0x000000c000009ae0:  0x000000000317000f  0x0000000000000104
0x000000c000009af0:  0x00000000039d0600  0x00007ff7bfefe1b8
0x000000c000009b00:  0x0000000000003500  0x0000000001b578e0
0x000000c000009b10:  0x0000350100003600  0x00007ff7bfefe260
0x000000c000009b20:  0x00007ff7bfefe1b8  0x0000000000000000
0x000000c000009b30:  0x00000000000000a0  0x0000000000000000
0x000000c000009b40:  0x0000000000000246  0x0000000000000016
0x000000c000009b50:  0x0000000000000000  0x0000350100003600
0x000000c000009b60:  0x0000000000003500  0x00007ff81e3fa3ea
0x000000c000009b70:  0x0000000000000247  0x0000000000000007
0x000000c000009b80:  0x0000000000000000  0x0000000000000000
0x000000c000009b90:  0x0000000000000000  0x000000000000037f
0x000000c000009ba0:  0x0000000000000000  0x0000000000000000
runtime.sigtramp()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/sys_darwin_amd64.s:227 +0x3c

goroutine 1 [running]:
	goroutine running on other thread; stack unavailable

goroutine 5 [syscall, 5 minutes]:
os/signal.signal_recv()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/runtime/sigqueue.go:166 +0x28
os/signal.loop()
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/os/signal/signal_unix.go:24 +0x19
created by os/signal.Notify.func1.1
	/home/runner/.cache/hermit/pkg/go-1.17.7/src/os/signal/signal.go:151 +0x2c

goroutine 35 [chan receive, 5 minutes]:
github.com/cashapp/hermit/ui.New.func2()
	/home/runner/go/pkg/mod/github.com/cashapp/hermit@v0.18.3/ui/ui.go:90 +0x4d
created by github.com/cashapp/hermit/ui.New
	/home/runner/go/pkg/mod/github.com/cashapp/hermit@v0.18.3/ui/ui.go:89 +0x23d

I don't have a good enough understanding of hermit/go yet to analyze this.

From slack discussion this might be a result of golang/go#52226

Duplicate of #215