neo4j/neo4j-go-driver

Neo4j client waits forever for lock leading to goroutines getting blocked

DaChartreux opened this issue · 2 comments

Bug

In my production environment, I am using querying neo4J for some data in a Golang service. This service is deployed on multiple instances but on one instance, we found that the server stopped responding and the last log was just before executing the neo4j query. We added a few logs and added profiling, and this issue happened again, this time in another machine. When we checked the profile, we saw that the number of goroutines kept on increasing, and the graph shows this...

Neo4j lock issue (1)

Upon taking the entire dump of the goroutine's stack, it is evident that getOrUpdateTable method is waiting for the lock.

goroutine 4374599 [sync.Mutex.Lock, 157 minutes]:
sync.runtime_SemacquireMutex(0xc0068a48b8?, 0x18?, 0xc0068a4948?)
	/usr/local/go/src/runtime/sema.go:77 +0x25
sync.(*Mutex).lockSlow(0xc00027c528)
	/usr/local/go/src/sync/mutex.go:171 +0x15d
sync.(*Mutex).Lock(...)
	/usr/local/go/src/sync/mutex.go:90
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).getOrUpdateTable(0xc00027c500, {0x1ca5630, 0xc008772dc0}, 0x189fe40?, {0x1996373, 0x5}, 0x7fc73be4e1d8?, {0x0, 0x0})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:154 +0xb5
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).GetOrUpdateWriters(0xc00027c500, {0x1ca5630, 0xc008772dc0}, 0x40df65?, {0x1996373, 0x5}, 0x1?, {0x0, 0x0})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:256 +0x5e
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getOrUpdateServers(0xc006fa5e00, {0x1ca5630, 0xc008772dc0}, 0x5?)
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:512 +0x125
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getConnection(0xc006fa5e00, {0x1ca5668?, 0xc009cb1030?}, 0x0, 0x40ff49?)
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:543 +0x387
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeTransactionFunction(0xc006fa5e00, {0x1ca5668, 0xc009cb1030}, 0x0, {0x5?, 0x0?}, 0xc006f455c0, 0xc0078084c0, 0xf8?, 0x3)
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:443 +0x9f
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).runRetriable(0xc006fa5e00, {0x1ca5668, 0xc009cb1030}, 0x1912760?, 0x27bc0a0?, 0x70?, 0x7fc73be4e1d8?, {0x0, 0x0, 0x0})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:424 +0x2c9
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeQueryWrite(0xc0078084c0?, {0x1ca5668?, 0xc009cb1030?}, 0x410805?, {0x0?, 0x1912760?, 0xc0068a4f01?})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:383 +0x45
github.com/neo4j/neo4j-go-driver/v5/neo4j.ExecuteQuery[...]({0x1ca5668, 0xc009cb1030?}, {0x1ca9a30, 0xc0000f61c0}, {0x1a01231, 0x1ba}, 0xc009cf6030, 0x1a1f9b8, {0xc0068a52a8, 0x1, ...})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/driver_with_context.go:548 +0x4fe
...

This is how I am connecting with my neo4j instance

        logger.Info("Connecting to Neo4J", "config", neo4jConfig)
	ctx := context.Background()
	driver, err := neo4j.NewDriverWithContext(
		neo4jConfig.Host,
		neo4j.BasicAuth(
			neo4jConfig.UserName,
			neo4jConfig.Password,
			"",
		),
		func(c *neo4jconfig.Config) {
			c = neo4jconfig.
		},
	)

and this is how I query

	result, err := neo4j.ExecuteQuery(ctx, *l.neo4jDriver,
		"neo4j query here",
		map[string]any{
			"jpins":          jpins,
			"customerGrpIds": customerGrpIds,
		}, neo4j.EagerResultTransformer,
		neo4j.ExecuteQueryWithDatabase("neo4j"),
		neo4j.ExecuteQueryWithReadersRouting(),
	)
	if err != nil {
		...
	}
	...
  • Neo4j version: 4.4-aura/enterprise
  • Neo4j Mode: Single instance with 2 followers
  • Driver version: GoLang driver v5.13.0
  • Operating system: Ubuntu 16.04.4 LTS x86_64 on AWS

Steps to reproduce
I don't know why or when this happens, or how to reproduce this.

One thing I noticed is goroutine #3565146. It seems that this is stuck in chan send for 162 minutes, and is causing the blocking of the other goroutines waiting for the lock.

goroutine 3565146 [chan send, 162 minutes]: 1 times: [3565146]
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).getOrUpdateTable(0xc00027c500, {0x1ca5630, 0xc002418050}, 0x189fe40?, {0x1996373, 0x5}, 0x7fc73be4ca68?, {0x0, 0x0})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:186 +0x608
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/router.(*Router).GetOrUpdateWriters(0xc00027c500, {0x1ca5630, 0xc002418050}, 0x40df65?, {0x1996373, 0x5}, 0x1?, {0x0, 0x0})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/internal/router/router.go:256 +0x5e
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getOrUpdateServers(0xc001ea5040, {0x1ca5630, 0xc002418050}, 0x5?)
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:512 +0x125
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).getConnection(0xc001ea5040, {0x1ca5668?, 0xc000454150?}, 0x0, 0x40ff49?)
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:543 +0x387
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeTransactionFunction(0xc001ea5040, {0x1ca5668, 0xc000454150}, 0x0, {0x5?, 0x0?}, 0xc00064e300, 0xc0026d1100, 0x8?, 0x3)
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:443 +0x9f
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).runRetriable(0xc001ea5040, {0x1ca5668, 0xc000454150}, 0x1912760?, 0x27bc0a0?, 0xa8?, 0x7fc73be4ca68?, {0x0, 0x0, 0x0})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:424 +0x2c9
github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeQueryWrite(0xc0026d1100?, {0x1ca5668?, 0xc000454150?}, 0x410805?, {0x0?, 0x1912760?, 0xc00200da01?})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/session_with_context.go:383 +0x45
github.com/neo4j/neo4j-go-driver/v5/neo4j.ExecuteQuery[...]({0x1ca5668, 0xc000454150?}, {0x1ca9a30, 0xc0000f61c0}, {0x1a01231, 0x1ba}, 0xc001f7b6b0, 0x1a1f9b8, {0xc00200ddb8, 0x1, ...})
	/home/winlist/go/pkg/mod/github.com/neo4j/neo4j-go-driver/v5@v5.13.0/neo4j/driver_with_context.go:548 +0x4fe
...
github.com/gin-gonic/gin.(*Context).Next(...)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
gopkg.in/DataDog/dd-trace-go.v1/contrib/gin-gonic/gin.Middleware.func1(0xc000a09000)
	/home/winlist/go/pkg/mod/gopkg.in/!data!dog/dd-trace-go.v1@v1.55.0/contrib/gin-gonic/gin/gintrace.go:67 +0x5a3
github.com/gin-gonic/gin.(*Context).Next(...)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0xc000a09000)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/recovery.go:102 +0x7a
github.com/gin-gonic/gin.(*Context).Next(...)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.LoggerWithConfig.func1(0xc000a09000)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/logger.go:240 +0xde
github.com/gin-gonic/gin.(*Context).Next(...)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0xc000103520, 0xc000a09000)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/gin.go:620 +0x65b
github.com/gin-gonic/gin.(*Engine).ServeHTTP(0xc000103520, {0x1ca3ef8?, 0xc001b7e0e0}, 0xc0002b4200)
	/home/winlist/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/gin.go:576 +0x1dd
net/http.serverHandler.ServeHTTP({0x1ca1588?}, {0x1ca3ef8?, 0xc001b7e0e0?}, 0x6?)
	/usr/local/go/src/net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc002301290, {0x1ca55f8, 0xc000e050e0})
	/usr/local/go/src/net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 131
	/usr/local/go/src/net/http/server.go:3086 +0x5cb

I have opened a PR for this, #539