hashicorp/raft

raft.GetConfiguration() - runtime error: invalid memory address or nil pointer dereference

arpanps opened this issue · 5 comments

raft.GetConfiguration() funtion throwing runtime error: invalid memory address or nil pointer dereference panic when calling at server startup.

2022/11/16 16:50:35 http: panic serving [127.0.0.1:PORT](https://127.0.0.1:PORT/): runtime error: invalid memory address or nil pointer dereference
goroutine 217 [running]:
net/http.(*conn).serve.func1(0xc000502280)
	/usr/lib/go/src/net/http/server.go:1800 +0x142
panic(0x7f8c067fd940, 0x7f8c06e295e0)
	/usr/lib/go/src/runtime/panic.go:975 +0x3f7
[github.com/hashicorp/raft.(*Raft).getLatestConfiguration(0x0](https://github.com/hashicorp/raft.(*Raft).getLatestConfiguration(0x0), 0x1, 0xc0000a8ba0, 0xc00054d4d0)
	<Path>/github.com/hashicorp/raft/raft.go:1825 +0x13
[github.com/hashicorp/raft.(*Raft).GetConfiguration(0x0](https://github.com/hashicorp/raft.(*Raft).GetConfiguration(0x0), 0x65646f6e2f676e69, 0x7a73)

Do we need a nil check for latestConfiguration here ?

raft/raft.go

Line 1989 in 6b4e320

switch c := r.latestConfiguration.Load().(type) {

Works fine after server boot up.

banks commented

@arpanps thanks for the issue!

It's not super clear at first glance what's going on there. atomic.Value is not a pointer itself and it can never contain a nil either so Load() should never thow that. See this example that shows even if we never stored a value a switch like this on it's own won't panic: https://play.golang.com/p/AUISuCoquJ-

I don't really understand this part of your trace: getLatestConfiguration(0x0) I forget what those offsets refer to especually as that method has no arguments but is it possible that the r (raft instance reciever is actually nil in your case?

If you could share the code that caused this it could help a lot! What is "startup" do you mean you called it before the raft instance was even constructed? If so then I think the nil check has to be in your code - most Go types don't support methods being called on a nil pointer to their type unless there's a specific reason why that is desirable.

we have below code in a http GET method, getting configuration like below and returning it

if request.Method == "GET" {
		var response *ClusterInfoz
		if s.raft != nil {
			configFuture := s.raft.GetConfiguration()
			if err := configFuture.Error(); err != nil {
				http.Error(writer, fmt.Sprintf("Error fetching configuration: %v", err), http.StatusServiceUnavailable)
			} else {
				response = &ClusterInfoz{configFuture.Configuration().Servers}
				s.sendResponse(writer, request, response)
			}
		} else {
			response = &ClusterInfoz{[]raft.Server{}}
			s.sendResponse(writer, request, response)
		}
	} else {
		http.Error(writer, fmt.Sprintf("Unsupported operation: %v", request.Method), http.StatusMethodNotAllowed)
	}

Getting penic in

		configFuture := s.raft.GetConfiguration()

here s is not nil and raft is also not nil

banks commented

Hey @arpanps, can you also check which version of Raft you're using? Your stacktrace indicates raft.go:1825 is the panic line but the same line in main as you linked was actually line 1989. If you're using an older version before we uses atomic.Value for this that might explain things and I'd suggest to try upgrading?

We are on "1.1.2 (January 17th, 2020)" which has latestConfiguration atomic.Value in api.go:144

	// Holds a copy of the latest configuration which can be read
	// independently from main loop.
	latestConfiguration atomic.Value

Hi @arpanps,

Sorry, I think there's been too much change in raft lib for us to debug a panic from such an old version. Could you upgrade and re-open this issue (or file a new one) if you continue to observe panics with a current version?