tinkerbell/tink

[Tink Server] panic: runtime error: index out of range

jacobweinstock opened this issue · 1 comments

If the ReportActionStatus RPC is called with the following constraints the Tink server will panic with panic: runtime error: index out of range.

  • The workflow has reached its final action (status of that action does not matter).
  • ReportActionStatus, for the same workflowID and workerID, is called with an ActionStatus of pb.State_STATE_RUNNING
    &pb.WorkflowActionStatus{
      	WorkflowId:   workflowID,
      	TaskName:     "doesnt matter",
      	ActionName:   "doesnt matter either",
      	ActionStatus: pb.State_STATE_RUNNING, // This is the key
      	Seconds:      0,
      	Message:      "",
      	CreatedAt:    &timestamppb.Timestamp{},
      	WorkerId:     workerID,
      }

Tink server will panic with

panic: runtime error: index out of range [6] with length 6

goroutine 215 [running]:
github.com/tinkerbell/tink/grpc-server.(*server).ReportActionStatus(0xc0000b2000, 0xeddb80, 0xc0002bb710, 0xc0004226c0, 0xc0000b2000, 0x0, 0x0)
	/home/runner/work/tink/tink/grpc-server/tinkerbell.go:114 +0x100f
github.com/tinkerbell/tink/protos/workflow._WorkflowService_ReportActionStatus_Handler.func1(0xeddb80, 0xc0002bb710, 0xd26d40, 0xc0004226c0, 0xd7e6af, 0x8, 0x0, 0x0)
	/home/runner/work/tink/tink/protos/workflow/workflow.pb.go:2299 +0x86
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc.UnaryServerInterceptor.func1(0xeddb80, 0xc0002bb710, 0xd26d40, 0xc0004226c0, 0xc0004b5940, 0xc0004b5960, 0x0, 0x0, 0x0, 0x0)
	/home/runner/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc@v0.22.0/interceptor.go:327 +0x649
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1(0xeddb80, 0xc0002bb5c0, 0xd26d40, 0xc0004226c0, 0x4e, 0xc00041ed70, 0x203000, 0xc0003b9500)
	/home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.2.2/chain.go:25 +0x63
github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1(0xeddb80, 0xc0002bb5c0, 0xd26d40, 0xc0004226c0, 0xc0004b5940, 0xc0004b5980, 0x98136a, 0xce9ea0, 0xc0004b59a0, 0xc0004b5940)
	/home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/server_metrics.go:107 +0xad
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1(0xeddb80, 0xc0002bb5c0, 0xd26d40, 0xc0004226c0, 0xc0000338c0, 0x0, 0xc000621b30, 0x40cb48)
	/home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.2.2/chain.go:25 +0x63
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1(0xeddb80, 0xc0002bb5c0, 0xd26d40, 0xc0004226c0, 0xc0004b5940, 0xc0004b5960, 0xc000621ba0, 0x48f618, 0xd07dc0, 0xc0002bb5c0)
	/home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.2.2/chain.go:34 +0xd5
github.com/tinkerbell/tink/protos/workflow._WorkflowService_ReportActionStatus_Handler(0xd5fa20, 0xc0000b2000, 0xeddb80, 0xc0002bb5c0, 0xc0003c9ec0, 0xc0000a62a0, 0xeddb80, 0xc0002bb5c0, 0xc000066690, 0x6b)
	/home/runner/work/tink/tink/protos/workflow/workflow.pb.go:2301 +0x14b
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0000c41c0, 0xee83c0, 0xc000432600, 0xc0000338c0, 0xc0000a67e0, 0x14da3b0, 0x0, 0x0, 0x0)
	/home/runner/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1292 +0x50a
google.golang.org/grpc.(*Server).handleStream(0xc0000c41c0, 0xee83c0, 0xc000432600, 0xc0000338c0, 0x0)
	/home/runner/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1617 +0xcfd
google.golang.org/grpc.(*Server).serveStreams.func1.2(0xc0005147f0, 0xc0000c41c0, 0xee83c0, 0xc000432600, 0xc0000338c0)
	/home/runner/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:940 +0xa1
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/home/runner/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:938 +0x204

Expected Behaviour

Current Behaviour

Possible Solution

The offending code is here.
One possible solution is adding something like the following after line 108,

if actionIndex == wfContext.TotalNumberOfActions-1 {
    return nil, status.Errorf(codes.FailedPrecondition, "already at the end of the workflow")
}

Steps to Reproduce (for bugs)

  1. Follow the Sandbox with Vagrant and Virtualbox quickstart
  2. Get the worker ID to be used to reproduce, should be the only one in there. tink hardware list -q
  3. Get the workflow ID to be used to reproduce, should be the only one in there. tink workflow list -q
  4. Get the TLS cert from Tink curl localhost:42114/cert
  5. Make the call to ReportActionStatus
package main

import (
	"context"
	"crypto/x509"

	"github.com/tinkerbell/tink/protos/workflow"
	"google.golang.org/grpc"
	"google.golang.org/grpc/credentials"
	"google.golang.org/protobuf/types/known/timestamppb"
)

func main() {
	ctx := context.Background()
	gc, err := grpcClient(ctx)
	if err != nil {
		panic(err)
	}
	workflowClient := workflow.NewWorkflowServiceClient(gc)
	_, err = workflowClient.ReportActionStatus(ctx, &workflow.WorkflowActionStatus{
		WorkflowId:   "<workflow ID from step 3>",
		TaskName:     "doesnt matter",
		ActionName:   "also doesnt matter",
		ActionStatus: workflow.State_STATE_RUNNING,
		Seconds:      0,
		Message:      "",
		CreatedAt:    &timestamppb.Timestamp{},
		WorkerId:     "<worker ID from step 2>",
	})
	if err != nil {
		panic(err)
	}
}

// toCreds takes a byte string, assumed to be a tls cert, and creates a transport credential.
func toCreds(pemCerts []byte) credentials.TransportCredentials {
	cp := x509.NewCertPool()
	ok := cp.AppendCertsFromPEM(pemCerts)
	if !ok {
		return nil
	}
	return credentials.NewClientTLSFromCert(cp, "")
}

func grpcClient(ctx context.Context) (*grpc.ClientConn, error) {
	var cert = []byte(`<insert cert from curl command here>`)
	return grpc.DialContext(ctx, "localhost:42113", grpc.WithTransportCredentials(toCreds(cert)))
}
  1. Observe the panic from the Tink server logs docker logs -f compose_tink-server_1

Context

Your Environment

  • Operating System and version (e.g. Linux, Windows, MacOS):
    MacOS/Linux

  • How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details:
    Sandbox with Vagrant and Virtualbox at commit e3f8d2809997583d164d595ce81b68b081c4a502
    Tink server at version quay.io/tinkerbell/tink:sha-3743d31e.

  • Link to your project or a code example to reproduce issue:

Created #576 with the suggested fix from @jacobweinstock