snowflakedb/gosnowflake

Segfault from s3_storage_client.go:96

Closed this issue · 6 comments

This issue happens every now and then. There is no reliable reproduce steps.
Stacktrace

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x98 pc=0x1245175]

goroutine 124222 [running]:
github.com/snowflakedb/gosnowflake.(*snowflakeS3Client).getFileHeader(0xbbd673242b7157ad?, 0xc0000fc380, {0xc004d90d20?, 0x0?})
        /home/joe/go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.7.1/s3_storage_client.go:96 +0x115
github.com/snowflakedb/gosnowflake.(*remoteStorageUtil).uploadOneFile(0xc001bbe9d8?, 0xc0000fc380)
        /home/joe/go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.7.1/storage_client.go:90 +0x248
github.com/snowflakedb/gosnowflake.(*remoteStorageUtil).uploadOneFileWithRetry(0xc007a1ef17?, 0xc0000fc380)
        /home/joe/go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.7.1/storage_client.go:141 +0xd0
github.com/snowflakedb/gosnowflake.(*snowflakeFileTransferAgent).uploadOneFile(0xc005dca500, 0xc0000fc380)
        /home/joe/go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.7.1/file_transfer_agent.go:871 +0x37c
github.com/snowflakedb/gosnowflake.(*snowflakeFileTransferAgent).uploadFilesParallel.func1(0x0, 0xc003ece210?)
        /home/joe/go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.7.1/file_transfer_agent.go:731 +0x7d
created by github.com/snowflakedb/gosnowflake.(*snowflakeFileTransferAgent).uploadFilesParallel in goroutine 75
        /home/joe/go/pkg/mod/github.com/snowflakedb/gosnowflake@v1.7.1/file_transfer_agent.go:729 +0x25d
  1. What version of GO driver are you using?
    1.7.1

  2. What operating system and processor architecture are you using?
    MacOS

  3. What version of GO are you using?
    go version go1.20.4 darwin/arm64

4.Server version:* E.g. 1.90.1
7.44.2

  1. What did you do?
  • Create temp stage
  • Upload local json files into temp stage
  • Copy data from tempstage to a regular table
  1. What did you expect to see?
    Successfully upload

  2. Can you set logging to DEBUG and collect the logs?
    Will do next time.

hi and thank you for raising this issue !

tried to reproduce it with a simple program which creates temp stage + PUTs some small JSON, wasn't successful. Omitted COPY INTO as per the stack trace, issue seems to happen at the upload step.

Then tried to run this program couple times sequentially, still no error. Then removed the unique temp stage name, did not help. Then running the same program sequentially but on two process, no luck. Then introduced goroutines, maybe that makes it fail - did not.

Right now i'm at two goroutines running in parallel in the background but still could not get to the point where the program would bail out with panic: runtime error: invalid memory address or nil pointer dereference , even after running the repro for a couple hundred (thousand?) times.
I'm using

  • gosnowflake 1.7.1
  • go version go1.20.6 darwin/amd64 (see you using ARM, not sure if i can get a repro there, will try)

upgrading aws-sdk-go-v2/feature/s3/manager from the default 1.11.59 past 1.14.0 (where they introduced the breaking change) also did not help either in reproducing

$ go get -u github.com/aws/aws-sdk-go-v2/feature/s3/manager
[..]
go: upgraded github.com/aws/aws-sdk-go-v2/feature/s3/manager v1.11.59 => v1.15.7

little repro program i'm running:

$  cat testPut.go 
package main

import (
  "context"
  "database/sql"
  "flag"
  "fmt"
  "log"
//  "strconv"
//  "time"

  sf "github.com/snowflakedb/gosnowflake"
)

func doIt(c chan string) {
  if !flag.Parsed() {
    flag.Parse()
  }

  cfg := &sf.Config {
    User: "user",
    Password: "password",
    Account: "myaccount.eu-central-1",
    Warehouse: "COMPUTE_WH",
//    Tracing: "trace",
  }

  dsn, err := sf.DSN(cfg)
  if err != nil {
    log.Fatalf("failed to create DSN from Config: %v, err: %v", cfg, err)
  }

  db, err := sql.Open("snowflake", dsn)
  if err != nil {
    log.Fatalf("failed to connect. %v, err: %v", dsn, err)
  }
  defer db.Close()

  ctx := context.Background()
  conn, err := db.Conn(ctx)
  if err != nil {
    log.Fatalf("Failed to acquire connection. err: %v", err)
  }
  defer conn.Close()

  stageName := "test_db.gh1008.stage_gh1008_" //+ strconv.FormatInt(time.Now().UnixNano(), 10)
  // initially used UnixNano to make the stage name unique

  createStageQuery := "CREATE TEMP STAGE " + stageName
  putQuery := "PUT file:///path/to/json*.json @" + stageName + " AUTO_COMPRESS=false OVERWRITE=true PARALLEL=10"

  fmt.Printf("Creating stage: %v\n", createStageQuery)
  _, err = conn.ExecContext(ctx, createStageQuery)
  if err != nil {
    log.Fatalf("failed to run the query. %v, err: %v", createStageQuery, err)
  }

  fmt.Printf("PUTting files: %v\n", putQuery)
  _, err = conn.ExecContext(ctx, putQuery)
  if err != nil {
    log.Fatalf("failed to run the query. %v, err: %v", putQuery, err)
  }
  c <- "goroutine done"
}

func main() {
  c := make(chan string)
  go doIt(c)
  go doIt(c)
  r := <-c
  fmt.Println(r)
}

So if you could provide any further insights into how you're using the PUT or even a minimal viable repro program, that would be immensely helpful. Thank you in advance!

@sfc-gh-dszmolka thanks for looking into the issue.
I have not found a reliable repro yet. This issue happens very rarely and I only saw twice.
Some random guess: is it possible that this branch returns false, in which case the code will continue executing even when err != nil?
https://github.com/snowflakedb/gosnowflake/blob/master/s3_storage_client.go#L80

good pointer, thank you ! i still did not manage to reproduce the issue, no matter what i did to the PUT / HEAD requests' responses, but the referenced part definitely looks suspicious; especially comparing to the gcs_storage_cilent and azure_storage_client where doesn't seem to be such 'loose end'.
We'll take a look.

since we don't have a reproducible scenario with which we could verify any kind of fix, i'm just hoping that adding error handling for the 'loose end' you pointed out will help.
PR is in test for that under #1012

test passed, PR merged, and it will be part of next release of the driver which can be expected towards mid-January

released with gosnowflake 1.7.2