Driver hangs if user lacks permission for operation
mlh758 opened this issue · 5 comments
Version: github.com/vertica/vertica-sql-go v0.1.7
timeout, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
_, execErr := connection.ExecContext(timeout, view_sql_string)
If the user does not have permission to do the operation, in our case it was CREATE OR REPLACE VIEW
, the driver will hang even with a timeout context passed in. The same sql passed into something like DBeaver errors out right away.
This seems similar to #48 although we aren't using a transaction.
Edit: This seems to happen for any DDL statement that returns an error. Dropping a table that does not exist can also trigger this.
I simplified my test case to:
db := dbConn()
defer db.Close()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
_, err := db.ExecContext(ctx, "drop view some_schema.some_view")
fmt.Println(err)
where I don't have permission to drop the view. I ran it a few times, and on one of my runs I saw this message:
Feb 7 13:47:39.563615 WARN connection: unhandled message: Error: [42501] Insufficient privileges on ALT_SEL_CAT, drop privileges needed
Which I believe should be coming from here. This should be caught in collectResults where I see that the error is ignored and it falls back into the for
loop.
"Hah", I thought, easy, return from the loop on an unhandled message. No such luck, I never received that unhandled message error again. Subsequent runs always fell into case 3 and returned an error. At that point QueryContextRaw
returns and it hangs afterwards.
Edit: I think future readers can probably ignore this trace...
I traced it a couple of times and saw these lines near the end where it then pretty much stopped doing things:
2020-02-07 14:34:23.630397 CST 0.1 MACH_STKHANDOFF 0 66 1f 0 2149d0 0 ct-vertica(58656)
2020-02-07 14:34:23.630397 CST 0.1(9982553.1) 9040002 c000048f88 0 0 4 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630397 CST 0.2 9030021 c000048f88 2 1 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630397 CST 0.1(0.1) 9030022 c000048f88 0 0 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630397 CST 0.6(9982554.5) BSC_psynch_cvwait 0 0 0 e520 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630401 CST 3.4 BSC_psynch_cvwait c000048f88 90100000a00 900 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630401 CST 0.5 9040001 c000048f88 0 a00 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630402 CST 0.7 903001d c000048f88 2 ffffff8064870340 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630402 CST 0.0(0.0) 903001e c000048f88 a00 900 901 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630402 CST 0.2 9040000 c000048f88 0 0 1 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630402 CST 0.1 903005c c000048f88 1 0 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630403 CST 0.2 MACH_WAIT a7e2b91da4b50957 2 0 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630403 CST 0.3 MACH_BLOCK 0 ffffff7f82b36bcb 0 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630403 CST 0.4 MT_InsCyc_CPU_CSwitch 10bc72af9449 fdf0e6bd20a 0 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630403 CST 0.1 MACH_STKHANDOFF 0 a0 1f 0 2149d2 6 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 37.3 MACH_DISPATCH a6 0 84 5 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.1 CPUPM_URGENCY 0 2 2 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.1 CPUPM_IDLE_EXIT1 0 4 532910 c350 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.1 CPUPM_PST_QOS_APPLY 7 4 a65220 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.0 CPUPM_PST_IDLE_EXIT 2ac9 4 53ec60 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.0 CPUPM_PST_QOS_SWITCH 7 0 4 53ec60 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.1 CPUPM_PST_QOS_CONT 1 0 989680 6 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.0 CPUPM_DVFS_TRANSIT 0 0 989680 2d 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.1 CPUPM_PST_RESOLVE 2d 2d 2 2d 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630441 CST 0.0 CPUPM_PST_RATELIMIT 0 3 1f 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630464 CST 23.1 TMR_set_tsc_deadline 35c3c40b6fe3 35c3c4a4048e 40449e8dd14a 4044a0194e8a 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630464 CST 0.0 TMR_TimerCallEnter a7e2b920cdc0e66f 0 35c3c4a40109 41 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630465 CST 0.3 TMR_set_tsc_deadline 35c3c40bcae1 35c3c4a40109 40449e8ebd26 4044a019456c 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630465 CST 0.0 DecrSet 983628 2 35c3c4a40109 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630465 CST 0.0(0.4) TMR_TimerCallEnter a7e2b920cdc0e66f 1 35c3c4a40109 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630465 CST 0.2(70.7) BSC_semwait_signal 3c 0 0 e520 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630466 CST 1.6 UTrap_DeviceNotAvail 7fdf 9096d84 0 100c5cb 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630468 CST 1.3 BSC_psynch_cvwait c000048788 70100000800 700 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630468 CST 0.3 9040001 c000048788 0 800 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.6 903001d c000048788 2 ffffff806486ff68 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.0(0.0) 903001e c000048788 800 700 701 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.1 9040000 c000048788 0 0 1 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.1 903005c c000048788 1 0 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.2 MACH_WAIT a7e2b91da4b5057f 2 35d1bc5332e4 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.1 TMR_TimerCallEnter a7e2b91d9f7e7927 0 35d1bc5332e4 10 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630469 CST 0.2(0.2) TMR_TimerCallEnter a7e2b91d9f7e7927 0 35d1bc9f7e24 117 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630470 CST 0.1 MACH_BLOCK 0 ffffff7f82b36bcb 0 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630470 CST 0.3 MT_InsCyc_CPU_CSwitch b7eecfa2e87 9ed2e9f951c 0 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:23.630470 CST 0.1 MACH_STKHANDOFF 0 a6 1f 0 2149d0 10 ct-vertica(58656)
2020-02-07 14:34:31.822654 CST 8192183.8 MACH_DISPATCH dad 0 1 5 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822654 CST 0.2 CPUPM_URGENCY 0 2 2 0 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822654 CST 0.1 CPUPM_PST_QOS_SWITCH a 0 1 2457f4 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822654 CST 0.1 CPUPM_PST_QOS_CONT 1 0 205883 6 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822654 CST 0.1 CPUPM_DVFS_TRANSIT 0 0 205883 2d 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822655 CST 0.4 CPUPM_PST_RESOLVE 2d 2d 0 2d 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822655 CST 0.0 CPUPM_PST_RATELIMIT 0 3 b 0 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822655 CST 0.2 TMR_set_tsc_deadline 35c5ac56b18e 35c5ac77069a 404990364276 4049908a1091 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822655 CST 0.1 TMR_TimerCallEnter ffffff8000de9b10 0 35c5acef3ca5 41 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822655 CST 0.2 TMR_set_tsc_deadline 35c5ac56b2a4 35c5ac77069a 404990364502 4049908a104c 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822655 CST 0.1(0.2) TMR_TimerCallEnter ffffff8000de9b10 1 35c5acef3ca5 0 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822656 CST 0.6 INTERRUPT de ffffff800045e2d9 0 1 2149cf 0 ct-vertica(58656)
2020-02-07 14:34:31.822656 CST 0.4(0.4) INTERRUPT de 0 0 0 2149cf 0 ct-vertica(58656)
It seems like some code gets stuck waiting on a conditional lock but I really need to stick a debugger on this.
It appears to freeze in close when receiving a message. This is called from the defer at the end of execDC
in databse/sql.
A couple of questions:
- Would it be reasonable to set a read deadline on the underlying connection? It might be something that we set for certain operations and then unset so long running queries can wait as long as the user wants them to.
- The core problem seems to be waiting for a message that never comes from the server. Is the message saying no permission special in some way in that it means the server is no longer listening on this portal or connection?
@huebnerr Is there something special about the rollback severity? A response never comes after the close and flush messages are sent which locks the driver up. I think the fix would be to check if the error severity is ROLLBACK and set a flag on the statement to not wait for a response but I wanted to make sure that was sane before adding a test an opening a PR.