vertica/vertica-sql-go

Driver hangs if user lacks permission for operation

mlh758 opened this issue · 5 comments

Version: github.com/vertica/vertica-sql-go v0.1.7

timeout, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
_, execErr := connection.ExecContext(timeout, view_sql_string)

If the user does not have permission to do the operation, in our case it was CREATE OR REPLACE VIEW, the driver will hang even with a timeout context passed in. The same sql passed into something like DBeaver errors out right away.

This seems similar to #48 although we aren't using a transaction.

Edit: This seems to happen for any DDL statement that returns an error. Dropping a table that does not exist can also trigger this.

I simplified my test case to:

db := dbConn()
defer db.Close()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
_, err := db.ExecContext(ctx, "drop view some_schema.some_view")
fmt.Println(err)

where I don't have permission to drop the view. I ran it a few times, and on one of my runs I saw this message:

Feb  7 13:47:39.563615 WARN connection: unhandled message: Error: [42501] Insufficient privileges on ALT_SEL_CAT, drop privileges needed

Which I believe should be coming from here. This should be caught in collectResults where I see that the error is ignored and it falls back into the for loop.

"Hah", I thought, easy, return from the loop on an unhandled message. No such luck, I never received that unhandled message error again. Subsequent runs always fell into case 3 and returned an error. At that point QueryContextRaw returns and it hangs afterwards.

Edit: I think future readers can probably ignore this trace...
I traced it a couple of times and saw these lines near the end where it then pretty much stopped doing things:

2020-02-07 14:34:23.630397 CST          0.1              MACH_STKHANDOFF                      0                66               1f               0                2149d0           0   ct-vertica(58656)  
2020-02-07 14:34:23.630397 CST          0.1(9982553.1)   9040002                              c000048f88       0                0                4                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630397 CST          0.2              9030021                              c000048f88       2                1                0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630397 CST          0.1(0.1)         9030022                              c000048f88       0                0                0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630397 CST          0.6(9982554.5)   BSC_psynch_cvwait                    0                0                0                e520             2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630401 CST          3.4              BSC_psynch_cvwait                    c000048f88       90100000a00      900              0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630401 CST          0.5              9040001                              c000048f88       0                a00              0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630402 CST          0.7              903001d                              c000048f88       2                ffffff8064870340 0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630402 CST          0.0(0.0)         903001e                              c000048f88       a00              900              901              2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630402 CST          0.2              9040000                              c000048f88       0                0                1                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630402 CST          0.1              903005c                              c000048f88       1                0                0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630403 CST          0.2              MACH_WAIT                            a7e2b91da4b50957 2                0                0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630403 CST          0.3              MACH_BLOCK                           0                ffffff7f82b36bcb 0                0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630403 CST          0.4              MT_InsCyc_CPU_CSwitch                10bc72af9449     fdf0e6bd20a      0                0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630403 CST          0.1              MACH_STKHANDOFF                      0                a0               1f               0                2149d2           6   ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST         37.3              MACH_DISPATCH                        a6               0                84               5                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.1              CPUPM_URGENCY                        0                2                2                0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.1              CPUPM_IDLE_EXIT1                     0                4                532910           c350             2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.1              CPUPM_PST_QOS_APPLY                  7                4                a65220           0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.0              CPUPM_PST_IDLE_EXIT                  2ac9             4                53ec60           0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.0              CPUPM_PST_QOS_SWITCH                 7                0                4                53ec60           2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.1              CPUPM_PST_QOS_CONT                   1                0                989680           6                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.0              CPUPM_DVFS_TRANSIT                   0                0                989680           2d               2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.1              CPUPM_PST_RESOLVE                    2d               2d               2                2d               2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630441 CST          0.0              CPUPM_PST_RATELIMIT                  0                3                1f               0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630464 CST         23.1              TMR_set_tsc_deadline                 35c3c40b6fe3     35c3c4a4048e     40449e8dd14a     4044a0194e8a     2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630464 CST          0.0              TMR_TimerCallEnter                   a7e2b920cdc0e66f 0                35c3c4a40109     41               2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630465 CST          0.3              TMR_set_tsc_deadline                 35c3c40bcae1     35c3c4a40109     40449e8ebd26     4044a019456c     2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630465 CST          0.0              DecrSet                              983628           2                35c3c4a40109     0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630465 CST          0.0(0.4)         TMR_TimerCallEnter                   a7e2b920cdc0e66f 1                35c3c4a40109     0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630465 CST          0.2(70.7)        BSC_semwait_signal                   3c               0                0                e520             2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630466 CST          1.6              UTrap_DeviceNotAvail                 7fdf             9096d84          0                100c5cb          2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630468 CST          1.3              BSC_psynch_cvwait                    c000048788       70100000800      700              0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630468 CST          0.3              9040001                              c000048788       0                800              0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.6              903001d                              c000048788       2                ffffff806486ff68 0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.0(0.0)         903001e                              c000048788       800              700              701              2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.1              9040000                              c000048788       0                0                1                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.1              903005c                              c000048788       1                0                0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.2              MACH_WAIT                            a7e2b91da4b5057f 2                35d1bc5332e4     0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.1              TMR_TimerCallEnter                   a7e2b91d9f7e7927 0                35d1bc5332e4     10               2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630469 CST          0.2(0.2)         TMR_TimerCallEnter                   a7e2b91d9f7e7927 0                35d1bc9f7e24     117              2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630470 CST          0.1              MACH_BLOCK                           0                ffffff7f82b36bcb 0                0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630470 CST          0.3              MT_InsCyc_CPU_CSwitch                b7eecfa2e87      9ed2e9f951c      0                0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:23.630470 CST          0.1              MACH_STKHANDOFF                      0                a6               1f               0                2149d0           10  ct-vertica(58656)  
2020-02-07 14:34:31.822654 CST    8192183.8              MACH_DISPATCH                        dad              0                1                5                2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822654 CST          0.2              CPUPM_URGENCY                        0                2                2                0                2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822654 CST          0.1              CPUPM_PST_QOS_SWITCH                 a                0                1                2457f4           2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822654 CST          0.1              CPUPM_PST_QOS_CONT                   1                0                205883           6                2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822654 CST          0.1              CPUPM_DVFS_TRANSIT                   0                0                205883           2d               2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822655 CST          0.4              CPUPM_PST_RESOLVE                    2d               2d               0                2d               2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822655 CST          0.0              CPUPM_PST_RATELIMIT                  0                3                b                0                2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822655 CST          0.2              TMR_set_tsc_deadline                 35c5ac56b18e     35c5ac77069a     404990364276     4049908a1091     2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822655 CST          0.1              TMR_TimerCallEnter                   ffffff8000de9b10 0                35c5acef3ca5     41               2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822655 CST          0.2              TMR_set_tsc_deadline                 35c5ac56b2a4     35c5ac77069a     404990364502     4049908a104c     2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822655 CST          0.1(0.2)         TMR_TimerCallEnter                   ffffff8000de9b10 1                35c5acef3ca5     0                2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822656 CST          0.6              INTERRUPT                            de               ffffff800045e2d9 0                1                2149cf           0   ct-vertica(58656)  
2020-02-07 14:34:31.822656 CST          0.4(0.4)         INTERRUPT                            de               0                0                0                2149cf           0   ct-vertica(58656)  

It seems like some code gets stuck waiting on a conditional lock but I really need to stick a debugger on this.

It appears to freeze in close when receiving a message. This is called from the defer at the end of execDC in databse/sql.

A couple of questions:

  1. Would it be reasonable to set a read deadline on the underlying connection? It might be something that we set for certain operations and then unset so long running queries can wait as long as the user wants them to.
  2. The core problem seems to be waiting for a message that never comes from the server. Is the message saying no permission special in some way in that it means the server is no longer listening on this portal or connection?

The actual error message that precedes this:

image

@huebnerr Is there something special about the rollback severity? A response never comes after the close and flush messages are sent which locks the driver up. I think the fix would be to check if the error severity is ROLLBACK and set a flag on the statement to not wait for a response but I wanted to make sure that was sane before adding a test an opening a PR.