timveil-cockroach/oltpbench

CRDB: unable to run TPC-H

timveil opened this issue · 3 comments

15:29:42,503 (Worker.java:417) WARN  - PSQLException thrown when executing 'Q9/09' on 'TPCHWorker<000>' [Message='An I/O error occurred while sending to the backend.', ErrorCode='0', SQLState='08006']
org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:335)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143)
	at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106)
	at com.oltpbenchmark.benchmarks.tpch.procedures.GenericQuery.run(GenericQuery.java:54)
	at com.oltpbenchmark.benchmarks.tpch.TPCHWorker.executeWork(TPCHWorker.java:43)
	at com.oltpbenchmark.api.Worker.doWork(Worker.java:386)
	at com.oltpbenchmark.api.Worker.run(Worker.java:294)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFException
	at org.postgresql.core.PGStream.receiveChar(PGStream.java:308)
	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1952)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308)
	... 9 more
15:29:42,504 (Worker.java:511) DEBUG - TPCHWorker<000> Q9/09 Result: UNKNOWN
15:29:42,504 (ThreadBench.java:769) ERROR - Unexpected fatal, error in 'TPCHWorker<000>' when executing 'Q9/09' [COCKROACHDB]
java.lang.RuntimeException: Unexpected fatal, error in 'TPCHWorker<000>' when executing 'Q9/09' [COCKROACHDB]
	at com.oltpbenchmark.api.Worker.doWork(Worker.java:545)
	at com.oltpbenchmark.api.Worker.run(Worker.java:294)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.postgresql.util.PSQLException: This connection has been closed.
	at org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:783)
	at org.postgresql.jdbc.PgConnection.rollback(PgConnection.java:791)
	at com.oltpbenchmark.api.Worker.doWork(Worker.java:428)
	... 2 more

this is the problematic query. executing it over CLI and it runs for minutes, does not finish.

ddl for related tables

SELECT
        nation, o_year, sum(amount) AS sum_profit
FROM
        (
                SELECT
                        n_name AS nation,
                        extract('year', o_orderdate) AS o_year,
                        l_extendedprice * (1 - l_discount)
                        - ps_supplycost * l_quantity
                                AS amount
                FROM
                        oltpbench.part,
                        oltpbench.supplier,
                        oltpbench.lineitem,
                        oltpbench.partsupp,
                        oltpbench.orders,
                        oltpbench.nation
                WHERE
                        s_suppkey = l_suppkey
                        AND ps_suppkey = l_suppkey
                        AND ps_partkey = l_partkey
                        AND p_partkey = l_partkey
                        AND o_orderkey = l_orderkey
                        AND s_nationkey = n_nationkey
                        AND p_name LIKE '%olive%'
        )
                AS profit
GROUP BY
        nation, o_year
ORDER BY
        nation, o_year DESC

explain without tpch indexes...

                      tree                     |    field    |                                description                                 
+----------------------------------------------+-------------+---------------------------------------------------------------------------+
  sort                                         |             |                                                                            
   │                                           | order       | +nation,-o_year                                                            
   └── group                                   |             |                                                                            
        │                                      | aggregate 0 | n_name                                                                     
        │                                      | aggregate 1 | o_year                                                                     
        │                                      | aggregate 2 | sum(amount)                                                                
        │                                      | group by    | @3,@1                                                                      
        └── render                             |             |                                                                            
             └── join                          |             |                                                                            
                  │                            | type        | inner                                                                      
                  │                            | equality    | (l_partkey) = (p_partkey)                                                  
                  ├── join                     |             |                                                                            
                  │    │                       | type        | inner                                                                      
                  │    │                       | equality    | (l_suppkey, n_nationkey) = (s_suppkey, s_nationkey)                        
                  │    ├── join                |             |                                                                            
                  │    │    │                  | type        | inner                                                                      
                  │    │    │                  | equality    | (ps_suppkey, ps_partkey, o_orderkey) = (l_suppkey, l_partkey, l_orderkey)  
                  │    │    ├── join           |             |                                                                            
                  │    │    │    │             | type        | cross                                                                      
                  │    │    │    ├── join      |             |                                                                            
                  │    │    │    │    │        | type        | cross                                                                      
                  │    │    │    │    ├── scan |             |                                                                            
                  │    │    │    │    │        | table       | orders@primary                                                             
                  │    │    │    │    │        | spans       | ALL                                                                        
                  │    │    │    │    └── scan |             |                                                                            
                  │    │    │    │             | table       | nation@primary                                                             
                  │    │    │    │             | spans       | ALL                                                                        
                  │    │    │    └── scan      |             |                                                                            
                  │    │    │                  | table       | partsupp@primary                                                           
                  │    │    │                  | spans       | ALL                                                                        
                  │    │    └── scan           |             |                                                                            
                  │    │                       | table       | lineitem@primary                                                           
                  │    │                       | spans       | ALL                                                                        
                  │    └── scan                |             |                                                                            
                  │                            | table       | supplier@primary                                                           
                  │                            | spans       | ALL                                                                        
                  └── scan                     |             |                                                                            
                                               | table       | part@primary                                                               
                                               | spans       | ALL                                                                        
                                               | filter      | p_name LIKE '%olive%'                                                      
(40 rows)

explain after indexes added...

                           tree                          |    field    |                                description                                 
+--------------------------------------------------------+-------------+---------------------------------------------------------------------------+
  sort                                                   |             |                                                                            
   │                                                     | order       | +nation,-o_year                                                            
   └── group                                             |             |                                                                            
        │                                                | aggregate 0 | n_name                                                                     
        │                                                | aggregate 1 | o_year                                                                     
        │                                                | aggregate 2 | sum(amount)                                                                
        │                                                | group by    | @3,@1                                                                      
        └── render                                       |             |                                                                            
             └── render                                  |             |                                                                            
                  └── lookup-join                        |             |                                                                            
                       │                                 | type        | inner                                                                      
                       │                                 | pred        | @18 LIKE '%olive%'                                                         
                       ├── lookup-join                   |             |                                                                            
                       │    │                            | type        | inner                                                                      
                       │    ├── join                     |             |                                                                            
                       │    │    │                       | type        | inner                                                                      
                       │    │    │                       | equality    | (l_suppkey, n_nationkey) = (s_suppkey, s_nationkey)                        
                       │    │    ├── join                |             |                                                                            
                       │    │    │    │                  | type        | inner                                                                      
                       │    │    │    │                  | equality    | (ps_suppkey, ps_partkey, o_orderkey) = (l_suppkey, l_partkey, l_orderkey)  
                       │    │    │    ├── join           |             |                                                                            
                       │    │    │    │    │             | type        | cross                                                                      
                       │    │    │    │    ├── join      |             |                                                                            
                       │    │    │    │    │    │        | type        | cross                                                                      
                       │    │    │    │    │    ├── scan |             |                                                                            
                       │    │    │    │    │    │        | table       | orders@primary                                                             
                       │    │    │    │    │    │        | spans       | ALL                                                                        
                       │    │    │    │    │    └── scan |             |                                                                            
                       │    │    │    │    │             | table       | nation@primary                                                             
                       │    │    │    │    │             | spans       | ALL                                                                        
                       │    │    │    │    └── scan      |             |                                                                            
                       │    │    │    │                  | table       | partsupp@primary                                                           
                       │    │    │    │                  | spans       | ALL                                                                        
                       │    │    │    └── scan           |             |                                                                            
                       │    │    │                       | table       | lineitem@primary                                                           
                       │    │    │                       | spans       | ALL                                                                        
                       │    │    └── scan                |             |                                                                            
                       │    │                            | table       | supplier@primary                                                           
                       │    │                            | spans       | ALL                                                                        
                       │    └── scan                     |             |                                                                            
                       │                                 | table       | part@p_pk                                                                  
                       └── scan                          |             |                                                                            
                                                         | table       | part@primary                                                               
(43 rows)

the closed/broken connection is actually caused by a 1m timeout configured in haproxy. that again means the underlying issue is query duration exacerbated by overly aggressive timeout in haproxy

Caused by: org.postgresql.util.PSQLException: This connection has been closed.
	at org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:783)
	at org.postgresql.jdbc.PgConnection.rollback(PgConnection.java:791)
	at com.oltpbenchmark.api.Worker.doWork(Worker.java:428)
	... 2 more

resolved in v2.2.0-alpha

Also made the following configuration changes:

  • increased haproxy timeout settings to 1h to be safe
  • set -max-sql-memory=4GB on each node