Beuth-Erdelt/DBMS-Benchmarker

Feature request - initialisation SQL per connection (initSQL)

deenar opened this issue · 10 comments

deenar commented

It will be useful to have a feature like initSQL - which gives the ability to run an SQL statement exactly once, when the connection is created.

JDBC drivers have different features, some allow setting a default schema via the connection string, some don't. initSQL will give the ability to switch to the schema having the benchmarking dataset without modifying the queries. This is especially useful when one is benchmarking with datasets of different scaling sizes viz TPCH 1, TPCH 10 TPCH 100. Currently one is forced to create another user (or modify queries), when ability to switch schemas via JDBC connection string isnt supported.

`
[
{
'name': "MySQL TPCH10",
'alias': "Some DBMS",
'version': "CE 8.0.13",
'hostsystem': {'node': 'localhost'},
'info': "This is an example: MySQL on localhost",
'active': True,
'JDBC': {
'driver': "com.mysql.cj.jdbc.Driver",
'url': "jdbc:mysql://localhost:3306/database",
'auth': ["username", "password"],
'jar': "mysql-connector-java-8.0.13.jar",
'initSQL': "USE TPCH10";
},
},
]

`

Do you have an example where it is not possible to switch schemas via JDBC connection string?

deenar commented

The above example isn't ideal. Oracle is a good example. one can one change the schema using the following

ALTER SESSION SET CURRENT_SCHEMA=tpch10

https://stackoverflow.com/questions/2353594/default-schema-in-oracle-connection-url

Something similar for SQLServer too

https://stackoverflow.com/questions/3282665/possible-to-set-default-schema-from-connection-string

ALTER USER dbmsbmk WITH DEFAULT_SCHEMA = TPCH10;

I think you can set the database in an Oracle connection string, but you need to set up a service at first (I do not recall details): https://docs.oracle.com/middleware/12211/bip/BIPAD/GUID-FB2AEC3B-2178-48DF-8B9F-76ED2D6B5194.htm#BIPAD289

SQLServer allows you to set the database in the connection string via databaseName=: https://learn.microsoft.com/en-us/sql/connect/jdbc/building-the-connection-url?view=sql-server-ver16

deenar commented
  1. I am fairly certain, it does not work for Oracle. Service names are for resource management. I don't think one can set a default schema in a service https://oracle-base.com/articles/10g/dbms_service. You can set it in a login trigger, but that means each user has only one default schema.

  2. For SQL server you can set the database name in the connection string, but not the schema. This might be a bit less useful.

I see. Would a static parameter init_command per connection help, that is sent each time a connection is established?

deenar commented

Yes, that would be perfect. Some JDBC connection pools already support this feature https://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html, something similar would be great. Just to clarify, the initSQL would be executed once each connection is established, before running any queries.

There now is a parameter init_SQL:

[
  {
    'name': "MySQL",
    'active': True,
    'JDBC': {
      'driver': "com.mysql.cj.jdbc.Driver",
      'url': "jdbc:mysql://localhost:3306/database",
      'auth': ["username", "password"],
      'jar': "mysql-connector-java-8.0.13.jar"
    },
    'init_SQL': "USE tpch",
  }
]

that will be evaluated starting with #117

You may want to check out the branch and test it?

deenar commented

Hi
I pulled the changes. They work with SQL that returns a result, but there is an issue with Oracle and its switch schema statement. It does not return any results and hence it fails on fetchResult. This was the mean use of this feature. Is there an alternative way to execute SQL that does not return results?

init_SQL alter session set current_schema=TPCH
Traceback (most recent call last):
  File ".\benchmark.py", line 160, in
    experiments.runBenchmarks()
  File "DBMS-Benchmarker\dbmsbenchmarker\benchmarker.py", line 1521, in runBenchmarks
    self.runBenchmarksQuery()
  File “DBMS-Benchmarker\dbmsbenchmarker\benchmarker.py", line 1438, in runBenchmarksQuery
    self.activeConnections[i].connect()
  File "DBMS-Benchmarker\dbmsbenchmarker\tools.py", line 625, in connect
    init_result = self.fetchResult()
  File "DBMS-Benchmarker\dbmsbenchmarker\tools.py", line 664, in fetchResult
    return self.cursor.fetchall()
  File "conda\envs\pi-machine\lib\site-packages\jaydebeapi_init_.py", line 593, in fetchall
    row = self.fetchone()
  File ".conda\envs\pi-machine\lib\site-packages\jaydebeapi_init_.py", line 558, in fetchone
    raise Error()
jaydebeapi.Error

Hi @deenar , thanks for testing!
DBMS-Benchmarker now does not fetch results from init_SQL in any case. I think it is not necessary anyway.

Please test again. init_SQL opens and closes a cursor, I am not sure if this is compatible with the solution you are looking for.

deenar commented

Thank you ver much. This works and allows to switch schemas in Oracle.