AbsaOSS/pramen

Add support for Spark JDBC sources with SQL inputs

Closed this issue · 0 comments

Background

Currently, when input.table is specified, the Spark JDBC Data Source will be used. But when input.sql is specified, the builtin JDBC Data Source (aka JDBC Native) will be used. The JDBC Native implementation is quite limited in performance and data types support, but can work with queries that do not start with "SELECT"

But JDBC Native is always selected, even if the query starts with "SELECT"

Feature

Add support for Spark JDBC sources with SQL inputs starting with "SELECT" and when they don't contain "WHERE".

You can force the previous behavior with JDBC Native using the following flag when you define the source:

pramen.sources = [
  {
    name = "jdbc1"
    factory.class = "za.co.absa.pramen.core.source.JdbcSource"
    jdbc {
      driver = "$driver"
      connection.string = "$url"
      user = "$user"
      password = "$password"
    }

    # This forces the use of JDBC Native
    use.jdbc.native = true
  }
]