[Bug] [spark-hive-connector] failed to setting hive.metastore.uri if not setting `spark.sql.hive.metastore.jars`
FANNG1 opened this issue · 5 comments
Code of Conduct
- I agree to follow this project's Code of Conduct
Search before asking
- I have searched in the issues and found no similar issues.
Describe the bug
- start two hive metastore,
127.0.0.1:9083
as hive1 and127.0.0.1:19083
as hive2 - start spark SQL client, setting default hive metastore address to hive2, and the hive metastore address of
hive_catalog
to hive1
./bin/spark-sql -v \
--conf spark.sql.catalog.hive_catalog="org.apache.kyuubi.spark.connector.hive.HiveTableCatalog" \
--conf spark.sql.catalog.hive_catalog.hive.metastore.uris=thrift://127.0.0.1:9083 \
--conf spark.sql.catalog.hive_catalog.hive.metastore.port=9083 \
--conf spark.hadoop.hive.metastore.uris=thrift://127.0.0.1:19083
- run spark sqls, after
using hive_catalog
, show databases retrives the database from hive2 not hive1.
Affects Version(s)
1.8.1
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
No response
Kyuubi Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
- Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- No. I cannot submit a PR at this time.
DON'T use spark-sql
to test the hive-related stuff, there are a lot of tricky inside, is it reproducible with spark-shell
?
The main reason is if not specify spark.sql.hive.metastore.jars
, HiveClientImpl
will use a shared SessionState
to create a Hive client
, and the shared SessionState
is initiated by spark_catalog
catalog first in which the hive client is hive2
in this case.
// `isolationOn` is true if `spark.sql.hive.metastore.jars` is `buildin` and SessionState is CliSessionState
def isCliSessionState(): Boolean = {
val state = SessionState.get
var temp: Class[_] = if (state != null) state.getClass else null
var found = false
while (temp != null && !found) {
found = temp.getName == "org.apache.hadoop.hive.cli.CliSessionState"
temp = temp.getSuperclass
}
found
}
// create or reuse session state according to the `clientLoader.isolationOn`
val state: SessionState = {
if (clientLoader.isolationOn) {
newState()
} else {
SessionState.get
}
}
// get conf from session state
def conf: HiveConf = {
val hiveConf = state.getConf
}
// create Hive client from conf
private def client: Hive = {
if (clientLoader.cachedHive != null) {
clientLoader.cachedHive.asInstanceOf[Hive]
} else {
val c = getHive(conf)
clientLoader.cachedHive = c
c
}
}
DON'T use
spark-sql
to test the hive-related stuff, there are a lot of tricky inside, is it reproducible withspark-shell
It can't be reproduced with spark-shell
KSHC does not work well with spark-sql
is a known issue, we don't have a plan to fix it on the Kyuubi side, because we treat it as a Spark side issue.
Kyuubi is a full drop-in replacement of spark-sql
.
spark-sql
==> beeline => kyuubi => spark driver (client or cluster mode)
Close as not planned