[Bug] function 'get_table_names' in the python client returns incorrect results
BruceWong96 opened this issue · 5 comments
Code of Conduct
- I agree to follow this project's Code of Conduct
Search before asking
- I have searched in the issues and found no similar issues.
Describe the bug
Hi,
I encountered the problem in the above image when I tried to connect Kyuubi with Superset.
The table structure and table name are incorrect, and appear to return a schema.
It's not what I want.
Affects Version(s)
master
Kyuubi Server Log Output
No response
Kyuubi Engine Log Output
No response
Kyuubi Server Configurations
default
Kyuubi Engine Configurations
No response
Additional context
No response
Are you willing to submit PR?
- Yes. I would be willing to submit a PR with guidance from the Kyuubi community to fix.
- No. I cannot submit a PR at this time.
Hello @BruceWong96,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi.
After my investigation, I found the bug and solution.
The function get_table_names returns an incorrect value.
Here are my test code :
from sqlalchemy import *
from sqlalchemy.engine import create_engine
engine = create_engine('hive://apache@10.183.4.93:10009/default')
with engine.connect() as con:
result = con.execute(text("show tables in default"))
rows = result.fetchall()
print("show tables in default :")
print(rows)
print("-----------------------")
print("result of row[0] :")
for row in rows:
print(row[0])
print("-----------------------")
print("result of row[1] :")
for row in rows:
print(row[1])
Here are my test results:
show tables in default :
[('default', 'employees', False), ('default', 'student', False), ('default', 'student_scores', False)]
-----------------------
result of row[0] :
default
default
default
-----------------------
result of row[1] :
employees
student
student_scores
According to the above tests
The correct return value is
return [row[1] for row in connection.execute(text(query))]
Some new discoveries have been made.
The following code is used to connect to hive directly.
return [row[0] for row in connection.execute(text(query))]
Because The following value is returned when the Hive is connected.
show tables in default :
[('student',), ('student_scores',)]
The following code is used to connect to Kyuubi.
return [row[1] for row in connection.execute(text(query))]
Because The following value is returned when the Kyuubi is connected.
show tables in default :
[('default', 'employees', False), ('default', 'student', False), ('default', 'student_scores', False)]
So, for the difference in return value, I modified the code, see PR for details.
And I test them in Superset. The code works.