Unable to do JOIN between to iceberg based tables
sarialalem1 opened this issue · 4 comments
Example of what I'm trying to do:
select * from iceberg_scan('s3://<url1>.metadata.json') as t1 join iceberg_scan('s3://t2.metadata.json') as t2 on 1=1
When running the above I get this error:
Binder Error: Duplicate alias \"iceberg_scan_data\" in query!
Although I'm passing an alias for each iceberg_scan
operation, it's still assigning the same iceberg_scan_data
alias to both.
A suggestion would be to add the alias ass a parameter to the function.
You can use WITH clause instead of alias to define both t1 and t2.
Yea this is a side-effect of the way things are implemented right now. This will likely change in the future, but I need to get some refactors in DuckDB-side before I can fix that
@samansmink maybe this would be a good solution? #57
In addition to @harel-e suggestion of WITH
, you can also create views, which work fine:
sql = '''
CREATE SCHEMA IF NOT EXISTS tpc;
CREATE VIEW tpc.customer AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/customer', allow_moved_paths = true);
CREATE VIEW tpc.lineitem AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/lineitem', allow_moved_paths = true);
CREATE VIEW tpc.nation AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/nation', allow_moved_paths = true);
CREATE VIEW tpc.orders AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/orders', allow_moved_paths = true);
CREATE VIEW tpc.part AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/part', allow_moved_paths = true);
CREATE VIEW tpc.partsupp AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/partsupp', allow_moved_paths = true);
CREATE VIEW tpc.region AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/region', allow_moved_paths = true);
CREATE VIEW tpc.supplier AS
SELECT * FROM iceberg_scan('/Users/mritchie712/blackbird/notebooks/iceduck_tpc/default.db/supplier', allow_moved_paths = true);
'''
# Execute the SQL
con.execute(sql)
sql = f"""
select *
from
tpc.customer as c inner join
tpc.orders as o on c.c_custkey = o.o_custkey
limit 20;
"""
res = con.execute(sql).fetchdf()
res