Cross join is treated as inner join
l1t1 opened this issue · 5 comments
l1t1 commented
Checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of the Polars CLI.
Reproducible example
〉select count(*) from read_parquet('slow3.parquet');
┌────────┐
│ count │
│ --- │
│ u32 │
╞════════╡
│ 100000 │
└────────┘
〉select count(*) from read_parquet('slow3.parquet') t1,read_parquet('slow3.parquet') t2;
┌────────┐
│ count │
│ --- │
│ u32 │
╞════════╡
│ 100000 │
└────────┘
〉select count(*) from read_parquet('slow3.parquet') t1 cross join read_parquet('slow3.parquet') t2;
Error: cross joins would produce more rows than fits into 2^32; consider compiling with polars-big-idx feature, or set 'streaming'
### Issue description
the second sql should return 10000000000, but returns 10000
the third sql should return 10000000000 too
### Expected behavior
the second sql and the third sql both return 10000000000
### Installed version
0.6.0
stinodego commented
Could you make a minimal reproducible example, e.g. without reading parquet files? I tried reproducing this on the latest Polars main branch from Python but am unable to do so:
import polars as pl
df1 = pl.DataFrame({"a": [1, 1], "b": [3, 4]})
df2 = pl.DataFrame({"a": [1, 2], "c": [5, 6]})
result = df1.join(df2, how="cross")
print(result)
sql = pl.SQLContext({"df1": df1, "df2": df2})
result = sql.execute("select * from df1 cross join df2;", eager=True)
print(result) # same result
l1t1 commented
use your example, see the result of duckdb
>>> result = sql.execute("select * from df1, df2;", eager=True)
>>> print(result)
shape: (2, 2)
┌─────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═════╪═════╡
│ 1 ┆ 3 │
│ 1 ┆ 4 │
└─────┴─────┘
>>> import pandas as pd
>>> import duckdb as dd
>>> dd.sql("select * from df1, df2;")
┌───────┬───────┬───────┬───────┐
│ a │ b │ a │ c │
│ int64 │ int64 │ int64 │ int64 │
├───────┼───────┼───────┼───────┤
│ 1 │ 3 │ 1 │ 5 │
│ 1 │ 3 │ 2 │ 6 │
│ 1 │ 4 │ 1 │ 5 │
│ 1 │ 4 │ 2 │ 6 │
└───────┴───────┴───────┴───────┘
l1t1 commented
stinodego commented
Right, closing as a duplicate then.
l1t1 commented
still returns wrong result in version 10. 20.31