[BUG-REPORT] Dataframes with no columns raise errors for various operations
Opened this issue · 1 comments
I'm able to create dataframes with zero columns, but representing it produces the following
>>> import vaex
>>> df = vaex.from_dict({})
>>> df
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../vaex/packages/vaex-core/vaex/dataframe.py", line 4221, in __repr__
return self._head_and_tail_table(format='plain')
File ".../vaex/packages/vaex-core/vaex/dataframe.py", line 3961, in _head_and_tail_table
if N <= n:
TypeError: '<=' not supported between instances of 'NoneType' and 'int'
I'm not too familiar with Vaex, but I imagine these type of bugs which assume at least 1 column will pop up for various operations, e.g. df.concat(df)
raises... although maybe that's a nonsensical in the first place (pandas.concat([pd.DataFrame({}), pd.DataFrame({})])
works interestingly).
Also, such dataframes cannot interop with pandas-dev/pandas#46141
>>> from pandas.api.exchange import from_dataframe
>>> from_dataframe(df)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../pandas/core/exchange/from_dataframe.py", line 57, in from_dataframe
return _from_dataframe(df.__dataframe__(allow_copy=allow_copy))
File ".../pandas/core/exchange/from_dataframe.py", line 77, in _from_dataframe
for chunk in df.get_chunks():
File ".../vaex/packages/vaex-core/vaex/dataframe_protocol.py", line 750, in get_chunks
n_chunks = n_chunks if n_chunks is not None else self.num_chunks()
File ".../vaex/packages/vaex-core/vaex/dataframe_protocol.py", line 712, in num_chunks
if isinstance(self.get_column(0)._col.values, pa.ChunkedArray):
File ".../vaex/packages/vaex-core/vaex/dataframe_protocol.py", line 721, in get_column
return _VaexColumn(self._df[:, i], allow_copy=self._allow_copy)
File ".../vaex/packages/vaex-core/vaex/dataframe.py", line 5355, in __getitem__
df = df[item[0]]
File ".../vaex/packages/vaex-core/vaex/dataframe.py", line 5371, in __getitem__
stop = stop or len(self)
TypeError: 'NoneType' object cannot be interpreted as an integer
I searched around and couldn't figure out if such dataframes are even supported by Vaex in the first place... I have no use case for them myself heh, it's just such dataframes are valid for other dataframe libraries (like pandas). If they're not supported, possibly constructors should raise ValueError
if a zero-col dataframe is trying to be initialized.
Vaex was built locally from source (upstream master
) on Ubuntu 20.04.
Same issue!