ContinuumIO/anaconda-package-data

Pandas/pyArrow/read_parquet error

Closed this issue · 1 comments

phwuil commented

As requested by @sophiamyang , I pass on an issue I opened for condastats since this package depends on the data pipeline in this very repo :

  • condastats version: 0.2.1
  • Python version: Python 3.11.3
  • Operating System: linux (Manjaro/Plasma)

Description

Unable to use condastats.cli.overall (internal error on pandas->pyArrow)

    dataconda = condastats.cli.overall([conda_module], monthly=True)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/condastats/cli.py", line 62, in overall
    df = dd.read_parquet(
         ^^^^^^^^^^^^^^^^
  File "[...]/python3.11/site-packages/dask/backends.py", line 138, in wrapper
    raise type(e)(
ValueError: An error occurred while calling the read_parquet method registered to the pandas backend.
Original Message: ArrowStringArray requires a PyArrow (chunked) array of string type
phwuil commented

I found it thanks to @nicrie : pandas<2.0.0 is required