Benchmarks I/O operations on Pandas Dataframes.
Currently tests these file formats:
- Python pickle
- CSV
- HDF5
- Parquet
- Feather
- DuckDB
Supports some compression formats, depending on the file format.
- brotli
- duckdb
- fastparquet
- fixfmt
- pandas
- pyarrow
- pytables
- zstandard
requirements.txt
coming soon.
-
(Optional) Generate a dataframe of random data to benchmark:
python -m dfio.gen --help
Or bring your own data, in uncompressed Python pickle format.
-
Run benchmarks:
python -m dfio.benchmark --help
This writes a file, by default
./dfio-benchmark.json
, with benchmark results. Multiple runs are appended to the same file. -
Show results:
python -m dfio.analyze --help