Using 2019-2022 data from NYC Taxi Data.
Number of rows: 179,807,942 Dataset size: 24.4GB
Framework | Read | Groupby | Total |
---|---|---|---|
Pandas | dies | n/a | n/a |
Polars | 1ms | 2.3s | 2.3s |
Vaex | 416ms | 5.3s | 5.3s |
- Data is stored in parquet files on local disk
- Using M1 MBA with 8GB RAM-- unsure how memory swap impacts result