/DataFrameShowDown

Polars vs Spark vs Pandas vs DataFusion. Guess who wins?

Primary LanguagePython

DataFrameShowDown

Polars vs Spark vs Pandas vs DataFusion. Guess who wins?

See full blog post. https://www.confessionsofadataguy.com/dataframe-showdown-polars-vs-spark-vs-pandas-vs-datafusion-guess-who-wins/

There once was a day when no one used DataFrames that much. Back before Spark had really gone mainstream, Data Scientists were still plinking around with Pandas a lot. My My, what would your mother say? How things have changed. Now everyone wants a piece of the DataFrame pie. I mean it tastes so good, doesn’t it?

Would anyone like a nice big slice of groupBy, maybe agg is what you need? No? Can you say distributed data set? Whatever it is you’re looking for, I’m quite sure a nice old DataFrame can give it to you. With so many options to choose from … what do you choose? I don’t know, whatever works best for you. But, it does set the stage nicely for a clash of the titans per see.