awesome-big-medium-data-frameworks

Not sure if they are awesome, but listing them anyway

Tools

Name Language Classification Comment
Intel® Scalable Dataframe Compiler Python Big claims to "orders of magnitute faster than alternatives like Apache Spark
Spark Scala (Main), Python, R, Julia (weak) Big De facto industry standard. Basically killed Hadoop
Dask Python Medium/Big
disk.frame R Medium soft deprecated
Husky C++, Scala (weaker), Python (weaker) Medium?/Big
JuliaDB.jl Julia Medium/Big Can't get it to work for me on the Fannie Mae data
DataFusion Rust Big Apache Arrow DataFusion and Ballista query engines
ballista Rust Big Spark but in Rust
vega Rust Big Another Spark killer in Rust
vaex Python Medium/Big
tuplex Python Medium/Big Compiles a subset of Python to machine code if possible.
nebula Medium/Big? seems to be Javascript based
arrow Medium Has a Dataset API in some implementations e.g. R

File Formats

Name Notes
ROOT

Resources

Quora: What are some credible Apache Spark killers out there? What are their chances of success?