/explorer

Series (one-dimensional) and dataframes (two-dimensional) for fast data exploration in Elixir

Primary LanguageElixirMIT LicenseMIT

Explorer

Package Documentation CI

Explorer brings series (one-dimensional) and dataframes (two-dimensional) for fast data exploration to Elixir. Its high-level features are:

  • Simply typed series: :float, :integer, :boolean, :string, :date, and :datetime.
  • A powerful but constrained and opinionated API, so you spend less time looking for the right function and more time doing data manipulation.
  • Pluggable backends, providing a uniform API whether you're working in-memory or (forthcoming) on remote databases or even Spark dataframes.
  • The first (and default) backend is based on NIF bindings to the blazing-fast polars library.

The API is heavily influenced by Tidy Data and borrows much of its design from dplyr. The philosophy is heavily influenced by this passage from dplyr's documentation:

  • By constraining your options, it helps you think about your data manipulation challenges.
  • It provides simple “verbs”, functions that correspond to the most common data manipulation tasks, to help you translate your thoughts into code.
  • It uses efficient backends, so you spend less time waiting for the computer.

The aim here isn't to have the fastest dataframe library around (though it certainly helps that we're building on Polars, one of the fastest). Instead, we're aiming to bridge the best of many worlds:

  • the elegance of dplyr
  • the speed of polars
  • the joy of Elixir

That means you can expect the guiding principles to be 'Elixir-ish'. For example, you won't see the underlying data mutated, even if that's the most efficient implementation. Explorer functions will always return a new dataframe or series.

Getting started

In order to use Explorer, you will need Elixir installed. Then create an Elixir project via the mix build tool:

mix new my_app

Then you can add Explorer as dependency in your mix.exs.

def deps do
  [
    {:explorer, "~> 0.3.1"}
  ]
end

Alternatively, inside a script or Livebook:

Mix.install([
  {:explorer, "~> 0.3.1"}
])

Contributing

Explorer uses Rust for its default backend implementation. While Rust is not necessary to use Explorer as a package, you need Rust tooling installed on your machine if you want to compile from source, which is the case when contributing to Explorer. In particular, you will need Rust Nightly, which can be installed with Rustup.

Once you have made your changes, run mix ci, to lint and format both Elixir and Rust code.

Sponsors

Amplified