/elm-array

A reimplementation of Elm's core Array module

Primary LanguageElm

elm-array

This project is an attempt to reimplement the logic of Elm's core Array module in Elm. The existing standard library Array module is implemented in Javascript, and has known bugs. This project is intended to fix the existing bugs in Array, while simualtaneously creating a more maintainable foundation for future work (thanks to the benefits of functional purity and static typing). It is very much a work in progress.

The data structure

The Array type is backed by a data structure called a "Relaxed Radix Balanced Tree", or RRB-Tree. RRB-Trees support logarithmic-time indexing, updating, appending, and concatenation, while preserving immutability. Online information on RRB-Trees is scarce, and the original paper that describes them is short and often unclear. The following appear to be some of the best supplemental resources:

Design

This Array implementation is designed to be, at minimum, a drop-in replacement for the existing Array module in the standard library, with exactly the same API.

This implementation is not and cannot be written entirely in Elm. RRB-Trees internally make use of fixed-size, copy-on-modify, random-access, memory-contiguous arrays (with a lowercase "a"), which do not exist in the Elm standard library. Consequently, these simple arrays (called Tables in the codebase to avoid confusion with Arrays) must be implemented in a small Javascript module. The complex logic of the RRB-Tree itself is implemented in Elm.

Current status

This repository currently includes a 100% API-compatible reimplementation of Array's logic in Elm, which passes all core unit tests. It is not yet ready for production use, but it captures all the essential features of the RRB-Tree. With further testing, profiling, and cleanup, it will hopefully be usable as a better-than-ever replacement for Array in the not-too-distant future.

Roadmap

Well-defined tasks:

  • Implement every function in the existing Array API
  • Reimplement NaiveTable in Javascript
  • Address every TODO comment in the codebase
  • Add justifications for every assumeJust call in the codebase, for use as both comments and potential error messages
  • For debug purposes: write a function to test if tree conforms to invariants, routinely call this function after every tree operation, and raise an error / log a warning if an invariant is broken.

Larger, open-ended tasks:

  • Test thoroughly, using at minimum the existing standard library unit tests
  • Profile and benchmark thoroughly, especially in comparison to...
    • The existing Array module (under conditions where it actually works)
    • Simpler, theoretically-less-performant data structure like Lists and Tables, especially for small, common use cases
    • Simpler tree-based persistent array structures
  • General code cleanup
  • Community code review