/spork

Spoon's Operations Research Kit

Primary LanguageClojureEclipse Public License 1.0EPL-1.0

https://img.shields.io/clojars/v/spork.svg [https://clojars.org/spork Clojars Project]

Originated 30 Aug 2012 Updated 24 Feb 2017

Hello! Welcome to Spoon’s Operations Research Kit, hereafter known as SPORK.

This document serves to highlight areas that are under development, or features that might be deprecated, as well as provide a high-level overview of the various libraries and novel features lurking in the repository.

What is SPORK?

SPORK is a collection of libraries and utilities that I have built over the years that provide a platform for much of the domain-specific work I do in Operations Research. Specifically, munging data, creating complex discrete event simulations, doing graph theoretic stuff, optimizing, sampling, interoperating, making cartoons (visualization/animation0 and on and on.

Why should I care?

SPORK is written in a functional style in Clojure, with an emphasis on reasoning about large programs without navel gazing into oblivion. In my work, I need to balance correctness (paycheck / credibility is on the line if it’s wrong) with a bit of performance.

SPORK brings a bevy of useful libraries and utilities to the table, along with some [I believe] novel ways of doing business - if your business includes discrete event simulation among other things. While it’s used to support “ORSA” work, it’s really a toolset that can address a bevy of computational tasks.

Since this library has evolved - as have I - since 2012, I note that there are multiple quality libraries in the ecosystem that may cover your bases. They weren’t around/accessible when I needed them, so you may see some - by now - mundane things show up in here.

Feel free to ditch anything for a better alternative. In some instances, I may be actively migrating toward better stuff from the community. There may still be some value in seeing working examples and alternate implementations. If nothing else, I offer my meager tools for pedagogical values and learning about Clojure.

Motivation

Exploit clojure and the benefits of the functional programming paradigm in a deciedly mutable/imperative/object-oriented dominated realm: Discrete Event Simulation.

Bitten by the snake of side-effects, mutation, and spooky action-at-a-distance one too many times…I placed my faith in Clojure (and other FP platforms). I think they offer a better alternative.

So far I haven’t been disappointed (been almost a decade).

Organization

spork.ai

Collection of data structures and functions that support defining entity behavior in simulations. Written using a functional style.

behavior

Behavior tree implementation. Pretty decent, and close to the literature.

fsm

Finite state machine data and functions.

spork.cljgraph

Signifcant and fairly well tested/profiled abstract graph and network library. More or less a functional port of Sedgewick’s Graph Algorithms in C, up to and “almost” including the Network Simplex algorithm.

core

top-level porcelain for invoking all sorts of graph related queries, including defining graphs (directed and otherwise), custom walks, single-shortest-path queries, and a whole lot more.

Also has simple graph visualization support via JUNG and graphml output (for visualization in yED for instance).

flow

Network flow extensions and algorithms. Includes edmonds-karp, and a naive (but useful) mincost-flow based on SSSP residual flows.

Mutable and array-backed implementations available. Pedagogical value could be improved due to some macrology that could be averted with what I now know. It’s cool to do flows in clojure.

spork.data

Slew of data structures that support other spork libraries. Most are functional and implement clojure protocols and interfaces.

spork.entitysystem

Draft implementation of an entitystore as in the Component Entity System (or is it Entity Component System) architecture popular in games. Provides protocols and APIs for defining, querying, and updating persistent and mutable stores. Current implementation is a column-store, with an experimental row-store in development.

Also provides a DSL for defining entity construction functions - or specs.

spork.incanter

Naive extensions for incanter. Recent developments mean there are actually a boatload more coming in the near future. We could benefit from modularity here…since SPORK implicitly requires incanter even if you don’t use it.

spork.mining

Early attempt to port examples from “Collective Intelligence” by Segaran to clojure. Intended to be a pedagogical tool and a simple data-mining library. Stalled out. There are better options pre-built.

spork.events

Generic event datastructures for use with a reactive event system.

native

Wrappers around the Swing adapater-madness to create first-class event streams compatible with the reactive observable event system in spork.events.observe.

observe

An early port of the F# observable architecture to clojure. Brought events-as-data and functional paradigms (map, filter , reduce, etc.) to GUI programming in Swing. General purpose despite the obvious GUI application.

There are likely better options these days, including core.async. Still, worth a look and currently used in production.

spork.examples

Currently barren set of examples. More to come. Typically there are short examples and tests at the bottom of each namespace. I’ll look to consolidate those here to show interesting use cases for SPORK.

spork.protocols

Shared protocols and core functionality. Separated to appease the clojure compiler.

spork.cljgui

Massive and early wrapping of Swing inspired by “Joy of Clojure.” Not the prettiest, but it has a useful set of features.

components.swing

This is where all the gui widgets and custom components live. Currently in production.

spork.sketch

Declarative 2D rendering in clojure. Simple sketching API that can produce suprisingly complex imagery. Currently has some extra features in there for plotting and the like, but the core is pretty small.

spork.graphics2d

2D graphics substrate that can render to Swing or alternative contexts. Provides clojure wrappers for image, color, canvas, and other idiomatic 2D features provided by the host. Intended to be a portable rendering layer with different backends.

spork.geometry

Wrapper for various shapes defined by the host platform, as well as user-defined shapes.

spork.opt

Generic optimization modeling functions, primarily targeting combinatorial optimization methods like GA, Simulation Annealing, etc. Currently on Hiatus.

spork.mvc

Simple model-view-controller protocol and wrapper for Swing and other GUI substrates.

spork.sim

Pure functional Discrete Event Simulation library.

core

Currently a place-holder, with an empty simulation context. Intended to be porcelain API over spork.sim.simcontext, but I never ended up needing it.

data

Primitive functions and protocols for “events” and event sequences the form the basis for discrete event simulation.

agenda

Port of the agend from “Structure and Interpretation of Computer Programs” by Abelson and Sussman. Provides a persistent timeline of events, maintaining total order of all pending events as well as functions and protocols for querying and manipulating the agenda.

pure

Facilities for implementing a a pure event system.

Network

Event propogation network based on the observable/observer model implemented as a reduction with pure functions.

impure

place-holder for a mutable implementation for event propogation.

updates

datastructure that maintains entities scheduled for updates at periods in simulated time. Similar to a scheduler.

simcontext

Primary API for defining, manpulating, and computing Discrete Event Simulations. Provides a simulation context datastructure that forms the basis of pure functional simulation, and extends/implements all relevent protocols for event sequences, agendas, updater, and the spork.entitysystem.store entitystore. Provides an organized means to manage all information relevant to the simulation in a high-level API that makes shoving entities around at variable-width time steps with event propogation in a persistent, functional context not only possible, but fun! Did I mention that it’s persistent? You can audit your simulation, reason about transitions between simulation context values, even time travel. All of this occurs at a high-level, rather than low-level mutation-at-a-distance.

This gives rise to simulation histories, which are a pending addition to SPORK (currently in production/fielding).

spork.trends

Hacky visual components for rendering animated scatter plots and stacked area charts in spork.sketch components. Used for animated visualizations.

spork.util

Signifcant collection of utilities. While some are eligible for elevation to the pantheon of “Separate Libraries,” just about everything starts off as a smallish utility here.

array

Utilities for munging primitive arrays.

bitset

A hacky bitset implementation/wrapper, intended for use with genetic algorithms.

bridging

Data translation between two table schemas. Not as impressive as it used to be, supplanted by spork.util.table ops.

cellular

Experimental implemention of localized state within deeply nested datastructures. Attempts to mitigate the performance penalties of multiple assoc/dissoc operations by allowing temporary contexts where we can define “cells” of mutability, and maintain handles to them. Take you nested map and temporarily get some mutable places… we’ll clean up after you’re done.

clipboard

Simple API for copying/pasting to the system clipboard. Great for moving data from the REPL to other programs via the clipboard.

collections

Experimental and disappointing attempt to squeeze some marginal speed gains out of common operations like assoc/dissoc. Teaching value.

combinatoric

Attempt to implement a sparse combinatoric map where the keys are lexicographically ordered combinations. Allows one to conceivably define combinations upon a set, and then rapidly compute / hash thet nth combination to provide a sparse hash-map. Intended to be used to support combinatorial optimization routines. There’s a better known implementation than this too.

comparison

Slew of functions for defining and composing comparators for sorting and the like.

datetime

Unpopular, backwoods cousin of a real date-time library, like clj-time. Then again, it got the job done when it was needed.

diff

Simple diffing functions for nested maps. Used for delta compression routines among other things.

eager

Probably no longer needed. Non-lazy, “fast” [at the time] replacement for core clojure functions. Bypasses seqs.

excel

Extended wrapper + fork around an early version of Docjure. Provides deep integration with spork.util.table and a nice API that allows you to rip to and from Excel workbooks without touching Excel.

general

Useful functions that one accumulates over the years.

generators

Unfold and the like.

help

Obsolete help system for an obsolete little clojure environment.

indexed

Useful? operations on indexed datastructures like vectors. Things like reducing backwards (without intermediate seqs) and others.

inspection

Really cool (imo) extension to clojure.reflect and clojure.inspector. Allows one to inspect objects from the repl and discern their lineage, i.e. interfaces, protocols, methods, fields, etc. Also provides a nice ML-like type signature for the discerning functional programmer.

interop

Helpful macro that allows consenting adults to “crack open” the private fields in the steel cages and booby-trapped classes that enterprise java programmers devoted so much time securing from perfidy. Creates getters and setters in a ns that gives you a functional API for mutating said hidden treasures.

io

File system routines, convenience apis, path stuff, zip file management.

log

Logging protocol, mostly to service Clojure’s demand for non-circular dependencies.

mailbox

Primitive async mailbox implementation for an Agent. Intended to manage GUI, rendering, and other stuff.

metaprogramming

Useful tricks, macros, and other black magic for bending Clojure to your will. Practical and possibly pedadogical value.

numerics

Smallish packaging of math, numerics, and some numerical routines. Pedagocial value, likely obsolete in today’s ecosystem.

parsing

General utilities and api for converting text to (typically, maybe?) typed data. Necessary if you want to read those big files. String canonicalization is coming very soon as well, to help compress huge tables of snowflake java strings. Heavily used by spork.util.table.

processor

Failed experiment to create something like a build system.

ranges

Simple data structures for defining and working with numerical ranges. Intended use was with spork.opt.

record

Operations for defining records, including some conveneinced wrappers around defrecord that act more like Common Lisp (i.e. default values).

reducers

Mostly obsolete now that core has caught up. At the time, provided patches for reducible ranges, and other goodies.

sampling

Domain Specific Language for defining rules for stochastic sampling functions over a set of discrete samples.

stats

Simple statistical functions for use with util.sampling, and simulation. Some overlap with incanter, some not.

table

Fairly robust implementation of a column-oriented table in clojure. Defines operations for manipulating said table at the field, row, or record level. Also defines joins and SQL-like querying language. Allows type definitions and parsing schemas for reading tables effeciently using spork.util.parse.

tags

Simple fact database that “tags” subjects with facts. Basically a logic db / triple store before I knew what that meant.

temporal

Useful operations for working with data that is sampled at varying intervals, namely output from discrete event simulations. Generic API allows anything with a start, duration, and name key to be handled as if the signal were continuous rather than discrete. Useful for stitching together said signals.

topotree

A primitive implementation of a “purely functional” doubly-linked-list using a directed graph as its backend. Not used much.

vecmath

Basic vector math, where vectors are implemented as clojure pvectors. Not recommended for heavy duty work, but it’s got some charm.

vector

Useful operations on vectors, like transpose and other shaping operations. Intended for use with spork.util.table column store.

vectors

Vector implementation using custom types.

xml

Light wrapper around clojure.xml. Pedantic.

zip

Wrapper around clojure.zip with some no-longer novel features, like reducible/mappable/filterable zipper traversals. Better libraries probably exist.

zipfile

Simple API for creating/compressing/decompressing GZip and Lz4 files.

Modularity (Pending)

Currently, SPORK exists in its communal form. Much of the hacking and development occurs across different domains for a given project. Still, I plan to automatically dissect and distribute SPORK - optionally - as individual, modular projects.

Differences From Traditional Simulation Implementations

SPORK.sim presents a significant departure from traditional Discrete Event Simulations, since the architecture embraces functional programming and persistent data structures wholly, rather than focusing on mutation.

SPORK’s goal is to provide a simulation history, which is a sequence of simulation contexts - or snapshots - of the entire “world” in the simulation. This history is lazily generated, yet allows us to define means for observing entities - typically the domain of clunky logging facilities and mutation in most simulations - that can operate on the entire stream of differential changes to the initial simulation context. SPORK.sim combines several novel features:

  • building the simulation transition functions (or step functions) as a composition of smaller functions
    • (as opposed to OOP-based classes or imperative mutation)
  • Optionally persistent, lazily computed stream of simulation history
  • a database layer based on the Component-Entity-System paradigm
  • Maintains the ability to have a classic observable/observer simulation model without sacrificing functional purity.
    • Where convenient (i.e. for logging, visualization, other side-effects).
  • Event-step (i.e. variable-width time-step) simulation.
  • High-level transforms on the simulation history.
    • Familiar idioms like map, filter, reduce work out of the box, since history is merely a time-indexed stream of simulation contexts.
    • Allows for precise reasoning about causality, tracing, debugging, etc.
  • Entity behaviors based on Behavior Trees, but can be as simple as a 2-line function.

Developer Notes

Portions of the code base are somewhat roughshod, owing to the rapid-fire pragmatic nature to support existing research software. Additionally, there are experimental and potentially deprecated namespaces. I have pruned out the name spaces where there appears to be little to no technical or pedagogical value and placed them in a temporary /obe folder.

Annotations are provided at the beginning of any namespace that looks “iffy”.

There are ongoing efforts to prune the codebase of noise, and to enhance the documentation. Certain heavily-used libs, like spork.cljgraph, have fairly decent docs.

Where possible, SPORK is an attempt at a literate style of programming; You can feed the source to Marginalia to produce a companion set of documentation that should be useful.

License

Copyright Tom Spoon 2012-2017

Spoon’s Operations Research Kit and encompassing libraries are licensed under the Eclipse Public License v 1.0.

Segments of inline open source code that have both license and attributions provided for the gracious authors. Please contact me via github if I missed someone and need to issue a correction.