/links

Just a bunch of useful links

Primary LanguageShell

links

Just a bunch of useful links

Scala

Serialization

Concurrency, Actors

Async Database Libs

  • Asyncpools - Akka-based async connection pool for Slick. Akka 2.2 / Scala 2.10.
  • Postgresql-Async - Netty-based async drivers for PostgreSQL and MySQL

Caching

  • Cacheable - a clever memoization / caching library (with Guava, Redis, Memcached or EHCache backends) using Scala 2.10 macros to remember function parameters

Big Data Processing

Spark

  • Jaws - Spark SQL REST server, includes query cancellation, logs, load balancing.

Geospatial and Graph

  • GeoTrellis - distributed raster processing, adding Vector/geom support, Akka Cluster and Spark implementations!

  • Spatial framework for Hadoop - PostGIS-like operators / UDFs for Hive. We want this for Spark!

  • trails - parser combinators for graph traversal. Supports Tinker/Blueprints/Neo4j APIs.

  • scala-graph - in-memory graph API based on scala collections. Work in progress.

Collections, Numeric Processing, Fast Loops

  • Breeze, Spire, and Saddle - Scala numeric libraries
    • spire-ops - a set of macros for no-overhead implicit operator enrichment
  • ScalaXY - collection of macros for performant for loops, extension methods etc
  • Squants - The Scala API for Quantities, Units of Measure and Dimensional Analysis
  • FastTuple - a dynamic (runtime-defined) C-style struct library, with support for off-heap storage. Would work really well for in-memory queries.
    • and the excellent blog covers all of the on- and off-heap access and allocation patterns on the JVM very thoroughly.
  • Unboxing, Runtime Specialization - a cool post on how to do really fast aggregations using unboxed integers
  • product-collections - useful library for working with collections of tuples
  • SuperFastHash - also see Murmur3

Big Data Storage

  • Phantom - Scala DSL for Cassandra, supports CQL3 collections, CQL generation from data models, async API based on Datastax driver
  • Athena - Asynchronous Cassandra client built on Akka-IO
  • Stubbed Cassandra - super useful for testing C* apps
  • Pithos - an S3-API-compatible object store for Cassandra
  • Sirius - Akka-based in-memory fast key-value store for JVM objects, with Paxos consistency, persistence/txn logs, HA recovery
  • Storehaus - Twitter's key-value wrapper around Redis, MySql, and other stores. Has a neat merge() functionality for aggregation of values, lists, etc.
  • MapDB - Not a database, but rather a database engine with tunable consistency / ACIDness; support for off-heap memory; fast performance; indexing and other features.
  • HPaste - a nice Scala client for HBase

Web / REST / General

  • Scalaj-http - really simple REST API. Although, the latest Spray-client has been vastly simplified as well.

  • REPL as a service - would be kick ass if integrated into Spark

  • IScala - Scala backend for IPython. Looks promising. There is also Scala Notebook but it's more of a research project.

  • Scaposer - i18n / .po file library

  • Adding Reflection to Scala Macros - example of using reflection in an annotation macro to add automatic ByteBuffer serialization to case classes :)

  • Scaldi - A lightweight dependency injection library, with Akka integration

  • How to use Typesafe Config across multiple environments

  • Scala-rainbow - super simple terminal color output, easier than Console.XXX

  • SExt - Supplies some missing Standard Library functions, like pretty-printing data structures, unfold, etc.

  • ScalaUtils - ===, !== with tolerance for floats, an OR operator for types for easy validation (Int Or One[ErrorMessage])

Build, Tooling

  • Run Scala scripts with dependencies - ie you don't need a project file

  • sbt-assembly 0.10.2 supports adding a shell script to your jar to make it executable! No more "java ...." to start your Scala program, and no more ps ax | grep java | grep ....

  • Other useful SBT plugins - sbt-sonatype, sbt-pom-reader, sbt-sound, plugins page

  • SCoverage - statement coverage tool, much more useful than line-based or branch-based tools. Has SBT plugin. Blog post on why it's an improvement.

  • sbt-jmh - Plugin for running SBT projects with the JMH profiling tool

  • SBT Shell Prompt with Git and project name :) (SBT 0.13 only)

  • SBT updates - Tool for discovering updated versions of SBT dependencies

  • Thyme and Parsley - microbenchmarking and profiling tools, seems useful

  • ScalaStyle - Scala style checker / linter

  • Linter - Scala linter compiler plugin

  • utest - a small micro test framework

  • lions share - a neat JVM heap and GC analysis tool, with charts and SBT integration.

SBuild seems like a promising replacement for SBT. Still Scala, but much much simpler, more like Scala version of Make. With MVN dependency and ScalaTest support.

JVM Other

Databases

Indexing and OLAP

ML and Data Science

  • LearnDS - A set of IPython notebooks for learning data science

Distributed Systems

Sublime Text

I love Sublime and use it for everything, even Scala! Going to put my Sublime stuff in a separate page.

Best Practices and Design

Other Random Stuff

  • A list of great docs

  • JQ - JSON processor for the shell. Super useful with RESTful servers.

  • Underscore-CLI - a Node-JS based command line JSON parser

  • MacroPy - Scala-like macros, case classes, pattern matching, parser combos for Python (!!)

  • Scala 2.11 vs Swift - Apple's new iOS language is often compared to Scala.

  • Real World OCaml

  • Gherkin - a Lisp implemented in bash !!

  • Nimrod - a neat, compile-straight-to-binary, static systems language with beautiful Python-like syntax, union types, generics, macros, first-class functions. What Go should have been.

  • Bret Victor - A set of excellent essays and talks from a great visual designer