Just a bunch of useful links
Scala Design Patterns - great stuff, how you do (or don't) traditional Java / OOP patterns in Scala
The Human Side of Scala - great post on styling Scala for readability
Sneaking Scala Through the Back Door - how to promote Scala in an organization
Effective Scala - Twitter's guide to writing good Scala code
Between Zero & Hero - tips and tricks for the intermediate Scala developer
Scala School 2 - Twitter's next generation interactive scala tutorial
Type of Types - an unfinished tutorial on the Scala type system
Monads are not Metaphors - a great explanation of monads
Important compiler flags
Recursive Types - signatures like
class Foo[T <: Foo[T]]
, useful for inheritance and proper return types. Tho if you hit this, there are probably better ways of solving the problem, ie via composition.
- Simple Binary Encoding - supposedly 20-50x faster than Google Protobuf !!
- Comparison of Cap'n Proto, SBE, FlatBuffers from the Cap'n Proto people
- Jawn - @d3's new fast JSON parser, parses to multiple ASTs including rojoma-json, spray-json, argonaut
- Extracting case class param names using Macros
- Fast-Serialization - a drop in replacement for Java Serialization but much faster
- Akka's ByteString class - immutable rope class for fast byte additions
Concurrency, Actors
CKite - Raft Scala implementation, Finagle, MapDB etc.
SafeFuture CancellableFuture etc - very useful
Execute Futures serially - in nonblocking fashion
Scala.Rx - "Reactive variables" - smart variables who auto-update themselves when the values they depend on change
Monifu - a nice set of wrappers around j.u.c.Atomic*, as well as super-lightweight cancellable tasks and futures utilities. Accompanying blog post.
CEP using Akka Streams - great example of using Akka's new Streams for distributed stream processing with backpressure
akka instrumentation - an experiment to walk the actor tree and see stuff at runtime
- rxmon - Akka monitoring via RxJava
Actor Provisioning pattern - if you have a long, failure-prone initialization procedure for an actor, this trait splits out the work, to say another actor and dispatcher
Running an Akka cluster with Docker Containers
Ask, Tell, and Per-Request Actors - why one company moved from Ask/Futures to per-request
Async Database Libs
- Asyncpools - Akka-based async connection pool for Slick. Akka 2.2 / Scala 2.10.
- Postgresql-Async - Netty-based async drivers for PostgreSQL and MySQL
- Cacheable - a clever memoization / caching library (with Guava, Redis, Memcached or EHCache backends) using Scala 2.10 macros to remember function parameters
Big Data Processing
Great list of Big Data Projects
Debasish G's list of streaming papers and algorithms - esp stuff on CountMinSketch and HyperLogLog
Summingbird - For any dataset that can be aggregated using a monoid, promises to unify Storm, Hadoop, and in the future, Akka and Spark with a single DSL. Also has a neat library of monoids built in.
Making Zookeeper Resilient, an excellent blog post from Pinterest
Probability Monad - super useful for stats or random data generation
stringmetric - Approximate string matching and phonetic algorithms
Factorie - a Scala library for Natural Language Processing
- Jaws - Spark SQL REST server, includes query cancellation, logs, load balancing.
Geospatial and Graph
GeoTrellis - distributed raster processing, adding Vector/geom support, Akka Cluster and Spark implementations!
Spatial framework for Hadoop - PostGIS-like operators / UDFs for Hive. We want this for Spark!
trails - parser combinators for graph traversal. Supports Tinker/Blueprints/Neo4j APIs.
scala-graph - in-memory graph API based on scala collections. Work in progress.
Collections, Numeric Processing, Fast Loops
- Breeze, Spire, and Saddle - Scala numeric libraries
- spire-ops - a set of macros for no-overhead implicit operator enrichment
- ScalaXY - collection of macros for performant for loops, extension methods etc
- Squants - The Scala API for Quantities, Units of Measure and Dimensional Analysis
- FastTuple - a dynamic (runtime-defined) C-style struct library, with support for off-heap storage. Would work really well for in-memory queries.
- and the excellent blog covers all of the on- and off-heap access and allocation patterns on the JVM very thoroughly.
- Unboxing, Runtime Specialization - a cool post on how to do really fast aggregations using unboxed integers
- product-collections - useful library for working with collections of tuples
- SuperFastHash - also see Murmur3
Big Data Storage
- Phantom - Scala DSL for Cassandra, supports CQL3 collections, CQL generation from data models, async API based on Datastax driver
- Athena - Asynchronous Cassandra client built on Akka-IO
- Stubbed Cassandra - super useful for testing C* apps
- Pithos - an S3-API-compatible object store for Cassandra
- Sirius - Akka-based in-memory fast key-value store for JVM objects, with Paxos consistency, persistence/txn logs, HA recovery
- Storehaus - Twitter's key-value wrapper around Redis, MySql, and other stores. Has a neat merge() functionality for aggregation of values, lists, etc.
- MapDB - Not a database, but rather a database engine with tunable consistency / ACIDness; support for off-heap memory; fast performance; indexing and other features.
- HPaste - a nice Scala client for HBase
Web / REST / General
Scalaj-http - really simple REST API. Although, the latest Spray-client has been vastly simplified as well.
REPL as a service - would be kick ass if integrated into Spark
IScala - Scala backend for IPython. Looks promising. There is also Scala Notebook but it's more of a research project.
Scaposer - i18n / .po file library
Adding Reflection to Scala Macros - example of using reflection in an annotation macro to add automatic ByteBuffer serialization to case classes :)
Scaldi - A lightweight dependency injection library, with Akka integration
How to use Typesafe Config across multiple environments
Scala-rainbow - super simple terminal color output, easier than Console.XXX
SExt - Supplies some missing Standard Library functions, like pretty-printing data structures, unfold, etc.
ScalaUtils - ===, !== with tolerance for floats, an OR operator for types for easy validation (
Int Or One[ErrorMessage]
Build, Tooling
Run Scala scripts with dependencies - ie you don't need a project file
sbt-assembly 0.10.2 supports adding a shell script to your jar to make it executable! No more "java ...." to start your Scala program, and no more
ps ax | grep java | grep ....
Other useful SBT plugins - sbt-sonatype, sbt-pom-reader, sbt-sound, plugins page
SCoverage - statement coverage tool, much more useful than line-based or branch-based tools. Has SBT plugin. Blog post on why it's an improvement.
sbt-jmh - Plugin for running SBT projects with the JMH profiling tool
SBT Shell Prompt with Git and project name :) (SBT 0.13 only)
SBT updates - Tool for discovering updated versions of SBT dependencies
Thyme and Parsley - microbenchmarking and profiling tools, seems useful
ScalaStyle - Scala style checker / linter
Linter - Scala linter compiler plugin
utest - a small micro test framework
lions share - a neat JVM heap and GC analysis tool, with charts and SBT integration.
SBuild seems like a promising replacement for SBT. Still Scala, but much much simpler, more like Scala version of Make. With MVN dependency and ScalaTest support.
JVM Other
- Quick dumping your JVM heap using GDB -- too bad it doesn't work on OSX.
- jHiccup -- "Hiccup" or GC pause analysis tool
- Bintray - friendlier alternative to Sonatype OSS / Maven central. Also see bintray-sbt plugin.
Indexing and OLAP
- Adaptive Radix Trees - cache friendly indexing for in-memory databases
- Quotient Cubes - semantic grouping and rollup algorithm for OLAP cubes. Ruby implementation.
- Top K queries and cubes
- Scalable In-memory Aggregation - column-oriented, in memory with bitmap indexing and memoization
ML and Data Science
- LearnDS - A set of IPython notebooks for learning data science
Distributed Systems
- Raft Visualization - great 5-min visualization of the distributed consensus protocol
Sublime Text
I love Sublime and use it for everything, even Scala! Going to put my Sublime stuff in a separate page.
Best Practices and Design
- Semver - Semantic versioning, how to deal with dev workflows and corner cases -- a must read
- Pragmatic RESTful API Design - really good stuff
- Blameless Post-Mortems - why they are crucial to good culture
- GitHub Flow - how does continuous deploys, uses pull requests for an automated, process-free development workflow. Some gems include naming branches descriptively and using to browse the work currently in progress by looking at active branches.
- Pull Requests and other good Github Practices
Other Random Stuff
JQ - JSON processor for the shell. Super useful with RESTful servers.
Underscore-CLI - a Node-JS based command line JSON parser
MacroPy - Scala-like macros, case classes, pattern matching, parser combos for Python (!!)
Scala 2.11 vs Swift - Apple's new iOS language is often compared to Scala.
Gherkin - a Lisp implemented in bash !!
Nimrod - a neat, compile-straight-to-binary, static systems language with beautiful Python-like syntax, union types, generics, macros, first-class functions. What Go should have been.
Bret Victor - A set of excellent essays and talks from a great visual designer