/probabilistic-data-structures

require('lx') presentation on probabilistic data structures

Start

$ rm slides.md; while inotifywait -e close_write slides/*; do cat slides/*.md > slides.md; done
$ touch slides/*; reveal-md slides.md

Related

References

Applications

  • web analytics, data streaming
  • distinct visitors (hll)
  • advertising (hll per feature)
  • real-time auditing/probing
  • ex: number of requests per ip (cm-sketch)
  • bigdata processing
  • estimation (ex: presto) (hll)
  • space-efficient optimizations
  • cassandra avoid costly IO lookups (bloom)

PipelineDB

Refs

Code sample

mkdir -p /mnt/ram mount -t ramfs -o size=20m ramfs /mnt/ram