mosche
Principal Data Engineer, Apache Beam Committer and Open Source enthusiast - Open to work!
TalendMunich, Germany
mosche's Stars
TileDB-Inc/TileDB-Spark
Spark interface to the TileDB storage manager [please see README]
polynote/polynote
A better notebook for Scala (and more)
vegas-viz/Vegas
The missing MatPlotLib for Scala + Spark
saddle/saddle
SADDLE: Scala Data Library
sterglee/scalalab
ScalaLab: Efficient MATLAB like scientific computing for the Java platform with the current Scala 2.13. For Scala 3 the equivalent project is dottylab: https://github.com/sterglee/dottylab
haifengl/smile
Statistical Machine Intelligence & Learning Engine
apache/sedona
A cluster computing framework for processing large-scale geospatial data
harsha2010/magellan
Geo Spatial Data Analytics on Spark
tdunning/t-digest
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means
airlift/airlift
Airlift framework for building REST services
citusdata/postgresql-hll
PostgreSQL extension adding HyperLogLog data structures as a native data type
addthis/stream-lib
Stream summarizer and cardinality estimator.
aggregateknowledge/java-hll
Java library for the HyperLogLog algorithm
swoop-inc/spark-alchemy
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
softwaremill/tapir
Rapid development of self-documenting APIs
binhnguyennus/awesome-scalability
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
nats-io/nats-streaming-server
NATS Streaming System Server
facebookresearch/ParlAI
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
hmemcpy/milewski-ctfp-pdf
Bartosz Milewski's 'Category Theory for Programmers' unofficial PDF and LaTeX source
softwaremill/sttp
The Scala HTTP client you always wanted!
tonsky/FiraCode
Free monospaced font with programming ligatures
typelevel/spire
Powerful new number types and numeric abstractions for Scala.
rolandtritsch/scala.g8
My Scala project template
apache/pulsar
Apache Pulsar - distributed pub-sub messaging system
apache/mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
PacktPublishing/Mastering-Machine-Learning-with-Spark-2.x
Mastering Machine Learning with Spark 2.x, published by Packt
holdenk/spark-validator
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support.
typelevel/frameless
Expressive types for Spark.
groupon/sparklint
A tool for monitoring and tuning Spark jobs for efficiency.