devmanhinton's Stars
CenturyLinkLabs/dray
An engine for managing the execution of container-based workflows.
apache/arrow
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
wesm/feather
Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow
ZuzooVn/machine-learning-for-software-engineers
A complete daily plan for studying to become a machine learning engineer.
springml/spark-sftp
Spark connector for SFTP
LucaCanali/sparkMeasure
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
freeCodeCamp/devdocs
API Documentation Browser
apache/griffin
Mirror of Apache griffin
ziishaned/learn-regex
Learn regex the easy way
0xAX/linux-insides
A little bit about a linux kernel
jenkinsci/job-dsl-plugin
A Groovy DSL for Jenkins Jobs - Sweeeeet!
schickling/chromeless
🖥 Chrome automation made simple. Runs locally or headless on AWS Lambda.
synack/docker-rsync
Docker + rsync
apache/incubator-toree
Mirror of Apache Toree (Incubating)
JerryLead/SparkInternals
Notes talking about the design and implementation of Apache Spark
facebook/prophet
Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
twitter/AnomalyDetection
Anomaly Detection with R
akullpp/awesome-java
A curated list of awesome frameworks, libraries and software for the Java programming language.
aaronlevin/scala-gitrev
blackrock/TopNotch
A framework for systematically quality controlling big data.
databricks/spark-sql-perf
kenbot/goggles
Pleasant, yet principled Scala optics DSL
spark-jobserver/spark-jobserver
REST job server for Apache Spark
apache-spark-on-k8s/spark
Apache Spark enhanced with native Kubernetes scheduler back-end: NOTE this repository is being ARCHIVED as all new development for the kubernetes scheduler back-end is now on https://github.com/apache/spark/
scalameta/scalameta
Library to read, analyze, transform and generate Scala programs
tuhdo/os01
Bootstrap yourself to write an OS from scratch. A book for self-learner.
google/guice
Guice (pronounced 'juice') is a lightweight dependency injection framework for Java 11 and above, brought to you by Google.
twosigma/flint
A Time Series Library for Apache Spark
typelevel/doobie
Functional JDBC layer for Scala.
Netflix/dynomite
A generic dynamo implementation for different k-v storage engines