sungjuly's Stars
donnemartin/system-design-primer
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
DopplerHQ/awesome-interview-questions
:octocat: A curated awesome list of lists of interview questions. Feel free to contribute! :mortar_board:
microsoft/ML-For-Beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
recommenders-team/recommenders
Best Practices on Recommendation Systems
mlflow/mlflow
Open source platform for the machine learning lifecycle
so-fancy/diff-so-fancy
Good-lookin' diffs. Actuallyโฆ nahโฆ The best-lookin' diffs. :tada:
github/gh-ost
GitHub's Online Schema-migration Tool for MySQL
great-expectations/great_expectations
Always know what to expect from your data.
datahub-project/datahub
The Metadata Platform for your Data Stack
VGraupera/1on1-questions
Mega list of 1 on 1 meeting questions compiled from a variety to sources
testcontainers/testcontainers-java
Testcontainers is a Java library that supports JUnit tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
bentoml/BentoML
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and much more!
pditommaso/awesome-pipeline
A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin
rundeck/rundeck
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
apache/hudi
Upserts, Deletes And Incremental Processing on Big Data.
JerryLead/SparkInternals
Notes talking about the design and implementation of Apache Spark
hashicorp/waypoint
A tool to build, deploy, and release any application on any platform.
uptrace/bun
SQL-first Golang ORM
FactoryBoy/factory_boy
A test fixtures replacement for Python
arl/statsviz
๐ Visualise your Go program runtime metrics in real time in the browser
Netflix/vectorflow
uber/queryparser
Parsing and analysis of Vertica, Hive, and Presto SQL.
facebookarchive/bistro
Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
etsy/boundary-layer
Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform
ubisoft/mobydq
:whale: Tool to automate data quality checks on data pipelines
Kuniz/alfnaversearch
Naver Search Workflow for Alfred (์ํ๋ ๋ ๋ค์ด๋ฒ ๊ฒ์/์ฌ์ /์ง๋ ์๋์์ฑ ์ํฌํ๋ก์ฐ)
naver/hadoop
Public hadoop release repository
Yelp/aws_logs_to_parquet_converter
Spark batch converter to convert AWS S3 server side logs to Parquet file format