initialkommit
Data Engineer, Python, SQL, Scala, Spark, Hadoop, EMR, Zeppelin, etc.
Kakao EntertainmentPangyo
initialkommit's Stars
datastacktv/data-engineer-roadmap
Roadmap to becoming a data engineer in 2021
oleg-agapov/data-engineering-book
Accumulated knowledge and experience in the field of Data Engineering
puckel/docker-airflow
Docker Apache Airflow
SpaceVim/SpaceVim
A community-driven modular vim/neovim distribution - The ultimate vimrc
karlredman/Vimwiki-Gollum-Integration
This is a guide and tutorial, with tools and 'out of the box' examples, for integrating Vimwiki with Gollum Wiki on Linux systems.
alexramirez/mac-setup
A very brief and basic list related to the the mac computer setup I like to work with.
channable/opnieuw
One weird trick to make your code more reliable
minio/mc
Simple | Fast tool to manage MinIO clusters :cloud:
vimwiki/vimwiki
Personal Wiki for Vim
BoostIO/BoostNote-Legacy
This repository is outdated and new Boost Note app is available! We've launched a new Boost Note app which supports real-time collaborative writing. https://github.com/BoostIO/BoostNote-App
parksb/handmade-blog
✍️ A static blog generator for people who want to start a blog quickly
webpro/dotfiles
Dotfiles for macOS
ekampf/PySpark-Boilerplate
A boilerplate for writing PySpark Jobs
gtoonstra/etl-with-airflow
ETL best practices with airflow, with examples
python-poetry/poetry
Python packaging and dependency management made easy
danielvdende/data-testing-with-airflow
jamalex/notion-py
Unofficial Python API client for Notion.so
GitAlias/gitalias
Git alias commands for faster easier version control
apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
dsaidgovsg/airflow-pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
hyonaldo/spark-submit-examples
maven build examples for spark-submit
PrefectHQ/prefect
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines
erikbern/git-of-theseus
Analyze how a Git repo grows over time
yorks/mpfhandler
a mutiple processes timed rotate logging file handler(base logging.RotatingFileHandler, ConcurrentLogHandler)
spotify/chartify
Python library that makes it easy for data scientists to create charts.
spotify/luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
ericwayman/luigi_gdb_pipeline_demo
An example to illustrate using Luigi to manage a data science workflow in Greenplum Database
a-hacker/PyCon2018-Luigi
banksalad/K-Format
🇰🇷 Python library for Korean style fixed length format definition(전문 통신)
drduh/macOS-Security-and-Privacy-Guide
Guide to securing and improving privacy on macOS