initialkommit
Data Engineer, Python, SQL, Scala, Spark, Hadoop, EMR, Zeppelin, etc.
Kakao EntertainmentPangyo
initialkommit's Stars
apache/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
python-poetry/poetry
Python packaging and dependency management made easy
drduh/macOS-Security-and-Privacy-Guide
Guide to securing and improving privacy on macOS
SpaceVim/SpaceVim
A modular Vim/Neovim configuration
spotify/luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
PrefectHQ/prefect
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
BoostIO/BoostNote-Legacy
This repository is outdated and new Boost Note app is available! We've launched a new Boost Note app which supports real-time collaborative writing. https://github.com/BoostIO/BoostNote-App
datastacktv/data-engineer-roadmap
Roadmap to becoming a data engineer in 2021
vimwiki/vimwiki
Personal Wiki for Vim
jamalex/notion-py
Unofficial Python API client for Notion.so
puckel/docker-airflow
Docker Apache Airflow
spotify/chartify
Python library that makes it easy for data scientists to create charts.
minio/mc
Unix like utilities for object store
erikbern/git-of-theseus
Analyze how a Git repo grows over time
GitAlias/gitalias
Git alias commands for faster easier version control
gtoonstra/etl-with-airflow
ETL best practices with airflow, with examples
webpro/dotfiles
Dotfiles for macOS
oleg-agapov/data-engineering-book
Accumulated knowledge and experience in the field of Data Engineering
ekampf/PySpark-Boilerplate
A boilerplate for writing PySpark Jobs
channable/opnieuw
One weird trick to make your code more reliable
alexramirez/mac-setup
A very brief and basic list related to the the mac computer setup I like to work with.
parksb/handmade-blog
✍️ A static blog generator for people who want to start a blog quickly
danielvdende/data-testing-with-airflow
dsaidgovsg/airflow-pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
karlredman/Vimwiki-Gollum-Integration
This is a guide and tutorial, with tools and 'out of the box' examples, for integrating Vimwiki with Gollum Wiki on Linux systems.
yorks/mpfhandler
a mutiple processes timed rotate logging file handler(base logging.RotatingFileHandler, ConcurrentLogHandler)
banksalad/K-Format
🇰🇷 Python library for Korean style fixed length format definition(전문 통신)
ericwayman/luigi_gdb_pipeline_demo
An example to illustrate using Luigi to manage a data science workflow in Greenplum Database
a-hacker/PyCon2018-Luigi
hyonaldo/spark-submit-examples
maven build examples for spark-submit