kim-dabin's Stars
citusdata/docker
:ship: Docker images and configuration for Citus
bhavul/System-Design-Cheatsheet
System Design Studying can be daunting. This gives you a table to study different problems, understand what components they require, their pros and cons, and how to deal with mitigations.
Sungchul-P/aws-cdk-examples
channel-io/monthly-channel
월간채널 — Monthly Channel
josephmachado/docker_for_data_engineers
Code for blog at: https://www.startdataengineering.com/post/docker-for-de/
DarkTornado/KakaoLink.js
astronomer/astro-sdk
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
wangAoqi666/bigdata-interview
最全的大数据大厂面试宝典,大数据面试题,大数据面试,王傲旗的大数据之路,大数据成神之路,Flink/Spark/Hadoop/Hbase/Hive/Impala/Hbase/MapReduce/YARN/HDFS/Kafka/Flume/Linux/Java/Scala...面试题
kaushikj/video2pdf
EKarton/Lecture-Video-to-PDF
Making lecture videos readable
microsoft/generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
edbullen/DockerSpark245
Spark cluster in docker containers with sample training Jupyter notebooks
Spatacoli/e-reader
The python source code for my Raspberry Pi 4 e-reader.
joeycastillo/The-Open-Book
docker/awesome-compose
Awesome Docker Compose samples
scalingexcellence/scrapybook
Scrapy Book Code
baabaaox/ScrapyDouban
豆瓣电影/豆瓣读书 Scarpy 爬虫
mmas/docker-scrapy-tor
Scrapy environment with Tor for anonymous ip routing and Privoxy for http proxy
8W9aG/scrapy-tor-downloader
Scrapy middleware with TOR support for more robust scrapers or anonymous scraping.
heckenmann/tor-scrapy
webcrawler using a tor-proxy, elasticsearch and scrapy
mheinl/OnionCrawler
Scrapy spider to recursively crawl for TOR hidden services
eksctl-io/eksctl
The official CLI for Amazon EKS
Swalloow/airflow-korean-timetable
Airflow TimeTable for korean working days
palantir/pyspark-style-guide
This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring topics across the PySpark repos we've encountered.
istresearch/scrapy-cluster
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
geekan/scrapy-examples
Multifarious Scrapy examples. Spiders for alexa / amazon / douban / douyu / github / linkedin etc.
scrapy/scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
yasserg/crawler4j
Open Source Web Crawler for Java
danifus/pyzipper
Python zipfile extensions
databricks/Spark-The-Definitive-Guide
Spark: The Definitive Guide's Code Repository