Pinned Repositories
hudi
Upserts, Deletes And Incremental Processing on Big Data.
AutoRobRedPackage
实现全自动抢红包并自带关闭窗口功能
big-data-literature
大数据技术方面论文、书籍等资料汇集
carmhuo.github.io
学习笔记、心得体会
carSystem1
SGM carSystem
CrawlSpider
Crawler script by Python
datamining
This is about dataming test by python
DataX
DataX是阿里云DataWorks数据集成的开源版本。
delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Douglas-Peucker
This is a Java Program to implement Douglas-Peucker Algorithm.
carmhuo's Repositories
carmhuo/scrapy-redis
Redis-based components for scrapy that allows distributed crawling
carmhuo/pg2dm-python
carmhuo/snownlp
Python library for processing Chinese text
carmhuo/dirbot
Scrapy project to scrape public web directories (educational)
carmhuo/james_blog
Example Jekyll Blog source for The Docker Book
carmhuo/hadoop-docker
Hadoop docker image
carmhuo/incubator-tez
Mirror of Apache Tez (Incubating)
carmhuo/dotfiles
Config files.
carmhuo/snowseg
tools for chinese word segmentation and pos tagging written in python