chengat1314
I am a Data geek, working on big data analytics. Data Engineering + Data Science
GrabSingapore
chengat1314's Stars
kdeldycke/awesome-engineering-team-management
👔 How to transition from software development to engineering management
trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
marcotcr/lime
Lime: Explaining the predictions of any machine learning classifier
minio/minio
MinIO is a high-performance, S3 compatible object store, open sourced under GNU AGPLv3 license.
gunnarmorling/awesome-opensource-data-engineering
An Awesome List of Open-Source Data Engineering Projects
flink-china/flink-forward-china-2018
Flink Forward China 2018 Slides
Netflix/iceberg
Iceberg is a table format for large, slow-moving tabular data
wypb/spark-summit-north-america-2018-06
spark-summit-north-america-2018-06, More detail please visit
pinterest/secor
Secor is a service implementing Kafka log persistence
prestodb/presto
The official home of the Presto distributed SQL query engine for big data
atlassian/themis
Autoscaling EMR clusters and Kinesis streams on Amazon Web Services (AWS)
chengat1314/hadoop-cluster-docker
Run Hadoop Custer within Docker Containers
phatak-dev/spark2.0-examples
Examples of Spark 2.0
jakevdp/sklearn_tutorial
Materials for my scikit-learn tutorial
apache/phoenix
Apache Phoenix
spark-notebook/spark-notebook
Interactive and Reactive Data Science using Scala and Spark.
segmentio/analytics.js
The hassle-free way to integrate analytics into any web application.
chengat1314/DataX
DataX 是阿里巴巴集团内被广泛使用的离线数据同步工具/平台,实现包括 MySQL、Oracle、HDFS、Hive、OceanBase、HBase、OTS、ODPS 等各种异构数据源之间高效的数据同步功能。
chengat1314/skill-map
StuQ 技能图谱
cloudera/livy
Livy is an open source REST interface for interacting with Apache Spark from anywhere
spark-jobserver/spark-jobserver
REST job server for Apache Spark
apache/superset
Apache Superset is a Data Visualization and Data Exploration Platform
chengat1314/CoolplaySpark
酷玩 Spark: Spark 源代码解析、Spark 类库等
solos/regexdict
regex dict 正则表达式词典
chengat1314/gsync
RSync for Google Drive - GSync
chengat1314/fingerprintjs2
Modern & flexible browser fingerprinting library, a successor to the original fingerprintjs
suede/breakwall
Automatically exported from code.google.com/p/breakwall
projectkudu/kudu
Kudu is the engine behind git/hg deployments, WebJobs, and various other features in Azure Web Sites. It can also run outside of Azure.
chengat1314/android_sdk
This is the Android SDK of
tensorflow/tensorflow
An Open Source Machine Learning Framework for Everyone