Pinned Repositories
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
DataX
DataX是阿里云DataWorks数据集成的开源版本。
datax-distribute
datax 分布式服务:主要将job 和 taskGroup分拆在两个进程,采用rpc实现通信,就能达到分布式能力,避免单进程资源局限。
datax-service
针对datax进行2次开发,实现data 以rpc的方式传递json配置调用推数服务,同时修复datax多处bug。项目中也引入nacos作为服务的配置中心和注册中心; 同时项目内扩展了kafkawriter,rabbitmqwriter,esreader,hivereader。增强了hdfs插件,支持分区表推送,支持动态参数传递(例如时间实现自增式抽取)。具体使用方式可以参照example模块。目前该服务已经稳定服务某上市公司半年,累计总任务数100+ ,日推送数据过10亿。具体如何使用,如何做插件开发以及datax底层原理,请关注https://blog.csdn.net/xiaoyao1999hn
dubbo-rest-example
dubbo rest filter
flink
Apache Flink
hera
hera 分布式任务调度系统(数据部门专用)
hugegraph
HugeGraph Database core component, including graph engine, API, and built-in backends
incubator-dolphinscheduler
Dolphin Scheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing.(分布式易扩展的可视化工作流任务调度)
streamx
Make stream processing easier! Flink & Spark development scaffold, The original intention of StreamX is to make the development of Flink easier. StreamX focuses on the management of development phases and tasks. Our ultimate goal is to build a one-stop big data solution integrating stream processing, batch processing, data warehouse and data laker.
caosuwenwu's Repositories
caosuwenwu/hera
hera 分布式任务调度系统(数据部门专用)
caosuwenwu/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
caosuwenwu/DataX
DataX是阿里云DataWorks数据集成的开源版本。
caosuwenwu/datax-service
针对datax进行2次开发,实现data 以rpc的方式传递json配置调用推数服务,同时修复datax多处bug。项目中也引入nacos作为服务的配置中心和注册中心; 同时项目内扩展了kafkawriter,rabbitmqwriter,esreader,hivereader。增强了hdfs插件,支持分区表推送,支持动态参数传递(例如时间实现自增式抽取)。具体使用方式可以参照example模块。目前该服务已经稳定服务某上市公司半年,累计总任务数100+ ,日推送数据过10亿。具体如何使用,如何做插件开发以及datax底层原理,请关注https://blog.csdn.net/xiaoyao1999hn
caosuwenwu/dubbo-rest-example
dubbo rest filter
caosuwenwu/flink
Apache Flink
caosuwenwu/hugegraph
HugeGraph Database core component, including graph engine, API, and built-in backends
caosuwenwu/incubator-dolphinscheduler
Dolphin Scheduler is a distributed and easy-to-expand visual DAG workflow scheduling system, dedicated to solving the complex dependencies in data processing, making the scheduling system out of the box for data processing.(分布式易扩展的可视化工作流任务调度)
caosuwenwu/mybatis-3
MyBatis SQL mapper framework for Java
caosuwenwu/netty
Netty project - an event-driven asynchronous network application framework
caosuwenwu/SpringBoot-Simple-Demo
开发模板:开发环境 IntelliJ IDEA JDK8 Maven 3.5.x lombok 1.16.18 使用框架 Spring Boot Swagger2 Druid Log4j2 MyBatis MyBatis Plus MySQL H2 Thymeleaf
caosuwenwu/tunnel
PG数据同步工具(Java实现),支持hive
caosuwenwu/xxl-rpc
源码解析(重点解析netty实战),A high performance, distributed RPC framework.(分布式服务框架XXL-RPC)
caosuwenwu/zeus
taobao zeus 支持 Hadoop mr, hive, shel,前端界面用java(google富客户端gwt)写的,现在二次开发之后hera(https://github.com/scxwhite/hera)
caosuwenwu/datax-distribute
datax 分布式服务:主要将job 和 taskGroup分拆在两个进程,采用rpc实现通信,就能达到分布式能力,避免单进程资源局限。
caosuwenwu/streamx
Make stream processing easier! Flink & Spark development scaffold, The original intention of StreamX is to make the development of Flink easier. StreamX focuses on the management of development phases and tasks. Our ultimate goal is to build a one-stop big data solution integrating stream processing, batch processing, data warehouse and data laker.
caosuwenwu/chronus
Chronus是360金融技术团队基于阿里开源项目-TBSchedule重写的分布式调度生产版本开源项目。
caosuwenwu/chunjun
flinkx 数据交换同步
caosuwenwu/CloudShuffleService
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
caosuwenwu/datax-web
DataX集成可视化页面,选择数据源即可一键生成数据同步任务,支持RDBMS、Hive、HBase、ClickHouse、MongoDB等数据源,批量创建RDBMS数据同步任务,集成开源调度系统,支持分布式、增量同步数据、实时查看运行日志、监控执行器资源、KILL运行进程、数据源信息加密等。
caosuwenwu/dubbo-study
手写dubbo框架
caosuwenwu/flink-streaming-platform-web
基于flink-sql的实时流计算web平台
caosuwenwu/FlinkSQL
仿照阿里blink使用sql开发flink的实时程序
caosuwenwu/God-Of-BigData
学习之路
caosuwenwu/mysql-connector-j
MySQL Connector/J
caosuwenwu/netty-to-dubbo
基于Netty手写Dubbo框架
caosuwenwu/orc
Mirror of Apache Orc
caosuwenwu/spark_study
spark源码学习
caosuwenwu/tinkerpop
Apache TinkerPop - a graph computing framework