spancer
Big data practitioner, data architect of the smart factory. Expert in big data architecture, search engine, big data analysis, agile development.
changsha
Pinned Repositories
bigdata-docker-builds
Docker images for building hadoop3.2, hive 3.1, hbase2.3, presto 0.247, flink1.11.3 on yarn, etc.
bigdata-docker-compose
Deploy bigdata platform using docker compose. Big data components include hadoop, hive, hbase, presto, flink, es, kafka, etc.
CS-Notes
:books: 技术面试必备基础知识、Leetcode 题解、Java、C++、Python、后端面试、操作系统、计算机网络、系统设计
elasticlake
open source data lake build on top of apache iceberg
elasticsearch-ansj-analysis-plugin
ansj analysis elasticsearch plugin
FiboRulex
FiboRulex - 实时AI智能决策引擎、规则引擎、风控引擎、数据流引擎。 通过可视化界面进行规则配置,无需繁琐开发,节约人力,提升效率,实时监控,减少错误率,随时调整; 支持规则集、评分卡、决策树,名单库管理、机器学习模型、三方数据接入、定制化开发等;
flink-es-demo
基于ES快速实现车辆碰撞分析、套牌车分析、尾随分析。
flink-iceberg-demo
flink iceberg integration tests, jobs running on yarn.
prestodb-hbase-connector
prestodb hbase connector, using zookeepr to hold the metadata.
zeus
Zeus is an open-source, analytical engine for big data hold in data lake; it was designed to provide OLAP (Online Analytical Processing) capability in the big data era. You can use Zeus to store, query, analysis, and manage data.
spancer's Repositories
spancer/elasticlake
open source data lake build on top of apache iceberg
spancer/nebula
A distributed, fast open-source graph database featuring horizontal scalability and high availability
spancer/OneBlog
:alien: OneBlog,一个简洁美观、功能强大并且自适应的Java博客
spancer/chatbot
https://pan.baidu.com/s/1jagzKBGAChKkt3cF_ajSlw 提取码: st5f 复制这段内容后打开百度网盘手机App,操作更方便哦
spancer/conf
The simplest docker file of Confluence.
spancer/cratedb
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of machine data in real-time.
spancer/dgraph
Native GraphQL Database with graph backend
spancer/docker-spark-iceberg
spancer/efaqa-corpus-zh
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
spancer/flink-ml
Machine learning library of Apache Flink
spancer/flinky
finky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Batch & Streaming and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
spancer/flinky-website
dinky website
spancer/genie
Distributed Big Data Orchestration Service
spancer/gobblin
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
spancer/iceberg
Apache Iceberg
spancer/janusgraph
JanusGraph: an open-source, distributed graph database
spancer/jpmml-sklearn
Java library and command-line application for converting Scikit-Learn pipelines to PMML
spancer/karmada
Open, Multi-Cloud, Multi-Cluster Kubernetes Orchestration
spancer/kubernetes-client
Java client for Kubernetes & OpenShift
spancer/metacat
spancer/metaflow
:rocket: Build and manage real-life data science projects with ease!
spancer/mlops
YMIR, a streamlined model development product.
spancer/money-transfer-project-template-java
spancer/neptune-sklearn
Experiment tracking and model registry for Scikit learn. 🧩 Visualize, organize, and compare model metrics, parameters, dataset versions, and more.
spancer/presto-retention-udf
spancer/presto-teach
presto、trino资料分享,开发文档、源码阅读、二次开发。
spancer/QLExpress
QLExpress is a powerful, lightweight, dynamic language for the Java platform aimed at improving developers’ productivity in different business scenes.
spancer/seldon-core
An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
spancer/Trisk
Trisk on Flink
spancer/wal2json
PostgreSQL log copy, a JSON output plugin for changeset extraction