Pinned Repositories
airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
airbyte-helm
awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
awesome-scalability
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
awesome-sysadmin
A curated list of amazingly awesome open source sysadmin resources inspired by Awesome PHP.
BigData-Notes
大数据入门指南 :star:
book-notes
Notes from books and other interesting things that I've read. Table of contents at the end 👇
citus
Scalable PostgreSQL for multi-tenant and real-time analytics workloads
dag-factory
Dynamically generate Apache Airflow DAGs from YAML configuration files
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
brucemen711's Repositories
brucemen711/BigData-Notes
大数据入门指南 :star:
brucemen711/airbyte
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
brucemen711/airbyte-helm
brucemen711/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
brucemen711/awesome-sysadmin
A curated list of amazingly awesome open source sysadmin resources inspired by Awesome PHP.
brucemen711/book-notes
Notes from books and other interesting things that I've read. Table of contents at the end 👇
brucemen711/dag-factory
Dynamically generate Apache Airflow DAGs from YAML configuration files
brucemen711/dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
brucemen711/dbt-duckdb-template
A dbt duckdb template
brucemen711/dbt-duckdb-udf
A dbt-duckdb-udf plugin
brucemen711/dbt-spark
dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
brucemen711/debezium-examples
Examples for running Debezium (Configuration, Docker Compose files etc.)
brucemen711/duckdb-udf
brucemen711/hive-metastore-docker
Example for article Running Spark 3 with standalone Hive Metastore 3.0
brucemen711/iceberg
Apache Iceberg
brucemen711/iceberg-stack-docker
Iceberg Stack
brucemen711/incubator-dolphinscheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`.
brucemen711/integrations-extras
Community developed integrations and plugins for the Datadog Agent.
brucemen711/k8s-example
example
brucemen711/kamu-cli
New generation decentralized data warehouse and streaming data pipeline
brucemen711/Machine-Learning-Yearning-Vietnamese-Translation
brucemen711/mml-book.github.io
Companion webpage to the book "Mathematics For Machine Learning"
brucemen711/presto
Official home of the community managed version of Presto, the distributed SQL query engine for big data, under the auspices of the Presto Software Foundation.
brucemen711/puppet-clickhouse-1
Install and manage ClickHouse DBMS Requires for xml-simple ruby gem to be installed
brucemen711/ranger
Mirror of Apache Ranger
brucemen711/react-flow
Highly customizable library for building interactive node-based UIs, editors, flow charts and diagrams
brucemen711/spark
Apache Spark
brucemen711/the_silver_searcher
A code-searching tool similar to ack, but faster.
brucemen711/vitess
Vitess is a database clustering system for horizontal scaling of MySQL.
brucemen711/wormhole
Wormhole is a SPaaS (Stream Processing as a Service) Platform