Pinned Repositories
AIPND-Exercises
data-lake-streaming-music
The intention of this project is to create a data lake to hold and process logs from a streaming music platform
data-pipeline-songs-analysis
DataEngineerFoundationalSkills
Foundational and adjacent skills in data engineering.
delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
diane
Hive helper functions for apache spark users
hio
HDFS filesystem and object store helper methods
netflix-content-reviews
This project is a data pipeline created with the intention of generating data related to netflix's content opinion on reddit, this data will serve a twitter bot that will tweet every time someone write on reddit about a certain movie o serie that is on netflix content catalog, also a data warehouse will be created to serve an analytics dashboard where we will answers a few questions.
simpledb
jodie
Delta lake and filesystem helper methods
brayanjuls's Repositories
brayanjuls/diane
Hive helper functions for apache spark users
brayanjuls/hio
HDFS filesystem and object store helper methods
brayanjuls/data-lake-streaming-music
The intention of this project is to create a data lake to hold and process logs from a streaming music platform
brayanjuls/data-pipeline-songs-analysis
brayanjuls/DataEngineerFoundationalSkills
Foundational and adjacent skills in data engineering.
brayanjuls/netflix-content-reviews
This project is a data pipeline created with the intention of generating data related to netflix's content opinion on reddit, this data will serve a twitter bot that will tweet every time someone write on reddit about a certain movie o serie that is on netflix content catalog, also a data warehouse will be created to serve an analytics dashboard where we will answers a few questions.
brayanjuls/airflow-pipeline
An Airflow docker image preconfigured to work well with Spark and Hadoop/EMR
brayanjuls/delta
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
brayanjuls/simpledb
brayanjuls/arrow-datafusion
Apache Arrow DataFusion SQL Query Engine
brayanjuls/arrow2
Transmute-free Rust library to work with the Arrow format
brayanjuls/brayanjuls.github.io
Personal webpage
brayanjuls/data-scrapbook
A collection of images and captions to explain core data concepts
brayanjuls/datawerehouse-songplays-analysis
Data modeling of an OLAP database from the streaming music datasets
brayanjuls/delta-examples
Delta Lake examples
brayanjuls/delta-rs
A native Rust library for Delta Lake, with bindings into Python
brayanjuls/desponge
brayanjuls/incubator-pinot
Apache Pinot (Incubating) - A realtime distributed OLAP datastore
brayanjuls/International-Football-results-analysis
brayanjuls/jodie
Delta lake and filesystem helper methods
brayanjuls/levi
Delta Lake helper methods. No Spark dependency.
brayanjuls/limbo
Limbo is a work-in-progress, in-process OLTP database management system, compatible with SQLite.
brayanjuls/plt
λΠ Programming Language Theory
brayanjuls/polars
Fast multi-threaded, hybrid-out-of-core query engine focussing on DataFrame front-ends
brayanjuls/r2-take-home-assesment
brayanjuls/rickandmorty_pubsub_injector
brayanjuls/rickandmortyprocess
brayanjuls/risinglight-tutorial
Let's build an OLAP database from scratch! 🚧 UNDER CONSTRUCTION 🚧
brayanjuls/toy-olap-db
OLAP DB implementation from scratch for educational purposes
brayanjuls/twitter4s
An asynchronous non-blocking Scala client for both the Twitter Rest and Streaming API