danielbeach
Data Engineer. Data lover. Data warehouse expert. Python, Rust, SQL, Databricks, Delta Lake is all I need in life.
Iowa
Pinned Repositories
data-engineering-practice
Data Engineering Practice Problems
DataEngineeringProjects
Some example projects for Data Engineers to build, end-to-end.
dataEngineeringTemplate
Template for Data Engineering and Data Pipeline projects
GreatExpectationsWithDatabricks
Getting Great Expectations setup to run on DataBricks with Spark Dataframes.
lakescum
A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.
PythonVsRustAWSLambda
Testing the runtime difference between Python and Rust for AWS Lambda.
reepicheep
This is a `Rust` based package to help with the management of complex medicine (pill) management cycles.
sniffer
csv and flat-file sniffer built in Rust.
tinytimmy
A simple and easy to use Data Quality (DQ) tool built with Python.
unitTestPySpark
how to unit test your PySpark code
danielbeach's Repositories
danielbeach/data-engineering-practice
Data Engineering Practice Problems
danielbeach/tinytimmy
A simple and easy to use Data Quality (DQ) tool built with Python.
danielbeach/sniffer
csv and flat-file sniffer built in Rust.
danielbeach/DataEngineeringProjects
Some example projects for Data Engineers to build, end-to-end.
danielbeach/reepicheep
This is a `Rust` based package to help with the management of complex medicine (pill) management cycles.
danielbeach/lakescum
A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.
danielbeach/PythonVsRustAWSLambda
Testing the runtime difference between Python and Rust for AWS Lambda.
danielbeach/RustForDataPipelines
Testing out if Rust can be used for a normal Data Engineering Pipeline.
danielbeach/polarsVpandasOnAwsLambda
Using Polars and Pandas on AWS Lambda to process data.
danielbeach/polars-DeltaLake
Trying out the Dataframe Polars library with Delta Lake ... feat Python.
danielbeach/PolarsVsPySpark
can Polars crunch 27GBs of data faster than Pyspark?
danielbeach/DuckdbAndDeltaLake
Learning how to query remote s3 Delta Lake with DuckDB.
danielbeach/RustOnApacheAirflow
The ultimate Data Engineering Chadstack. Apache Airflow running Rust. Bring it.
danielbeach/fine-tune-openLLaMA
This repo shows how to fine tune openLLaMA (7b) model on a GPU.
danielbeach/rustAsyncExample
A quick example of using Rust to do async HTTP requests/downloads.
danielbeach/datafusion-sql-cli
Playing around and making ETL tools with Datafusion's CLI SQL tool.
danielbeach/graphRS
Building a Network/Graph from scratch, and understanding it with Rust.
danielbeach/PolarsDateTimeManipulation
Polars date and time manipulation
danielbeach/puddleglum
Rust based package for answer questions about s3 buckets and files
danielbeach/DataEngineeringWithFortran
Trying to use Fortran to write a data pipeline
danielbeach/DSAforTheRestOfUs
Introduction to DSA (Data Structures and Algorithms) with Rust.
danielbeach/pyarrow-v-duckdb-v-polars
Compare pyarrow to duckdb to polars for writing data pipelines.
danielbeach/RayonWithRustVsPython
Trying on Rayon with Rust vs Python Thread and ProcessPools.
danielbeach/scrounger
A `Rust` based Python package as a faster alternative to `vulture` for seeking out and finding dead and unused code in Python repositories.
danielbeach/solaSearch
Project to store, relate, and make for public use and consumption, various ancient texts.
danielbeach/sparklepop
SparklePop is a simple Python package designed to check the free disk space of an AWS RDS instance.
danielbeach/TheBearVsTheDuck
Compare DuckDB v Polars for Data Pipelines.
danielbeach/csvRustIteration
learning how to download, unpack, and read CSV files in Rust.
danielbeach/learning-daft
Trying out Daft for Dataframes
danielbeach/skein
Rust based attempt at Distributed Processing, learning things.