Pinned Repositories
advanced-git
https://udemy.com/course/git-advanced-commands/
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
BerlinHousing
Scraped data from https://www.immobilienscout24.de/, used Google Maps API plus folium to make an interactive plot of listings across Berlin.
braze-client
A Python client for the Braze REST API
dagaster-datapipeline
dagster
A data orchestrator for machine learning, analytics, and ETL.
dagster-celery-k8s-example
An example project to explore using dagster with celery in k8s
databricks-workflow
Example of a scalable IoT data processing pipeline setup using Databricks
datapipeline_tests
An example implementation of testing datapipelines (Airflow)
rate-my-post
Response prediction system for Stack Exchange communities. Built on AWS.
somasays's Repositories
somasays/rate-my-post
Response prediction system for Stack Exchange communities. Built on AWS.
somasays/advanced-git
https://udemy.com/course/git-advanced-commands/
somasays/airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
somasays/dagaster-datapipeline
somasays/dagster
A data orchestrator for machine learning, analytics, and ETL.
somasays/dagster-celery-k8s-example
An example project to explore using dagster with celery in k8s
somasays/databricks-workflow
Example of a scalable IoT data processing pipeline setup using Databricks
somasays/datapipeline_tests
An example implementation of testing datapipelines (Airflow)
somasays/demo-scene
👾Scripts and samples to support Confluent Demos and Talks. ⚠️Might be rough around the edges ;-) 👉For automated tutorials and QA'd code, see https://github.com/confluentinc/examples/
somasays/dbrew
Brew and orchestrate your data products seamlessly into actionable deployments with DBrew, the CLI tool tailored for modern data teams
somasays/dbt-airbnb
Learning DBT
somasays/gitignore
A collection of useful .gitignore templates
somasays/GPTs
leaked prompts of GPTs
somasays/impf_slot_checker
somasays/Knowlytics
somasays/kotlin-test-starter
somasays/lp_migrate_to_k8s
Manning Live project - Migrate to Kubernetes
somasays/mlflow
Open source platform for the machine learning lifecycle
somasays/mlflow-example
somasays/mr.aiverson
somasays/neo4j-examples-and-tips
A somewhat curated list of Neo4j examples and tips, mostly around SDN with OGM (SDN 5.x), SDN 6 (previously SDN/RX) and testing.
somasays/nyc_311
A data pipeline for NYC 311 data.
somasays/postgres-gpt
PostgresGPT is a Python library that enables the creation of SQL queries from natural language for PostgreSQL databases.
somasays/PySpark-Boilerplate
A boilerplate for writing PySpark Jobs
somasays/schema-registry
Confluent Schema Registry for Kafka
somasays/seatunnel
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
somasays/slight
A data management service to define and create data products declaratively.
somasays/spark-movies-etl
Spark batch data pipeline that ingests and transforms movies data.
somasays/spark-streaming-flexing
flexing my spark streaming muscles, or so I think
somasays/terraform-aws-redshift-cluster
Terraform module to provision an AWS Redshift Cluster