ikmalzulkifli's Stars
josephmachado/spark_submit_airflow
Simple repo to demonstrate how to submit a spark job to EMR from Airflow
alex/what-happens-when
An attempt to answer the age old interview question "What happens when you type google.com into your browser and press enter?"
data-engineering-community/data-engineering-wiki
The best place to learn data engineering. Built and maintained by the data engineering community.
josephmachado/python-v-sql-for-data-transform
Python or SQL for data transformation
janaom/gcp-de-project-streaming-pubsub-beam-dataflow
This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.
josephmachado/sde_de101_josephmachado
Sample repo for startdataengineering DE 101 free course
josephmachado/python_essentials_for_data_engineers
Code for blog at https://www.startdataengineering.com/post/python-for-de/
josephmachado/cost_effective_data_pipelines
Cost Efficient Data Pipelines with DuckDB
Rhoteh/exam-testing-engine-vumingo
Free full version of exam testing engine vumingo
AlexTheAnalyst/Power-BI
TheAlgorithms/Python
All Algorithms implemented in Python
darshilparmar/python-for-data-engineering
This repo contains all the code used in the Python for Data Engineering Course
Thevesh/analysis-election-msia
Data on Malaysian parliamentary election results + dataviz with the consolidated datasets
Thevesh/data
Data which, to the best of my knowledge, I am the first / only to collate and make freely available in a machine-readable way. I will delete files for which I discover a better previous source.
josephmachado/data
open data for blog content at https://www.startdataengineering.com/
josephmachado/soho
Minimalist Hugo theme based on Hyde
josephmachado/spark_submit_airflow-
Simple repo to demonstrate how to submit a spark job
josephmachado/unit_test_dbt
unit test example in DBT
josephmachado/sde_superset_demo
Apache Superset Demp
josephmachado/trigger_spark_with_lambda
Simple example showing how to trigger a spark job with AWS Lambda
josephmachado/idempotent-data-pipeline
Making data pipelines idempotent
josephmachado/e2e_datapipeline_test
Example repo to create end to end tests for data pipeline.
josephmachado/files
public file hosting
josephmachado/josephmachado
Profile readme
josephmachado/data_test_ci
Repository showing how to automate data testing as part of CI
josephmachado/docker-trino-cluster
Multiple node presto cluster on docker container
josephmachado/dbt_development
Repo to explain development, CI/CD cycle in dbt
josephmachado/bitcoinMonitor
Near real time ETL to populate a dashboard.
josephmachado/online_store
End to end data engineering project
josephmachado/simple_dbt_project
Code for dbt tutorial