This repository is for holding notebooks I've created for demonstrating various AWS Analytics services in a simple notebook format.
File | Description |
---|---|
jwyant_hudi.ipynb | A notebook for demonstrating Hudi's capabilies on an EMR cluster. Requires a running EMR cluster and I recommend using the EMR Notebook feature |
xml-demo.ipynb | A simple notebook demonstrating loading XML data into a Spark dataframe |
load_store_sales_to_ddb.ipynb | Example of using Apache Spark to load data (in Parquet) from S3 into Amazon DynamoDB |
unload_ddb_to_s3.ipynb | Example of using Apache Spark to unload data from DynamoDB to Parquet in S3 |