Advanced SQL Material Prep

VM Setup Guide: https://docs.google.com/document/d/1lqg1ISPt8mezwkp_yapuwRHbmotTTP4sRr9wIR6qZBo/

This repo contains the material for generating data for the Advanced SQL class

To generate one week of data it takes approx:

  • ~60 minutes
  • 120 MB disk
  • ? MB of memory
  • 100% CPU

To build one year:

  • 22 hours
  • 6 GB Disk

To Build For Class

make build

# To push to docker hub...
make push

Then use the VM configuration guide and the config.sh script on the VM to ready the machine for class.

Resources

Links:

Books https://www.amazon.com/Database-Internals-Deep-Distributed-Systems/dp/1492040347 https://jakevdp.github.io/PythonDataScienceHandbook/ https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321 https://pages.cs.wisc.edu/~remzi/OSTEP/ (Part 3)