/cloud-data-analysis-at-scale

[Course-2020-2023] taught at Duke MIDS. This is also a Coursera Course that covers MLOps, ML Engineering and the foundations of Cloud Computing for Data Science.

Primary LanguageJupyter NotebookOtherNOASSERTION

Data Analysis at Scale in the Cloud

Course taught at Duke MIDS, Spring 2020-2022 by Noah Gift.

Guest Lecture 2022-Async

GPT 3:

Prequel Material

These resources could be helpful before starting this course.

Duke/Coursera: Foundations of Data Engineering Course (Launching early 2022)

Course1: Python and Pandas for Data Engineering

Course2: Linux and Bash for Data Engineering

Github Repos for Projects in Course
Week1: Using Linux
Week2: Using Bash
Week3: Building Bash Scripts
Week4: Composing File and Data Management Solutions with Linux

Course3: Python and SQL for Data Engineering

Course4: Building Data Engineering Solutions with Python for Web Applications, Command-Line Tools and Notebooks

Sequel Material

These resources could be helpful after starting this course.

Duke/Coursera: Applied Data Engineering Course (Launching late 2022)

Github Repos Referenced Duke Coursera Course

Course 1: Cloud Computing Foundations

Course 2: Cloud Computing Building Blocks

Lecture Topics:

Getting Started: [Week1]

Cloud Computing Foundations: [Week2]

Virtualization and Containers: [Week3 & Week 4]

Challenges and Opportunities in Distributed Computing: [Week 5 & Week 6]

Cloud Storage [Week 7 & Week 8]

Serverless [Week 9 & Week 10]

MLOps, Big Data and Edge Computer Vision [Week 11 & Week 12 & Week 13]

General

Student Example Projects

A practical guide to Data Science, Machine Learning Engineering and Data Engineering

Read Cloud Computing for Data Book cloud4data books

Free book Developing-on-AWS-with-CSharp Screenshot 2022-10-28 at 7 12 09 AM

Next Steps: Take Coursera MLOps Course

cloud-specialization

Text and Code License

The text and code content of notebooks and documents is released under the CC-BY-NC-ND license