/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

data-science-on-gcp

Source code accompanying book:

Data Science on the Google Cloud Platform
Valliappa Lakshmanan
O'Reilly, Jan 2017

Try out the code on Google Cloud Platform

Open in Cloud Shell

Purchase book

Read on-line or download PDF of book

Buy on Amazon.com

Updates to book

The pace of change in cloud computing is incredible, so it's not surprising that there are topics that were not covered in the book, but which are important. These articles update the book in the sections stated:

Where it goes Link to article Key update
Addition to Ch 5 How to train and predict regression and classification ML models using only SQL — using BigQuery ML You can start experimenting with Machine Learning models much earlier, in the data exploration phase itself.
Replace last section of Ch 2 Scheduling data ingest using Cloud Functions and Cloud Scheduler A better way to do periodic data ingest: instead of using AppEngine Cron, use Cloud Scheduler to launch a Cloud Function.
Update Ch 9 How to deploy Jupyter notebooks as components of a Kubeflow ML pipeline Developing a TensorFlow model in a notebook is now best done using the Keras API, in Eager mode. You can even quickly run the notebook routinely using Kubeflow pipelines, while you extract the pieces out into containerized components for production.