/gp-magic-query

Demo and training of Greenplum multi-modal analytics

Primary LanguageJupyter NotebookMIT LicenseMIT

This is a demo of advanced analytics queries that can be run using Pivotal Greenplum

What is the desired outcome?

Teach advanced analytics techniques in one single data set

Motivation

Greenplum is a multimodal database that can be used for advanced analytics such as Text, Geospatial, Graph and Time Series analytics. The unique Greenplum platform can combine all of these types of analytical queries on a single data set in order to gain insights on the data. In this session we will use a single real world data set and show how to start with basic SQL queries, python user defined functions, and then advance to text search, text analytics, geospatial analytics, graph analytics, time series analysis and machine learning with Python. All of these queries are done in place on a single data set in a single database. Each discipline of analytics will be covered from basic principles with simple code examples and output. Users can reproduce the data set and the queries after the session to practice on their own in Greenplum

Course Slides

Google slides for the course are found here

Course Flow

Follow the steps below to go through the course:

  1. Deploy Greenplum Env
  2. Load Twitter Data
  3. Learn to Connect to GPDB with a Python Client
  4. Write Server Side Python User Defined Functions
  5. Text Search
  6. Time Series
  7. Geospatial
  8. Graph
  9. Putting it all Together