Big Data


Module 1 - What is Big Data?

Characteristics of Big Data
What are the V’s of Big Data?
The Impact of Big Data

Module 2 - Big Data - Beyond the Hype

Big Data Examples
Sources of Big Data
Big Data Adoption

Module 3 - The Big Data and Data Science

The Big Data Platform
Big Data and Data Science
Skills for Data Scientists
The Data Science Process

Module 4 - Big Data Use Cases

Big Data Exploration
The Enhanced 360 View of a Customer
Security and Intelligence
Operations Analysis

Module 5 - Processing Big Data

Ecosystems of Big Data
The Hadoop Framework


What You’ll Learn

  • Through narrated lecture, recorded demonstrations, and hands-on exercises,you will learn how to:
  • How to use Apache Spark to run data science and machine learning workflows at scale
  • How to use Spark SQL and DataFrames to work with structured data
  • How to use MLlib, Spark’s machine learning library
  • How to use PySpark, Spark’s Python API
  • How to use sparklyr, a dplyr-compatible R interface to Spark
  • How to use Cloudera Data Science Workbench (CDSW)
  • How to use other Cloudera platform components including HDFS, Hive,
  • Impala, and Hue