Big Data
Syllabus
Module 1 - What is Big Data?
Characteristics of Big Data
What are the V’s of Big Data?
The Impact of Big Data
Module 2 - Big Data - Beyond the Hype
Big Data Examples
Sources of Big Data
Big Data Adoption
Module 3 - The Big Data and Data Science
The Big Data Platform
Big Data and Data Science
Skills for Data Scientists
The Data Science Process
Module 4 - Big Data Use Cases
Big Data Exploration
The Enhanced 360 View of a Customer
Security and Intelligence
Operations Analysis
Module 5 - Processing Big Data
Ecosystems of Big Data
The Hadoop Framework
DATA
What You’ll Learn
- Through narrated lecture, recorded demonstrations, and hands-on exercises,you will learn how to:
- How to use Apache Spark to run data science and machine learning workflows at scale
- How to use Spark SQL and DataFrames to work with structured data
- How to use MLlib, Spark’s machine learning library
- How to use PySpark, Spark’s Python API
- How to use sparklyr, a dplyr-compatible R interface to Spark
- How to use Cloudera Data Science Workbench (CDSW)
- How to use other Cloudera platform components including HDFS, Hive,
- Impala, and Hue