/Spark_lessons_in_Scala

How to manipulate data with Spark in Scala

Primary LanguageScala

Here is a summary of various Spark exercices in Scala :

-Chapt 02 : spark session, create DF from range / divisible by 2, read CSV, physical plan, Temp View, sort by count(destinations), max, column renamed, oder by...

-Chapt 05 : Basic Structured Operations : Schemas, Columns and Expressions, Records and Rows, DataFrame Transformations,

-Chapt 06 : Working with different types of data : spark types, working with Booleans, Numbers, Dates & Timestamps, Nulls in Data, User Defined Functions, JSON...

-Chapt 07 : Aggregations, functions, grouping, window functions, grouping sets, User-Defined Aggregation Functions

-Chapt 08 : Inner, Outer, Left Outer, Right Outer, Left Semi, Left Anti, Natural, Cross (Cartesian) Joins. Challenges When Using Joins & How Spark Performs Joins

-Chapt 09 : Data sources (CSV, JSON, Parquet, Orc, SQL Db, Text files)