Explanation of all Spark SQL, RDD, DataFrame and Dataset examples present on this project are available at https://sparkbyexamples.com/ , All these examples are coded in Scala language and tested in our development environment.
Table of Contents (Spark Examples in Scala)
Spark RDD Examples
- Create a Spark RDD using Parallelize
- Spark – Read multiple text files into single RDD?
- Spark load CSV file into RDD
- Different ways to create Spark RDD
- Spark – How to create an empty RDD?
- Spark RDD Transformations with examples
- Spark RDD Actions with examples
- Spark Pair RDD Functions
- Spark Repartition() vs Coalesce()
- Spark Shuffle Partitions
- Spark Persistence Storage Levels
- Spark RDD Cache and Persist with Example
- Spark Broadcast Variables
- Spark Accumulators Explained
- Convert Spark RDD to DataFrame | Dataset
Spark SQL Tutorial
- Spark Create DataFrame with Examples
- Spark DataFrame withColumn
- Ways to Rename column on Spark DataFrame
- Spark – How to Drop a DataFrame/Dataset column
- Working with Spark DataFrame Where Filter
- Spark SQL “case when” and “when otherwise”
- Collect() – Retrieve data from Spark RDD/DataFrame
- Spark – How to remove duplicate rows
- How to Pivot and Unpivot a Spark DataFrame
- Spark SQL Data Types with Examples
- Spark SQL StructType & StructField with examples
- Spark schema – explained with examples
- Spark Groupby Example with DataFrame
- Spark – How to Sort DataFrame column explained
- Spark SQL Join Types with examples
- Spark DataFrame Union and UnionAll
- Spark map vs mapPartitions transformation
- Spark foreachPartition vs foreach | what to use?
- Spark DataFrame Cache and Persist Explained
- Spark SQL UDF (User Defined Functions)
- Spark SQL DataFrame Array (ArrayType) Column
- Working with Spark DataFrame Map (MapType) column
- Spark SQL – Flatten Nested Struct column
- Spark – Flatten nested array to single array column
- Spark explode array and map columns to rows
Spark SQL Functions
- Spark SQL String Functions Explained
- Spark SQL Date and Time Functions
- Spark SQL Array functions complete list
- Spark SQL Map functions – complete list
- Spark SQL Sort functions – complete list
- Spark SQL Aggregate Functions
- Spark Window Functions with Examples
Spark Data Source API
- Spark Read CSV file into DataFrame
- Spark Read and Write JSON file into DataFrame
- Spark Read and Write Apache Parquet
- Spark Read XML file using Databricks API
- Read & Write Avro files using Spark DataFrame
- Using Avro Data Files From Spark SQL 2.3.x or earlier
- Spark Read from & Write to HBase table | Example
- Create Spark DataFrame from HBase using Hortonworks
- Spark Read ORC file into DataFrame
- Spark 3.0 Read Binary File into DataFrame