Home_Sales

Used the knowledge of SparkSQL to determine key metrics about home sales data and used Spark to create temporary views, partition the data, cache and uncache a temporary table, and verify that the table has been uncached.

Spark SQL Home Sales Analysis

This repository contains code that performs analysis on a dataset of home sales using Apache Spark SQL.

The code performs various Spark SQL operations on a dataset of home sales. It includes the following functionality:

Reading a CSV file from an AWS S3 bucket into a DataFrame.
Creating temporary views of the DataFrame.
Executing SQL queries to analyze the dataset.
Caching and uncaching tables for performance optimization.
Working with parquet formatted data.

baller01/Home_Sales

Home_Sales

Spark SQL Home Sales Analysis

The code performs various Spark SQL operations on a dataset of home sales. It includes the following functionality: