/Movies-ETL

The Movies Extract-Transform-Load (ETL) Analysis repo contains movie data extracted from Wikipedia and Kaggle in CSV and JSON file formats. The datasets were transformed by cleaning and merging the datasets, and the cleaned datasets were loaded into a movie_data SQL database. Regex was used to identify strings of characters defined by search patterns playing a critical role in cleaning the box office, budget, release date, and running time data. Lambda functions were used in the transform phase as "anonymous functions."

Primary LanguageJupyter Notebook

No issues in this repository yet.