/andalucian-water

Water Reservoirs in Andalucia

Primary LanguageJupyter Notebook

Andalucia Water reservoir data

Overview

If you just want to use the data, it's probably easiest to work with cleaned CSV tabular data. And then look at the analysis.ipynb notebook to see examples how this data can get used.

At a high level, this data contains the fill status for the water reservoirs in Andalucia, Spain. Andalucia is (as of 2024) going through an extended drought, with many reservoirs running low, and wide-spread water cuts being expected. This data is likely useful for understanding and predicting the biggest issues.

Data preparation

The raw data is available here from "Rediam", an environmental data platform. This provides daily data on reservoir status, but the only ways of accessing this are either (1) pdf files with tables, and (2) an online tool with limited download ability.

Here, I work with the PDF files. These can be downloaded in bulk, but they require a fair bit of processing to extract data in cleaned format.

To the providers of public data: Thank you, but... this data already exists in a parseable form in your system, if you can add this to the download folder it's easier to read and also dramatically more space efficient. But I take anything I can get!

The parse.ipynb notebook contains the code for parsing the pdf docs, turning it into a dataframe with simplified column names and some data cleaning, which can be downloaded from s3.

Data used

Raw

  1. Reservoir state: From portalrediam.cica.es. We parse the PDF to obtain cleaned tabular data (see below).

  2. Reservoirs: From portalrediam.cica.es, available on s3

  3. Demarciaciones hidrograficas: From Miteco, available on s3

Cleaned

  1. Reservoir states: We provide the cleaned tabular version on s3.