/unibg-gis

Geographic Information Systems project

Primary LanguageJupyter NotebookMIT LicenseMIT

Geographic Information Systems

Geographic Information Systems project

Creation of an Open Data Cube and analysis of pollutant emissions in Lombardy

Note: Follow the documentation for all the information, in this file there will be only the part relating to the installation of the Open Data Cube.

Open Data Cube

alt text

Installation

Follow the documentation Open Data Cube Documentation (Installation)

Requirements

Python v3.8.6

  • Create new virtual environment:
    python -m venv env
    .\env\Scripts\activate
  • Install requirements (some .whl files):
    pip install -r requirements.txt
  • Install Jupyter notebooks:
    pip install jupyter
    jupyter notebook

Database setup

  • Install PostgreSQL

  • Add PostgreSQL to the system environment variables (Path):

    C:/Program Files/PostgreSQL/13/bin
    C:/Program Files/PostgreSQL/13/lib
  • Create database:

    psql -U postgres
    > CREATE DATABASE datacube;
  • Create configuration file in ~/.datacube.conf:

    [datacube]
    db_database: datacube
    
    # A blank host will use a local socket. Specify a hostname (such as localhost) to use TCP.
    db_hostname: localhost
    
    # Credentials are optional: you might have other Postgres authentication configured.
    # The default username otherwise is the current user id.
    db_username: datacube
    db_password: 0
    

Datacube installation

Get datacube-core-develop from here.

  • Install datacube:

    cd datacube-core-develop
    python setup.py install
  • Check datacube version:

    datacube --version
  • In case of errors with numpy, maybe this can resolve:

    pip install numpy==1.19.3
  • Initialize database schema:

    datacube -v system init

Dataset Sentinel5P (NetCDF)

We are using dataset.nc NetCDF file with HARP conventions.

Note: Dataset used are not in the repository due to big size!

We need to create two .yaml files in order to define the product and add the dataset data in Open Data Cube.

Product YAML

We create a product.yaml file in order to define the product in Open Data Cube.

  • Add product product.yaml to datacube:

    datacube product add product.yaml

Dataset YAML

We are using Dataset.ipynb jupyter notebook.

We had to specify all the required parameters by the EO3 convention, extrapolating them from the NetCDF.

Dataset metadata documents define critical metadata about a dataset including:

  • Available data measurements
  • Platform and sensor names
  • Geospatial extents and projection
  • Acquisition time
  • Provenance information

We analyze our Sentinel5P dataset dataset.nc and create the dataset.yaml file.

  • Validate dataset.yaml (Option --thorough -> Attempt to read the data/measurements, and check their properties match the product):

    eo3-validate "dataset.yaml"
    eo3-validate --thorough "dataset.yaml"
  • Add dataset dataset.yaml to datacube:

    datacube dataset add --auto-match dataset.yaml
  • In case of errors with shapely, maybe this can resolve:

    pip uninstall shapely
    pip install C:/GitHub/unibg-gis/whls/Shapely-1.7.1-cp38-cp38-win_amd64.whl

    or

    from shapely import speedups
    speedups.disable()
  • In case of errors with sqlalchemy, maybe this can resolve:

    pip install sqlalchemy==1.3.20

Automated YAML Script

If the data have already been prepared from the source for inclusion in the Open Data Cube project, they can be imported into the database without further steps, as they will already be packaged with dataset and product documents compatible with ODC and will therefore already be ready for immediate indexing.

If, on the other hand, the data is from external or incompatible sources, it will be necessary to generate these dataset and product documents by hand. This is what we did for the first import of the data into the Data Cube. We then created a so-called "Data Preparation Script", that is a python script that reads the metadata format of the input file and automatically generates the dataset and product documents required for import.

The automatic generation script is located in /odc/yaml_generator/generate_yaml.py.

Analysis

See the full docs!

Authors

Team

  • Samuele Ferri: Site, GitHub
  • Lorenzo Conti
  • Fabio Sangregorio
  • Simone Sudati

Version

alt text

License

License