PyPSA/pypsa-eur

Tutorial example Sector-Coupled does not run through: KeyError: "None of [Index(['diameter_mm', 'max_cap_M_m3_per_d'], dtype='object')] are in the [columns]"

fhg-isi opened this issue · 1 comments

Checklist

  • [x ] I am using the current master branch or the latest release. Please indicate.
  • [x ] I am running on an up-to-date pypsa-eur environment. Update via conda env update -f envs/environment.yaml.

Describe the Bug

I tried to follow

https://pypsa-eur.readthedocs.io/en/latest/tutorial_sector.html#

and run

snakemake -call all --configfile config/test/config.overnight.yaml

Maybe format of data/gas_network/scigrid-gas/data/IGGIELGN_PipeSegments.geojson changed?

Error Message

localrule build_powerplants:
    input: resources/test/networks/base.nc, data/custom_powerplants.csv
    output: resources/test/powerplants.csv
    log: logs/test/build_powerplants.log
    jobid: 24
    reason: Input files updated by another job: resources/test/networks/base.nc
    resources: tmpdir=/tmp, mem_mb=5000, mem_mib=4769

ERROR:root:Uncaught exception
Traceback (most recent call last):
  File "/home/projekt-resilient03/pypsa-eur/.snakemake/scripts/tmpvirewhbh.build_gas_network.py", line 154, in <module>
    gas_network = load_dataset(snakemake.input.gas_network)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/projekt-resilient03/pypsa-eur/.snakemake/scripts/tmpvirewhbh.build_gas_network.py", line 62, in load_dataset
    method = df.method.apply(pd.Series)[["diameter_mm", "max_cap_M_m3_per_d"]]
             ~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/projekt-resilient03/conda/envs/pypsa-eur/lib/python3.11/site-packages/pandas/core/frame.py", line 4108, in __getitem__
    indexer = self.columns._get_indexer_strict(key, "columns")[1]
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/projekt-resilient03/conda/envs/pypsa-eur/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 6200, in _get_indexer_strict
    self._raise_if_missing(keyarr, indexer, axis_name)
  File "/home/projekt-resilient03/conda/envs/pypsa-eur/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 6249, in _raise_if_missing
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['diameter_mm', 'max_cap_M_m3_per_d'], dtype='object')] are in the [columns]"
RuleException:
CalledProcessError in file /home/projekt-resilient03/pypsa-eur/rules/build_sector.smk, line 91:
Command 'set -euo pipefail;  /home/projekt-resilient03/conda/envs/pypsa-eur/bin/python3.11 /home/projekt-resilient03/pypsa-eur/.snakemake/scripts/tmpvirewhbh.build_gas_network.py' returned non-zero exit status 1.
[Thu Jun 27 14:26:09 2024]
Error in rule build_gas_network:
    jobid: 15
    input: data/gas_network/scigrid-gas/data/IGGIELGN_PipeSegments.geojson
    output: resources/test/gas_network.csv
    log: logs/test/build_gas_network.log (check log file(s) for error details)
    conda-env: /home/projekt-resilient03/pypsa-eur/.snakemake/conda/afe33fb9496f9501dfc3a366a00c7f3c_

Maybe adapt load_dataset function in load_gas_network.py as shown below. Or fix structure of data.

def load_dataset(fn):
    df = gpd.read_file(fn)
    param = df.param.apply(json_string_to_series)
    method_series = df.method.apply(json_string_to_series)
    method = method_series[["diameter_mm", "max_cap_M_m3_per_d"]]
    method.columns = method.columns + "_method"
    df = pd.concat([df, param, method], axis=1)
    to_drop = ["param", "uncertainty", "method", "tags"]
    to_drop = df.columns.intersection(to_drop)
    df.drop(to_drop, axis=1, inplace=True)
    return df

def json_string_to_series(json_string):
  data_dictionary = json.loads(json_string)
  series = pd.Series(data_dictionary)
  return series