LCA data sets for machine learning

This repository features data sets that can be used to learn Life Cycle Assessment (LCA) models.

Currently the only data set available is a data set for learning a LCA model for cars. It is generated using the open source model carculator.

Usage

The data for the carculator model is available as zipped .csv files in the /data folder and as numpy arrays in the data_numpy folder. The data can easily be loaded using the python scripts in the scripts folder. The script folder also contains the script for generating data as it is represented in the data set.

Loading the data

Data can be loaded using the load_data method in the load_data.py file. This loads the data using the .npz files in the data_numpy folder

data = load_data(path_to_data, num_files=10)

The data will be loaded into a dictionary with the following structure:

{ 
  parameter_names: [] list containing the parameter names (dim N)
  nominal_parameters: [] list containing a 1 for nominal parameters 0 for numerical parameters
  attribution_names: [] list containing the attribution names (dim D)
  impact_category_names: [] list containing the impact categories
  X: np.array of size NxM containing the parameter values
  y: { impact_category: np.array of size MxD } containing the impact results for the corresponding impact category
     }
}

Generating data

To generate data the module carculator needs to be installed using pip install carculator. The data is then generated using the lca_model class:

car_config = configparser.ConfigParser()
car_config.read("scripts/carculator_config.ini")   
lca_model = LCAModel(car_config)
lca_model.generate_data(min_data_set_size)

configuration options are available in the config.ini file.

vstarlinger/lca_datasets

LCA data sets for machine learning

Usage

Loading the data

Generating data