Green-Software-Foundation/carbon-aware-sdk

[Feature Contribution]: Multiple data source support

Closed this issue · 7 comments

What happened?

I proposed this feature in #343 (comment) .

Carbon Aware SDK supports just one data source as following:

{
  "DataSources": {
    "EmissionsDataSource": "ElectricityMaps",
    "ForecastDataSource": "WattTime",

Granularity and covered area vary in each data souces. Some SDK users want to use WattTime as a first priority, but they might want to use ElectricityMaps when they want to know emission data of uncovered area in WattTime.

So I'd like to propose mutiple data source support as following:

{
  "DataSources": {
    "EmissionsDataSources": ["WattTime", "ElectricityMaps"],
    "ForecastDataSources": ["JSON", "WattTime", "ElectricityMaps"],

In above case, both EmissionsDataSouce and ForecastDataSouces are array. Lower index is higher priority. For example in EmissionData, WattTime is the highest priority, but no emission data for given location in WattTime, it would fallback to ElectricityMaps.

Code of Conduct

  • I agree to follow this project's Code of Conduct

Feature Commitment

  • I commit to contributing this feature as a PR and working with the GSF to merge this feature into the Carbon Aware SDK.

@YaSuenag This seems worthy of an ADR.

@bderusha Can we converge this proposal to ADR-0006 ? They are strongly related, and I guess this proposal would make big change from ADR-0006. So I think it makes sence to work together.

This seems like it is more related to ADR-008. Either way, happy to collaborate.

My personal preference is to create new ADRs and add a reference to existing ADRs impacted by the new one.

For example:

  • Create ADR-0016: Multiple Data Source Support
  • Update ADR-0008 status to "superseded by ADR-0016"

Unless @vaughanknight or @Willmish has a different preference for this project

From call #349 : Data from WattTime and data in ElectricityMaps is different - (one uses marginal the other uses average( might not be comparable between different locations using different data sources.

So important thing to point out here is, how do we ensure users are aware where the data comes from and that it might be different/not comparable between different locations if used from different datasources (marginal vs average)

  • suggestion from @YaSuenag : add datasource to request Body
  • @vaughanknight : this would also require a /datasources endpoints (similar to /locations)
  • how do we look at the two different data types?
  • @danuw : mentioned to investigate : Coverage maps of datasources (which countries are unique to specific data sources vs which ones have overlap)
  • @danuw : Should we look into more data sources?

Update from meeting #355: not crticial for 1.1, rather a feature for 1.2. Open to discussion

This issue has not had any activity in 120 days. Please review this issue and ensure it is still relevant. If no more activity is detected on this issue for the next 20 days, it will be closed automatically.

This issue has not had any activity for too long. If you believe this issue has been closed in error, please contact an administrator to re-open, or if absolutly relevant and necessary, create a new issue referencing this one.