/enipedia-search

Using Elasticsearch with Enipedia

Primary LanguageHTML

enipedia-search

What?

This is the web page code that is used for http://enipedia.tudelft.nl/Elasticsearch.html. The main motivation is that data for global power plants is not published using a standardized format. Data that is relevant for a single plant may be distributed across multiple databases which do not (re)use standard identifiers. The interface below has been created to help deal with this situation by allowing one to easily search across multiple databases that may contain relevant data.

This isn't just for the humans, and you can send API requests as well to http://enipedia.tudelft.nl/search.
Behind the scenes, Elasticsearch is used, and you can use their documentation to help create your own queries.

There are a few interesting types of queries for which examples are included further below in the documentation:

  • Common Terms Query - This reduces the importance of commonly occurring terms in suggesting matches. This is also useful when searching across data in different languages, as it will automatically reduce the importance of terms like "power plant" and "kraftwerk" which may frequently appear in the data.
  • Fuzzy Like This - This is useful if you're searching for the name of a power plant that may have different spellings due to translation or the conversion of diacritical characters to ascii characters
  • Geographic - You can search for anything within a geographic bounding box or a distance from a point

Available Databases

In the examples below, you'll see URLs such as http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search, which indicate that the geo, osm, and wikipedia databases are to be searched. You can add or remove these to search over as few or as many databases as you want. Note the addition of ?pretty=true at the end of the URL which pretty prints the JSON results to make them easier to read.

API Call Examples

Text

Search for "Maasvlakte" using Common Terms Query

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "common": {
      "_all": {
        "query": "Maasvlakte",
        "cutoff_frequency": 0.001
      }
    }
  }
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true

Search for "Maasvlakte" using Fuzzy Like This query

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "fuzzy_like_this": {
      "like_text": "Maasvlakte"
    }
  }
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true

Search over both country and name

Search for the Fierza plant within Albania. Use boost to make sure we prioritize matching Albania. Numerous results will be returned for Albania and it's important to look at the score as it should highlight the one correct match above the others.

curl -XPOST 'http://enipedia.tudelft.nl/search/geo/_search?pretty=true' -d '
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "Name": "Fierza"
          }
        }, 
        {
          "match": {
            "Country": {
              "query": "Albania",
              "boost": 2
            }
          }
        }
      ]
    }
  }
}'

Geographic Queries

Searches can be done within a bounding box, within a distance from a point, and within a defined polygon.

Search for anything within a geographic bounding box

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "filtered": {
      "query": {
        "match_all": {
        }
      }
    }
  },
  "filter": {
    "geo_bounding_box": {
      "location": {
        "top_left": "51.9746049736781,3.9879854492187405",
        "bottom_right": "51.932286908856256,4.1253145507812405"
      }
    }
  }
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true

Search for "Maasvlakte" within a geographic bounding box

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "common": {
      "_all": {
        "query": "Maasvlakte",
        "cutoff_frequency": 0.001
      }
    }
  },
  "filter": {
    "geo_bounding_box": {
      "location": {
        "top_left": "51.9746049736781,3.9879854492187405",
        "bottom_right": "51.932286908856256,4.1253145507812405"
      }
    }
  }
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true

Search for anything within 10 km of a specific geographic point

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "filtered": {
      "query": {
        "match_all": {
        }
      }
    }
  },
  "filter": {
        "geo_distance": {
          "distance": "10km", 
          "location": { 
            "lat":  52,
            "lon": 4
          }
        }
  }
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true

Search for anything within 10 km of a specific geographic point and sort results by distance

See documentation on sorting by distance and the note on scoring by distance (i.e. taking additional features besides distance into account).

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "filtered": {
      "query": {
        "match_all": {
        }
      }
    }
  },
  "filter": {
        "geo_distance": {
          "distance": "10km", 
          "location": { 
            "lat":  52,
            "lon": 4
          }
        }
  },
  "sort": [
    {
      "_geo_distance": {
        "location": { 
          "lat":  52,
          "lon": 4
        },
        "order":         "asc",
        "unit":          "km", 
        "distance_type": "plane" 
      }
    }
  ]
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true

Find something mentioning coal within 10 km of a specific geographic point

curl -H "Content-Type: application/json" -X POST -d '{
  "from": 0,
  "size": "10",
  "query": {
    "common": {
      "_all": {
        "query": "Coal",
        "cutoff_frequency": 0.001
      }
    }
  },
  "filter": {
        "geo_distance": {
          "distance": "10km", 
          "location": { 
            "lat":  52,
            "lon": 4
          }
        }
  }
}' http://enipedia.tudelft.nl/search/geo,osm,wikipedia/_search?pretty=true