pelias/schema

Elasticsearch 6.3+ support (FYI)

Closed this issue · 5 comments

Note: I know Pelias only supports ES v2.0 but in an effort to hack at the schema to get it to load the portland-metro project on ES 6.3 (would do 6.4 but AWS ES doesn't have it yet), I thought I'd list the things that broke for me so that they're maybe useful to the development team. If these are already known, feel free to close this issue.

To make this work, I had to set the apiVersion parameter to '6.3' in the Elasticsearch Client constructor.

This was an attempt to just load the index definition and not actually support the Pelias code. I'm sure many changes would have to be made to different projects to support these changes (e.g., query changes to support custom types). Additionally, this was just an attempt to get node scripts/create_index.js to pass without necessarily replacing/fixing functionality.

Synonym Files

foo => bar syntax in directionals.txt and full_token_address_suffix_expansion.txt fails for... directionals. Had to comment out all lines like this:

southwest => sw
southeast => se
northwest => nw
northeast => ne
north => n
south => s
east => e
west => w

Syntax works fine in the custom_*.txt synonym files and for the other synonyms in aforementioned files.

Example error:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "failed to build synonyms"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "failed to build synonyms",
    "caused_by": {
      "type": "parse_exception",
      "reason": "Invalid synonym rule at line 97",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "term: n analyzed to a token (north) with position increment != 1 (got: 0)"
      }
    }
  },
  "status": 400
}

"type": "string""type": "text"

Replaced in all mappings/partial/*.json files.

Example error:

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "No handler for type [string] declared on field [continent]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [_default_]: No handler for type [string] declared on field [continent]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "No handler for type [string] declared on field [continent]"
    }
  },
  "status": 400
}

"store": "yes""store": "true"

Replaced in all mappings/partial/*.json files and document.js's address_parts.properties[].type

Example error:

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Failed to parse mapping [_default_]: Could not convert [continent.store] to boolean"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [_default_]: Could not convert [continent.store] to boolean",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Could not convert [continent.store] to boolean",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "Failed to parse value [yes] as only [true] or [false] are allowed."
      }
    }
  },
  "status": 400
}

"index": "no""index": "false"

Replaced in mappings/partials/boundingbox.json

Example error:

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Failed to parse mapping [_default_]: Could not convert [bounding_box.index] to boolean"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [_default_]: Could not convert [bounding_box.index] to boolean",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Could not convert [bounding_box.index] to boolean",
      "caused_by": {
        "type": "illegal_argument_exception",
        "reason": "Failed to parse value [no] as only [true] or [false] are allowed."
      }
    }
  },
  "status": 400
}

geo_point type

centroid.js is using parameters that are no longer supported by ES (or at least there isn't an obvious translation). Commented them out and 🤞 .

  /* `lat_lon` enabled since both the geo distance and bounding box filters can either be executed using in memory checks, or using the indexed lat lon values */
  'lat_lon': true,

  /* store geohashes (with prefixes) in order to facilitate the geohash_cell filter */
  'geohash': true,
  'geohash_prefix': true,
  'geohash_precision': 18

Example error:

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "Mapping definition for [center_point] has unsupported parameters:  [geohash : true] [geohash_precision : 18] [lat_lon : true] [geohash_prefix : true]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "Failed to parse mapping [_default_]: Mapping definition for [center_point] has unsupported parameters:  [geohash : true] [geohash_precision : 18] [lat_lon : true] [geohash_prefix : true]",
    "caused_by": {
      "type": "mapper_parsing_exception",
      "reason": "Mapping definition for [center_point] has unsupported parameters:  [geohash : true] [geohash_precision : 18] [lat_lon : true] [geohash_prefix : true]"
    }
  },
  "status": 400
}

Multiple types

Multiple types per index is no longer supported per breaking changes and I'm not super sure how to get around that.

Just to get the index to create, commented out all types after _default_: doc, in schema.js

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Rejecting mapping update to [pelias] as the final mapping would have more than 1 type: [venue, country, address, dependency, locality, county, borough, macroregion, localadmin, macrocounty, street, neighbourhood, postalcode, region]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Rejecting mapping update to [pelias] as the final mapping would have more than 1 type: [venue, country, address, dependency, locality, county, borough, macroregion, localadmin, macrocounty, street, neighbourhood, postalcode, region]"
  },
  "status": 400
}

Hey @mklaber,
Thanks much for this report. It will really help with pelias/pelias#461 and pelias/pelias#719

The synonyms failure is the most interesting, as we weren't aware of that. We've made some changes to our synonyms code fairly recently so that might have done it.

We at least knew of most of the other issues, but it's really great to get confirmation that there are solutions. For the multiple types in particular we have pull requests pelias/model#95 and #293 which will help when the time comes. We'll probably merge those once we support Elasticsearch 5.

One question for you: were you able to get the pelias/schema repository to the point where you could run the unit or integration tests? (they can be run with npm test and npm run integration respectively, the latter requires an Elasticsearch host available at localhost:9200). In particular the integration tests are very helpful in ensuring we're setting up everything properly in Elasticsearch, so having those close to passing would be a great step forward in our journey towards ES5/6.

@orangejulius sorry for the slow response. No, wasn't able to get npm test to pass but I also didn't fix tests as I made changes described above (though, in hindsight, should have). Didn't even try npm run integration.

Hey @mklaber,
I've now catalogued all your reports here over in pelias/pelias#719. We've already fixed several of them :)

So I'm going to close this issue in favor of that one. Thanks again for the report, if you do notice anything that works well, or poorly, please let us know over in that issue!

☮️

This was a great analysis BTW, thanks very much for contributing it.

As of yesterday, we are supporting ES@5.6, which is the first step to being able to support 6.3 :)

Happy to help!