bitemyapp/bloodhound

ES 6.4.2

vlatkoB opened this issue · 9 comments

Hi,
I adapted bloodhound V5 as V6 to work with ES 6.4.2. Fork is at https://bitbucket.org/vlatkoB/bloodhound (hg, not git).

This is from the current changelog:

  • Removed MappingName, "_type" hardcoded to "_doc", as there can be only one
  • Removed "_all" from tests, because it does not exist any more
  • Added ParentChildJoin relation for use in mapping and indexing parents/child docs
    • Removed parent arg from some functions
    • Added ParentIdQuery
  • TemplateQueryInline renamed to TemplateQuery and adapted to V6
    • QueryTemplateQuery removed from Query
    • Added searchByTemplate function
  • NodeStats and NodeInfo adapted to new fields
    • Added ThreadPoolSearch to ThreadPoolType
  • IndexTemplate supports list of TemplatePatterns, but only one mapping
  • Field type "string" changed to "text"

Considering I started to read more deeply about ES less than a week ago and am using it on a very basic level, I'd appreciate if someone who uses bloodhound/ES for some "real" tasks could give it a try.

All tests pass, but it is not completely done, have to read more on ES and breaking changes...

vlatko

This looks promising, thank you! /cc @MichaelXavier where are you at ES version-wise in production?

Maybe to mention the biggest ES design breaking changes (that I'm aware of now):

  1. index can only have 1 mapping type
  2. /twitter/{user,blog}/.. should now be /twitteruser/.. and /twitterblog/..
  3. no more "normal" parent/child relationship (as in, parent and child have different fields). Both parent and child must be in one index, and you differentiate between them on a special field.
    Basically, it means both parent and child have the same fields, so user <> blogs is not practical any more.
  4. HEAD doesDocumentExist can not check on parent/child relation
  5. _all field does not exist any more (but a solution exists by using "copy_to" .= "my_all")

As you may expect, I'm still on ESV1 because I never have a block of time open long enough to build a new cluster. Don't mind me though, I'll take responsibility for V1 for as long as I'm still using it.

I will like to try the library but have a 6.4 version. Should I use the fork or is this going to be merged in master soonish ? Cheers,

FYI, I squashed and moved the fork to GitHub. Repo on BitBucket is deprecated.

@bitemyapp Are you still actively maintaining this library ? I would be willing to help but given the lack of reaction on this thread it makes me wonder if it is not safer to go with another elastic client (Python or otherwise).

@PierreR I'm still here, I believe Michael is too. At present adding a new major (the 6 in 6.4.2) Elasticsearch version means duplicating a fair chunk of the source tree and requires a maintainer.

One problem is that @vlatkoB didn't follow the existing pattern of devoting a major sub-directory to the new major version, cf. https://github.com/vlatkoB/bloodhound/tree/master/src/Database

I am not confident in the mutual compatibility of a "V5" module for V5 and V6 of Elasticsearch without more cleverness or dirty tricks than I have historically tolerated in this library. It's meant to reflect, as simply as possible, the underlying structure of Elasticsearch's API and query DSL.

I appreciate @vlatkoB's efforts toward V6 compatibility but it's not actionable for me at this time for two major reasons:

  1. the 6.x.x support isn't distinct from the Database.V5 modules. It is possible this could be solved by copying it and calling that V6 but it's really more complicated than that. Ideally, @vlatkoB would rebase his V6 compatibility patches onto the current V5 sub-tree.

  2. Currently the approximate division of responsibility is that @MichaelXavier keeps V1 working and I try to keep V5 working. It's not clear to me that @vlatkoB is willing to take on maintainership of a new V6 sub-tree. Tracking Elasticsearch's API breakages isn't much fun, so I wouldn't blame him if he wasn't willing to sign up for that.

Between those two major issues, the non-trivial amount of work required, and the lack of demand for an explicitly compatible V6 sub-tree, I'm not in a rush to try to merge something somebody didn't bother to put up as a pull request to begin with.

Further, my level of activity on this library is partly reflected by whether an employer or client needs me to fix or add something in the library. I've successfully avoided needing to use Elasticsearch of late so I happily have not needed to make changes to the library for my own sake. Any work I've done in the last year or two has been 100% for the sake of the library's users. Nobody's offered to pay me to do more work on the library, so I give it the amount of the time that best pleases me --- same as almost every other open source maintainer.

@PierreR what are you offering to help with?

@bitemyapp I have followed the existing pattern with V6 directory, but seems I messed something up when moving fork from BitBucket to GitHub. It lost V6 sub-directory completely. Was in a rush and didn't check. :-(
V6 is still visible on https://bitbucket.org/vlatkoB/bloodhound. I'll try to find some time to fix that soon on GitHub repo.

As for maintaining, not sure I can devote myself because I'm not actively using ES. I did V6 only for some internal testing I needed.

EDIT: GitHub fork fixed. Sorry it took some time.

@bitemyapp Thanks for your detailed and informative reply. What I could do to help is to act as a tester for the V6 version (if a PR is merged at some point). I might help further and propose some PR fixes for V6 but that really depends of my usage of the library. At this point, knowing there isn't any serious user for V6, I am not so sure I am going to dive deeper. Cheers !