bitemyapp/bloodhound

Aggregation stuff changed in 6.x

bitemyapp opened this issue · 7 comments

I am loathe to copy the tree again so we should start thinking about what's workable for patchy DSL overrides. You can see what broke by running the test suite against the current 6.x release.

One way would be to parameterize the parts of the datatypes that use aggregations. A Proxy for the version works only if the only variation is in the JSON. If the representation is different, the hole in the type has to be for the whole aggregation repr.

Could be an excuse to fuck with Backpack too. I think this is the sort of thing that's meant to be good at? I'm not yet sure what it would net us over a type class in this case.

I do the proxy thing in katip-elasticsearch. It is quite tedious and verbose as you have to define a big typeclass of all the overlap in type and functionality, then redefine all those for each instance. I'm not sure that you'd end up with that all that less code, nor do I really know if there's a performance impact with this approach. There could be a way to do this more granularly. I could kind of see there several typeclasses describing broad groups of functionality such that say V5 and V6 could have instances but not 1. I don't know if this would work in practice though because in my estimation, most API disruption between versions comes from changes in protocols and available options rather than totally new functionality (but maybe I'm just a boring ES user 😆 )

A Backpack solution would probably look like this: an abstract internal library which imported a signature for the types that changed between versions. Another internal library for the different implementations of the signature—the definitions of the types.

The public library would instantiate the abstract internal library in two ways and re-export the modules.

Could be an excuse to fuck with Backpack too. I think this is the sort of thing that's meant to be good at? I'm not yet sure what it would net us over a type class in this case.

One possible advantage would be the following: a particular user of the library is not likely to be interested in various versions of the API at the same time, so why burden him with the task of understaning the type class that enables such flexibility? He would probably prefer having separate modules with simpler signatures.

@danidiaz

is not likely to be interested in various versions of the API at the same time

Bzzzzt. Elasticsearch versions matter big time. You've got the use-case backwards. This is about code-reuse internally, not hiding the differences in representation or the version used for the user. The user neeeeeeds to know what version of ES the client they're using expects and to ensure those versions match.

Is there any reason to use Backpack when it's about internal code reuse when instances, data types, and functions could vary in some cases but not others?

Bzzzzt. Elasticsearch versions matter big time. You've got the use-case backwards. This is about code-reuse internally, not hiding the differences in representation or the version used for the user. The user neeeeeeds to know what version of ES the client they're using expects and to ensure those versions match.

For a client that only needs to work with Version 6, it's easier choose module "V6" and ignore module "V5", than having a single module for both versions with heavier typeclass machinery and more complex signatures. I believe Backpack would make easier to have separate modules while still reusing the code that doesn't change.

@danidiaz that sounds about like what I'd end up doing with nested datatypes. I'd love to see a worked example but I'm not sure how to keep that sort of thing small.

@bytemyapp I put up a crude proof-of-concept here: https://github.com/danidiaz/bloodhound/tree/done

I used Cabal 2.2 because I'm not sure about the current level of support of Backpack in Stack.

The example defines two parallel hierarchies V5A and V5B of modules that in reality are produced by instantiating a single abstract internal library V5 with two different implementations of the Aggregation module.

Going into more detail, the internal libraries are:

  • pindeps Just to avoid repeating version bounds.

  • bloodhound-import Bloodhound.Import seems to be used across the codebase.

  • v5-internal-minus-aggregation A bunch of modules that do not depend on Aggregation and will not vary across instantiations.

  • v5-abstract A bunch of modules that depend on Aggregation, where Aggregation has been cut out and turned into a signature. So basically this is an abstract library with an Aggregation-shaped hole.

    (While creating the signature, I had to leave out a typeclass defined in Aggregation. Backpack doesn't seem to like declaring new typeclasses in sigs.)

  • v5-aggregation-impls Contains the two different implementations of Aggregation (well, actually they're not different, I didn't bother changing anything). Notice that the implementations don't know that the signature even exists, they just happen to be compatible with it.

  • The main library exposes the unchaged V1 modules, instantiates v5-abstract in two different ways and then re-rexports the generated modules.

    One limitation I found is that Haddock doesn't seem to handle reexported-modules very well, or at all. Maybe the generated modules could be re-exported from "wrapper" modules created for that purpose?

    These wrapper modules could also be useful to hold extra functions present in one version of the lib but not in the other.

Just as a heads-up, stack currently does not work well with backpack in my experience: commercialhaskell/stack#4013