edouarda/brigand

folds are somewhat slow

ldionne opened this issue · 19 comments

While investigating the cause of timeouts on Metabench, I came across these:

fold_left-gcc49

fold_right-gcc49

It turns out that they are about as slow on GCC 6 too, so this is not only an issue with GCC 4.9. I would suggest fixing this, or I may have to prune Brigand's folds from metaben.ch (or reduce their sample size).

Do you have an idea of what is causing this?

No, I don't. I haven't looked at your implementation of folds.

Ok thanks for the feedback we will look into it.

Is @odinthenerd (or someone else) still planning on changing brigand to use some the aliased-based TMP he's developed in Kvasir? Is there general interest in that change?

I think it's still planned but let's discuss it here.

yes its still on my radar but I have no exact time frame, I'm quite busy now. If someone else wants to give it a try I would be glad to mentor the effort.

Okay, if I recall correctly from your C++Now talk you've got a little one around now, congratulations :) Compilation time and Brigand documentation is something my advisor has actually asked about a few times so I can allocate time to working on this. It would probably make sense to write the documentation as the code is changed rather than going back through all of it after. I'll go over @odinthenerd's blog posts again, then look through Kvasir.

@nilsdeppe that's really great if you can help. It seems that most major contributors are caught up in life events that slow down Brigand development. :-)

I concur 😣
My point is however that a lot of the documentation can probably be written now as the interface won't change much in some corners.

the only thing standing in the way of kvasir::mpl's fold being pretty much copied and pasted is brigand lambdas. In general we need to make a decision whether we want context dependent lambdas (like in boost) or jit-able lambdas (using pin). What I mean by jit-able is that we walk the tree of composed functions which make up the lambda turning each one into a continuation once and every subsequent use will find this memoized continuation version and use it. Since the number of lambda invocations is typically orders of magnitude higher than the number of unique lambda signatures this should be much faster.

I think a good first step would be to make a "continuation from brigand lambda" factory which takes any brigand lambda and turns it into something callable as a continuation. We could do some simple pattern matching at this point and change trivial cases to something fast. The speed of non trivial cases will continue to be bad unless we dump the context dependent support but we could just wrap what we habe for now without much effort.

Its kind of a question of who we are catering to:

  • those using trivial lambdas probably don't care
  • those migrating from boost.mpl probably want compatibility
  • if someone wants speed they will be using continuation style lambdas any way

Alright, so I'll get CMake setup for both Doxygen and Standardese so we can support both as much as possible. I agree, the interface should see little change.

I think we should aim for speed with Brigand since there's Boost.hana for friendliness. While that would still break boost.mpl support, I would guess people are willing to put in some effort to change the code if it means reduced compile time. Since the much faster compile time also allows for more metaprogramming I wouldn't be surprised if people moving from boost.mpl would want to perform more operations at compile time. In short, I think it makes most sense to try to really push hard on the speed front, maybe having two interfaces if possible: one that's easier to use but somewhat slower, and one that's a bit harder to use but faster.

Well we're not 100% MPL compatible anyway. What matters is that the transition is clear and smooth.

Right, for which I think good documentation will be key. Maybe what we would want is to a have a transition guide explaining how the various examples in the MPL documentation would translate to Brigand.

Yes, also it would help if the differences are minor. It's already the case for many trivial tasks.

Yes, I completely agree. I think this points us down a pretty clear road

as I explained here the placeholder style gets a bit clunky with eager meta functions and explicit composition also has its limits. Whether or not continuation style lambdas are easier to understand is debatable and its easy to be biased about matters of style but one can surly argue that the fact that everything is composable with everything and everything works with everything else with no special placeholders, protect, bind, pin etc. etc. is a bit intriguing.

You do still need wrapper to turn non continuation things into continuations but since essentially everything that is eager needs some kind of wrapper then you are going to have wrappers any way. The fact that you don't need wrappers for algorithms or a growing number of traits etc. means you will probably have less wrappers in total.

Since the STL traits classes were not designed with non SFINAE friendly speed in mind I think we will be able to do much better in the kvasir::mpl equivalents of most of them. Both are wrapping the same intrinsics on modern compilers but we can create less types in this wrapping process by increasing the chances of memoization and by continuation style composition.

@odinthenerd that's really interesting. Even not having used the continuation syntax yet, I personally find it easier to read. If I've understood correctly I should read left to right with the last argument to call being the meta variable to operator on? To me this seems much more natural than the current mess I've managed to obtain in many places using placeholders... I'd be in favor of this interface, even if it means wrappers are sometimes necessary. Also, if wrappers for the STL are provided then the need for wrappers will be greatly decreased since any meta function a user such as myself would write could be written in the "correct" form from the beginning.

Yes that's correct, left to right as long as its linear, if its more of a tree it gets more complex but that goes for both cases. There are also more powerful things one can do with continuations like recursion and value based decision making within lambdas, they are essentially a touring complete sub language within TMP.

I will get into more advanced lambda stuff in my next blog post as soon as I have time to write it ;)

FYI, I'll resume dev on brigand by moving everything to C++17 to clean up old things and try to gain more speed.

I tagged the last point of the master head to be the last C++11ish version.