rubysolo/dentaku

Support for ANY/ALL to evaluate conditions for array items

vlazar opened this issue · 10 comments

Thank you for you awesome gem!
I've been using it for years without any issues.

Now I have this new use case where to avoid using external logic to prepare context for calculator first it would be great to be able to do the same things you can do with Enumerable#all? and Enumerable#any? methods in Ruby.

users = [{ a_score: 3, b_score: 7 }]
users.all? { |u| u.a_score > 1 && u.score < 10 }

First I've checked if hash works and it does:

calc = Dentaku::Calculator.new
> calc.evaluate("user.a_score > 1 AND user.b_score < 10", user: { a_score: 3, b_score: 7 })
# => true

Then I've added a custom function to evaluate condition for each array item:

> calc.add_function(:all, :boolean, ->(arr, expr) { arr.all? {|v| calc.evaluate(expr, v) } })
> calc.evaluate("ALL(users, 'a_score > 1 AND b_score < 10')", users: [{ a_score: 3, b_score: 7 }])
# => true

It also works, but I want a properly defined and more generic solution, not this quick hack.

  1. I don't want this function definition referring to calc instance and calling evaluate with each item from array as a context.
  2. I don't want to put conditions in ALL() call in quotes: ALL(users, 'a_score > 1 AND b_score < 10')
  3. I don't know what the right syntax would be for ALL function, but it shouldn't be limited to only supporting array of hashes. Array of integers/strings/etc. for example should also work somehow, so the syntaxt needs to be generic and make sense everywhere else in Dentaku.
  4. Similar to Ruby's all? conditions it should be possible to reference not only an array item in conditions but anything from context or maybe calling some a function and so on.

So the following should be also possible to describe:

users = [{ a_score: 3, b_score: 7 }]
something_other = 42
users.all? { |u| u.a_score > 1 && u.score < 10 && something_other == 42 }

scores = [3,7]
something_other = 42
scores.all? { |v| v > 2 && something_other == 42 }

Slightly related to my use case here is #208 as I would sometimes need this to work (and currently it doesn't):

> calc.evaluate('users[0].a_score', users: [{ a_score: 3, b_score: 7 }])
# => nil

As a workaround I can add FIRST() function so that:

> calc.evaluate("FIRST(users, 'a_score')", users: [{ a_score: 3, b_score: 7 }])
# => 3
  • Would it make sense for ANY/ALL functions I described to be added to Dentaku?
  • Are these functions easy enough to implement for someone like me without prior knowledge of parsers and other CS magic which makes Detaku work?

The special syntax which could work for both the array of hashes and array of simple values:

With array of simple values like scores: [3,5]. Evaluates to true when item > 1 AND item < 10 condition is true for all items in array:

ALL(scores > 1 AND scores < 10)

With array of hashes like scores: [{ a: 3 }, { a: 5}]. Evaluates to true when item.a > 1 AND item.a < 10 condition is true for all items in array:

ALL(scores.a > 1 AND scores.a < 10)

But that I imagine would complicate parsing a lot, as now the expression inside ALL() should no longer be evaluated directly, but needs to be changed first to make sense for each item in array and only then to be evaluated against each item.

Also referencing some other value from the context should also work somehow:

ALL(scores.a > 1 AND scores.a < 10 AND SOME_FN(some_unrelated_to_array_value)  == 42)

This sounds like a good addition. The syntax should probably be something like ALL(collection, item_var_name, expression) so that you will know how to refer to each item in the expression.

Calculator#store accepts a block, which will "overlay" some value(s) on the current calculator memory and then remove them after evaluation, so I think the best implementation would be to iterate over the collection, store the item with the given name, then evaluate the expression. This approach should allow implementing several Enum-type functions.

If you would like to tackle this, that would be great and I can provide support if you have questions. If not, I understand and I will plan to implement at some point, but I'm not sure how soon that would be.

I would love to tackle this. I have some questions though to help me understand if I'll be able to do it.

I like the proposed syntax and thank you for pointing me to Calculator#store. My new implementation is much better with it already:

calc = Dentaku::Calculator.new

Dentaku::AST::Function.register(:all, :logical, ->(collection, attr, expr) {
  collection.all? do |item|
    calc.store(Hash[attr, item]) { calc.evaluate(expr) }
  end
})

calc.evaluate(
  "ALL(users, 'user', 'user.a_score > 1 AND user.b_score < 10')",
  users: [{ a_score: 3, b_score: 7 }]
)
# => true

Now the questions are:

  1. In function definition I'm still using a reference to calc instantiated outside of a block. Functions seems to be designed to be pure and hence have no access to current calculator instance. Should ALL/ANY and other functions be treated as special case and current calculator instance be injected somehow for them? If yes, then where it could be done? If not, should I just create a new calculator instance inside a function definition? Wouldn't this implementation be too expensive?
  2. I assume you meant the syntax should be ALL(users, user, user.a_score > 1 AND user.b_score < 10), and not the one from my example above ALL(users, 'user', 'user.a_score > 1 AND user.b_score < 10')? So my second question is what should be done to make such syntax work? If you can point me to a similar example on how to avoid '' there then I think I can make everything work.

Bad news / good news. The bad news is that (as you pointed out) regarding function access to the calculator -- currently there is no access, so it will take a bit of a refactor to make that work. The good news is that the parser already handles the syntax (without quotes).

Better news: This turned out to be fairly straightforward! Do you think #220 will work for your use case?

@rubysolo Oh my! Thank you so much! I'll test it with all possible use cases soon and will let you know. But I think everything would work just fine.

@rubysolo I've checked #220 with all my current and future use cases and it is working perfectly for me! 🎉

It handles simple cases as shown in specs in #220 :

calculator.evaluate!('ANY(users, u, u.age > 33)', users: [
  {name: "Bob",  age: 44 },
  {name: "Jane", age: 27 }
])

It handles more complex expressions (added AND u.gender = 'female' to the expression):

calculator.evaluate!('ANY(users, u, u.age > 33 AND u.gender = 'female'")', users: [
  {name: "Bob",  age: 44, gender: "male" },
  {name: "Jane", age: 27, gender: "female" }
])

It also allows to nest ALL/ANY (this is a must of course for proper implementation), which was something not possible with my workaround implementation and quoted syntax above.

calculator.evaluate!('ANY(users, u, u.age > 33 AND u.gender = 'female' AND ANY(u.hobbies, h, h = 'guitar')")', users: [
  {name: "Bob",  age: 44, gender: "male", hobbies: ["guitar", "baseball"] },
  {name: "Jane", age: 27, gender: "female", ["drawing", "singing"] }
])

I have other functions in mind, once you ship #220 I would be happy to contribute these, if you'll find them generally useful to be included in Dentaku itself.

FIrst function I have in mind is MAP. My use case is this:

  1. I have a list of expressions which is a feature for power users with expressions entered in plaintext. This list of expressions is there to describe and transform an incoming API data and define useful calculations to be used in much simpler UI based expression builder for all users.
  2. UI expression builder basically allows to use expressions defined in previous advanced feature as inputs. The basic building block on UI is 3 text inputs in single line: [input] [operator] [value]. It allows to build simple conditions like user_age > 33, group them and join with AND/OR.

Now in addition to input with single values like user_age there is a requirement to be able to build conditions for multiple values coming from API.

Let's say there is an API request with this data:

{
  "users": {
    { "name": "Bob", "age": 44 },
    { "name": "Jane", "age": 27 }
  }
}

So at level 1 there will be an expression defined like MAP(users, u, u.age) with a name like all_users_age which would return an array [44, 27] when evaluated by Dentaku.

And on the UI level, inputs which are arrays are handled a bit different from single value, meaning you still have the same UI with 3 text inputs to define condition, but instead of being able to select only user_age for the first text input ([input]) there will be 2 separate pseudo items for ANY/ALL functions. So you'll be able to configure things like ALL(all_users_age, age, age > 33) using the same UI you'll use to build conditions for single values like user_age > 33.

For this to work I'm now using an improved MAP function based on your ANY/ALL implementation in #220:

module Dentaku
  module AST
    class Map < Function
      def self.min_param_count
        3
      end

      def self.max_param_count
        3
      end

      def deferred_args
        [1, 2]
      end

      def value(context = {})
        collection      = @args[0].value(context)
        item_identifier = @args[1].identifier
        expression      = @args[2]

        collection.map do |item_value|
          expression.value(
            context.update(
              FlatHash.from_hash(item_identifier => item_value)
            )
          )
        end
      end
    end
  end
end

Dentaku::AST::Function.register_class(:map, Dentaku::AST::Map)

Do you think this function can be included in Dentaku too or do you feel like this use case is too specific?

If you think it should be included in Dentaku I'll submit a PR with tests.

Another function I have in mind is a PLUCK, just to be able to do PLUCK(users, age) instead of MAP(users, u, u.age).

The implementation is as follows, but it has a bug: it only works when keys for hashes in array are strings, but doesn't work when symbols are used as keys. And the ANY/ALL from #220 and MAP I've described above works with symbol keys too.

calc.evaluate("PLUCK(users, age)", users: [{ age: 44 }, { age: 27 }]) #  => nil
calc.evaluate("PLUCK(users, age)", users: [{ "age" => 44 }, { "age" => 27 }]) #  => [24, 77]

calc.evaluate("MAP(users, u, u.age)", users: [{ age: 44 }, { age: 27 }]) #  => [24, 77]
calc.evaluate("MAP(users, u, u.age)", users: [{ "age" => 44 }, { "age" => 27 }]) #  => [24, 77]
module Dentaku
  module AST
    class Pluck < Function
      def self.min_param_count
        2
      end

      def self.max_param_count
        2
      end

      def deferred_args
        [1]
      end

      def value(context = {})
        collection      = @args[0].value(context)
        property        = @args[1]

        collection.map do |item|
          property.value(
            context.update(
              FlatHash.from_hash(item)
            )
          )
        end
      end
    end
  end
end

Dentaku::AST::Function.register_class(:pluck, Dentaku::AST::Pluck)

I'm not sure yet what would be the correct fix, but what I was actually doing before was not using Dentaku evaluation and instead did this:

  collection.map do |item|
    item[attr.to_sym] || item[attr]
  end

And it seems to match current Dentaku behavior, but I'm not sure if it covers all cases and I don't want to couple to my expectations/current behavior of Dentaku which might change in the future.

Do you think this function can be included in Dentaku too or do you feel like this use case is too specific?

If you think it should be included in Dentaku I'll submit a PR with tests.

Great! I'll merge this. I think map and pluck would be useful additions -- looking forward to the PRs, and we can figure out the string/symbol key issue there.