palkan/action_policy-graphql

ActionPolicy does not work well with graphql-ruby dataloaders (not thread/fiber safe)

Bertg opened this issue ยท 9 comments

Bertg commented

Tell us about your environment

Ruby Version: ruby 3.2.2

Framework Version (Rails, whatever): rails (7.0.4.3), graphql (2.0.21)

Action Policy Version: (0.6.5)

Action Policy GraphQL Version: (0.5.3)

What did you do?

We started using dataloader in our policies to help reduce the amount of queries.

We pass the graphql context as part of the policy context, which allows us to use our dataloaders in the database.

What did you expect to happen?

We expected everything to keep working as it did, but with less or equal amount of queries.

What actually happened?

We started getting completely wrong results when using expose_authorization_rules.

During our investigation we monkey patched the allowance_to method like this:

def allowance_to(rule, record = :__undef__, **options)
  policy = lookup_authorization_policy(record, **options)

  policy.apply(authorization_rule_for(policy, rule))
  policy.result.tap { |a| puts "policy_result: #{rule} #{a.inspect}" }
end

We got following (sample) output.

policy_result: publish? <PublicGraph::Policies::FolderNodePolicy#publish?: false>
policy_result: move_into? <PublicGraph::Policies::FolderNodePolicy#move_into?: false>
policy_result: create_item? <PublicGraph::Policies::FolderNodePolicy#move_into?: false>
policy_result: create_item_in? <PublicGraph::Policies::FolderNodePolicy#move_into?: false>
policy_result: create_folder_in? <PublicGraph::Policies::FolderNodePolicy#move_into?: false>

As you can see a discrepancy on the rule is exposed. The result that is returned is not that of the rule requested.

In order to understand what is happening it is important to know that resolving of these policies have (somewhere in the stack) a call to dataloader. Dataloader uses fibers to do its "magic", basically stopping execution of functions as long as it can.

This seems to be incompatible with the (default on) ActionPolicy::Behaviours::Memoized and ActionPolicy::Behaviours::ThreadMemoized. When calling the apply method (in allowance_to) the result is stored in the instance. When the thread is yielded (by the fiber scheduler) other threads can replace this result with an other call to apply.

As I see it this implementation can not be considered thread safe, especially in its default configuration.

Thanks for the report!

Just to confirm that it relates to the ThreadMemoized, can you please try to turn it off and see if it helps:

ActionPolicy::PerThreadCache.enabled = false

(Though, I think, Memoized may cause the same problems, depending on whether the same authorization behaviour is reused by loaders)

Bertg commented

@palkan We have, through monkey patching disabled ActionPolicy::Behaviours::Memoized and the issue resolved itself. In test mode the ThreadMemoised is not active anyway (in default setup) so it does not affect the outcome.

According to us the sequence of events is something like this:

The graph tries to return an authorisation rule `create_item' on object `FolderNode#foo`
  ActionPolicy `allowance_to` is invoked and caches the policy object with cache_key `FolderNode/foo`
    ActionPolicy invokes the `apply` method with `create_item` on the policy object
      Dataloader yields the execution back to the Fiber scheduler
The graph tries to return an authorisation rule `move_into' on object `FolderNode#foo`
  ActionPolicy is invoked and finds the policy object with cache_key `FolderNode/foo`
    ActionPolicy invokes the `apply` method with `move_into ` on the policy object
      Dataloader yields the execution back to the Fiber scheduler
The graph has no other fields to run, and yields back to dataloader
Dataloader loads its data
`allowance_to` for `create_item` can continue after the `apply` method, and returns the cached policy object
๐Ÿ‘† here is the issue. The policy object result is now `move_into` as that was the last apply that was called
`allowance_to` for `move_into` can continue after the `apply` method, and returns the cached policy object

Thanks for the details! That will help with reproduction in tests.

Also, I think it's worth extracting the minimal behaviour implementation into a separate module, so it's possible to opt-out from extensions (such as memoization) without monkey-patching.

Bertg commented

Hi @palkan I was wondering if you had any idea on how to proceed with this. I saw you started a branch some time ago, but it hasn't moved much (publicly) since then.

I was looking into the code, and from where I'm sitting, it looks like some major overhaul of the internal implementation is required. I'd love to help where I could , but as it looks like a big change, some guidance on how you'd like to do this would be needed.

npupko commented

@Bertg Could you add some context, please.
Is it causes problems with expose_authorization_rules only or any usage of dataloader inside the policy will cause this bug?
I am currently adding policies to the application and the approach I use for now is:

class RenewalPolicy < BasePolicy
  authorize :dataloader, optional: true

  def update?
    scopes = dataloader.with(Sources::ScopesForUser).load(user.id)
    global_scopes = scopes.include?("...")
    # CODE
  end
end
Bertg commented

We have only discovered it in expose_authorization_rules. When actually checking the policies, the scheduler behaves in a sequential way, so the issue would not happen there. (Although, I guess it could in some other situations).

The issue here is how graphql-ruby resolves the values when returning the graph results, combined with the not parameterised memorisation that action_policy does.

To be clear: In the following bot of code, from action_policy the @result value is memorised, and not "scoped" to the rule argument. Subsequent methods relying on @result are thus not thread safe.

      # Returns a result of applying the specified rule (true of false).
      # Unlike simply calling a predicate rule (`policy.manage?`),
      # `apply` also calls pre-checks.
      def apply(rule)
        @result = self.class.result_class.new(self.class, rule)

        catch :policy_fulfilled do
          result.load __apply__(resolve_rule(rule))
        end

        result.value
      end

I mentioned before, that this is exactly what happens during the response generation phase in graphql-ruby. Several fibres call the apply rule on the same policy (and thus the @result value keeps changing). allowance_to then just returns the last @result instance set.

So, in short ActionPolicy:: Policy#apply has internal caching/memoisation/state that makes ActionPolicy::Behaviour#allowance_to (last 2 lines) not thread safe.

palkan commented

@Bertg Thanks a lot for the details.

Sorry, hadn't had enough time to proceed with this issue: still don't have a reproduction setup :(

it looks like some major overhaul of the internal implementation is required

That's true. I plan to start working on v1.0 release soon, and this particular refactoring is on my list.