rails-on-services/apartment

Proposal: Connection Resetting Middleware for SET search_path

Opened this issue · 9 comments

Steps to reproduce

Use PgBouncer or an equivalent (e.g AWS RDS Proxy) in front of your database with Rails

Expected behavior

PgBouncer to be able to reuse connections efficiently

Actual behavior

Near 1-1 connections because connections become session-pinned when SET is encountered

System configuration

  • Database: postgres 16.4 + AWS RDS Proxy

  • Apartment version: latest (HEAD)

  • Apartment config (in config/initializers/apartment.rb or so):

    • use_schemas: true
  • Rails (or ActiveRecord) version: 7.0.8.6

  • Ruby version: 3.3.5

Proposed solution

class Apartment::ResetConnection
  def initialize(app)
    @app = app
  end

  def call(env)
    @app.call(env)
  ensure
    # Reset search path to allow connections to be unpinned if the SET pinned them
    Apartment.connection.execute('RESET search_path')
  end
end

Alternatively, this could be done in the postgres adapter:

  class PostgresqlAdapter < AbstractAdapter
      ....
      #   Reset schema search path to the default schema_search_path
      #
      #   @return {String} default schema search path
      #
      def reset
        @current = default_tenant
        Apartment.connection.schema_search_path = full_search_path
        Apartment.connection.execute('RESET search_path')
      end

      ....
  end

I'm not super familiar with how this works in Apartment, so thought I'd open an issue with the proposal rather than a PR. What do you think, would this be possible? Would it cause any issues?

hmmm, I'd think that a SET search path would be issued whenever a query was about to be sent through a connection, so it doesn't matter what the previous value of search path was. This is why I don't think there's been much done to issue an additional SQL statement to reset the search path on the connection. Can you tell me more about your use case? When are you specifying a tenant, i.e., the search path?

@mnovelo We set the tenant during authentication, i.e after parsing the JWT token and authenticating a user. So, I agree that it doesn't matter what the previous value of search path was, but when RDS Proxy comes across a SET command (SET search_path, SET timezone, doesn't matter). Here's some info on that: https://jpcamara.com/2023/04/12/pgbouncer-is-useful.html

Amazon has a similar tool to PgBouncer called RDS Proxy, and it has a feature called “connection pinning”. If it detects a statement that is incompatible with transaction mode, it will automatically hold that connection for that client for the duration of their session.

This is both highly useful and simultaneously problematic. It means query behavior is consistent with your expectations (🙌🏼) but also that you can silently kill all concurrency benefits (😑). If enough queries are run that trigger connection pinning, all of a sudden you may throttle your throughput. But it does give you an escape hatch for safely running statements which are not transaction compatible without having to jump through any hoops.

The problem is that this leads to pretty much all connections being session-pinned because of Apartment setting the search path, and not resetting it

Screenshot 2024-11-19 at 11 00 25

Oh, I see. Then, something like what you're proposing would be helpful for people using either AWS RDS Proxy or PgBouncer. I'd prefer for this to be something that has to be enabled by configuration. This would execute an unnecessary query for those not using AWS RDS Proxy or PgBouncer, so I'd want this behavior disabled by default. I'm open to names for this configuration, like enforce_search_path_reset or similar.

@mnovelo I see what you're saying, on the other hand, wouldn't it make sense that the reset method actually resets things that it has set, such as search path? Although it executes an additional query, it's not a very heavy one, and it really just cleans up what it did by setting the search path in the first place. I think it seems like reasonable cleanup?

I think enforce_search_path_reset sounds like a good name for such a configuration, or reset_search_path_before_disconnect

Let's go with enforce_search_path_reset with a default of false. While I see what you're saying about cleanup, it's similar to the tenant_presence_check, which is also an arguably good practice to have enabled. What we've found in our production environment is that such an additional safeguard was not worth the overhead of an extra query on each request. At this time, I don't see this being a necessity outside of connection pooling solutions like RDS Proxy and PgBouncer, which is why I'm defaulting it to false. I'm happy to default it to true in the future if more people find it beneficial.

@mnovelo When do you think this could be added?

I'm looking to include it in v4.0.0.alpha1 which I hope to release by EOY with Rails 7.2 and 8 support

This wasn't included in v3.2.0, but I hope to add such support in v4.0.0. Your help would be greatly appreciated! #312

@mnovelo Awesome!

As for the proposed solutions above, what's your opinion?

For a middleware, I could imagine that if any fibers or threads spawned during the runtime of the request checked out connections and called switch to switch into a tenant, then those wouldn't be reset, only Apartment.connection would be reset

As for the other proposed solution, adding it to the reset method, this would then call RESET search_path even if the request isn't necessarily finished yet, and would conflict with the behaviour of setting search_path back to public or whatever it was before switching.

Are there perhaps any hooks we can hook into, called when a connection is checked back in?