ankane/blazer

Connection pool issue when using Blazer with a read replica and Rails automatic connection switching

weilandia opened this issue · 4 comments

After moving to a multidb setup in our app with automatic role switching, we're running into an issue where /queries/tables?data_source=main intermittently returns a 500 with No connection pool for 'Blazer::Connection::Adapter96700' found..

NOTE: We were also getting an ActiveRecord::ConnectionNotEstablished error intermittently when running queries after the switch, but this seems to have been fixed by increasing our db pool size.

BACKTRACE:

lib/active_record/connection_adapters/abstract/connection_handler.rb:243 retrieve_connection  
lib/active_record/connection_handling.rb:287 retrieve_connection  
lib/active_record/connection_handling.rb:254 connection  
lib/active_record/sanitization.rb:199 replace_bind_variables  
lib/active_record/sanitization.rb:168 sanitize_sql_array  
lib/blazer/adapters/sql_adapter.rb:289 add_schemas  
lib/blazer/adapters/sql_adapter.rb:54 tables  
/usr/local/lib/ruby/3.3.0/forwardable.rb:240 tables  
app/controllers/blazer/queries_controller.rb:206 tables  
lib/action_controller/metal/basic_implicit_render.rb:6 send_action  
lib/abstract_controller/base.rb:224 process_action  
lib/action_controller/metal/rendering.rb:165 process_action  
lib/abstract_controller/callbacks.rb:259 block in process_action  
lib/active_support/callbacks.rb:110 run_callbacks  
lib/abstract_controller/callbacks.rb:258 process_action  
lib/action_controller/metal/rescue.rb:25 process_action  
lib/action_controller/metal/instrumentation.rb:74 block in process_action  
lib/appsignal/hooks/active_support_notifications.rb:19 block in instrument  
lib/active_support/notifications/instrumenter.rb:58 instrument  
lib/appsignal/hooks/active_support_notifications.rb:18 instrument  
lib/action_controller/metal/instrumentation.rb:73 process_action  
lib/action_controller/metal/params_wrapper.rb:261 process_action  
lib/searchkick/controller_runtime.rb:15 process_action  
lib/active_record/railties/controller_runtime.rb:32 process_action  
lib/abstract_controller/base.rb:160 process  
lib/action_view/rendering.rb:40 process  
lib/action_controller/metal.rb:227 dispatch  
lib/action_controller/metal.rb:309 dispatch  
lib/action_dispatch/routing/route_set.rb:49 dispatch  
lib/action_dispatch/routing/route_set.rb:32 serve  
lib/action_dispatch/journey/router.rb:51 block in serve  
lib/action_dispatch/journey/router.rb:131 block in find_routes  
lib/action_dispatch/journey/router.rb:124 each  
lib/action_dispatch/journey/router.rb:124 find_routes  
lib/action_dispatch/journey/router.rb:32 serve  
lib/action_dispatch/routing/route_set.rb:882 call  
lib/rails/engine.rb:536 call  
lib/rails/railtie.rb:226 public_send  
lib/rails/railtie.rb:226 method_missing  
lib/action_dispatch/routing/mapper.rb:22 block in class:Constraints  
lib/action_dispatch/routing/mapper.rb:51 serve  
lib/action_dispatch/journey/router.rb:51 block in serve  
lib/action_dispatch/journey/router.rb:131 block in find_routes  
lib/action_dispatch/journey/router.rb:124 each  
lib/action_dispatch/journey/router.rb:124 find_routes  
lib/action_dispatch/journey/router.rb:32 serve  
lib/action_dispatch/routing/route_set.rb:882 call  
lib/rack/attack.rb:127 call  
lib/prerender_rails.rb:124 call  
lib/active_record/middleware/database_selector.rb:67 block in call  
lib/active_support/notifications/instrumenter.rb:58 instrument  
lib/active_record/middleware/database_selector/resolver.rb:58 block in read_from_primary  
lib/active_record/connection_handling.rb:361 with_role_and_shard  
lib/active_record/connection_handling.rb:147 connected_to  
lib/active_record/middleware/database_selector/resolver.rb:57 read_from_primary  
lib/active_record/middleware/database_selector/resolver.rb:37 read  
lib/active_record/middleware/database_selector.rb:77 select_database  
lib/active_record/middleware/database_selector.rb:66 call
...

I ran into this exact problem after updating to updating to use a primary / primary_replica. We're on the default pool size right now. How many did you update to? I'm already running 30

@bdschultzAU it actually ended up not working. We ended up moving Blazer back to the primary for now.

@bdschultzAU it actually ended up not working. We ended up moving Blazer back to the primary for now.

thanks for the response @weilandia . I came to that conclusion too. I monkey patched it to force blazer to use the writer node for now

Closed because we changed our reporting infra, but I think this is still a problem.