Run multiple replicas in k8s of bot app
Opened this issue · 2 comments
How run multiple replicas of bot app in k8s and still handle it only in 1 insance till it dies. We run our bot using k8s but sometimes it scales down/up moving node to other vmk - like 4-5min break in bot work while its starting again. For 1 replica all works fine, but when we have more than 1 there is race condition. There is any way to setup it to support 2+ isntances of bot and don't make mess with handling events/commands as 1st started instance, not both (with 50% of working for 1st bot/50% for 2nd).
@slack/bolt
version
^3.15.0
Your App
and Receiver Configuration
const app = new App({
token: process.env.SLACK_BOT_TOKEN,
signingSecret: process.env.SLACK_SIGNING_SECRET,
socketMode: true,
appToken: process.env.SLACK_APP_TOKEN,
port: 3000
});
Node.js runtime version
v16.13.0
Steps to reproduce:
(Share the commands to run, source code, and project settings)
- run 1st instance with diff say() msg for using command
- run 2st instance with another diff msg for cmd
- run command and watch output
Expected result:
Only one bot handles all events
Actual result:
50% that 1st bot catch command and says 1st msg, 50% that 2nd bot reply with diff message
Requirements
Hi, @MrPatryk! Thank you for your question! 🙌
Based on what you've described, it does sound like the expected behavior of Bolt with socket mode.
By design, when there are multiple connections, Socket Mode will randomize sending the event between the socket connections.
As mentioned here, this might be a good potential workaround:
If you need to scale to multiple app instances, then you may need to use 1 app that establishes a SocketMode connection and adds each request to a job queue for other instances to handle. We don't have a working example of this but if you want to share your solution then we'd greatly appreciate it!
👋 It looks like this issue has been open for 30 days with no activity. We'll mark this as stale for now, and wait 10 days for an update or for further comment before closing this issue out. If you think this issue needs to be prioritized, please comment to get the thread going again! Maintainers also review issues marked as stale on a regular basis and comment or adjust status if the issue needs to be reprioritized.