Feature: Add support for running multiple replicas
guerzon opened this issue · 10 comments
Requirements:
Proposal:
Hi, ¿is vaultwarden possible use multiple replicas to deploy in HA?
Thank you @guerzon for this chart.
¡Hola @fhera!
Right now there are issues running multiple copies of Vaultwarden. For example, the data directory (/data
by default) contains application data such as attachments and it has to be visible to the multiple replicas.
It is possible to put it in an NFS filesystem, but I'm not sure if that's something you can do or even want. There might be other cloud-native alternatives for this though (?). Personally, I would like to see S3 support so we could point /data
to an S3 bucket instead. If you or your organization has the resources, I encourage you to sponsor or submit a pull request on https://github.com/dani-garcia/vaultwarden.
One hack I found was to disable the ORG_ATTACHMENT_LIMIT
but I'm pretty sure this is not enough.
I opened a discussion here to discuss the topic further.
Lester
i don't think that the chart should take care of concurrent access to /data/
, that is something that usually is managed by kubernetes itself, using ReadWriteMany PVCs, so it should be sufficient to add the PVC type and a good Kubernetes admin must know if he needs to use a RWO or RWX volume.
All in all, given how Bitwarden works there are not many concurrent accesses, 'cause the client(s) sync the vault when needed, then do not access the server at all, unless users use the WEB UI instead of any client.
One need for multiple replicas indeed could be related to HA, having two pods would help for rolling updates or in case of fault of one node, but still Kubernetes would take care of it, rescheduling pods on other nodes and doing, well, rolling upgrades, so there will always be a running pod.
i don't think that the chart should take care of concurrent access to
/data/
, that is something that usually is managed by kubernetes itself, using ReadWriteMany PVCs, so it should be sufficient to add the PVC type and a good Kubernetes admin must know if he needs to use a RWO or RWX volume.All in all, given how Bitwarden works there are not many concurrent accesses, 'cause the client(s) sync the vault when needed, then do not access the server at all, unless users use the WEB UI instead of any client.
One need for multiple replicas indeed could be related to HA, having two pods would help for rolling updates or in case of fault of one node, but still Kubernetes would take care of it, rescheduling pods on other nodes and doing, well, rolling upgrades, so there will always be a running pod.
I think @akelge is right. Helm must support deployment replication and don't take care about what Kubernetes should do with it.
I've tested this a bit and it seems to work on this configuration:
kind: Statefulset (each replica has its own persistent storage, which is not sync'd)
replicas: 3
database: postgres
extra stuff: Service configured with traefik sticky sessions annotations:
traefik.ingress.kubernetes.io/service.sticky.cookie: "true"
traefik.ingress.kubernetes.io/service.sticky.cookie.name: "vaultwarden-sticky"
without the sticky sessions strange stuff happens as i get logged out immediately after logging in, i assume some sort of in memory data that is not yet shared with other replicas.
I don't use attachments and the icon cache does not warrant shared storage as it can be downloaded when needed so if this stays stable it seems decent enough.
I am using a loadbalancer service and an apache2 as a proxy for Vaultwarden.
When running > 1 replicas of Vaultwarden I am usually required to login multiple times.
I am using a loadbalancer service and an apache2 as a proxy for Vaultwarden.
When running > 1 replicas of Vaultwarden I am usually required to login multiple times.
You need to setup some sort of sticky sessions, otherwise it doesn't work properly.
I am using a loadbalancer service and an apache2 as a proxy for Vaultwarden.
When running > 1 replicas of Vaultwarden I am usually required to login multiple times.You need to setup some sort of sticky sessions, otherwise it doesn't work properly.
I added
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
to my loadbalancer service yml and this seems to do the proper job.
A PR would be very welcome for this feature.