superfly/litefs

Proxy forwarding requests before app ready to handle them

Closed this issue ยท 3 comments

Important note at the beginning - I'm not sure whether it's a real issue or me trying to do something dumb ๐Ÿ˜… Also sorry in advance if it's not an issue, normally I'd create a discussion but it seems like this feature is disabled in this repo.

I've tried to setup LiteFS on fly.io with a few regions replication. I've wanted to utilize the auto-stopping machines part of fly.io to reduce the costs of my website running in a lot of regions. Without the proxy that comes with LiteFS it seems like fly.io on "cold start" waits until my app can handle a request. With proxy however, for a few first seconds I have connection error instead.

Should I not auto-stop machines with this solution? Or I'm missing some option/way to fix this. I've tried to dive into source code but it seems like the last exec command from config is just invoked asynchronously and proxy starts.

Here is my repo at the latest commit with LiteFS setup in case it helps - https://github.com/pawelblaszczyk5/next-fly-playground/tree/d039b040bfea9b143af280e884a9fd73784086b8

It's a Next.js app, in a Dockerfile I've mostly changed only stuff related to LiteFS, except of adding a swap file setup and DB migrations to LiteFS config, which you can find under other directory.

Okay, I've just got an aha moment ๐Ÿ˜„ It can't work the way that I'm imagining. I shouldn't use auto stopping with replaction, because there wouldn't be a place to repliacte the data. I've got a wrong idea of the concept ๐Ÿ˜…

If anyone could just confirm that I came to the right conclusion I'd gladly close this issue

@pawelblaszczyk5 Good question. Auto-stop is tricky with LiteFS for a couple reasons. First, as you mentioned, the primary won't have a place to replicate to if only the primary is running. That's a durability issue. You could always run 2 candidate nodes and that would give the primary somewhere to replicate to.

Second, LiteFS needs to sync up on startup and that can take some time depending on the size of the database and the number of changes since shutdown. The proxy needs to be more intelligent about queuing requests until a node is marked as ready but that hasn't been added yet.

As of now, I wouldn't suggest using auto-stop with LiteFS. It's something we want to get to but it's not something that works well right now.

Thanks a lot for your time!