dwyl/learn-devops

How to run PostgreSQL on VPS with High Availability?

nelsonic opened this issue ยท 13 comments

Linode has a Postgres App in the "Marketplace":
https://www.linode.com/marketplace/apps/linode/postgresql
It runs on Debian which is fine to start with.
I would definitely prefer to run it on OpenBSD: dwyl/learn-security#73

Requirements

  • High Security
  • Low maintenance.
    • Full documentation for setup and maintenance
  • At least two nodes (master+replica in separate data centres) so we can have failover.
  • Lower total cost than AWS RDS https://aws.amazon.com/rds/postgresql/pricing

I'm bumping this up to priority-1 because this is now costing us/me money each month and I feel it. #62 ๐Ÿ’ธ ๐Ÿ”ฅ
Going to spend a couple of hours on this today.

At the time of writing, Linode (the hosting service we are using for Hits) does not have a managed database product.
Digital ocean has such a product: https://www.digitalocean.com/pricing/#managed-databases

digital-ocean-managed-databases

But the minimum we would pay is $50/month given how much data (25GB) we already have in our Hits app.
That would be borderline "OK" in the short term, but the second we need slightly more disk space,
the next level up is 38 GB and the price doubles to $100/month spending $1200/year on a "hobby" app is an eye-watering amount of money!!

As for the AWS RDS pricing you practically need a PhD to understand it! https://aws.amazon.com/rds/aurora/pricing
image

The cheapest instance that can be used AWS RDS Postgres is db.t3.medium which costs $0.093/hour
This is $67.89/month for each instance. So the minimum we would be paying (with replication) would be $135.78/month which is daylight robbery.

And don't even get me started on the "Serverless" option: https://aws.amazon.com/rds/aurora/serverless
That's the definition of a ripoff!

  • $0.06 per ACU-hour
  • $0.10 per GiB-month
    Imagine you use 1 "ACU" for the full month (duh!) and 25GB of storage, it's: $43.8 + $2.50 = $46.30/month
    That's not even unreasonable on the surface. But for basic apps that require the database to be "always on",
    it's silly that the minimum charge is $44/month. We can do much better by hand-crafting a DB setup on VPS.

Thinking of using the Block Storage: https://www.linode.com/products/block-storage
image

The storage is independent of the instance(s) so we can use a 1GB ($5/month) Linode instance and unlimited storage!
image

3x Replication:
image

"You can also increase the size of a block storage volume at any time. This gives you the flexibility to update your storage capacity as your needs change with a few easy steps."

This is exactly what I want for a DB!

According to this thread: https://www.linode.com/community/questions/18893/how-do-i-move-my-database-files-to-a-block-storage-volume it's possible to use Block Storage as the data directory in Postgres.

It's going to require a couple of hours of DevOps (documenting the setup) but I think it will be worth it!
We can easily run the Hits App for **$10/month with 50GB of Disk space. And the only additional cost per month will be the disk allocation.

AWS has a Amazon Elastic Block Store which we have used in the past, it was one of their earliest products. But it does not have replication or high availability by default. You need to pay extra for that: https://aws.amazon.com/ebs/pricing
aws-ebs-pricing

$0.116/GB-month for the EBS and $0.053/GB-month for the EBS Snapshots (backups) = 0.169/GB-month i.e. 69% more expensive per GB than Linode. Doesn't seem like a good idea.

Then we would still need to use an AWS EC2 instance where the price for a comparable instance (1GB RAM) being a t3.micro is $0.0118/hour or 8.614/month https://aws.amazon.com/ec2/pricing/on-demand
Screenshot 2020-07-19 at 15 50 54

We could attempt to run it on a t3.nano with 500mb RAM at $0.0059/hour or 4.307/month, but I have my doubts as to the performance of such an instance. ๐Ÿ˜•
Read this: https://www.cloudsqueeze.ai/amazons-t3-who-should-use-it-when-how-and-the-why/index.html
image

Basically you will get an "unexpected bill for additional virtual CPU" when you go over the measly 2 VCPU baseline and it's charged at 0.05/hour i.e. more than the cost of the instance!!! So imagine people are online for 16h/day and they are making requests that tax our DB server's CPU, the additional CPU would be 16h x $0.05 x 365 / 12 = $24/month
They charge you for the full hour where the CPU spike occurred, so even if there are higher requests in 1 minute per hour we will pay for the full hour. I think this rules out AWS four our "Hits" App purposes.

Going to proceed with my exploration of Linode Block storage. ๐Ÿ”

On further reading, it turns out that DigitalOcean Block storage is based on SSDs and they have at-rest encryption:
https://www.digitalocean.com/products/block-storage

digitalocean-block-storage-ssds

And they have a tutorial for migrating the Postgres data directory to their Block storage:
https://www.digitalocean.com/community/tutorials/how-to-move-a-postgresql-data-directory-to-a-new-location-on-ubuntu-18-04 (not just a forum topic, an actual step-by-step tutorial).

I think the performance and reliability benefits of SSD are worth it.
https://www.enterprisestorageforum.com/storage-hardware/ssd-vs-hdd.html
image

Considering reviving my DigitialOcean account and try to run a Postgres instance with Block Storage.

Briefly considered using this Supabase Postgres marketplace image:
https://cloud.digitalocean.com/marketplace/5eb4c8436e9007280dbcd8a3?i=933568
image

But while Supabase looks sick! see: https://github.com/supabase/supabase
As previously mentioned in: dwyl/how-to-choose-a-database#14
It's still considered "Alpha" ...
image

So I think I'm just going to do a manual / first principals setup of Postgres on Ubuntu 18.04 and document it.

No other Postgres available in the DigitalOcean Marketplace: https://marketplace.digitalocean.com

digital-ocean-supabase-postgres

Going to install Postgres and do writeup. ๐Ÿ“

Select the instance Operating System and Plan:
do-create-01-select-instance-type

Scroll down to the "Add Block Storage" section and enter 1gb into the field:
do-create-02-block-storage

We're going to increase the size in a later step, so keep it low for now just to set it up.
https://www.digitalocean.com/docs/volumes/how-to/increase-size/

Select Ext4 for the filesystem and then select the datacenter region that bests suits your needs.

Scroll down until you see the "Choose a hostname" field. Enter a relevant name, in our case "hits":
do-create-04-hostname

Scroll to the bottom of the page and click "Create Droplet".

Progress update:

This morning I created a new DO droplet (VPS instance) with attached block storage.
Installed Postgres on the instance and changed the data directory of Postgres to be the block storage.
This is all documented in: dwyl/hits#82
This is technically not "High Availability" because there is only one Postgres instance.
If the Postgres data directory were to be compromised or corrupted, we would lose everything.
But for the purposes of "hits" we can already do the filesystem to postgres migration with this setup.

Note: I will still definitely pursue the HA route to have at least two Postgres nodes, one as a "hot" backup.
Obviously this doubles the cost of running the service. But it's the only way to have a backup. Or is it ...?
Considering using DO Object Storage: #65

My next step is to install NGINX on the droplet and open port 80/443.
Luckily we've done this before: https://github.com/dwyl/learn-microsoft-azure#12-install--run-nginx-to-test

Fly.io have made this a lot simpler: https://fly.io/docs/reference/postgres/
We have already used it to deploy the Hits App https://hits.dwyl.com/ see: dwyl/hits#128
Lowering the priority of this issue to priority-4 ("maybe someday") and removing it from the App backlog.

Closing as for now we are using Fly.io ๐Ÿš€
Will re-open if/when needed. ๐Ÿ‘Œ