Mongo VS Postgres - Large-table performance comparison

This is a simple benchmark that tests how quick Postgres and MongoDB can concurrently read and write records to a table/collection that has at least 100M records in it.

This kind of benchmark is important to show which DB performs better as an event log store - think thermostat readings, or when a user (dis)connected to a chat room, and similar events that just get logged and never updated.

"But why not go for InfluxDB/TimescaleDB/Clickhouse?" I hear you ask. I don't want to add unnecessary complexity if I don't have to. Having a new DB that I have to learn to use and maintain is something I'd like to avoid given the size of my team. The only reason I'm looking into Mongo is because I have an expert in-house.

Running

To run the benchmark execute

docker compose run -it --rm app ./benchmark

While you can run this locally, my suggestion is that you rent out a VPS and let the benchmark run there over night. This is very taxing on your SSD and renting a VPS gives you a consistent comparison platform.

There are many configuration options that you can pass to ./benchmark. To see them pass --help. They allow you to configure the concurrency and size of the read-write test, the base size of the database-under-test, define which tests to perform against which database instances, and more.

Notes

Here are a few things I have learned while performing this benchmark. Watch out for them when running you own:

Result

The following are the results of this benchmark with the test cases defined in cases.yml. (reference the docker-compose.yml file to see the individual DB configs).

Test cases:

  • cluster_mongo - tests how MongoDB performs with journaling enabled. I can't use a database that can lose 1min of data, or one without a hot-standby or read-replica.
  • optimized_postgres - tests how PG performs when configured to use the same amount of resources as MongoDB.
  • mongo - tests how the stock MongoDB config compares to the one with journaling enabled.
  • async_postgres - tests how fast PG write performance can go if 100ms of data loss is acceptable.

System:

AWS m7a.large
CPU: 2 vCPU
RAM: 8 GB RAM
SWAP: 5 GB
DISK: 200GB

Output:

Comparison:

Conclusions:

Disclaimer

Please remember that all benchmarks are bullshit. What is important to me might not be important to you.

I'm interested in how these databases perform when there are 100M+ records in a table. If a DB is super performant with 1-5M records that isn't of much use to me since my dataset has 500M records currently and is growing.

I didn't test partitioning because I'd just go with Postgres or MariaDB then no matter how large the performance difference would be. As I said, having one DB to look at and maintain is much easier for me than having two.

Other people's benchmarks & other relevant link