We set up two CouchDB instances and let them talk to each other (replicate their data). Then we access the two CouchDB instances via an nginx load balancer.
We create our own up-to-date CouchDB Docker image, building CouchDB from source. Also, we build ourselves a little nginx load balancer.
The Makefile has all the help you need.
But first, make sure you have logged in with your Giant Swarm account and the swarm
CLI. Otherwise, adapt the Makefile to match your requirements, especially the USERNAME
variable in the top.
Then, use these commands to build the images:
cd couchdb
make docker-build
This will create an image based on Debian Wheezy. Building might take a bit.
Use this command:
make docker-run
This will launch a Docker container from your custom CouchDB image. You can connect to it using the IP address of your interface and port 5984.
This works the same way as with the couchdb image, but in a different folder.
cd ../nginx-couchdb-lb
make docker-build
Make sure you have logged in before ussing docker login https://registry.giantswarm.io/
. Then, use these commands to push your images:
cd ../couchdb
make docker-push
cd ../nginx-couchdb-lb
make docker-push
Using the provided swarm.json
file, we will create a Giant Swarm application. We use our username to disambiguate the domain entries.
swarm up --var=username=`swarm user`
Both CouDBs are accessible by with their own host entry via port 80.
curl a.couchdemo-`swarm user`.gigantic.io
curl b.couchdemo-`swarm user`.gigantic.io
First, we create a database on A
curl -X PUT http://admin:THEcouchDBdemo@a.couchdemo.gigantic.io/demo
curl http://a.couchdemo.gigantic.io/_all_dbs
Then we create a first document:
curl -X PUT \
http://admin:THEcouchDBdemo@a.couchdemo.gigantic.io/demo/d516241ea9546cb9bd911f902b000aee \
-d '{"title": "First test document", "description": "Created on A"}'
curl -X POST \
http://admin:THEcouchDBdemo@a.couchdemo.gigantic.io/_replicate \
-d '{"source": "demo","target": "http://admin:THEcouchDBdemo@b.couchdemo.gigantic.io/demo", "create_target": true, "continuous": true}' \
-H "Content-Type: application/json"
curl -X POST \
http://admin:THEcouchDBdemo@b.couchdemo.gigantic.io/_replicate \
-d '{"source": "demo","target": "http://admin:THEcouchDBdemo@a.couchdemo.gigantic.io/demo", "create_target": true, "continuous": true}' \
-H "Content-Type: application/json"
curl -X PUT \
http://admin:THEcouchDBdemo@b.couchdemo.gigantic.io/demo/72f5c2e7265683f6f341d7e1b2000ba0 \
-d '{"title": "Second test document", "description": "Created on B"}'
curl http://a.couchdemo.gigantic.io/demo/72f5c2e7265683f6f341d7e1b2000ba0
swarm stop couch-demo/couchb
curl -X PUT \
http://admin:THEcouchDBdemo@a.couchdemo.gigantic.io/demo/371bd2c52d2bc0ed6b240fa4d6001965 \
-d '{"title": "Third test document", "description": "Created on A"}'
swarm start couch-demo/couchb
After B is back up, it should get the new doc created in B's absence form A.
curl http://b.couchdemo.gigantic.io/demo/371bd2c52d2bc0ed6b240fa4d6001965
If we wanted our clients to be unaware of the number of CouchDB backends and spread load nicely via all instances, a load balancer might be useful.
We already built the image above, but didn't make use of it yet. Here we test it's up:
curl a.couchdemo-`swarm user`.gigantic.io
Call the above command several times to see that the response is changing slightly. The uuid
returned should differ between requests, depending by witch host the response is served.
Since both CouchDB instance replicate their content, we don't really care which instance is accessed.
The provided backup script (written in Python) runs a test on both instances directly and then over the load balancer to see if there gains in throughput. Here is how to set it up using virtualenv
and pip
.
virtualenv venv
source venv/bin/activate
pip install requests couchdb
Once this is done, call it using
python benchmark.py
The script creates a database called benchmark
. When it already exists, the database is deleted and re-created.
Then, documents are created first on one instnce, then on the other instance, then via the load balancer. In the next step, the same documents are read in randomized order from one host, then from the other host, then via the load balancer.
In our tests, the speed gain from using the load balancer is minimal. The benefit that the CouchDB cluster is hidden from the clients is far greater.