Class (Distributed program systems) project - clone of DynamoDB created at FIIT STU
Run $ bash init.sh
to initialize all the necessary docker machines and containers. Our DynamoDB consists of 3 Ruby on Rails nodes by default. We strongly advise you to maintain at least 3 nodes in your system configuration as we have not considered DynamoDB to work in other circumstances for the purposes of the project.
- NGinx (proxy):
$ docker-machine ip sm-docker-0
- Consul (service discovery):
$ docker-machine ip docker-SD
- Kibana (dashboard):
$ docker-machine ip docker-LOG
- Kibana: http://kibana_ip:8080
- Consul: http://consul_ip:8500
If you wish to remove the whole project from your machine feel free to $ bash uninstall
. This will remove all the docker machines we have created for you with the init
script automatically.
If you want to add another RoR container for your DynamoDB, run $ bash rails_app_gener.sh -vm $VM_NAME -ds $CONSUL_IP
. VM_NAME should be a valid docker-machine name of the machine you want to run the container on and $CONSUL_IP should be an IP address of the machine that runs service discovery.
If you want to keep the machines and remove all the DynamoDB containers created by init
or you without screwing up the swarm, execute $ bash remove_dynamo_containers
and it will do the trick for you.
You can acess the GUI at http://$ docker-machine ip sm-docker-0
/. You can read and write values for a specific hash key. You don't need to specify vector clock when writing value for a key for the first time, although you can (given that you have already read that key and DynamoDB returned initial vector clock). On the other hand, you have to provide vector clock when storing value for a key that you have already stored some value for, otherwise DynamoDB will assign new vector clock for that value (DynamoDB treats the value as a completely new version of the object).
This is what the GUI looks like:
There are two main functions of the DynamoDB: read and write data.
GET http://proxy_ip/node/read_key?&key=12345
for reading values of key = 12345.
If you wish to specify read quorum add another parameter: http://proxy_ip/node/read_key?&key=12345&read_quorum=2
POST http://proxy_ip/node/write_key
and specify needed request parameters:
{ "key":"113", "value":"random_value" }
There are some optional parameters:
"write_quorum":"2", "vector_clock":{ "68535db237f3=1;a36909a4384f=1;f1f28dd2e53a=0;": { "68535db237f3": "1", "value": "some_random_stored_value", "f1f28dd2e53a": "0", "a36909a4384f": "1" } }
Vector clock can be acquired by calling read_key
API method for a key that has stored data.
If you wish to specify read quorum add another parameter: http://proxy_ip/node/read_key?&key=12345&read_quorum=2
We've been able to create a multi-host network consisting of 3 docker-machines, each running several containers. So far we have used Consul, Registrator, Docker Swarm and NGINX. Containers can communicate with each other (only ping atm) across VMs.
Added init script which starts docker-machines and added modified Dockerfile for nodes
TODO:
-use Consul-Template for updating proxy config when new container joins (or leaves) the network.
-create dummy applications that will be run on DataNodes with simple REST APIs.
Init and Uninstall script polished and issues with java_image containers (registrator ignored them) fixed. Dummy Java applications have been created under node-app folder.
TODO and IN_PROGRESS:
-Consul-Template integration (proxy conf update in consul).
-rsyslog / ELK stack integration.
Consul-Template has been successfully integrated and proxy config updates work automatically. We have come across some problems with (registrator assigned localhost IP => containers were unreachable) - we had to use other Registrator image that could perform well with overlay network. We are currently having some issues with rsyslog init scripts, kibana and logstash should be linked together soon.
IN_PROGRESS:
-rsyslog / ELK stack integration.
TODO:
Dynamo
We have successfully finished integrating ELK stack into our system. Logs are being centralized, Kibana is set up (displays logs), we have created single REST API method ($PROXY_IP/dynamo-node-webapp/dynamo/hello/$USER_INPUT) which responds with "Hello $USER_INPUT!" from one of the java app containers. We use Consul's key-value store only for storing logserver's IP atm.
TODO:
Dynamo
We have decided to switch from Java framework to Ruby on Rails (DynamoDB nodes). We have managed to implement consistent hashing (95% done) and currently we are working on read-write quorum, vector clocks and container metrics, service discovery broadcasts and additional SD configuration.
TODO:
quorum
vector clocks
metrics (zabbix?)
ELK graphs
Consul broadcasts
Implemented quorum and replication of hash keys across nodes. Whenever node fails / new one registers to network, others acclimatize. New node first gets all the data he will be responsible for after registering to network and afterwards finally joins the network. Consul registers a change of DynamoDB configuration in it's key-value storage and broadcasts all nodes. They then update their data, replicate what is new, delete what is old and not needed of them anymore and adjust to changes. If quorum requirements coudln't be satisfied user gets notified.
TODO:
vector clocks
metrics (zabbix?)
ELK graphs
Vector clocks have been successfully implemented. Casualities are dealt with automatically (Hash table datastructure helps). Broadcasts are being tested and so are health checks. The only thing left undone is GUI interface for better UX. That will be implemented tomorrow along with some basic graphs in ELK.
TODO:
metrics (zabbix?)
ELK graphs
GUI
We can finally say that we have successfully implemented DynamoDB with all it's key features: consistent hashing, vector clocks, replication and fault tolerance. The only few things left undone are integration of monitoring platform (Zabbix for example) and advanced ELK graphs for better visualization of system activity.
TODO:
metrics (zabbix?)