This is a demo for MongoDB+Elastic
Sync mongodb data to Elastic:
https://www.linkedin.com/pulse/5-way-sync-data-from-mongodb-es-kai-hao
In this demo we will demonstrate:
- Use the mongo-connector method to sync the data from MongoDB to Elastic
- Perform some queries from Elastic
Tear down
sudo docker ps -a | awk '{print $1}' | xargs --no-run-if-empty sudo docker rm -f
Prep (one time thing):
mkdir -p /data/mtools /data/mongodb /data/elastic
# fix for elasticsearch docker issue: https://github.com/docker-library/elasticsearch/issues/98
sudo sysctl -w vm.max_map_count=262144
Start containers:
sudo docker run --name mongodb -p 27017:27017 -v /data/mongodb:/data/db -d mongo:3.4.0-rc5 --replSet mset
sudo docker run --name elastic -p 9200:9200 -p 9300:9300 --link mongodb:mongodb -v /data/elastic:/usr/share/elasticsearch/data -d elasticsearch
sudo docker run -v /data/mtools:/data/mtools --link mongodb:mongodb -w /data/mtools -it stennie/ubuntu-mtools
Load
# mgenerate -p 4 -n 100000 -c tracking --port 9999 tracking-template.json
Create index
db.tracking.ensureIndex({courier:1})
db.tracking.ensureIndex({status:1})
db.tracking.ensureIndex({dest:1})
db.tracking.ensureIndex({origin:1})
Date.timeFunc(function(){
db.tracking.aggregate( [
{$match: {origin:"China"}},
{
$facet: {
"categorizedByStatus": [
{ $sortByCount: "$status" }
],
"categorizedByDays": [
{ $match: { days: { $exists: 1 } } },
{
$bucket: {
groupBy: "$days",
boundaries: [ 1, 2, 3,4, 5,6,7,8,9,10 ],
default: "Other",
output: {
"count": { $sum: 1 }
}
}
}
],
"categorizedByOrigin": [
{ $sortByCount: "$origin" }
]
}
}
])
}); // end timeFunc
When run with 100K documents, it took no time 1 million: 2500 ms 20 million:
sudo pip install mongo-connector
sudo pip install elastic2-doc-manager
mongo-connector -m localhost:27017 -t localhost:9200 -d elastic2_doc_manager
Note about the mongo-connector:
First it will discover the mongo nodes by connecting to the seed node. However if the mongo is running in the container, the hostname configuration in the replset config may not be what is the one used in the mongo-connector parameter. As such, one must manually config a DNS resolve for the hostname in the replset config. i.e.,
# cat /etc/hosts
127.0.0.1 localhost 6dca744f8c34
Note the hostname starts with 6dca