Unable to determine mongo collection for indexed document
wesleyarchbell opened this issue · 22 comments
I am unable to determine which mongo-db collection a document that is indexed in ES belongs too. I have created a unique river for each mongo db collection but they are all indexed under the same index. Is there any way of determining for a given document which mongodb collection it belongs too?
Hello,
You have a couple of options with the current release:
- Create a new index for each river.
- Use script filter to add an additional attribute to the document to be indexed.
Let me know if the first 2 options work for you.
I could also create a new settings options/include_collection
which provide the attribute name where the collection name will stored.
Thanks,
Richard.
For the mean time I just do a mongo db find in each collection by document id to find the collection it belongs too as I didn't want to have to go update documents but I reckon the settings option is the way to go.
Thanks, which release will this make it into?
Hi,
That's available in version 1.6.11 just released today.
Thanks,
Richard.
On Tuesday, July 16, 2013, Wesley Archbell wrote:
Thanks, do which release will it make it into?
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/101#issuecomment-21038133
.
Thanks, if i used this version (1.6.11) will it still work with older versions of mongodb and elasticsearch? specifically mongodb version 2.4.4 & ES version 0.90.1?
Hi,
Yes it should be working.
Thanks,
Richard.
Sent via BlackBerry by AT&T
-----Original Message-----
From: Wesley Archbell notifications@github.com
Date: Wed, 17 Jul 2013 15:47:33
To: richardwilly98/elasticsearch-river-mongodbelasticsearch-river-mongodb@noreply.github.com
Reply-To: richardwilly98/elasticsearch-river-mongodb reply@reply.github.com
Cc: Richard Louaprerichard.louapre@gmail.com
Subject: Re: [elasticsearch-river-mongodb] Unable to determine mongo
collection for indexed document (#101)
Thanks, if i used this version (1.6.11) will it still work with older versions of mongodb? specifically version 2.4.4?
Reply to this email directly or view it on GitHub:
#101 (comment)
Ok thanks will give it a try
One more thing, where is the 'options/include_collection' set in? Which file?
Hi,
In the river settings the wiki should be updated.
Thanks,
Richard.
Sent via BlackBerry by AT&T
-----Original Message-----
From: Wesley Archbell notifications@github.com
Date: Wed, 17 Jul 2013 17:06:27
To: richardwilly98/elasticsearch-river-mongodbelasticsearch-river-mongodb@noreply.github.com
Reply-To: richardwilly98/elasticsearch-river-mongodb reply@reply.github.com
Cc: Richard Louaprerichard.louapre@gmail.com
Subject: Re: [elasticsearch-river-mongodb] Unable to determine mongo
collection for indexed document (#101)
One more thing, where is the 'options/include_collection' set in? Which file?
Reply to this email directly or view it on GitHub:
#101 (comment)
OK Thanks :)
Ive just tried including the change with the new option with v1.6.11, but i dont see any new field in the source fields..
Hi,
I will not be able to help before few days (currently on vacations).
But you compare / check the example available in HEAD (issues/101 folder). There is also a test case available.
Can you also provide your river settings?
Thanks,
Richard.
Sent via BlackBerry by AT&T
-----Original Message-----
From: Wesley Archbell notifications@github.com
Date: Wed, 17 Jul 2013 17:46:10
To: richardwilly98/elasticsearch-river-mongodbelasticsearch-river-mongodb@noreply.github.com
Reply-To: richardwilly98/elasticsearch-river-mongodb reply@reply.github.com
Cc: Richard Louaprerichard.louapre@gmail.com
Subject: Re: [elasticsearch-river-mongodb] Unable to determine mongo
collection for indexed document (#101)
Ive just tried including the change with the new option with v1.6.11, but i dont see any new field in the source fields..
Reply to this email directly or view it on GitHub:
#101 (comment)
No worries, thanks will check it out. Have a good break man :)
Hi Richard, when you get back would u be able to have a look at this, I've reviewed the test for fix#101 but my data is not reflecting the change in the document source fields i.e. there is no 'include_collection' field after i have created the river with the options/include_collection setting. I can see that the field is included in one of the rivers by using the head plugin and viewing in browser:
_index | _type | _id | _score | type | db | collection | include_collection | name | throttle_size |
_river | readcloud_wiley | _meta | 1 | book | rdb | wiley | wileybook-index | 50 |
But not present in document source..
The data gets inserted via a bulk import (mongoexport and mongorestore after river is created for each collection)
Thanks in advance.
I have successfully re-executed the test located here [1].
@wesleyarchbell can you please share the river settings used?
[1] - https://github.com/richardwilly98/elasticsearch-river-mongodb/tree/master/resources/issues/101
Thanks,
Richard.
Hi richard, sorry for the delay, here is my settings for a river:
{
"type":"mongodb",
"mongodb":{
"db":"test",
"collection":"collection123"
},
"options": {
"include_collection":"collection123"
},
"index":{
"name":"book-index",
"throttle_size":"50",
"type":"book"
}
}
@wesleyarchbell I have added additional traces. Can you please try the snapshot version available here [1]?
- Stop ES
- Replace $ES_HOME\plugins\river-mongodb elasticsearch-river-mongodb-1.6.11.jar by elasticsearch-river-mongodb-1.6.12-SNAPSHOT.jar
- Restart ES
Enable logging for the river:
In $ES_HOME\config\logging.yml add the following in logger section
river.mongodb: TRACE
org.elasticsearch.river.mongodb.MongoDBRiver$Indexer: TRACE
Please send me ES log file.
[1] - https://dl.dropboxusercontent.com/u/64847502/elasticsearch-river-mongodb-1.6.12-SNAPSHOT.zip
https://www.dropbox.com/s/fm9am817rk3x3nr/elasticsearch.log
It seems the include_collection option is empty..
This is the exact curl command i use to create river:
curl -XPUT 'http://'"$ES_HOST"':'"$ES_PORT"'/_river/'"$DATABASE"'_'"$collection"'/_meta' -d '
{
"type":"mongodb",
"mongodb":{
"db":"'$DATABASE'",
"collection":"'$collection'"
},
"options": {
"include_collection":"'$collection'"
},
"index":{
"name":"book-index",
"throttle_size":"50",
"type":"book"
}
}
@wesleyarchbell from the code it seems that the options section is not recognized.
Can you please execute curl -XGET http://localhost:9200/_river/{river-name}/_meta
The curl comment results in:
{"_index":"_river","_type":"readcloud_wiley","_id":"_meta","_version":1,"exists":true, "_source" :
{
"type":"mongodb",
"mongodb":{
"db":"readcloud",
"collection":"wiley"
},
"options": {
"include_collection":"wiley"
},
"index":{
"name":"book-index",
"throttle_size":"50",
"type":"book"
}
}}
options should be inside mongodb:
{
"type":"mongodb",
"mongodb":{
"db":"readcloud",
"collection":"wiley",
"options": {
"include_collection":"wiley"
}
},
"index":{
"name":"book-index",
"throttle_size":"50",
"type":"book"
}
}
Dow! Thanks Richard.