breakroom/snap

Hotswap vs Reindex

Closed this issue · 2 comments

@spec hotswap(Enumerable.t(), module(), String.t(), map(), Keyword.t()) ::

Hello all,

Let me first thank you for a well written library. It is really easy to read. I also like how easy to follow the tests are.

We, as developers, can't predict the future and will inevitably have to change the document structure in an index. Why is the hotswap api build the way it is instead of using the native Reindex APIs that ostensibly does the same thing? What was the design decision there?

Opensearch: https://opensearch.org/docs/2.1/opensearch/reindex-data/
Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

I'm not an Opensearch/Elasticsearch expert so are there any gotchas to be aware of?

Thanks for your kind words!

Hotswap builds a new index from source data, usually in your relational database. It gives you the flexibility to define entirely new fields, change mapping types, etc. — anything you can think of — and then swap seamlessly over.

I've not used reindex, but my understanding is that it lets you move the documents in a one index to another index, inside ES. This won't let you add a completely new field that wasn't present in original documents because the data simply won't be present.

Depending on your needs you might be able to transform your data inside ES into the new format. In my experience that's a less common operation, but there's nothing stopping you using snap to make requests that perform it.

If you discover a higher level interface into reindex that you think would be useful for others, you're welcome to submit a PR, but I suggest sharing your thoughts early to make sure it's something we can absorb.

@tomtaylor Having worked more with the library, I now see the wisdom of your words. Our dataset is small enough that a hotswap far outperforms a reindex and is also much easier to code.