crbelaus/trans

Performance and Convinience: Indexing translations fields separately or use of postgres operator @> and GIN index

igormisha opened this issue · 4 comments

First of all, Trans is a great tool which I'm looking forward to use extensively in my projects, thanks for sharing!
For now it's just a question but may be it will become an enhancement proposal.
Filtering big table by many translated fields in translations column using translated macros in where clauses will require indexing in order to be fast enough.
So, what is the best way to do it?
Should we add the btree indexes for every field inside every language in translations?
May be it is better to add GIN index to the whole translations column and use @> operator for filtering instead of x->y->>z operators?
Possibly add one more macro like existing translated macro or add an argument to translated macro for such possibility?
Also may be the documentation needs some more info about indexing for improving performance.

Thank you for your comment @igormisha !! 🙌

I had never thought about the problem you are having, and it is indeed an important one. You are completely right about the necessity of having it covered by Trans both in the code and in the documentation.

GIN indexes are a little bit obscure to me. Let me investigate a little bit about how they work and what we could do with them so I can prepare a plan for addressing this functionality.

Thanks again!

As for the GIN indexes and using the @> operator in this case everything is reduced to addition index like this to migration file:

create table(:posts) do
  add :title, :string
  add :body, :text
  add :category, :string
  add :translations, :map 
end
execute "CREATE INDEX posts_translations_index ON posts USING gin(translations);"

And the fragment which compares that, for example, russian translation of category is equal to "Образование" for the where clause will look like this:
fragment(posts.category @> '{"translations": {"ru": "Образование"}}'::jsonb)

After my initial post I've got that you need to keep things as generic as possible to support different databases and this feature appears to be postgresql specific.

Very interesting @igormisha, thank you very much.

I see no problem about having some Postgresql specific functionality in Trans as long as it improves performance and makes the library easier to use.
It may be the moment to add a Trans.Postgres (or something like that) submodule that contains the PostgreSQL-specific functionality. It could have functions for generating the GIN index and for querying it.

A long time has passed since this discussion.

Since Trans itself does only provide functions that access the translations, the comparison operator themselves can be implemented by the Trans users. Today I am leaning towards improving the documentation regarding database indexes for translations.

A pull request improving the docs on this area would be really appreciated.