toluaina/pgsync

Is there a way to sync Postgresql Views with Elasticsearch?

Closed this issue · 6 comments

PGSync version: 2.2.1

Postgres version: 13.6

Elasticsearch version: 8.1.0

Redis version: 5.0.7

Python version: 3.8.3

Problem Description:
I have created an index in Elasticsearch for Postgresql Views. Whenever I do changes in the Postgres View table, the same changes have to happen in the Elasticsearch index. I tried running the command pgsync with schema.json and got the below error.
Is there a way to sync Postgresql Views with Elasticsearch? Because this is my requirement.

Error Message (if any):

$ pgsync --config /home/vijee/Elastic/schema.json
 - dept_cmpy


2022-03-29 12:50:09.632:ERROR:pgsync.elastichelper: Exception (psycopg2.errors.UndefinedColumn) column dept_cmpy_1.xmin does not exist
LINE 3: WHERE CAST(CAST(dept_cmpy_1.xmin AS TEXT) AS BIGINT) < 699
                        ^

[SQL: SELECT count(*) AS count_1 
FROM public.dept_cmpy AS dept_cmpy_1 
WHERE CAST(CAST(dept_cmpy_1.xmin AS TEXT) AS BIGINT) < %(param_1)s]
[parameters: {'param_1': 699}]
(Background on this error at: https://sqlalche.me/e/14/f405)
Traceback (most recent call last):
  File "/home/vijee/anaconda3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1802, in _execute_context
    self.dialect.do_execute(
  File "/home/vijee/anaconda3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedColumn: column dept_cmpy_1.xmin does not exist
LINE 3: WHERE CAST(CAST(dept_cmpy_1.xmin AS TEXT) AS BIGINT) < 699


Got the same issue, but believe it is not possible at this moment, it is not part of the supported features.

I believe that if you can tweak the code a little you could make it use a view as the source table for the query, but it would only trigger on changes of 1 of the tables.

Adding a parameter for monitor table, which will have the the trigger logic, and the view could be in the source table could be a solution.

Also wish to thank @toluaina for the great work you have put into this.

Yes I'm afraid views are not supported yet. I hope to do this whan I have a bit of time

I've added support for views in the master branch.
This requires some addition to the schema.json.

For views, you need to include:

  • base_tables: Each node now requires this entry as a list of tables contributing to the view
  • primary_key: This is a list of primary_key for the view
  • foreign_key: Child nodes should include this to indicate the relationship to the parent node

@trymkb

closed as resolved. thanks for your patience on this

I looked into the implementation. Look like pgsync trigger an mat view refresh on any insert/update on the any of base table.
Our case is a bit different. We have aMatView which combines multiple tables. We only update it once every hour as it's too costly to refresh on every update on the table.

Any idea how we can table this ?

@toluaina You mentioned views are supported now but it is still looking for the xmin column in the views