noi-techpark/it.bz.opendatahub.sparql

mobility-sync: We need to update "small tables", not delete and insert them

Closed this issue · 7 comments

Piiit commented

@jcte02 I have tested the script today, but the "small tables" strategy does not work. We cannot delete for example a "type" record, because it has a foreign key constraint on measurements. So, that part needs to be updated and then new types inserted into that table.

{"level":"error","ts":1656492433.6113122,"caller":"root/application.go:324","msg":"error truncating destination table in replica database","application":"mobility-sync","version":"development","error":"ERROR: update or delete on table \"type\" violates foreign key constraint \"fk_measurement_type_id_type_pk\" on table \"measurement\" (SQLSTATE=23503)","table":"type","stacktrace":"github.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/internal/cmd/root.(*Application).transferTable\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/internal/cmd/root/application.go:324\ngithub.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/internal/cmd/root.(*Application).Synchronize\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/internal/cmd/root/application.go:154\ngithub.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/cmd.glob..func2\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/cmd/root.go:149\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/pemoser/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:860\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/pemoser/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/pemoser/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902\ngithub.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/cmd.Execute\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/cmd/root.go:170\nmain.main\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/main.go:6\nruntime.main\n\t/usr/lib/go-1.18/src/runtime/proc.go:250"}
Piiit commented

@jcte02 @bcogrel Lets use the Kanban again, so we have a better overview. However, lets have the meeting first ...

https://github.com/orgs/noi-techpark/projects/19

Piiit commented

In my bash script I handled that table like this:

function upsert() {
    TBL=$1
    shift
    GLOB=$@
    ls $GLOB &>/dev/null || {
        echo "..... UPSERT: Skipping... No file found for table $TBL with glob $GLOB"
        return
    }
    $PSQLVKG -c "create table if not exists intimev2.${TBL}_tmp as table intimev2.${TBL} with no data"
    for FILE in $(ls $GLOB); do
        echo "..... UPSERT: Processing $FILE..."
        $PSQLVKG -c "truncate intimev2.${TBL}_tmp;"
        cat $FILE | $PSQLVKG -c "copy intimev2.${TBL}_tmp from stdin with delimiter ',' csv;"
        QRY=$(while read data; do echo "$data"; done)
        $PSQLVKG -c "$QRY"
        mv $FILE $OUT/DONE
        echo "..... UPSERT: READY."
    done
    $PSQLVKG -c "drop table intimev2.${TBL}_tmp"
}


upsert "type" $PWD/$OUT/type.csv << EOF
    update intimev2.type
    set
        cname = type_tmp.cname,
        created_on = type_tmp.created_on,
        cunit = type_tmp.cunit,
        description = type_tmp.description,
        rtype = type_tmp.rtype,
        meta_data_id = type_tmp.meta_data_id
    from intimev2.type_tmp
    where type.id = type_tmp.id
    and type.cname = type_tmp.cname;

    insert into intimev2.type select * from intimev2.type_tmp on conflict do nothing;
EOF

Seems like you're using the old schema without the deferred constraints.
To which database server are you pointing the tool at?

Piiit commented

I tried with the new VKG server: virtual-knowledge-graph.co90ybcr8iim.eu-west-1.rds.amazonaws.com
and schema = vkg

...and the old testing DB, which has a very similar schema.

Piiit commented

@jcte02 You are right with the deferred constraints, let me retest... maybe I used the wrong db

Piiit commented

@jcte02 You are right, sorry for this inconvenience and extra work... I had the old address in the yaml config...

Piiit commented

Hmm... maybe I closed it too fast, now I still get foreign key constraint errors, using the new DB for sure now:

{
    "level": "error",
    "ts": 1656515415.4358892,
    "caller": "root/application.go:169",
    "msg": "error committing transaction",
    "application": "mobility-sync",
    "version": "development",
    "error": "ERROR: update or delete on table \"type\" violates foreign key constraint \"fk_measurementhistory_type_id_type_pk\" on table \"measurementhistory\" (SQLSTATE=23503)",
    "transaction": "type & type_metadata",
    "stacktrace": "github.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/internal/cmd/root.(*Application).Synchronize\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/internal/cmd/root/application.go:169\ngithub.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/cmd.glob..func2\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/cmd/root.go:149\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/pemoser/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:860\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/pemoser/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/pemoser/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902\ngithub.com/noi-techpark/it.bz.opendatahub.sparql/infrastructure/utils/mobility-sync/cmd.Execute\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/cmd/root.go:170\nmain.main\n\t/home/pemoser/projects/noi/it.bz.opendatahub/sparql/infrastructure/utils/mobility-sync/main.go:6\nruntime.main\n\t/usr/lib/go-1.18/src/runtime/proc.go:250"
}

Config:

interval: "5m0s"

mobility.addr: prod-pg-bdp.co90ybcr8iim.eu-west-1.rds.amazonaws.com:5432
mobility.database: bdp
mobility.password: 
mobility.user: bdp_readonly

mobility.timeout.dial: 2m
mobility.timeout.read: 2m
mobility.timeout.write: 2m

replica.addr: virtual-knowledge-graph.co90ybcr8iim.eu-west-1.rds.amazonaws.com:5432
replica.database: vkg
replica.password: 
replica.user: ontopic

replica.timeout.dial: 2m
replica.timeout.read: 2m
replica.timeout.write: 2m