IATI/IATI-Datastore

Having more than one worker causes database conflicts

Closed this issue · 1 comments

If there is more than one iati queue background worker, we get IntegrityErrors due to violation of the Organisation uniqueness constraint.

In case it’s useful, here’s an example stack trace of this:

11:42:44 IntegrityError: (IntegrityError) duplicate key value violates unique constraint "organisation_ref_name_type_key"
DETAIL:  Key (ref, name, type)=(, Directorate-general Development Cooperation and Humanitarian Aid, 10) already exists.
 'INSERT INTO organisation (ref, name, type) VALUES (%(ref)s, %(name)s, %(type)s) RETURNING organisation.id' {'ref': u'', 'type': u'10', 'name': u'Directorate-general Development Cooperation and Humanitarian Aid'}
Traceback (most recent call last):
  File "/IATI-Datastore/env/lib/python2.7/site-packages/rq/worker.py", line 411, in perform_job
    rv = job.perform()
  File "/IATI-Datastore/env/lib/python2.7/site-packages/rq/job.py", line 343, in perform
    self._result = self.func(*self.args, **self.kwargs)
  File "/IATI-Datastore/iati_datastore/iatilib/crawler.py", line 297, in update_activities
    parse_resource(resource)
  File "/IATI-Datastore/iati_datastore/iatilib/crawler.py", line 250, in parse_resource
    parse_activity(new_identifiers, old_xml, resource)
  File "/IATI-Datastore/iati_datastore/iatilib/crawler.py", line 232, in parse_activity
    db.session.flush()
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 149, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1814, in flush
    self._flush(objects)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1896, in _flush
    flush_context.execute()
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 372, in execute
    rec.execute(self)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 525, in execute
    uow
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 63, in save_obj
    table, insert)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 565, in _emit_insert_statements
    execute(statement, params)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 664, in execute
    params)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 764, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 878, in _execute_context
    context)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 871, in _execute_context
    context)
  File "/IATI-Datastore/env/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 320, in do_execute
    cursor.execute(statement, parameters)
IntegrityError: (IntegrityError) duplicate key value violates unique constraint "organisation_ref_name_type_key"
DETAIL:  Key (ref, name, type)=(, Directorate-general Development Cooperation and Humanitarian Aid, 10) already exists.
 'INSERT INTO organisation (ref, name, type) VALUES (%(ref)s, %(name)s, %(type)s) RETURNING organisation.id' {'ref': u'', 'type': u'10', 'name': u'Directorate-general Development Cooperation and Humanitarian Aid'}

Running max one worker sucks – it would be great to fix this one.