RockefellerArchiveCenter/fornax

Archivematica ingests are starting while the previous ingest(s) are still processing

Closed this issue · 3 comments

Describe the bug

This line should check whether there is a still-processing ingest before starting the transfer in Archivematica. However, this check is not working and transfers are still starting--causing a pileup in Archivematica.

To reproduce

Run this service over a few successive packages.

Expected behavior

Transfers do not start while an ingest is in-process in the Archivematica pipeline.

Impact on your work

I just confirmed with Artefactual that Archivematica could indeed blow up if too much is happening at once--though it managed to get through ~10 simultaneous ingests/transfers.

Additional context

I do not yet know why this is happening--my next step is to run this process manual and see if the issue lies with what we're getting back from Archivematica or something else.

@helrond I figured out what's happening--a bag was set to CLEANED_UP (probably a problematic package that we wanted to mark as complete to get it unstuck) but did not have an Archivematica UUID. So this could be chalked up to human error, but I'm not sure if we want to modify the check at all?

After thinking about this for a couple of minutes more I think there is an actual issue--we want to check last and not first:

>>> SIP.objects.filter(process_status__in=[SIP.APPROVED, SIP.CLEANED_UP], origin=sip.origin).first().created
datetime.datetime(2021, 10, 28, 14, 52, 0, 92865, tzinfo=<UTC>)
>>> SIP.objects.filter(process_status__in=[SIP.APPROVED, SIP.CLEANED_UP], origin=sip.origin).last().created
datetime.datetime(2022, 1, 15, 6, 4, 14, 917304, tzinfo=<UTC>)

This issue is still not resolved (however the last() fix was necessary). Basically, client.get_unit_status(last_started.archivematica_uuid) == 'PROCESSING' is not evaluating properly because what get_unit_status is returning looks like:

{'type': 'transfer', 'path': '/var/archivematica/sharedDirectory/currentlyProcessing/59193ace-30c3-4a3b-a656-9232ebc7ce0e.tar.gz', 'directory': '59193ace-30c3-4a3b-a656-9232ebc7ce0e.tar.gz', 'name': '59193ace-30c3-4a3b-a656-9232ebc7ce0e.tar.gz', 'uuid': '1651add2-d21b-445a-abd4-444450648ba9', 'microservice': 'Extract zipped bag transfer', 'status': 'PROCESSING', 'message': 'Fetched status for 1651add2-d21b-445a-abd4-444450648ba9 successfully.'}

The check should instead be client.get_unit_status(last_started.archivematica_uuid)['status']