thoth-station/integration-tests

Provenance checks do not finish on time (aws-prod)

mayaCostantini opened this issue · 2 comments

Describe the bug

See integration tests report for aws-prod (2022-07-11 version 0.11.2):

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.8/site-packages/behave/model.py", line 1329, in run
    match.run(runner.context)
  File "/opt/app-root/lib64/python3.8/site-packages/behave/matchers.py", line 98, in run
    self.func(context, *args, **kwargs)
  File "features/steps/provenance_check.py", line 57, in step_impl
    raise RuntimeError("provenance-checker took too much time to finish")
RuntimeError: provenance-checker took too much time to finish

Captured logging:
INFO:thamos.lib:Successfully submitted provenance check analysis 'provenance-checker-220711022402-5a020e0151794ec6' to 'https://api.prod.thoth-station.ninja/api/v1'

For both provenance_flask and provenance_flask_error test cases.
It looks like provenance checks are not being scheduled despite the status message:

{
  "error": "Analysis 'provenance-checker-220711081404-295f34a40aeaf941' is being queued and scheduled for processing",
  "parameters": {
    "analysis_id": "provenance-checker-220711081404-295f34a40aeaf941"
  },
  "status": {
    "finished_at": null,
    "reason": null,
    "started_at": null,
    "state": "pending"
  }
}

(See Argo UI for aws-prod).

Expected behavior

Provenance-checks are scheduled and finish on time.

/kind bug
/priority critical-urgent
/sig stack-guidance

Verified this.

  • Requested provenance for standard example:
    https://khemenu.thoth-station.ninja/api/v1/provenance/python?origin=git%40github.com%3Athoth-station%2Fadviser.git&debug=false&force=false
  • Works correctly:
STEP                                                 TEMPLATE                                                         PODNAME                                                      DURATION  MESSAGE
 ✔ provenance-checker-221107184722-a5e7957d1772a12b  provenance-checker                                                                                                                                                     
 ├─✔ provenance-check                                provenance-check/provenance-check                                provenance-checker-221107184722-a5e7957d1772a12b-2149388737  15s                                      
 ├─○ graph-sync-provenance-check                     graph-sync/graph-sync                                                                                                                   when '0 == 1' evaluated false  
 ├─○ parse-provenance-checker-output                 parse-provenance-checker-output/parse-provenance-checker-output                                                                         when '0 == 1' evaluated false  
 └─○ send-messages                                   send-messages/send-messages                                                                                                             when '0 == 1' evaluated false  

Verified the integration tests as well, it works successfully without any issues.

Cluster execution:
Screenshot from 2022-11-07 13-53-51

integration-test local execution:
Screenshot from 2022-11-07 13-57-04

Closing this issue, as the problem is no longer present.
Thanks everyone.