Unable to re-solve already solved package with an error
fridex opened this issue · 11 comments
Describe the bug
One of the reasons thoth-station/adviser#1850 is failing is missing dependency information for google-resumable-media==1.2.0
which causes resolver to look for another resolution path (unsuccessfully). The reason behind this is a failed solver run marked in the database. If I try to solve the mentioned package locally or in the cluster using solver-rhel-8-py38, the solver succeeds. Looks like we have wrong data/result synced in the database.
Traceback (most recent call last):
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "python_package_version_package_name_package_version_python__key"
DETAIL: Key (package_name, package_version, python_package_index_id, os_name, os_version, python_version)=(google-resumable-media, 1.2.0, 1, rhel, 8, 3.8) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/app-root/lib64/python3.8/site-packages/thoth/storages/graph/models_base.py", line 52, in get_or_create
session.commit()
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 1046, in commit
self.transaction.commit()
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 504, in commit
self._prepare_impl()
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
self.session.flush()
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 2540, in flush
self._flush(objects)
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 2682, in _flush
transaction.rollback(_capture_exception=True)
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
{"name": "sentry_sdk.errors", "levelname": "DEBUG", "module": "transport", "lineno": 222, "funcname": "_send_event", "created": 1621256410.762395, "asctime": "2021-05-17 13:00:10,762", "msecs": 762.394905090332, "relative_created": 13092.246532440186, "process": 1, "message": "Sending event, type:null level:error event_id:4d447d25be454d0aa4bec32a01fd2c81 project:1298083 host:sentry.io"}
raise exception
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 2642, in _flush
flush_context.execute()
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 586, in execute
persistence.save_obj(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 239, in save_obj
_emit_insert_statements(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1135, in _emit_insert_statements
result = cached_connections[connection].execute(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
util.raise_(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "python_package_version_package_name_package_version_python__key"
DETAIL: Key (package_name, package_version, python_package_index_id, os_name, os_version, python_version)=(google-resumable-media, 1.2.0, 1, rhel, 8, 3.8) already exists.
[SQL: INSERT INTO python_package_version (package_name, package_version, os_name, os_version, python_version, entity_id, python_package_index_id, python_package_metadata_id, is_missing, provides_source_distro) VALUES (%(package_name)s, %(package_version)s, %(os_name)s, %(os_version)s, %(python_version)s, %(entity_id)s, %(python_package_index_id)s, %(python_package_metadata_id)s, %(is_missing)s, %(provides_source_distro)s) RETURNING python_package_version.id]
[parameters: {'package_name': 'google-resumable-media', 'package_version': '1.2.0', 'os_name': 'rhel', 'os_version': '8', 'python_version': '3.8', 'entity_id': 2438371, 'python_package_index_id': 1, 'python_package_metadata_id': 313979, 'is_missing': False, 'provides_source_distro': True}]
(Background on this error at: http://sqlalche.me/e/13/gkpj)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "app.py", line 252, in <module>
cli()
File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/app-root/lib64/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "app.py", line 226, in cli
_do_sync(
File "app.py", line 125, in _do_sync
stats = sync_documents(
File "/opt/app-root/lib64/python3.8/site-packages/thoth/storages/sync.py", line 574, in sync_documents
stats_change = handler(
File "/opt/app-root/lib64/python3.8/site-packages/thoth/storages/sync.py", line 131, in sync_solver_documents
graph.sync_solver_result(document)
File "/opt/app-root/lib64/python3.8/site-packages/thoth/storages/graph/postgres.py", line 5078, in sync_solver_result
python_package_version = self._create_python_package_version(
File "/opt/app-root/lib64/python3.8/site-packages/thoth/storages/graph/postgres.py", line 3869, in _create_python_package_version
python_package_version, _ = PythonPackageVersion.get_or_create(
File "/opt/app-root/lib64/python3.8/site-packages/thoth/storages/graph/models_base.py", line 62, in get_or_create
return session.query(cls).filter_by(**kwargs).one(), True
File "/opt/app-root/lib64/python3.8/site-packages/sqlalchemy/orm/query.py", line 3500, in one
raise orm_exc.NoResultFound("No row was found for one()")
sqlalchemy.orm.exc.NoResultFound: No row was found for one()
To Reproduce
Steps to reproduce the behavior:
- Go to prod and schedule solver solver-rhel-8-py38 for
google-resumable-media==1.2.0
- See the solver finishes successfully
- Graph sync fails with the exception reported above
Expected behavior
Graph sync should sync solver result.
Interestingly, the solver document synced previously has no package information:
{
"metadata": {
"analyzer": "thoth-solver",
"analyzer_version": "1.6.3",
"arguments": {
"python": {
"exclude_packages": null,
"index": "https://pypi.org/simple",
"no_pretty": false,
"no_transitive": true,
"output": "/mnt/workdir/solver-rhel-8-py38-4f4eb1b6",
"requirements": "google-resumable-media===1.2.0",
"virtualenv": "/home/solver/venv"
},
"thoth-solver": {
"verbose": false
}
},
"datetime": "2020-12-15T21:12:22.560044",
"distribution": {
"codename": "Ootpa",
"id": "rhel",
"like": "fedora",
"version": "8.3",
"version_parts": {
"build_number": "",
"major": "8",
"minor": "3"
}
},
"document_id": "solver-rhel-8-py38-4f4eb1b6",
"duration": 10,
"hostname": "solver-rhel-8-py38-4f4eb1b6-3956294335",
"os_release": {
"id": "rhel",
"name": "Red Hat Enterprise Linux",
"platform_id": "platform:el8",
"redhat_bugzilla_product": "Red Hat Enterprise Linux 8",
"redhat_bugzilla_product_version": "8.3",
"redhat_support_product": "Red Hat Enterprise Linux",
"redhat_support_product_version": "8.3",
"version": "8.3 (Ootpa)",
"version_id": "8.3"
},
"python": {
"api_version": 1013,
"implementation_name": "cpython",
"major": 3,
"micro": 3,
"minor": 8,
"releaselevel": "final",
"serial": 0
},
"thoth_deployment_name": "ocp4-stage",
"timestamp": 1608066742
},
"result": {
"environment": {
"implementation_name": "cpython",
"implementation_version": "3.8.3",
"os_name": "posix",
"platform_machine": "x86_64",
"platform_python_implementation": "CPython",
"platform_release": "4.18.0-193.14.3.el8_2.x86_64",
"platform_system": "Linux",
"platform_version": "#1 SMP Mon Jul 20 15:02:29 UTC 2020",
"python_full_version": "3.8.3",
"python_version": "3.8",
"sys_platform": "linux"
},
"environment_packages": [
{
"package_name": "pipdeptree",
"package_version": "1.0.0"
}
],
"errors": [],
"platform": "linux-x86_64",
"tree": [],
"unparsed": [],
"unresolved": [
{
"index_url": "https://pypi.org/simple",
"is_provided_package": true,
"is_provided_package_version": false,
"package_name": "google-resumable-media",
"version_spec": "===1.2.0"
}
]
}
}
Should we modify solver workflow to be:
add new condition resync
to the workflow scheduling which is False by default, for task number 2 of the workflow.
- 1
solver
workflow-task - 2
allow-resync
workflow-task check solver version run, check if the package name,version index + solver is solved already and in that case delete data. (resync=True condition, otherwise this task does not run) - 3
graph-sync
workflow-task
who is going to schedule solver workflows with resync=True
? A new cronworkflow component that checks for new solver version available and schedule solver for packages that are solved with solver version below a certain solver version, so thoth keeps updating itself for the knowledge? and we don't put all data in the databse.
graph-refresh will keep scheduling unsolved packages, using always latest solver available, so there is no conflict with the check introduced.
wdyt @fridex @harshad16 @goern ?
Should we modify solver workflow to be:
add new condition
resync
to the workflow scheduling which is False by default, for task number 2 of the workflow.
- 1
solver
workflow-task- 2
allow-resync
workflow-task check solver version run, check if the package name,version index + solver is solved already and in that case delete data. (resync=True condition, otherwise this task does not run)- 3
graph-sync
workflow-task
I think we can reuse THOTH_FORCE_SYNC
parameter used in graph-sync task. This way, the user/component responsible for scheduling the graph-sync task will be aware there is done force sync of results computed.
who is going to schedule solver workflows with
resync=True
? A new cronworkflow component that checks for new solver version available and schedule solver for packages that are solved with solver version below a certain solver version, so thoth keeps updating itself for the knowledge? and we don't put all data in the databse.
It might be good if we start scheduling workflows on our own (no code/component yet). If we spot a bug in solver data that we know we have the logic on which data should be recomputed (trigger solver workflows).
graph-refresh will keep scheduling unsolved packages, using always latest solver available, so there is no conflict with the check introduced.
This will be expensive when it comes to resources. If we decide which solver results are affected (which have wrong data) we can trigger an update just for the affected components, not to recompute all the data we have once again (which takes weeks).
/reopen
@fridex: Reopened this issue.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen
.
Mark the issue as fresh with /remove-lifecycle rotten
.
/close
@sesheta: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen
.
Mark the issue as fresh with/remove-lifecycle rotten
./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/remove lifecycle-rotten
/lifecycle frozen
/sig stack-guidance
This issue is already resolved in the past release of thoth-station/storages
module. The following are details:
Cause of the issue was the introduction python_package_metadata_id
field in the PythonPackageVersion
table via the function _create_python_package_version
when there was unique key constraint on the set
storages/thoth/storages/graph/models.py
Line 92 in f8d5406
So on each transaction of new entry to the table
PythonPackageVersion
, it would through sqlalchemy.exc.IntegrityError
error, as the entry based on storages/thoth/storages/graph/models.py
Line 92 in f8d5406
python_package_metadata_id
.
For example entry: [parameters: {'package_name': 'google-resumable-media', 'package_version': '1.2.0', 'os_name': 'rhel', 'os_version': '8', 'python_version': '3.8', 'entity_id': 2438371, 'python_package_index_id': 1, 'python_package_metadata_id': 313979, 'is_missing': False, 'provides_source_distro': True}]
and SQL tries to get the existing entry it wouldn't be able to capture it , as
try to find entry with all the field (includepython_package_metadata_id
) which doesn't exists.This issue got fixed with commit 47a2b66 part of PR
With help of PR's #2310 and #2602.
This particular issue is fixed.
Closing the issue.