Digest is based on a partially ordered model schema blocking reproducibility
Opened this issue · 3 comments
Prerequisites
- I checked the documentation and found no answer to my problem
- I checked the existing issues and made sure there are no similar bug reports
Category
Bug (unexpected behavior)
Expected Behavior
Digests should be based on completely sorted json. Completely sorted means recursively sorting all keys, arrays, and nested objects. It seems that nested objects are sorted by key but, arrays are not.
Observed Behavior
Digest is based on a json representation of the open api model schema. However, the json is only ordered by keys.
For example given a model
class SuperImportantCheck(Model):
"""Plus random docstring"""
check: bool
message: str
counter: int
when Model.build_schema_digest()
is called it runs schema = model.schema_json(indent=None, sort_keys=True)
for the given SuperImportantCheck
Model
, schema
will equal
{"description": "Plus random docstring", "properties": {"check": {"title": "Check", "type": "boolean"}, "counter": {"title": "Counter", "type": "integer"}, "message": {"title": "Message", "type": "string"}}, "required": ["check", "message", "counter"], "title": "SuperImportantCheck", "type": "object"}
Notably, schema.required
maps to an array with the unordered parameters ["check", "message", "counter"]
.
The unordered nature of the schema json means other agents hoping to interact with an agent must implement a Model with the same ordering such that the digests match.
To Reproduce
Modify uAgents/python/src/model.py
to contain
@staticmethod
def build_schema_digest(model: Union["Model", Type["Model"]]) -> str:
schema = model.schema_json(indent=None, sort_keys=True)
print(schema)
digest = hashlib.sha256(schema.encode("utf8")).digest().hex()
return f"model:{digest}"
Then run python python/tests/test_model.py
Version
v0.17.0
Environment Details (Optional)
No response
Failure Logs (Optional)
No response
Additional Information (Optional)
No response
Thank you for opening this issue.
As you may have seen in the branches section or PR #374 just to name an example, we are very much aware of the topic.
Unfortunately we've been holding off of changing the way the digest is created for now because of backwards compatibility. Introducing a new way of calculating the digest will ultimately break compatibility for many existing agents so we need to be careful when rolling out such a change.
To minimise the amount of breaking changes we're working hard in-house to come up with a better solution for manifests in general which in turn will have an impact on how the digest will be created.
Coming back to your observation and what you've written under "expected behaviour":
Digests should be based on completely sorted json.
That is an assumption caused probably by the nature of our documentation. I'm convinced that there is no point in the docs where we state that Models used for agent to agent communication can differ in structure/order. It was always meant to be this way to allow for some flexibility but at the same time you'll never see examples or integrations where two agents that want to communicate don't share the exact same model definition (including order).
So for me that is an implicit definition and I want to apologise for being unclear in that regard.
We'll do our best to come up with a new way of treating the manifest in the near future and hope this will not block your progress or project in any way.
Leaving this issue open as a reminder
I had the same concerns regarding backward compatibility. I ran into this issue while implementing digest in an NPM package. To ensure that our agents are compatible with existing agents, we're implementing a digest that should be identical to the Python digest. I would like your opinion on our implementation of digest. We can continue the discussion at #539. Current progress on our digest implementation can be found at https://github.com/Luceium/uAgents-NPM/blob/digest/src/model.ts#L74.