Objective unicity check
Closed this issue · 7 comments
The unicity of an objective is currently only based on the description content. This allows objective with the same metrics to be uploaded multiple time so that they are associated with different datasets.
This however prevents the creation of an objective with the same description but a different metrics archive.
We could have the unicity check include both description and metrics, which would allow the same flexibility when it comes to re-uploading a metrics with another description but would also allow to upload multiple metrics with the same description.
It should be reviewed and validated in the substra specification before implementing it.
@Kelvin-M I don't see anything in the spec preventing us from this change, I think we can do this update
@jmorel It may be easier to just compute the hash on the metrics file instead of description no ?
After giving this some thought, I think the unicity should be on metrics file + datamanager key + data samples keys. This way you can have multiple objective sharing the same metrics code but with different data sets. And you can have the same data set but with different metrics code. I think this covers every use case.
And it removes the description from the check, which in my opinion makes no sense.
@jmorel I think it will be difficult to add datamanager key and data samples keys as paramater for the hash.
Doing something like :
data_keys = ''.join([data_manager_pkhash] + sorted(data_sample_pkhashes))
pkhash = get_hash(metrics, key=data_keys)
will be difficult beccause backend db do not store or have access to the data_manager_pkhash
and data_sample_pkhashes
relation with the objective for the full save.
You can see in backend/backend/substrapp/models/objective.py
for instance :)