comic/evalutils

Error message for UniqueImagesValidator does not tell the user which images are duplicates

Louisvh opened this issue · 1 comments

  • evalutils version: 0.4.2
  • Python version: 3.11
  • Operating System: Linux

Description

The error that UniqueImagesValidator produces doesn't tell the user which images are duplicates. This is somewhat problematic when end-users/clinicians run into the error, as finding the duplicates can turn into manual labor for people who don't know how to script this search.

jmsmkn commented

The error message is generated here:

try:
hashes = df["hash"]
except KeyError:
raise ValidationError("Column `hash` not found in DataFrame.")
if len(set(hashes)) != len(hashes):
raise ValidationError(
"The images are not unique, please submit a unique image for "
"each case."
)

You can override this exception in your own code if you like, or please feel free to submit a PR with an enhancement.