ARPA-SIMC/arkimet

strange warning with arki-check

Closed this issue · 2 comments

ciao,

I got this message with arki-check, but I'm sure that I didn't delete any data:

cosmo_5M_itr:2023/05-22.grib: possibly deleted data found not tracked by index: 423857604b would be freed by a repack

is it possibly because of duplicated data?

They might be duplicated data.
Duplicated data are not freed until a repack occurs (@spanezz correct me if I'm wrong).

The documentation is ambiguous since it states "replace: when yes, importing duplicate data will replace the existing version." (https://arpa-simc.github.io/arkimet/datasets.html)
What actually happens is that the most recent duplicate import replaces the existing version in the index while the actual data (in this case, grib messages) are kept until a repack is launched.

If that's the case I see two possible improvements:

  • the warning message could be something like: "possibly deleted or duplicated data found..."
  • the replace option documentation could be more detailed

@brancomat is correct: that mesage means that there is data in the segment that is not tracked by the index, which normally happens when data is deleted or replaced (a replace is a delete of the old one and an append of the new one)

I'll now update message and documentation as you said