Use in the MedSea: content duplicate check takes too long
Opened this issue · 0 comments
imab4bsh commented
Problem: The content duplicate check code has a step where the profiles that are suspicious of being content duplicates are found by comparing the deep portion of the profile (deeper than 800 db). Afterwards, the entire profiles are compared.
Since profiles are shallower in the MedSea, all profiles are check against all profiles and then the duplicate check takes too long.
Therefore the profile portion to be checked to find those suspicious profiles should be defined by the user.
Problem found by: @Anto79124
To do: Identify where this needs to be changed and add an argument to the deep content comparison function