Label names mixed up in results and results_unique
Opened this issue · 4 comments
Hi @erdogant,
first, thank you very much for the work you put in this project and making it public! When experimenting with your code base I came across an error when I access the 'labels' of cl.results() and cl.results_unique(). It seems to me that between both, the abels somehow mix up. I want to give you an example:
'img_name_1' is assigned a label '2' in cl.results_unique['labels'], however when I iterate over cl.results['labels'] and search for the file with the same name 'img_name_1' this image belongs (sometimes) to a different label, lets say '5' for example.
My goal is to extract random images and the most centered image (unique) image per cluster label, that is why I would like to match both labels. Maybe, do you have a different idea how I could do it?
Thank you very much!
Best,
Maximilian
Thank you for your issue. I could not reproduce the issue. Can you maybe demonstrate this with a small example?
from clustimage import Clustimage
cl = Clustimage()
# load example with flowers
pathnames = cl.import_example(data='flowers')
# Cluster flowers
cl.fit_transform(pathnames)
# Make plot
cl.clusteval.plot()
cl.clusteval.scatter(density=True, s=100, params_scatterd={'edgecolor': 'black'})
cl.results['labels']
cl.results_unique['labels']
Thanks for the reply. I rerun your example, looks like everything is fine. It is probably an error in my code, I have to look. I will close this for now, and in case I find something I will reopen the issue. Thanks for your help!
Update: I found out that apparently the keywords "filenames" and "pathnames" are not the same. If I use "pathnames" for results and results_unique it works. If I use "filenames" and "pathnames", the labels do not match.
I again could not reproduce this error. Is there any way you can show the error using one of these four data sets?
from clustimage import Clustimage
cl = Clustimage()
X = cl.import_example(data='flowers')
X = cl.import_example(data='scenes')
X = cl.import_example(data='mnist')
X = cl.import_example(data='faces')
cl.fit_transform(X)
np.all(np.array(list(map(os.path.basename, cl.results['pathnames'])))==cl.results['filenames'])
#True