discussion on different concepts results
bakachan19 opened this issue · 2 comments
Hi.
Thank you for this library. It is really helpful.
I am using concept modeling to cluster images and do some analysis on the results.
I modified the use of find_concepts()
( that initially was meant to find the top 5 related concepts based on a search term) to find the top 5 related concepts given an image (by simply passing the path to an image and obtain the embeddings of the image with the embedding model).
However I noticed that in many cases the top-1 most related cluster is different from the cluster that is returned by fit_transform()
. Sometimes the concept is in second position, but in many cases it is in positions >2. Any idea on why this might be happening?
Thank you for your time.
Best wishes.
The find_concepts
function is merely a quick search function and does not behave the same way .transform
does. .find_concepts
applies a cosine similarity between image and concept embeddings to quickly find a match. However, this is not an exact representation of the training process during .fit
which involves clustering and dimensionality reduction.
Ohh, I see.
Thank you!