MaartenGr/Concept

Question about the Function transform

xinli2008 opened this issue · 7 comments

Thank you for your excellent job-:) I have a question when i read the code about function transform
You say, given the images and image_embedding, and the return is Predictions:Concept predictions for each image
But when i read the code of transform, the output is not the concept prediction for each image.
can you explain it ?Thank you very much!

The .transform function returns the predictions for each image. Take the last few lines of the .transform method as shown below:

Concept/concept/_model.py

Lines 193 to 195 in d270607

umap_embeddings = self.umap_model.transform(image_embeddings)
predictions, _ = hdbscan.approximate_predict(self.hdbscan_model, umap_embeddings)
return predictions

With that, we create a lower dimensionality of the embeddings and feed those to the HDBSCAN model to cluster. The resulting clusters, predictions, are the concept prediction for each image.

sorry to bother you again. I try to use the following code to find the best concept for each images:
concept_model = ConceptModel()
new_concepts = concept_model.transform(image_list)

the error detail is :
Traceback (most recent call last):
File "_model.py", line 629, in
new_concepts = concept_model.transform(image_list)
File "model.py", line 197, in transform
umap_embeddings = self.umap_model.transform(image_embeddings)
File "/home/lixin/enter/envs/PR-VIST/lib/python3.7/site-packages/umap/umap
.py", line 2802, in transform
if self._raw_data.shape[0] == 1:
AttributeError: 'UMAP' object has no attribute '_raw_data'
can you give me some useful advice to fix it ? thank you @MaartenGr

Definitely not a bother! Although I am not familiar with the error, I would advise making sure you have the newest version of umap-learn installed. If that does not work out, creating a completely fresh environment and re-installing in theory should resolve your issue.

Thank you for your nice advice, i will follow that instructions. And i have another to bother you, if you have some ideas, i would appreciate it if you share it with me.
I have a sequences of images(prehaps 10 images), if i want to find the topic or theme(wedding, vocation etc.) of them do you have some ideas?

To find the topics of a set of images, I would advise going through the README and simply replacing those images with the images that you have. Do note that you would want at least a few hundred images to get a good clustering going.

Thank you for you advice and i have some ideas in my minds. I have another question, if i predict the topic of a set of images, but how can i evaluate the results? Because i found no images2topic dataset. Look forward to your reply.

To my knowledge, there currently is not a dataset where you can find both images and topics as topic modeling is typically evaluated through coherence, which cannot easily be generalized to images. Since concept modeling is rather new I do not think there is a set of standard procedures yet for evaluating concepts.