Question about the description
Opened this issue · 3 comments
Reckless to ask,I don’t understand how to enable Myriad with the ability to describe IAD, it seems like you freeze the LLM module,So how can you enable the MiniGPT to describe IAD? With the designed prompt and embedding,the conclusion that " the MiniGPT4 can make such description" seems to be impossible?
Refer to read our paper on arxiv(https://arxiv.org/abs/2310.19070). We encode the anomaly maps predicted by vision experts with our proposed Expert Perception modules into both LLM and Qformer(MiniGPT4) as instructions.
The paper will be updated to the newest version. We are also working hard to prepare our model, data and codes for reproduction. Sorry for waiting.
Thanks for your reply! And I am also looking forward to the new version of your excellent work!
Still wondering about the details in the visual-language pairs in IAD datasets, mentioned in the paper.
Do you use the same Image Description in AnomalyGPT? as illustrated bellow (extract from AnomalyGPT paper)
How do you construct the Answer of the training data?
No, we do not use complex constructed answers for IAD instruction.
Simple "yes, there is a defect/are anomalies." are pretty good for MiniGPT-4. Additional information might be confusion for training MiniGPT-4. BUT we also try multi-task joint learning including counting, detection and object type classification for training.