linyq2117/CLIP-ES

initial cam

Closed this issue · 2 comments

Hi,

First of all, thank you for this great work! I’m currently using the following code to save the initial CAM and the refined CAM:
np.save(os.path.join(args.cam_out_dir, output_filename), { "highres_cam": grayscale_cam_highres.astype(np.float32), "attn_highres": cam_refined_highres.astype(np.float32), })

The refined CAM (attn_highres) looks great and provides very accurate results, but the initial CAM (highres_cam) seems quite unusual or strange. Could you please help explain why this might be happening?
截屏2024-10-07 下午5 48 29 png_output

image png_output
5c12b)
image png_output
截屏2024-10-07 下午5 48 29 png_output

Thanks for your interest!

The results seem abnormal. I re-run this demo and the result is normal in my case. It seems that the cam save part is incorrect. You can directly follow our code in line209-214

np.save(os.path.join(args.cam_out_dir, im.replace('jpg', 'npy')),
                {"keys": keys.numpy(),
                "highres": highres_cam_all_scales.cpu().numpy().astype(np.float16),
                "attn_highres": refined_cam_all_scales.cpu().numpy().astype(np.float16),
                })

Note that the shape of grayscale_cam_highres is (h, w) while highres_cam_all_scales is (c, h, w) that save cams for c classes in an image. If the image only have one class, you can use grayscale_cam_highres but need to reshape it into (1, h, w). I have provided the visualization code I used for your reference. You can check and compare it with yours.

image

import cv2
import numpy as np
from pytorch_grad_cam.utils.image import show_cam_on_image
import os
from PIL import Image
from tqdm import tqdm

if __name__ == "__main__":
    class_names = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle',
                       'bus', 'car', 'cat', 'chair', 'cow',
                       'diningtable', 'dog', 'horse', 'motorbike', 'person',
                       'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor',
                       ]
    cam_dict_root = '/root/code/CLIP-ES/output/voc12/demo/'#2007_000876.npy
    img_root = '/root/datasets/VOC2012/JPEGImages/'
    cam_list = sorted(os.listdir(cam_dict_root))

    save_path = '/root/code/CLIP-ES/vis/demo'
    if not os.path.exists(save_path):
        os.makedirs(save_path)

    for idx in tqdm(cam_list):
        cam_path = os.path.join(cam_dict_root, idx)
        cam_dict = np.load(cam_path, allow_pickle=True).item()
        keys = cam_dict['keys']

        img_path = os.path.join(img_root, idx.replace('.npy', '.jpg'))
        image1 = np.array(Image.open(img_path))
        image1 = (image1 - image1.min()) / (image1.max() - image1.min())
        bgr_img = image1
        bgr_img = bgr_img.astype(np.float32)

        cams = cam_dict['highres']
        for i, k in enumerate(keys):
            category = class_names[k]
            cam = cams[i]
            visualization = show_cam_on_image(bgr_img, cam, use_rgb=False)
            cv2.imwrite(os.path.join(save_path,idx.replace('.npy', '_highres_'+category+'.jpg')), visualization)

        attn_cams = cam_dict['attn_highres']
        for i, k in enumerate(keys):
            category = class_names[k]
            cam = attn_cams[i]
            visualization = show_cam_on_image(bgr_img, cam, use_rgb=False)
            cv2.imwrite(os.path.join(save_path, idx.replace('.npy', '_attn_highres_' + category + '.jpg')), visualization)

Thanks