Could not get accuracy that is mentioned in paper
Closed this issue · 5 comments
I am using pre trained weights to get embedding and then calculating difference to ReID of images. But I am not geting results as I was expection and mentioned in paper. Please let me know do I need to train on my own dataset first. secondly please review code that I am using to get embedding .
`import os
import sys
import torch
import random
import numpy as np
import csv
import matplotlib.pyplot as plt
from PIL import Image
from torchvision import transforms
from torch.backends import cudnn
from reid.utils.logging import Logger
from reid.models.msinet import msinet_x1_0
from reid.utils.serialization import copy_state_dict
def count_parameters(model):
return np.sum(np.fromiter((np.prod(v.size()) for name, v in model.named_parameters() if 'classifier' not in name), dtype=np.float32)) / 1e6
def preprocess_image(image_path, height, width):
transform = transforms.Compose([
transforms.Resize((height, width)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
image = Image.open(image_path).convert('RGB')
image = transform(image)
image = image.unsqueeze(0) # Add batch dimension
return image
def extract_embedding(model, image_tensor):
model.eval()
with torch.no_grad():
embedding = model(image_tensor.cuda())
return embedding.cpu().numpy()
def euclidean_distance(embedding1, embedding2):
return np.linalg.norm(embedding1 - embedding2)
class Args:
def init(self):
# data
self.source_dataset = 'market1501'
self.target_dataset = 'none'
self.batch_size = 64
self.test_batch_size = 128
self.workers = 4
self.height = 256
self.width = 256
self.num_instance = 4
# model
self.arch = 'resnet50'
self.pretrained = False
self.reset_params = False
self.genotypes = 'msmt'
# loss
self.margin = 0.3
self.sam_mode = 'none'
self.sam_ratio = 2.0
# optimizer
self.optim = 'sgd'
self.lr = 0.065
self.weight_decay = 5e-4
self.momentum = 0.9
self.milestones = [150, 225, 300]
self.warmup_step = 10
# training configs
self.resume = ''
self.evaluate = False
self.epochs = 350
self.seed = 0
self.print_freq = 100
self.eval_interval = 40
# misc
self.data_dir = './data'
self.logs_dir = './logs'
self.pretrain_dir = './pretrained'
def main():
args = Args()
seed = args.seed
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
cudnn.deterministic = True
cudnn.benchmark = True
sys.stdout = Logger(os.path.join(args.logs_dir, 'log.txt'))
print('Running with:\n{}'.format(args))
num_classes = 751 # Set number of classes according to your dataset
model = msinet_x1_0(args, num_classes)
print('Model Params: {}'.format(count_parameters(model)))
model = model.cuda()
pretrained_weights = os.path.join(args.pretrain_dir, 'msinet_msmt.pth.tar')
if os.path.isfile(pretrained_weights):
checkpoint = torch.load(pretrained_weights)
copy_state_dict(checkpoint['state_dict'], model)
else:
print(f"No pretrained weights found at {pretrained_weights}")
# Ask user for input image
input_image_path = "./bb/0031_c1s1_002576_04.jpg"
# Load input image and extract its embedding
input_image_tensor = preprocess_image(input_image_path, args.height, args.width)
input_embedding = extract_embedding(model, input_image_tensor)
# Directory containing images
image_directory = './bb' # Change this to your image directory
# Load other images and extract embeddings
image_paths = [os.path.join(image_directory, f) for f in os.listdir(image_directory) if f.endswith('.jpg')]
embeddings = []
for image_path in image_paths:
image_tensor = preprocess_image(image_path, args.height, args.width)
embedding = extract_embedding(model, image_tensor)
embeddings.append((embedding, image_path))
# Calculate Euclidean distances between input embedding and other embeddings
distances = [(i, euclidean_distance(input_embedding, embedding[0])) for i, embedding in enumerate(embeddings)]
# Sort distances
distances.sort(key=lambda x: x[1])
# Display top 20 images
fig, axes = plt.subplots(5, 4, figsize=(15, 15))
for i in range(5):
for j in range(4):
if i == 0 and j == 0:
# Display input image
img = Image.open(input_image_path).convert('RGB')
axes[i, j].imshow(img)
axes[i, j].set_title("Input Image")
else:
# Display other images
idx = i * 4 + j - 1
if idx < len(distances):
img = Image.open(image_paths[distances[idx][0]]).convert('RGB')
axes[i, j].imshow(img)
axes[i, j].set_title(f"Rank {idx + 1}\nDistance: {distances[idx][1]:.2f}")
axes[i, j].axis('off')
plt.tight_layout()
plt.show()
if name == 'main':
main()`
Thanks for the interest. The provided weights are pre-trained on ImageNet. So yes, you do need to fine-tune the model on your own dataset.
And for customized datasets, you can create a new python file to make it similar to what exist now in the reid/data
folder. And then the training script in this repo can be directly utilized.
Let me know if there are more problems on this.
Thanks for the interest. The provided weights are pre-trained on ImageNet. So yes, you do need to fine-tune the model on your own dataset. And for customized datasets, you can create a new python file to make it similar to what exist now in the
reid/data
folder. And then the training script in this repo can be directly utilized. Let me know if there are more problems on this.
Thank you so much for your prompt response, I surely will train it on custom dataset. Is it possible you can share weights that you are trained on VehicleID dataset. As I want test on vehicles and bikers images.
I'm sorry but currently I don't have any trained models on Re-ID datastes.
If you want to use the model for vehicles and bikers re-identification, it is better to construct a dataset more similar to the actual scenarios. Although MSINet has improved the generalization by a large margin, the direct cross-domain performance is still relatively poor.
My another work on continual Re-ID helps improve the generalization by a color distribution shuffle operation, which might also be useful for you. Please refer to https://github.com/vimar-gu/ColorPromptReID/blob/57ed2ac17c5239542a426818051cb588defa4b42/reid/trainers.py#L41
Thank you so much ! I have retrained the model on my own dataset which has 851 clsseses, these picture are of bikers currently I used MSInet . these are results
=> Computing DistMat with euclidean distance
Validation Results - Epoch[349]
mAP: 82.4%
CMC curve, Rank-1 :75.0%
CMC curve, Rank-5 :100.0%
CMC curve, Rank-10 :100.0%
I just want to make infrence to match embedings , can you please let me know or you share a simple script which will give embedings and eculean distance between two embedings. Currently I am trying but I got several errors , size miss match which is I think becasue of class miss match,,, and like these
You can refer to the code in reid/utils/metrics.py
:
Lines 131 to 136 in 2a8845b
Here the distance between groups of features are calculated. The distance calculation is not related to classes, where the features should be in the shape of
[sample_number]*[feature_len]
.The matrix calculation operation indeed is kind of tricky. Try figuring it out by using different operations :)