[BUG] can not train yolov3
lucasjinreal opened this issue · 3 comments
Hi, not sure its a bug or something missed on my part. But I have a question about yolov3 preprocess_image.
Here is error I got:
mask[b, a, gj, gi] = 1
IndexError: index 23 is out of bounds for dimension 3 with size 14
this means the label (wh) doesn't same as my image resized, see your image resized to 544 and stride is 32, then your label max would not exceed 17, but I got some index like 23.
And this prerocess_image code:
def preprocess_image(self, batched_inputs, training):
"""
Normalize, pad and batch the input images.
"""
images = [x["image"].to(self.device) for x in batched_inputs]
bs = len(images)
images = [self.normalizer(x) for x in images]
images = ImageList.from_tensors(
images, size_divisibility=0, pad_ref_long=True)
logger.info('images ori shape: {}'.format(images.tensor.shape))
logger.info('images ori shape: {}'.format(images.image_sizes))
# sync image size for all gpus
comm.synchronize()
if training and self.iter % self.change_iter == 0:
if self.iter < self.max_iter - 20000:
meg = torch.LongTensor(1).to(self.device)
comm.synchronize()
if comm.is_main_process():
size = np.random.choice(self.multi_size)
meg.fill_(size)
if comm.get_world_size() > 1:
comm.synchronize()
dist.broadcast(meg, 0)
self.size = meg.item()
comm.synchronize()
else:
self.size = 608
if training:
# resize image inputs
modes = ['bilinear', 'nearest', 'bicubic', 'area']
mode = modes[random.randrange(4)]
if mode == 'bilinear' or mode == 'bicubic':
images.tensor = F.interpolate(
images.tensor, size=[self.size, self.size], mode=mode, align_corners=False)
else:
images.tensor = F.interpolate(
images.tensor, size=[self.size, self.size], mode=mode)
if "instances" in batched_inputs[0]:
gt_instances = [
x["instances"].to(self.device) for x in batched_inputs
]
elif "targets" in batched_inputs[0]:
log_first_n(
logging.WARN,
"'targets' in the model inputs is now renamed to 'instances'!",
n=10)
gt_instances = [
x["targets"].to(self.device) for x in batched_inputs
]
else:
gt_instances = None
targets = [
torch.cat(
[instance.gt_classes.float().unsqueeze(-1), instance.gt_boxes.tensor], dim=-1
)
for instance in gt_instances
]
labels = torch.zeros((bs, 100, 5))
for i, target in enumerate(targets):
labels[i][:target.shape[0]] = target
labels[:, :, 1:] = labels[:, :, 1:] / 512. * self.size
else:
labels = None
self.iter += 1
return images, labels
The image resized 2 times, but the label seems doesn't have any changes. Any idea how the error get ? (Maybe your code have some automatically way to solve image and labels, but I don't know where)
It's a yolov3 on coco dataset or on your own dataset?
coco. I found it hard to converge on coco as well, I changed to retangle input size rather than force resize. Also your code have a bug in build_target which make w and h oppsite.
Thanks for your report, we will try in our inner version.