GPU利用率低

为什么我训练的时候利用率很低呢，我的cuda环境也配置好了

目前I/O方面效率比较低，我们正在处理这个问题。其次模型也比较小，综合导致利用率较低。
训练时添加OMP_NUM_THREADS=2可提升训练速度，即OMP_NUM_THREADS=2 python train.py。>=2也可以。

嗯嗯，好的，谢谢

我还有个问题想问一下您。在您提供的multibox_loss.py的文件内，在计算eiou_loss的时候： loc_t这个是不是没有给他分配target的值啊。还是我理解的有问题。

libfacedetection.train/src/multibox_loss.py

Line 89 in e72b34b

    
           loss_bbox_eiou = eiou_loss(loc_p[:, 0:4], loc_t[:, 0:4], variance=self.variance, smooth_point=self.smooth_point, reduction='sum')

这一行的loc_p是网络的预测值，loc_t是真实标签。我们用eiou来计算预测框与真实框之间的距离。

嗯嗯，是的。我理解的是loc_t应该等于这里的truths，但是我在代码中没有看到赋值操作

libfacedetection.train/src/multibox_loss.py

Line 71 in e72b34b

truths = targets[idx][:, 0:14].data

在代码中local_t只在这里赋值

libfacedetection.train/src/multibox_loss.py

Line 67 in e72b34b

loc_t = torch.Tensor(num, num_priors, 14)

请问是我的理解有问题吗？谢谢回复

libfacedetection.train/src/multibox_loss.py

Lines 66 to 75 in e72b34b

    
           # match priors (default boxes) and ground truth boxes 
        
           loc_t = torch.Tensor(num, num_priors, 14) 
        
           conf_t = torch.LongTensor(num, num_priors) 
        
           iou_t = torch.Tensor(num, num_priors) 
        
           for idx in range(num): 
        
               truths = targets[idx][:, 0:14].data 
        
               labels = targets[idx][:, -1].data 
        
               defaults = priors.data 
        
               iou_t[idx] = match(self.threshold, truths, defaults, self.variance, labels, loc_t, conf_t, idx) 
        
           iou_t = iou_t.view(num, num_priors, 1)

第74行的match函数负责将anchor和gt匹配，匹配的gt会写入到loc_t里面（第148行）：

libfacedetection.train/src/utils.py

Lines 98 to 151 in e72b34b

    
           def match(threshold, truths, priors, variances, labels, loc_t, conf_t, idx): 
        
               """Match each prior box with the ground truth box of the highest jaccard 
        
               overlap, encode the bounding boxes, then return the matched indices 
        
               corresponding to both confidence and location preds. 
        
               Args: 
        
                   threshold: (float) The overlap threshold used when mathing boxes. 
        
                   truths: (tensor) Ground truth boxes, Shape: [num_obj, num_priors]. 
        
                   priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4]. 
        
                   variances: (tensor) Variances corresponding to each prior coord, 
        
                       Shape: [num_priors, 4]. 
        
                   labels: (tensor) All the class labels for the image, Shape: [num_obj]. 
        
                   loc_t: (tensor) Tensor to be filled w/ endcoded location targets. 
        
                   conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds. 
        
                   idx: (int) current batch index 
        
               Return: 
        
                   The matched indices corresponding to 1)location and 2)confidence preds. 
        
               """ 
        
               # jaccard index 
        
               overlaps = jaccard( 
        
                   truths, 
        
                   point_form(priors) 
        
               ) 
        
               # (Bipartite Matching) 
        
               # [1,num_objects] best prior for each ground truth 
        
               best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True) 
        
               # ignore hard gt 
        
               valid_gt_idx = best_prior_overlap[:, 0] >= 0.2 
        
               best_prior_idx_filter = best_prior_idx[valid_gt_idx, :] 
        
               if best_prior_idx_filter.shape[0] <= 0: 
        
                   loc_t[idx] = 0 
        
                   conf_t[idx] = 0 
        
                   return torch.zeros((1, priors.shape[0])) 
        
               # [1,num_priors] best ground truth for each prior 
        
               best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True) 
        
               best_truth_idx.squeeze_(0) 
        
               best_truth_overlap.squeeze_(0) 
        
               best_prior_idx.squeeze_(1) 
        
               best_prior_idx_filter.squeeze_(1) 
        
               best_prior_overlap.squeeze_(1) 
        
               best_truth_overlap.index_fill_(0, best_prior_idx_filter, 2)  # ensure best prior 
        
               # TODO refactor: index  best_prior_idx with long tensor 
        
               # ensure every gt matches with its prior of max overlap 
        
               for j in range(best_prior_idx.size(0)): 
        
                   best_truth_idx[best_prior_idx[j]] = j 
        
               matches = truths[best_truth_idx]          # Shape: [num_priors,14] 
        
               conf = labels[best_truth_idx]          # Shape: [num_priors] 
        
               conf[best_truth_overlap < threshold] = 0  # label as background 
        
               loc = encode(matches, priors, variances) 
        
               loc_t[idx] = loc    # [num_priors,14] encoded offsets to learn 
        
               conf_t[idx] = conf  # [num_priors] top class label for each prior 
        
               return best_truth_overlap

啊啊啊，原来藏在这里，非常感谢您的回复。

	# match priors (default boxes) and ground truth boxes
	loc_t = torch.Tensor(num, num_priors, 14)
	conf_t = torch.LongTensor(num, num_priors)
	iou_t = torch.Tensor(num, num_priors)
	for idx in range(num):
	truths = targets[idx][:, 0:14].data
	labels = targets[idx][:, -1].data
	defaults = priors.data
	iou_t[idx] = match(self.threshold, truths, defaults, self.variance, labels, loc_t, conf_t, idx)
	iou_t = iou_t.view(num, num_priors, 1)

	def match(threshold, truths, priors, variances, labels, loc_t, conf_t, idx):
	"""Match each prior box with the ground truth box of the highest jaccard
	overlap, encode the bounding boxes, then return the matched indices
	corresponding to both confidence and location preds.
	Args:
	threshold: (float) The overlap threshold used when mathing boxes.
	truths: (tensor) Ground truth boxes, Shape: [num_obj, num_priors].
	priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4].
	variances: (tensor) Variances corresponding to each prior coord,
	Shape: [num_priors, 4].
	labels: (tensor) All the class labels for the image, Shape: [num_obj].
	loc_t: (tensor) Tensor to be filled w/ endcoded location targets.
	conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds.
	idx: (int) current batch index
	Return:
	The matched indices corresponding to 1)location and 2)confidence preds.
	"""
	# jaccard index
	overlaps = jaccard(
	truths,
	point_form(priors)
	)
	# (Bipartite Matching)
	# [1,num_objects] best prior for each ground truth
	best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)

	# ignore hard gt
	valid_gt_idx = best_prior_overlap[:, 0] >= 0.2
	best_prior_idx_filter = best_prior_idx[valid_gt_idx, :]
	if best_prior_idx_filter.shape[0] <= 0:
	loc_t[idx] = 0
	conf_t[idx] = 0
	return torch.zeros((1, priors.shape[0]))

	# [1,num_priors] best ground truth for each prior
	best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
	best_truth_idx.squeeze_(0)
	best_truth_overlap.squeeze_(0)
	best_prior_idx.squeeze_(1)
	best_prior_idx_filter.squeeze_(1)
	best_prior_overlap.squeeze_(1)
	best_truth_overlap.index_fill_(0, best_prior_idx_filter, 2) # ensure best prior
	# TODO refactor: index best_prior_idx with long tensor
	# ensure every gt matches with its prior of max overlap
	for j in range(best_prior_idx.size(0)):
	best_truth_idx[best_prior_idx[j]] = j
	matches = truths[best_truth_idx] # Shape: [num_priors,14]
	conf = labels[best_truth_idx] # Shape: [num_priors]
	conf[best_truth_overlap < threshold] = 0 # label as background
	loc = encode(matches, priors, variances)
	loc_t[idx] = loc # [num_priors,14] encoded offsets to learn
	conf_t[idx] = conf # [num_priors] top class label for each prior

	return best_truth_overlap