qiank10/MVDNet

labels about ground-truth

Etah0409 opened this issue · 11 comments

Hi, thanks for your sharing about the bounding-box-annotations of ORR dataset. I have some questions about the annotations (such as units and meaning). Could you please add some notes?
Best wishes

The format of the 2D labels is:
Class, ID, Center Loc X (m), Center Loc Y (m), Width (m), Length (m), Yaw (degree)

The format of the 3D labels is the same as the KITTI labels.

The yaw angle is the clockwise angle from the heading direction of the ego vehicle in the bird's eye view.

The format of the 2D labels is:
Class, ID, Center Loc X (m), Center Loc Y (m), Width (m), Length (m), Yaw (degree)

The format of the 3D labels is the same as the KITTI labels.

The yaw angle is the clockwise angle from the heading direction of the ego vehicle in the bird's eye view.

Thanks for your kindly reply! It really helps a lot.

Hi there,
I am working on this demo recently, but come across a paradoxical problem here when drawing output result.
I used pyplotlib.matplot.imshow(radar raw image), and plt.scatter(lidar point cloud) to plot radar raw image and lidar point cloud, which matched each other good as below:
image
In this figure, the blue boxes are label boxes,the red dots are the centers of label boxes. The red boxes are model prediction boxes and blue dots are the centers of result boxes.Yellow points are projected from point cloud and the back image is radar raw image.
As u can see, When I trying to plot the result bbox and label bbox onto that figure, I found out that the label bbox center x and center y is located correctly in the up figure. But u can never move the center points of the boxes to any of the corner of the lidar point cloud with minus or plus a half of width or length of the box to accomplished both sides of the road bbox.
See if its correct, I upload my result-viewing code:

import matplotlib.pyplot as plt
import matplotlib.patches as patches
import numpy as np
import cv2
from PIL import Image
import os

inputDataPath = '/data/2019-01-10-11-46-21-radar-oxford-10k/processed/'

def readLidarData(filename):
lidarData = np.fromfile(filename, dtype=np.float32)
lidarData = np.reshape(lidarData,(-1,4))
[pcNum,pcDim] = np.shape(lidarData)
x_arr = []
y_arr = []
z_arr = []
intensity_arr = []
for i in range(pcNum):
# coordinate transfer
indx_x = 319-5 * (lidarData[i, 0]+32)
indx_y = 5 * (lidarData[i, 1]+32)
indx_z = 5*(lidarData[i,2])
intensity = lidarData[i,3]
if indx_x > 320 or indx_x < 0 or indx_y > 320 or indx_y < 0:
continue
# pack them together
x_arr.append(indx_x)
y_arr.append(indx_y)
z_arr.append(indx_z)
intensity_arr.append(intensity)
return x_arr, y_arr, z_arr, intensity_arr

def readRawIndex():
dict_temp = {}
fn_evalIndex = os.path.join(inputDataPath,'ImageSets','eval.txt')
file = open(fn_evalIndex)
for line in file.readlines():
line = line.strip()
k = line.split(' ')[0]
v = line.split(' ')[1]
dict_temp[k] = v
return dict_temp

def readResult():
filename = './output/mvdnet/instances_predictions.csv'
return np.loadtxt(filename)

def readLabel(labelFilename):
this_labelFilename = os.path.join(inputDataPath, 'label_2d', labelFilename)
class_arr = []
id_arr = []
leftup_x_arr = []
leftup_y_arr = []
width_arr = []
height_arr = []
angle_arr = []
with open(this_labelFilename, "r") as f:
for line in f.readlines():
tmp = line.split(' ')
class_arr.append(tmp[0])
id_arr.append(int(tmp[1]))
# center-width/2+center_image
# leftup_x = 5 * (float(tmp[2])+float(tmp[4])/2) + 319/2
# leftup_y = 5 * (float(tmp[3])+float(tmp[5])/2) + 319/2
leftup_x = 5 * (float(tmp[2])) + 319/2
leftup_y = 5 * (float(tmp[3])) + 319/2
leftup_x_arr.append(leftup_x)
leftup_y_arr.append(leftup_y)
width_arr.append(5float(tmp[4]))
height_arr.append(5
float(tmp[5]))
# angle adjust
angle = float(tmp[6])
# if angle >= -112.5 and angle < 67.5:
# direction = 0
# else:
# direction = 1
# if angle < -112.5:
# angle = angle + 180
# else:
# angle = angle - 180
angle_arr.append(angle)
return class_arr,id_arr,leftup_x_arr,leftup_y_arr,width_arr,height_arr,angle_arr

def drawOutput(resultMat, rawIndexDict):
[objNUm, resDim] = np.shape(resultMat)
frameNum = 2799
fig = plt.figure()
for i in range(objNUm):
# i+=100
frameNumNow = int(resultMat[i, 0])
timestamp = rawIndexDict[str(frameNumNow)]
fig.canvas.set_window_title(str(timestamp))
# read raw data images
if frameNumNow != frameNum:
print("frame num: {}".format(frameNum))
# read lidar data
lidarFilename = '{}.bin'.format(timestamp)
lidar_rawDataFilename = os.path.join(inputDataPath,'lidar',lidarFilename)
[x_arr, y_arr, z_arr, intensity_arr] = readLidarData(lidar_rawDataFilename)
# read radar data
radarFilename = '{}.jpg'.format(timestamp)
radar_rawDataFilename = os.path.join(inputDataPath,'radar',radarFilename)
# draw point cloud and radar image
plt.scatter(y_arr, x_arr,s=0.05,c='yellow')
plt.imshow(Image.open(radar_rawDataFilename))
# load label bbox
# read label 2d
labelFilename = '{}.txt'.format(timestamp)
[class_arr, id, label_leftup_x, label_leftup_y, label_width, label_height, label_angle] = readLabel(labelFilename)
for j in range(len(id)):
plt.gca().add_patch(patches.Rectangle((label_leftup_x[j], label_leftup_y[j]), label_width[j], label_height[j],
# angle = 180angle/np.pi,
angle=180-label_angle[j],
edgecolor='blue',
facecolor='none',
lw=2))
plt.plot(label_leftup_x[j], label_leftup_y[j],'ro')
# load result bbox
height = resultMat[i, 3]
width = resultMat[i, 4]
leftup_x = resultMat[i, 1]#+width/2
leftup_y = resultMat[i, 2]#+height/2
angle = resultMat[i, 5]
# draw them together
if frameNumNow == frameNum:
plt.gca().add_patch(patches.Rectangle((leftup_x,leftup_y),width,height,
# angle = 180
angle/np.pi,
angle = 90-angle,
edgecolor='red',
facecolor='none',
lw=2))
plt.plot(leftup_x,leftup_y,'bo')
else:
# plt.hold(False)
plt.pause(0.1)
if frameNumNow == 2830:
plt.pause(0)
plt.clf()
frameNum = frameNumNow

def main():
resultMat = readResult()
rawDataIndexDict = readRawIndex()
drawOutput(resultMat, rawDataIndexDict)
pass

if name == "main":
main()

I'm keen to get some help,
thank u.

Sorry I may have solved this problem by doing rotation before translation. My fault, sorry.

I want to visualize the labels too. Can you provide the code? Thank you for your help!

sorry to tell you that above is all my code
thanks

@yyxr75 Could you tell me how to get instances_predictions.csv? When I run eval.py I get instances_predictions.pth. Is there a code to convert .pth to .csv?

Thank u

have you solve it? thanks.

have you solve it? thanks.

@MaiRajborirug @yyxr75

@neverstoplearn @yyxr75 I can't find the instances_predictions.csv. So I write my own readResult function to store results from instances_predictions.pth. Let me know if you find any bugs or offset / scale adjustments in the function. My result in visualization for frame '2799' (1547121487422169.bin) looks weird. The ground truth (blue) and the red label (prediction) should be more overlapped with each other. Seem like it has the same top-left points (blue and red dots) but the code might draw the bounding boxes' width in the incorrect direction.

def readResult(filename = '../output/mvdnet/instances_predictions.pth'):
    '''Array
    frameNumNow = resultMat[i, 0] -> 2799
    leftup_x = resultMat[i, 1]#+width/2
    leftup_y = resultMat[i, 2]#+height/2
    height = resultMat[i, 3]
    width = resultMat[i, 4]
    angle = resultMat[i, 5]
    red color for prediction
    '''
    checkpoint = torch.load(filename)
    len_eval = len(checkpoint)  # number of evaluation frame
    resultMat = []
    for i in range(len_eval):
        cars = checkpoint[i]['instances']
        frameNumNow = int(checkpoint[i]['image_id'])
        for j in range(len(cars)):
            leftup_x = cars[j]['bbox'][0]#+width/2
            leftup_y = cars[j]['bbox'][1]#+height/2
            height = cars[j]['bbox'][2]
            width = cars[j]['bbox'][3]
            angle = cars[j]['bbox'][4]
            score = cars[j]['score']
            resultArr = [frameNumNow, leftup_x, leftup_y, height, width, angle, score]
            resultMat.append(resultArr)
    resultMat = np.array(resultMat)
    return resultMat

image

thanks。

r kindly reply! It really

Can you correctly connect 3Dbboxs with homologous objtcts?I ran every possible coordinate transformation than I used the open3d to visualize the scene and bbox. I tried my best, but I can't connect them.😭😭
I've been stumped by this question for a long time. Can you help me?