facebookresearch/co-tracker

how to draw trajectory of predicted points

Shengnan-Zhu opened this issue · 9 comments

Hi, @nikitakaraevv, thanks for your excellent work! I want to know how to draw the trajectory of tracked points on graph in the README. I've tried set tracks_leave_trace=-1 in Visualizer, but the result looks a little different from the one in the README. Greatly appreciate it If you could show some commands or any help!

Hi @Shengnan-Zhu, that's how you can do it:

vis = Visualizer(
    mode="rainbow",
    tracks_leave_trace=-1,
)
vis.visualize(
    video,
    pred_tracks,
    segm_mask=sample.segmentation,
    compensate_for_camera_motion=True
)

You need to pass a segmentation mask for the first frame of the video, and set compensate_for_camera_motion=True.

Thanks for your reply! I tried your suggestion, and I pass the segm_mask(used in CoTrackerPredictor model) to 'segm_mask' in visualize(), then I encountered a problem of indexing between different devices(cpu or gpu). And I moved segm_mask to cuda, but then the problem was: "can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.". Hope more help, thanks!

here's part of sample code:

segm_mask = np.array(Image.open(os.path.join(args.mask_path)))
segm_mask = torch.from_numpy(segm_mask)[None, None]
assert video.shape[3:5] == segm_mask.shape[2:4], "Video and mask dimensions do not match"
args.output_name = args.output_name + "_mask"
            
pred_tracks, pred_visibility = model(
       video,
       grid_size=args.grid_size,
       grid_query_frame=args.grid_query_frame,
       backward_tracking=args.backward_tracking,
       segm_mask=segm_mask,
  )
print("computed")

# save a video with predicted tracks
seq_name = args.video_path.split("/")[-1]
vis = Visualizer(save_dir="./saved_videos", pad_value=120, linewidth=2, tracks_leave_trace=-1)
vis.visualize(video, pred_tracks, pred_visibility, 
                      query_frame=args.grid_query_frame, 
                      filename=args.output_name, 
                      compensate_for_camera_motion=True,
                      segm_mask=segm_mask.cuda())

Hi @Shengnan-Zhu, that's how you can do it:

vis = Visualizer(
    mode="rainbow",
    tracks_leave_trace=-1,
)
vis.visualize(
    video,
    pred_tracks,
    segm_mask=sample.segmentation,
    compensate_for_camera_motion=True
)

You need to pass a segmentation mask for the first frame of the video, and set compensate_for_camera_motion=True.

Can you try to do video.cpu(), pred_tracks.cpu() and segm_mask.cpu() before passing them tovis.visualize?

Can you try to do video.cpu(), pred_tracks.cpu() and segm_mask.cpu() before passing them tovis.visualize?

Yes, I tried video.cpu(), pred_tracks.cpu() , pred_visibility.cpu() and segm_mask.cpu(), but still have some error.
here's the error log:

Traceback (most recent call last):
  File "/home/shengnan/co-tracker/demo.py", line 119, in <module>
    vis.visualize(video.cpu(), pred_tracks.cpu(), pred_visibility.cpu(),
  File "/home/shengnan/co-tracker/cotracker/utils/visualizer.py", line 94, in visualize
    res_video = self.draw_tracks_on_video(
  File "/home/shengnan/co-tracker/cotracker/utils/visualizer.py", line 215, in draw_tracks_on_video
    res_video[t] = self._draw_pred_tracks(
  File "/home/shengnan/co-tracker/cotracker/utils/visualizer.py", line 269, in _draw_pred_tracks
    coord_y = (int(tracks[s, i, 0]), int(tracks[s, i, 1]))
ValueError: cannot convert float NaN to integer

and my code:

vis.visualize(video.cpu(), pred_tracks.cpu(), pred_visibility.cpu(), 
                      query_frame=args.grid_query_frame, 
                      filename=args.output_name, 
                      compensate_for_camera_motion=True,
                      segm_mask=segm_mask.cpu())

then I tried to replaceNaN in tracks to 0, and the demo runs normal, but the result video looks strange.

bmx-bumps_mask_pred_track.mp4

What is your grid_query_frame?

For some reason, you have NaNs in the estimated tracks. Could you send me this video?

Sure, here is the video and mask:
bmx-bumps.zip

I didn't set grid_query_frame , maybe its a default value 0. And the mask file is for the first frame of the video

I think the reason for this is that you're taking segm_mask as input to the model. In this case, the model only estimates the motion of the object points and can't rely on any background points to compensate for camera motion.

That's the result I get when running this command with your video and mask:
python demo.py --grid_size 30 --video_path ./bmx-bumps.mp4 --mask_path ./mask.png

This is the code:

    pred_tracks, pred_visibility = model(
        video,
        grid_size=args.grid_size,
        grid_query_frame=args.grid_query_frame,
        backward_tracking=args.backward_tracking,
    )
    print("computed")

    # save a video with predicted tracks
    seq_name = args.video_path.split("/")[-1]
    vis = Visualizer(
        save_dir="./saved_videos",
        pad_value=120,
        linewidth=3,
        tracks_leave_trace=-1,
    )

    vis.visualize(
        video,
        pred_tracks.cpu(),
        query_frame=args.grid_query_frame,
        segm_mask=segm_mask,
        compensate_for_camera_motion=True,
    )
video_pred_track.mp4

Yes, you are right! And this time the result looks correct. Thanks very much for your help! I will close this issue.