Cartucho/OpenLabeling

New Features Discussion

Cartucho opened this issue ยท 20 comments

The purpose of this tool is to make labeling as easy and fast as possible.

Initial ideas:

  • Use pre-labeled images from Yolo v2;
  • Video Object Tracking;
  • Feature Matching + Homography;
  • Superpixel Segmentation;
  • Grabcut Segmentation;

Discuss here your opinions.

Video object tracking and feature matching / homography would increase labeling efficiency a huge amount for those labeling frames from a video. You could label one frame and then get 30 frames labeled for free, for example.

I don't know what "Use pre-labeled images from Yolo v2" means. Does this mean use Yolo v2 or another pretrained network to pre-label the images, and then allow the user to resize or correct those labels, maybe increasing labeling efficiency? I'm unsure about this one because I think the screen might be cluttered, the modifications might take just as much work as pure labeling, and one is limited to the classes of the pre-trained network for pre-labels. I'm attracted to this theme though: using pre-trained models to increase labeling efficiency. I think it can be done somehow. And now I remember this video: https://youtu.be/t4kyRyKyOpo?t=13m3s (the labeling idea is shown from 13m3s to 14m3s). The idea in this video is classification-based, not bounding-box-based, but many bounding-box networks treat the problem like a classification problem via considering many regions.

Superpixel Segmentation: this one seems to be good for many project types. I think with this, one could convert a single click into a rough segmentation, and then automatically turn that into a bounding box, for bounding box projects. And for segmentation projects, one could click a few more times to get a precise segmentation.

Grabcut segmentation: this one seems focused on segmentation projects at the expense of bounding box projects. I value superpixel segmentation more, since it's more flexible.

I'm working on a bounding box project that involves video, so I'm biased towards video object tracking and feature matching / homography.

@MattKleinsmith
What do you think of changing the name of the repo to:
a) SmartLabeling
b) OpenLabeling

The goal of this tool is independent from Yolo so I think it should be changed. And it's better to change in the beginning before starting to get references.

Between a and b I prefer b. It seems more welcoming.

is there a way to erase a box surrounding the object.

@gmanolak double click to select the bbox, and then click the x.

rcabg commented

Hey @Cartucho and @MattKleinsmith,

I'm working on the video tracking feature. I have something but I need some feedback before continuing. Should I create a pull request and discuss there?

Cheers!

Hello @rcabg that would be great! Please make it a separate PR.

I have also made a draft version so we could merge them together and hopefully find a way for adding that feature.

I have implemented "click-drag resize" the bbox, instead of "quick delete" -> "draw new one" as the current version.
I found it is useful in a case:
when the tracker create not-exact bboxes for some frames, we could resize the bbox again without deleting them. Because delete method currently caused later frames be deleted also.
If you guys feel it is useful, I will clean a code a bit and make a pull request.

Yes! That was on my TODO list haha! great!
Make a PR and I will test and help you out.

Yeah, i will clean code a bit and will make a pull request!
Have u planned to integrate the Tracker with state-of-the-art deep learning object detection model to help us reduce manually labelling as much as possible.
Like we may have some very long video(a whole day for instance) and we want to let the program label automatically for us. Then we can re-label after that. This may help us reduce much of time for labeling i think.

Yeah, I agree with you although we would need to try its usability.
As it was previously argued here, it may turn out that having to fix automatically generated labels might be as hard as pure labeling.

There are great deep-learning trackers, better than the ones that are currently implemented in OpenCV. If we used a state-of-the-art tracker it would improve a lot the predictions for each object's bounding box in a video.

I have made a pull request of click-drag-resizing. Would you mind to take a look?

@vuthede Now with the resizing we can improve a lot the video tracker (so that when one resizes it re-adjusts the other associated labels).

Also, another great feature would be if we allowed the users to label with a single click and drag.

Yeah, I thought about the first one too,
Could u clarify label with a single click drag?

Instead of having to click twice per bounding box, e.g.:

  1. click: left-top
  2. click: right-bottom

the user could click only once and drag the mouse, e.g.:

  1. click : left-top
  2. move mouse with click 1 still pressed
  3. release click 1.

Oh yeah, thanks. I understood. If u plan to do other 's features. I think I can help u finish some features you have just mentioned.

Also, another cool thing to add would be the option between 1. rectangle (object detection) versus 2. pixel labeling (image segmentation).

It would be cool if there would be an option to choose from what file draw_bboxes_from_file method draws the boxes. .xml or .txt

@VytautasDv one of the users said he would submit a PR for this #59

Both of moving box bound key and dragging image key are mouse right key, it's clashes. I think using mouse middle button to drag image is more convenience.