Initialization during inference

Question

Initialization during inference

MaxEAB opened this issue 2 years ago · 6 comments

Hi, Is it possible to initialize tracking using a mask and frame (or a sequence of frames) that is saved in advance? This is for single object tracking when the object to be tracked is known in advance.

Answer 1 · 2023-03-22T15:55:02.000Z

Yes. See https://github.com/hkchengrex/XMem/blob/main/docs/INFERENCE.md#on-custom-data

Feel free to comment if you run into problems.

Answer 2 · 2023-03-22T20:36:58.000Z

Hi, Thanks for the reply. I meant a scenario in which the initial reference frame is known/saved in advance, but the tracking will happen live using the saved reference images.

Basically, the user will not actively initiate the process.

Answer 3 · 2023-03-23T06:42:15.000Z

I am not sure if I understand. Where do the images to be segmented come from? What do you mean by live?

Answer 4 · 2023-03-23T06:43:28.000Z

If the user is not initiating... who is?

Answer 5 · 2023-03-23T13:49:29.000Z

Suppose we are interested in tracking a specific type of object, say a certain make/model of a car, which is known in advance. And we want the same car to be identified (semantically segmented) by a street camera, without anyone actively initializing the first frame.

Is it possible to save a few reference images and masks beforehand so that the network can in turn use those to initialize the inference, instead of the user choosing one? (This means the network being able to handle rapid change of scene/background between the reference frame and the current frame.)

Answer 6 · 2023-03-23T17:32:12.000Z

I see. Thank you for the explanation.
In that case, I would still treat the input as a "video", except that the first frame is quite different than the second. I would also reset the sensory memory at the second frame due to this difference. For multiple objects, we can save all of them to the working memory (and force them to remain in the working memory at all times). Some inference logic would have to be rewritten.
Whether it would work well depends largely on how different the reference and the query are.