Real-Time Inference

Question

Real-Time Inference

Henistein opened this issue 5 months ago · 5 comments

I noticed that in all demo scripts the video is first processed in whole and then propagated, would it be possible to run a video in real-time?

Thank you in advance!

Answer 1 · 2024-08-13T11:24:31.000Z

Thanks for you issue, I think you can try to use smaller SAM 2 model for faster inference, and maybe you can refer to SAM 2's official repo to see if there're some acceleration methods on this issue

Answer 2 · 2024-08-13T14:32:01.000Z

@rentainhe sorry, maybe I expressed myself badly, what I meant was if would be possible to run grounded sam2 end to end, like in a stream. Imagine I have a real time video stream and I wanted to segment it live while keeping the track of each object, would it be possible?

Answer 3 · 2024-08-14T02:19:32.000Z

I think SAM 2 now only supports non-stream video input. I have no idea how to support streaming input at this time. Maybe we can refer to the community to see if there are some solutions for this issue.

Answer 4 · 2024-08-15T03:40:58.000Z

@Henistein, I hope my implementation will be helpful, though these is latency.
https://github.com/patrick-tssn/Streaming-Grounded-SAM-2

Answer 5 · 2024-08-20T12:11:52.000Z

Hi, @Henistein
Can your code take prompts like the grounded sam2? I mean I want to track road amd cars in a video with continuous id? Is it possible by your implementation??

TIA