Real-time SLAM
mattiapiz opened this issue · 10 comments
Hii, I have a question about the implementation of the algorithm.
In particular, for what I saw everything is done offline. You take an RGBD video, then it is passed to the Co-SLAM algorithm and it will give the reconstructed mesh of the scene and the estimated trajectory of the camera. So my question is, the Real-time SLAM that is written in the title refer just of the fact we do mapping and tracking simultaneously or the it is done in real time so we can use it to mapping and localize our robot in an environment?
Am I missing something? The second question is, if everything is offline, what is the purpose of having the trajectory of the camera?
Thank a lot for having shared your code.
Hi @mattiapiz, Co-SLAM is an online algorithm as we are passing the RGB-D video to the Co-SLAM frame-by-frame.
Thanks for the quick reply. So I'm doing something wrong, then.
In my case, I have a Realsense D435i with which I capture the video of the room I want to map. However, following the guide you provided with the Azure camera, the video is first recorded, then divided into frames, reconstructed for obtaining the bounds, and finally, the Co-SLAM algorithm is run.
- Is there smart way to pass the frames obtained from the camera in real-time? (Also some suggestion could help)
- In Nice-SLAM there is a script for visualize the result visualizer.py, is this kind of script available also for Co-SLAM?
I don't know if I'm asking too much, but in any case, I appreciate your availability :)
I would suggest you try Open3D for live capture with RealSense and feed the frame to the Co-SLAM. We have not released our code for visualization.
Thank you for the suggestion.
So, I can run Open3D to capture the frames on one side and Co-SLAM on the other side, passing it the folder where the frames are saved. It might indeed make sense. The only thing is that Co-SLAM operates at a lower frame rate compared to the camera's frame rate. Do you suggest reducing the camera's frame rate, or is it okay to have a delay between the real-time frames captured by the camera and those provided by the algorithm? Also, regarding the visualizer, do you plan to release it in the future, or is there a way to obtain it somehow? It would be helpful to see how the map is constructed in real-time.
Hi @mattiapiz, you can use the code here:
You can simply skip those frames to enable real-time performance. For the visualizer, we do not have a plan for releasing the code. I would suggest you save the mesh and display it with open3d using another process.
Hi @HengyiWang, thank you very much for the answers, I will try to follow your suggestions, hoping to succeed.
Regarding visualization, so far, I have done it in this way, seeing the final mesh. However, my idea is to use the Nice-SLAM visualizer and make it compatible with Co-SLAM data.
I have one last question concerning multi-processing. Specifically, I have two RTX 4000 GPUs that I would like to use for both mapping and tracking. However, even when using coslam_mp.py, only one GPU is recognized. Is it possible to use both of them? In Nice-SLAM, I was able to separate tracking from mapping.
I don't know if the last question is off-topic, in the case sorry about that, I could open a new issue regarding it.
Best regards.
Hi @mattiapiz, it is possible to use two GPUs for it. However, I only have access to a desktop with one GPU for the moment, so I have not had a chance to test and modify the code for multi-processing with two GPUs;(
But it is definitely a good idea. I will see if I can get a suitable machine and make some modifications to the code to support two GPUs when I have time.
Thank you very much for responding, I am grateful to you.
Currently, I am trying to implement the code for a little car, and unfortunately, I don't have a deep knowledge of the code to be able to transfer data from one GPU to another and let the algorithm run on multiple GPUs.
I will wait for the release of the code for multiple GPUs if that's the case :).
I have just one last question I hope. Regarding how the algorithm takes the data, I would like to make sure that the training of the networks is done frame by frame in both the single processing and multi-processing code. In particular, if I understood correctly, it takes one image, performs mapping or tracking, and then takes the next one, right?
Best
Yes, it is done frame by frame. For every input frame, we initialize the pose via constant speed assumption and perform tracking (every frame) + mapping (every 5 frames). The tutorial for capturing the data is for research purposes, it does not mean Co-SLAM is an offline method.
Thanks very much @HengyiWang