Checkpoint Loading Error in Demo_sim_Erender.py

Question

Checkpoint Loading Error in Demo_sim_Erender.py

Opened this issue a year ago · 4 comments

Hi Chen,

I followed all the instructions provided in the documentation but encountered an error related to checkpoint loading. The error message is as follows:

pybullet build time: Nov 28 2023 23:51:11
Traceback (most recent call last):
File "demo_sim_Erender.py", line 77, in
run_demo()
File "demo_sim_Erender.py", line 16, in run_demo
pipeline = CorrespondenceBasedPipeline(
File "/root/CNS/cns/benchmark/pipeline.py", line 70, in init
self.control = GraphVSController(ckpt_path, self.device)
File "/root/CNS/cns/benchmark/controller.py", line 13, in init
self.net: GraphVS = torch.load(ckpt_path, map_location=self.device)["net"]
File "/root/miniconda3/envs/myconda/lib/python3.8/site-packages/torch/serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/root/miniconda3/envs/myconda/lib/python3.8/site-packages/torch/serialization.py", line 1131, in _load
result = unpickler.load()
File "/root/miniconda3/envs/myconda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1269, in getattr
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Linear' object has no attribute '_lazy_load_hook'

It seems that the issue is related to the environment or potential differences in the PyTorch versions. I am using the following environment:

Python version: 3.8
Torch version: 1.13.1
I would appreciate it if you could provide some insights into the following:

PyTorch Compatibility: Could you confirm if the checkpoint provided is compatible with PyTorch version 3.8? If not, could you please specify the recommended PyTorch version?

Loading Checkpoint in Different Environments: Are there any known issues or considerations when loading the checkpoint in an environment different from the one it was saved in?

Possible Solutions: Do you have any suggestions or solutions to resolve this issue? I have already checked the PyTorch version and made sure it aligns with the project requirements.

Thank you for your time and assistance. I appreciate the effort you put into maintaining this open-source project, and I look forward to your guidance on resolving this matter.

Best regards,

Answer 1 · 2024-01-29T17:50:42.000Z

Sorry for the late reply. This model is trained with python=3.7.12, torch=1.13.1, and torch_geometric=2.2.0. The issue may raise from the different versions of torch_geometric. It seems that the method _lazy_load_hook of Linear object mentioned in the error message has been removed in later PyG versions.

Solution: We now upload the "state dict" of the model (see checkpoints/cns_state_dict.pth). Some scripts are also updated, you can run demo_sim_Erender.py with the latest checkpoint and codes. May this help you.

Answer 2 · 2024-01-30T07:06:21.000Z

Thank you for your previous assistance. It worked! I have a new issue with the algorithm under two scenarios:

'Hard' mode with fewer than 50 match points.
Objects at the edge or partially in the image.
These conditions significantly reduce the algorithm's effectiveness. Do you have any suggestions for improvements?

Answer 3 · 2024-01-30T07:09:03.000Z

Thank you for your previous assistance. It worked! I have a new issue with the algorithm under two scenarios:

'Hard' mode with fewer than 50 match points.

Objects at the edge or partially in the image.
These conditions significantly reduce the algorithm's effectiveness. Do you have any suggestions for improvements?

The object of 'Hard' mode with fewer than 50 match points is just a simple screw.

Answer 4 · 2024-01-30T09:34:49.000Z

This is the intrinsic drawback of this work, as our controller only uses the point position (image or point features are not utilized) to compute the control rate. Therefore, if the frontend fails (or provides few keypoints and errorneous matches), the whole algorithm will fail. It is typical when servoing textureless scenes or symmetric objects.

Here are the possible solutions:

For scenario 1: Improve the quality of correspondence, use a detector-based frontend providing denser keypoints and matches, such as SuperGlue, LightGlue and etc. However, these learning based image matching works tend to fail when facing large in-plane rotations.
For scenario 2: It may arise from the out-of-distribution problem, since our training stage doesn't contain partially observed scenes. You may need to change the environment randomization and re-train the model from scratch... Another possible solution is pre-rotating the camera before servoing to ensure that objects roughly locate at the center of camera's field of view.
We now upload the classical IBVS method as an alternative, you can switch to IBVS (set ckpt_path="checkpoints/ibvs_config.json") to see if it could solve your problem.