Issue with inferencing
Closed this issue · 1 comments
First of all thanks for the repo for this Looking3D and nice work. I am trying the code for the inferencing part and notice some errors when following the README.md, especially during the inferencing step.
I already setup the sample folder as follows:
Looking3D
└───sample
│ │ query_example.png # one of the images from the dataset with the shape_id 179
│ │ query_example_2.png # one of the images from the dataset with the shape_id 179
│ │ query_example_3.png # one of the images from the dataset with the shape_id 179
└───mv_images
│ │ 179_3.0_0_20.json
│ │ 179_3.0_0_20.npy
│ │ 179_3.0_0_20.png
│ │ 179_3.0_18_20.json
│ │ 179_3.0_18_20.npy
│ │ 179_3.0_18_20.png
│ │ ...
I wrote another inferencing Python script as instructed in the README.md by providing the stated query_path, the mv_path, the resume_ckpt, device and topk
when I run this script it shows this error:
(Looking3D) C:\Users\MyPC\Looking3D>python predict.py
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
constructing SpatialTransformer of depth 3 w/ 128 channels and 8 heads
WARNING: SpatialTransformer: Found context dims [256] of depth 1, which does not match the specified 'depth' of 3. Setting context_dim to [256, 256, 256] now.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is None and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is 256 and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is None and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is 256 and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is None and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is 256 and using 8 heads with a dimension of 64.
model loaded successfully
Traceback (most recent call last):
File "predict.py", line 3, in <module>
pred_labels = predict(query_path = r"C:\Users\MyPC\Looking3D\sample\query", \
File "C:\Users\MyPC\Looking3D\demo.py", line 83, in predict
result = forward_cmt(batch, models, is_train = False, topk = topk)
File "C:\Users\MyPC\Looking3D\train.py", line 465, in forward_cmt
imgs, mesh, labels, bbox, pos_enc3d = batch['query_imgs'], batch['mesh_images'], batch['labels'], batch['bbox'], batch['mesh_pos_enc3d']
KeyError: 'query_imgs'
So upon digging the train.py line 465 in forward_cmt function, the batch dictionary has different keys compared to what was implemented (no 'query_imgs', 'mesh_images' and 'mesh_pos_enc3d').
Checking the batch keys shows that the right keys are as follows:
'query_imgs' --> 'imgs'
'mesh_imgs' --> 'mesh'
'mesh_pos_enc3d' --> 'pos_enc3d'
Upon changing the dict keys the inference seems to be working although there are some wrong predictions (I suppose) as follows:
(Looking3D) C:\Users\MyPC\Looking3D>python predict.py
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
constructing SpatialTransformer of depth 3 w/ 128 channels and 8 heads
WARNING: SpatialTransformer: Found context dims [256] of depth 1, which does not match the specified 'depth' of 3. Setting context_dim to [256, 256, 256] now.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is None and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is 256 and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is None and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is 256 and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is None and using 8 heads with a dimension of 64.
Setting up MemoryEfficientCrossAttention. Query dim is 512, context_dim is 256 and using 8 heads with a dimension of 64.
model loaded successfully
-> Query_path : C:\Users\MyPC\Looking3D\sample\query\query_example.png
Anoamly_pred_label : 0
Conf_score : 99.99982126805662
-> Query_path : C:\Users\MyPC\Looking3D\sample\query\query_example_2.png
Anoamly_pred_label : 1
Conf_score : 86.26554012298584
-> Query_path : C:\Users\MyPC\Looking3D\sample\query\query_example_3.png
Anoamly_pred_label : 0
Conf_score : 99.99998801540357
I have not retrained it from scratch nor have I trained it on my own dataset. I will update if there are any more errors. Thanks in advance.
As the issue has been active for a long time, I am closing it for now. Please reopen it if required.