ImageBatch of the S3DIS dataset
ruomingzhai opened this issue · 15 comments
Hi, Dr. Robert.
I met a problem when I customed my model based on your DeepViewAgg project again with the S3DIS dataset.
I add a customed classification label for each pixel in images, i.e., extend the ImageMapping class with added values[3] in ImageMapping.values, as follows:
But when I trained the model with my customed dataset class, the ImageBatch (derived from SameSettingImageBatch? )present different downscale values such as 1, 0.5, and 0.25. Why it shows different scales? I thought they should be the same scales for all images.
The program shows fine with downscale=1.0 but breaks down with downscale=0.5 or 0.25.
I trace the code to the upscale_images() function in torch_point3d/core/multimodal/image.py and it seems you recalculate the pixel coords and update them in value[1].values[0].
The pixel coords indexes exceed the output of the pretrained image model, which is fixed to [batch, C_dim, 1024, 512].
Hope to get some help from you.
Best regards,
Hi @ruomingzhai
With the information you provided, I cannot tell what your problem is. Can you please provide some code snippets of:
- the modifications you made to ImageMapping
- the command you run and the full error message traceback you get
A remark though: if you try to extend the ImageMapping with some additional pixel-level information, you should not append stuff in ImageMapping.values (which carry image-level info), but in ImageMapping.values[1].values (which carry pixel-level stuff).
Thank you for your attention!
First of all, the modification I made to ImageMapping
:
- I add the pixel-level information in the process function of the MapImages class as follows:
the image_sp_ids
attribute is a tensor having the same number of columns as the pixels
attribute.
in the from_dense
function, I compress the image_sp_ids
with other attributes as follows:
So, you mean I should add my image_sp_ids
attribute to the image_ids
which referring to ImageMapping.values[1]
?
- Secondly, I use the
ADE20KResNet18_C1_deepsup
model as image pretrained model and confine the output tensor to [i,c,516,1024] as follows:
- Third, I add a function
get_mapped_features_sp
and attribute () to retrieve myimage_sp_idx
during training
def get_mapped_features_sp(self, interpolate=False):
scale = 1 / self.downscale
# If not interpolating, set the mapping to the proper scale
mappings = self.mappings if interpolate \
else self.mappings.rescale_images(scale)
# if not interpolate:
# self.mappings.rescale_images(scale)
# Index the features with/without interpolation
if interpolate and scale != 1:
resolution = torch.Tensor([self.mapping_size]).to(self.device)
coords = mappings.pixels / (resolution - 1)
coords = coords[:, [1, 0]] # pixel mappings are in (W, H) format
batch = mappings.sp_map_indexing[0]
x = sparse_interpolation(self.x, coords, batch)
else:
mod_idx,sp_sorting_idx,sp_csr,sp_im_csr = mappings.sp_map_indexing
x = self.x[mod_idx] #[num_views,512,heigh,wide]
self.mappings.sp_csr_idx = sp_csr
self.mappings.sp_sorting_idx = sp_sorting_idx
self.mappings.sp_im_csr =sp_im_csr
return x
def sp_map_indexing(self):
#image_ids
idx_batch = self.images.repeat_interleave(
self.values[1].pointers[1:] - self.values[1].pointers[:-1])
sp_sorting_idx = lexargsort(idx_batch,self.values[3]) #[N_i,1]
#pixel indexes
idx_batch = idx_batch[sp_sorting_idx]
idx_height = self.pixels[sp_sorting_idx, 1]
idx_width = self.pixels[sp_sorting_idx, 0]
idx = (idx_batch.long(), ..., idx_height.long(), idx_width.long())
#self---ImageMapping.sp_sorting_idx
# self._sp_sorting_idx = sp_sorting_idx
#SP_CSR:self----ImageMapping.sp_csr_idx
sp_pointer=self.values[3][sp_sorting_idx]
sp_idx = torch.cat([torch.LongTensor([0]).to(self.device),
torch.where(sp_pointer[1:]!=sp_pointer[:-1])[0]+1])
sp_im_idx = torch.cat([torch.LongTensor([0]).to(self.device),
torch.where(idx_batch[sp_idx[1:]]!=idx_batch[sp_idx[:-1]])[0]+1,
torch.LongTensor([sp_idx.shape[0]]).to(self.device)]) #
sp_csr_idx = torch.cat([sp_idx, torch.LongTensor([sp_pointer.shape[0]]).to(self.device)])
# self._sp_csr_idx = sp_csr_idx
sp_3d_idx=self.points.repeat_interleave(self.pointers[1:]-self.pointers[:-1])
return idx,sp_3d_idx,sp_csr_idx,sp_im_idx
The error happens at:
x = self.x[mod_idx] #[num_views,512,heigh,wide]
as follows:
] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [87,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [88,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [89,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [90,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [91,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [92,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [93,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [94,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: operator(): block: [41,0,0], thread: [95,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
0%| | 0/2000 [01:04<?, ?it/s]
Traceback (most recent call last):
File "s3dis_preprocess.py", line 120, in <module>
initial_trainer()
File "s3dis_preprocess.py", line 95, in initial_trainer
trainer.train()
File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 147, in train
self._train_epoch(epoch)
File "/root/share/code/DeepViewAgg/torch_points3d/trainer.py", line 202, in _train_epoch
self._model.optimize_parameters(epoch, self._dataset.batch_size)
File "/root/share/code/DeepViewAgg/torch_points3d/models/base_model.py", line 245, in optimize_parameters
self.forward(epoch=epoch) # first call forward to calculate intermediate results
File "/root/share/code/DeepViewAgg/torch_points3d/models/segmentation/multimodal/sparseconv3d.py", line 299, in forward
mm_data_dict = self.backbone_no3d(self.input)
File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/share/code/DeepViewAgg/torch_points3d/applications/multimodal/no3d.py", line 116, in forward
mm_data_dict = self.down_modules[i](mm_data_dict)
File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/share/code/DeepViewAgg/torch_points3d/modules/multimodal/modules.py", line 97, in forward
mm_data_dict = mod_branch(mm_data_dict, m)
File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/share/code/DeepViewAgg/torch_points3d/modules/multimodal/modules.py", line 485, in forward
x_mod = mod_data.get_mapped_features_sp(interpolate=self.interpolate)
File "/root/share/code/DeepViewAgg/torch_points3d/core/multimodal/image.py", line 1579, in get_mapped_features_sp
return [im.get_mapped_features_sp(interpolate=interpolate) for im in self]
File "/root/share/code/DeepViewAgg/torch_points3d/core/multimodal/image.py", line 1579, in <listcomp>
return [im.get_mapped_features_sp(interpolate=interpolate) for im in self]
File "/root/share/code/DeepViewAgg/torch_points3d/core/multimodal/image.py", line 1327, in get_mapped_features_sp
x = self.x[mod_idx] #[num_views,512,heigh,wide]
RuntimeError: CUDA error: device-side assert triggered
It seems the mod_idx
exceeds the actual index in image output size. So I found the image pixel indexes changes when the scale
value is not 1.0
in this code :
mappings = self.mappings if interpolate \
else self.mappings.rescale_images(scale)
So, I thought it should be a problem with the scale value, not my adding codes.
Hope I explain my problem clearly!
I took the liberty to edit your previous message, for readability.
I can't tell for sure what is happening here, some more explanations on what your modifications are for might be helpful:
- the type and shape of your
image_sp_ids
, what it represents, etc. - how do you use
ADE20KResNet18_C1_deepsup
as an image model ? Which model config are you using ? Are you sure you can fit a whole image encoder-decoder in your GPU memory ?
If I had to guess what you are trying to do, I would say image_sp_ids
holds indices of superpixels in which the mapping falls. Still guessing, this means your image_sp_ids
should be a 1D LongTensor
of shape [ImageMapping.pixels.shape[0]]
(ie it has the same number of rows as the pixel
attribute, not the same number of columns as your previously mentionned), is that correct ? When debugging, you can use ImageMapping.debug()
to check that there are no errors in how you constructed your mappings. This will not investigate all possible errors, but can rule out simple ones.
You seem to have replaced ImageMapping.get_mapped_features
with your ImageMapping.get_mapped_features_sp
. From what I see, the traceback seems to indicate the error comes from there, I don't think x = self.x[mod_idx]
is the problem.
Make sure you set CUDA_LAUNCH_BLOCKING=1
when debugging CUDA-related errors. Otherwise you won't get trustworthy error logs:
CUDA_LAUNCH_BLOCKING=1 python train.py <your kwargs here>
This might give you a more specific error message.
Overall, I am guessing your error is related to the scale ratio between the input image and the output image feature map. ImageMapping.get_mapped_features
should normally be able to tell how you rescaled your image and return a big [num_mappings, num_features]
Tensor of the 2D features associated with the mappings. One current limitation of this code is that it only supports rescaling image width and size separately (ie the width / height aspect ratio must always stay constant). So make sure your input and output image feature maps have the same aspect ratio. Have a look here to see where the downscale
attribute is updated based on the input/output image feature maps size ratio.
Thanks!
Your are right. The image_sp_ids
is 1D tensor with the same rows as pixel
attribute.
I want better image mapped features from a more complete encode-decode image model and it it quite a heavy load for a 16G GPU because it always breaks down. So I change the resolution_3d
to 0.3m just for working the codes out first. I am not sure its a good idea to ptrain this large 2D model right now.
The ADE20KResNet18_C1_deepsup
model is shown as follows:
class ADE20KResNet18_C1_deepsup(nn.Module):
"""ResNet-18 encoder pretrained on ADE20K with C1_deepsup.
Adapted from https://github.com/CSAILVision/semantic-segmentation-pytorch
"""
def __init__(self, *args, frozen=False, pretrained=True, **kwargs):
super().__init__()
# Adapt the default config to use ResNet18 + PPM-Deepsup model
ARCH = 'resnet18dilated-c1_deepsup'
DIR = osp.join(PRETRAINED_DIR, 'ade20k', ARCH)
MITCfg.merge_from_file(osp.join(DIR, f'{ARCH}.yaml'))
MITCfg.MODEL.arch_encoder = MITCfg.MODEL.arch_encoder.lower()
MITCfg.MODEL.arch_decoder = MITCfg.MODEL.arch_decoder.lower()
MITCfg.DIR = DIR
# Absolute paths of model weights
MITCfg.MODEL.weights_encoder = osp.join(
MITCfg.DIR, 'encoder_' + MITCfg.TEST.checkpoint)
MITCfg.MODEL.weights_decoder = osp.join(
MITCfg.DIR, 'decoder_' + MITCfg.TEST.checkpoint)
assert osp.exists(MITCfg.MODEL.weights_encoder) and \
osp.exists(MITCfg.MODEL.weights_decoder), \
"checkpoint does not exist!"
decoder_pretrianed = kwargs.get('decoder_pretrained',False)
# Build encoder and decoder from pretrained weights
old_stdout = sys.stdout # backup current stdout
sys.stdout = open(os.devnull, "w")
self.pretrained = pretrained
self.encoder = MITModelBuilder.build_encoder(
arch=MITCfg.MODEL.arch_encoder,
fc_dim=MITCfg.MODEL.fc_dim,
weights=MITCfg.MODEL.weights_encoder if pretrained else '')
self.decoder_pretrianed = decoder_pretrianed
if self.decoder_pretrianed :
self.decoder = MITModelBuilder.build_decoder(
arch=MITCfg.MODEL.arch_decoder,
fc_dim=MITCfg.MODEL.fc_dim,
num_class=MITCfg.DATASET.num_class,
weights=MITCfg.MODEL.weights_decoder if pretrained else '',
use_softmax=True)
else:
self.decoder = nn.Sequential(
nn.Conv2d(512, 256, 1),
nn.Upsample(scale_factor=8, mode="bilinear", align_corners=True),)
sys.stdout = old_stdout # reset old stdout
# Convert PPM from a classifier into a feature map extractor
# self.decoder = PPMFeatMap.from_pretrained(self.decoder)
# If the model is frozen, it will always remain in eval mode
# and the parameters will have requires_grad=False
self.frozen = frozen
if self.frozen:
self.training = False
self.normalize_feature=True
def forward(self, x, *args, out_size=None, **kwargs):
enc = self.encoder(x, return_feature_maps=True)
if self.decoder_pretrianed :
pred = self.decoder(enc,segSize=[512,1024]) #[240,320]
else:
pred = self.decoder(enc[3])
if self.normalize_feature:
pred = F.normalize(pred, p=2, dim=1)
return pred
About the error, I just found I use the get_mapped_feature
followed with my customed get_mapped_feature_sp
in the 2D forward function:
mod_data = self.forward_conv(mod_data)
if self.superpixel_pool is not None:
x_mod = mod_data.get_mapped_features(interpolate=self.interpolate) #
x_mod = self.forward_superpixel_pool(x_3d, x_mod, mod_data) #attention约束的2Dfeature
x_mod = mod_data.get_mapped_features_sp(interpolate=self.interpolate)
#2D像素的特征进行sp聚集reduction=mean的特征值
sp_csr =mod_data.get_sp_csr_idx()
sp_sorting = mod_data.get_sp_sorting_idx()
sp_im_csr = mod_data.get_sp_im_csr()
# idx =(sp_sorting[0],...)#对3D点云的特征进行sp聚集reduction=mean的特征值
x_mod = self.forward_atomic_pool(x_3d, x_mod, sp_csr )
Maybe implementing this similar get_mapped_features
functions is the reason why the downscale change. I will investigate it these day.
I use your suggested ImageMapping.debug(), and its error message shows :
self.debug() File "/root/share/code/DeepViewAgg/torch_points3d/core/multimodal/image.py", line 344, in debug assert self._downscale >= 1, \ AssertionError: Expected scalar larger than 1 but got 0.25 instead
I don't know why this happen.
I must warn you that the 2D model is the part of the architecture that is the most memory-intensive. Much more than the 3D encoder-decoder. In this regard, using a 2D encoder will be especially costly, because you want to produce features maps at a large resolution, with a lot of channels. We address this issue in our paper and in the code, by showing that using a full image decoder is overkill:
- Because the 2D-3D mappings are inherently sparse in datasets like S3DIS and KITTI-360, we only pass a fraction of the full-resolution output feature maps to the 3D points. So computing a full-resolution output feature map is memory-inefficient
- Instead, we proposed a pyramid interpolation scheme to extract multi-scale features from the 2D encoder features maps much like a FPN would do, but without having to compute the whole 2D decoder feature maps. See the
Res16UNet34-PointPyramid-early-cityscapes-interpolate
model for instance. This relies on bilinear interpolation of the low resolution features maps only at the desired mapping pixel coordinates - This pyramid interpolation scheme revealed useful for the KITTI-360 dataset, but not essential for S3DIS, becasue the pretrained 2D ResNet-18 encoder we use for S3DIS outputs relatively "high-resolution" features at its lower stage.
If you still want to use a full 2D encoder-decoder, it is possible. I have already done so with the ADE20KResNet18PPM
encoder-decoder. You can have a look at the XYZ-RGB-PPM-late
model config here for pointers. Besides, here are some insights about using an image decoder:
- reducing you 2D resolution will save you much more memory than reducing you 3D resolution, and might better preserve your final 3D semantic segmentation performances.
- you could consider freezing you 2D model to save memory and compute at train time. This means you will not fine-tune your 2D model
- if fine-tuning the 2D model is important to you (it does improve performance in my experience), you can use checkpointing (see the
checkpointing
parameter forUnimodalBranch
). This allows you to fit larger models in memory when training, but at the cost of longer training time.
As for your current error, please make sure you set CUDA_LAUNCH_BLOCKING=1
when investigating erros happening on the GPU. Your previously-posted error traceback message does not look complete to me and does not indicate where the error actually comes from.
Your above error indicates that you have self.downscale
set to a value smaller than 1, which is not acceptable. This might happen if your output 2D feature map resolution is larger than your input image resolution, or if you manually modified self.downscale
in an unexpected way somewhere. As suggested yesterday, have a look here to see where the downscale
attribute is updated based on the input/output image feature maps size ratio.
sorry to mention I use the CUDA_LAUNCH_BLOCKING=1 to run the program from the first time. The error comes from the wrong scale I think. So I want to ask you about that is it possible the scale cloud be 0.25?
Did you set the scale yourself manually somewhere ? What are the shapes of your input images and output feature maps ? Can you share which model config your are using ? In the piece of code you showed, your scale should normally be larger than 1. If it is smaller, it means your output feature maps have a higher resolution than your input image (unlikely).
I only set the image size in the image decoder to pred = self.decoder(enc,segSize=[512,1024])
,but I don’t think it is the problem. I will debug the code to find where the scale changs. Thank you for your tips!!!
def forward(self, x, *args, out_size=None, **kwargs):
enc = self.encoder(x, return_feature_maps=True)
if self.decoder_pretrianed :
pred = self.decoder(enc,segSize=[512,1024])
else:
pred = self.decoder(enc[3])
if self.normalize_feature:
pred = F.normalize(pred, p=2, dim=1)
return pred
Hi, sorry to bother and my problem has not been solved.
I followed your suggestions and traced bugs to the ImageMapping.debug(). It shows each SameSettingImageBatch has a different img_size as shown follows:
I think this is the reason why the downscale change while preparing the imagebatch before training. It did not happen to my ScanNet dataset and I guess it derives from the data_transform "CropImageGroups". Does it happen to you? I can't contraol its cropping operation. Hope to get help from you.
Looking forward to hearing your suggestions.
I trace back to the data_transform in CropImageGroup and I think it may be the reason why ImageData.img_size change and result in the changing downscale. I add codes in CropImageGroup as follows:
And the result is :
image original_size :(512, 256) cropping_size:(128, 64) scale :0.25
image original_size :(512, 256) cropping_size:(128, 128) scale :0.5
image original_size :(512, 256) cropping_size:(128, 128) scale :0.5
image original_size :(512, 256) cropping_size:(256, 256) scale :1.0
image original_size :(512, 256) cropping_size:(128, 64) scale :0.25
image original_size :(512, 256) cropping_size:(128, 128) scale :0.5
image original_size :(512, 256) cropping_size:(256, 128) scale :0.5
image original_size :(512, 256) cropping_size:(256, 256) scale :1.0
image original_size :(512, 256) cropping_size:(128, 128) scale :0.5
image original_size :(512, 256) cropping_size:(256, 256) scale :1.0
image original_size :(512, 256) cropping_size:(128, 128) scale :0.5
image original_size :(512, 256) cropping_size:(256, 256) scale :1.0
So does it because my resolution_3d=0.3 leads to different image sizes?
Hope to hear your suggestions!!!! It bothers me a lot!
Now, I may try to preprocess the point cloud with resolution_3d=0.02(default value) and check if the same bug happens to it.
If you look at the conf/data/segmentation/multimodal/s3disfused-sparse.yaml
here is what each transform:
SelectMappingFromPointId
: reads themapping_index
key in laoded 3D points Data object and loads the corresponding images and mappingsCenterRoll
: rotate the image sphere around the Z axis to place the mapped pixels in the center of the image (for S3DIS equirectangular images only). This prepares the next stepsPickImagesFromMappingArea
andCropImageGroups
PickImagesFromMappingArea
: drop images which do not contain enough mapped pixelsCropImageGroups
: crop each image with a bounding box including all its mapped pixels. This produces images of various sizes, depending on how far the image is located from the 3D points. But since we prefer processing images in same-size batches, the bounding boxes are adjusted to the nearest box among a family of boxes (eg[64, 64], [128, 64], [128, 128], [256, 128], [256, 256], [512, 256], [512, 512], ...
). The output of this transform is anImageData
and not aSameSettingImageData
. AnImageData
is a holder class to carry a list ofSameSettingImageData
with different sizes, precisely whatCropImageGroups
outputs.PickImagesFromMemoryCredit
: select a subset of the images based on their size and how informative their mappings are. This transform ensures that the number of images in the batch does not bypass a chosen "pixel credit". This is an augmentation and caps the GPU memory usage.JitterMappingFeatures
: add some noise to the mapping featuresColorJitter
: add some noise to the image radiometryRandomHorizontalFlip
: random horizontally flip images (and mappings)Normalize
: normalize image radiometry
So it is perfectly normal for CropImageGroups
to alter image size, it is what it is designed for.
I think the error may come from the fact that you hardcoded an output feature map size in your image decoder:
self.decoder(enc,segSize=[512,1024])
If this does what I think it does, it means your decoder does not respect the cropping sizes produced by CropImageGroups
and it outputs feature maps which are even larger than some of your input feature maps (which is unlikely, unless you work on super-resolution). If that is the case, I suggest you make sure the output of your image model is the same size as your input, or smaller with the same scaling applied on height and width (for now, the code does not support aspect ratio changes).
PS: the 3D point cloud resolution should not have anything to do with this, you can change it however you want.
Hi, Dr. Robert.
The bug indeed derives from the fixed image size in the self.decoder(enc,segSize=[512,1024])
!
I can train the model right now but it randomly breaks down with the resnet18dilated-c1_deepsup
encoder:
File "/root/DeepViewAgg/torch_points3d/modules/multimodal/modalities/image.py", line 779, in forward
enc = self.encoder(x)
File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/.local/lib/python3.8/site-packages/mit_semseg/models/models.py", line 259, in forward
x = self.maxpool(x)
File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/.local/lib/python3.8/site-packages/torch/nn/modules/pooling.py", line 153, in forward
return F.max_pool2d(input, self.kernel_size, self.stride,
File "/root/.local/lib/python3.8/site-packages/torch/_jit_internal.py", line 267, in fn
return if_false(*args, **kwargs)
File "/root/.local/lib/python3.8/site-packages/torch/nn/functional.py", line 585, in _max_pool2d
return torch.max_pool2d(
RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4
Hi @ruomingzhai
So it seems the reason for your initial problem was that your modifications broke the way the code makes use of images batched together at various sizes, following CropImageGroups
.
The initial problem for this issue has been solved, so I will close it.
Your latest comment is clearly about a different problem and concerns an input or output feature shape in a 2D model resnet18dilated-c1_deepsup
I did not build.
I am really glad 😊 you find this project interesting and are trying to make use of it. However, there are limits to how much time I can dedicate for support here:
-
✔️ I can provide help to clarify how things I built work and make some suggestions on how to extend the current codebase.
-
❌ I cannot provide support for things I did not build myself and that break the way the project works.
👉 I would recommend to make sure you understand the paper and read the code and comments better. Typically, this means understanding how the provided model works, the size of the inputs and outputs in each block of the model.
Under these circumstances, if you are running into a new challenge, which you cannot solve after having thoroughly investigated it yourself, feel free to open an issue with detailed information on the modifications you made and the error logs.
Thanks in advance and good luck 🍀 !