Mismatch between Hugging Face Pretrained Model Description and GitHub Implementation

Question

Mismatch between Hugging Face Pretrained Model Description and GitHub Implementation

rajaram6052150 opened this issue 5 months ago · 7 comments

I have noticed a discrepancy between the pretrained model mentioned on the Hugging Face website and its actual implementation available in the GitHub repository. Specifically:

Hugging Face Model Page: https://huggingface.co/IGNF/FLAIR-INC_rgbie_15cl_resnet34-unet/blob/main/FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth

Error:
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "\?\C:\Users\Rajaram\anaconda3\envs\GPU\Scripts\flair-detect-script.py", line 33, in
sys.exit(load_entry_point('flair', 'console_scripts', 'flair-detect')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "e:\gpu\unet\flair-1-main\src\zone_detect\main.py", line 143, in main
sliced_dataframe, profile, resolution, model = prepare(config, device)
^^^^^^^^^^^^^^^^^^^^^^^
File "e:\gpu\unet\flair-1-main\src\zone_detect\main.py", line 116, in prepare
model = load_model(config)
^^^^^^^^^^^^^^^^^^
File "e:\gpu\unet\flair-1-main\src\zone_detect\model.py", line 84, in load_model
model.load_state_dict(state_dict=state_dict, strict=True)
File "C:\Users\Rajaram\anaconda3\envs\GPU\Lib\site-packages\torch\nn\modules\module.py", line 2215, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for UperNetForSemanticSegmentation:
Missing key(s) in state_dict: "backbone.embeddings.patch_embeddings.projection.weight", "backbone.embeddings.patch_embeddings.projection.bias", "back
bone.embeddings.norm.weight", "backbone.embeddings.norm.bias", "backbone.encoder.layers.0.blocks.0.layernorm_before.weight", "backbone.encoder.layers.0.block
s.0.layernorm_before.bias", "backbone.encoder.layers.0.blocks.0.attention.self.relative_position_bias_table", "backbone.encoder.layers.0.blocks.0.attention.s
elf.relative_position_index", "backbone.encoder.layers.0.blocks.0.attention.self.query.weight", "backbone.encoder.layers.0.blocks.0.attention.self.query.bias
", "backbone.encoder.layers.0.blocks.0.attention.self.key.weight", "backbone.encoder.layers.0.blocks.0.attention.self.key.bias", "backbone.encoder.layers.0.b
locks.0.attention.self.value.weight", "backbone.encoder.layers.0.blocks.0.attention.self.value.bias", "backbone.encoder.layers.0.blocks.0.attention.output.de
nse.weight", "backbone.encoder.layers.0.blocks.0.attention.output.dense.bias", "backbone.encoder.layers.0.blocks.0.layernorm_after.weight", "backbone.encoder
.layers.0.blocks.0.layernorm_after.bias", "backbone.encoder.layers.0.blocks.0.intermediate.dense.weight", "backbone.encoder.layers.0.blocks.0.intermediate.de
nse.bias", "backbone.encoder.layers.0.blocks.0.output.dense.weight", "backbone.encoder.layers.0.blocks.0.output.dense.bias", "backbone.encoder.layers.0.block
s.1.layernorm_before.weight", "backbone.encoder.layers.0.blocks.1.layernorm_before.bias", "backbone.encoder.layers.0.blocks.1.attention.self.relative_positio
n_bias_table", "backbone.encoder.layers.0.blocks.1.attention.self.relative_position_index", "backbone.encoder.layers.0.blocks.1.attention.self.query.weight",

Answer 1 · 2024-08-09T08:33:22.000Z

Hello @rajaram6052150
The pre-trained models available right now on our HF page have been trained with segmentation-models-pytorch. This might indeed not be clear for now, we will update the model cards and model names when releasing pre-trained models trained with HF.
So for now your config file should use something like :

model_weights: ../FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth
model_framework: 
    model_provider: SegmentationModelsPytorch
    HuggingFace:
        org_model: 
    SegmentationModelsPytorch:
        encoder_decoder: resnet34_unet

Answer 2 · 2024-08-09T09:08:17.000Z

Hello @agarioud ,

Thank you for the clarification regarding the pre-trained models. However, I have checked the specified path in the GitHub repository, and it appears that the file FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth is not present there.

Could you please provide the correct path or a link to download the pre-trained model file? It would be very helpful for continuing with the setup.

Thank you!

Best regards,
Rajaram

Answer 3 · 2024-08-09T09:14:05.000Z

You had the right link : https://huggingface.co/IGNF/FLAIR-INC_rgbie_15cl_resnet34-unet/blob/main/FLAIR-INC_rgbie_15cl_resnet34-unet_weights.pth
Once downloaded locally, you should ajust the 'model_weights' path from the config file to point it.

Answer 4 · 2024-08-09T09:58:39.000Z

Thank you for help sir @agarioud . Sorry to bother you again .
I’ve successfully set up and run inference using the FLAIR-INC_rgbie_15cl_resnet34-unet model after addressing the issue with the model_weights path. However, the output images are all coming out completely black.

Here are the details of my setup and problem:

Setup:
Model Weights Path: Correctly set to the downloaded weights file from Hugging Face.
Input Image:
Path: Set to a georeferenced raster image.
Metadata:
Driver: GTiff
Dtype: uint8
Number of Bands: 5
Image Shape: (5, 512, 512)
CRS: EPSG:2154
Configuration Parameters:
Output Type: argmax
Normalization: Custom
Image Size for Detection: 512
Overlap Margin: 128
Issue:
Observation: The output raster images are completely black for all processed images.
Error Message: No errors are reported during execution, and the model successfully loads and performs inference without issues.

Answer 5 · 2024-08-09T10:23:12.000Z

@rajaram6052150, the flair-detect command is meant to infer over a 'large' area with overlap in inferences. If you image is 512*512 and img_pixels_detection is also 512 but with 128 overlap it may yield some inconsistencies. Also, the output raster are 'black' but do they contain values ?

Answer 6 · 2024-08-09T12:15:34.000Z

Big thanks to you for the guidance sir @agarioud . It turns out that the black images do indeed contain values, representing different classes, and the segmentation results are accurate. Your suggestion helped us identify this.

I also wanted to ask: Will this implementation work with JPG or PNG images as well, or do I need to modify the code for those formats?

Answer 7 · 2024-08-09T13:32:00.000Z

Glad it worked out @rajaram6052150, happy to help.
flair-detect won't work with JPG/PNG images as it is meant to work with georeferenced inputs. I haven't planned to add support to this in the near future so you might wan't to either convert your inputs or indeed modify the code.
Best,