/segment-anything-2-fine-tune

Segment-Anything-2 (SAM 2) fine tune with COCO data

Primary LanguagePythonApache License 2.0Apache-2.0

Segment-Anything-2-Finetune Project

Welcome to the Segment-Anything-2-Finetune project! This repository is designed to train and evaluate a segmentation model using Meta's Segment-Anything-2 and COCO format.

Features

  • Dataset Configuration: COCO format .
  • Training Options: Train the model using either bounding boxes or points. Points are generated from the bounding boxes.Each point represent the center of the bounding box .
  • Mask Utilization: Multiple masks are utilized for each point or bounding box, with the lowest segmentation loss mask being used for training (multimask_output=True).
  • Loss Function:Using the same technique described in the SAM 2 paper."use an ℓ1 loss to more aggressively supervise the IoU predictions and to apply a sigmoid activation to the IoU logits to restrict the output into the range between 0 and 1. For multi-mask predictions (on the first click), we supervise the IoU predictions of all masks to encourage better learning of when a mask might be bad, but only supervise the mask logits with the lowest segmentation loss (linear combination of focal and dice loss)."
  • Efficient Training: Save and load image embeddings to reduce training time. Save ~35% of training time by loading embeddings from a previous epoch/run.
  • Validation Output: Save segmented validation images to a specified directory.
  • Iterative Sampling: Repeatedly process the same image to enhance the model's performance. In each iteration after the first, the model uses the masks generated from the previous iteration as prompts, in conjunction with the original prompt. This approach aims to reduce false negatives.

How to Use

1.Configure Settings: Open the configuration file. Adjust the settings to match your desired parameters.

2.Run Training: Run the train.py file to start training.

Special thanks to luca-medeiros and sagieppel. Their code served as the base for this project.

Resources

License

This project is licensed under the same terms as the SAM 2 model.

Citing SAM 2

@article{ravi2024sam2, title={SAM 2: Segment Anything in Images and Videos}, author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{'a}r, Piotr and Feichtenhofer, Christoph}, journal={arXiv preprint arXiv:2408.00714}, url={https://arxiv.org/abs/2408.00714}, year={2024} }