/SeC

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

πŸš€πŸš€πŸš€ Official implementation of SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

Zhixiong Zhang* Β· Shuangrui Ding* Β· Xiaoyi Dongβœ‰ Β· Songxin He Β· Jianfan Lin Β· Junsong Tang
Yuhang Zang Β· Yuhang Cao Β· Dahua Lin Β· Jiaqi Wangβœ‰

Demo Video

demo.mp4

πŸ“œ News

πŸ† [2025/8/13] SeC sets a new state-of-the-art on the latest MOSE v2 leaderboard!

πŸš€ [2025/7/22] The Paper and Project Page are released!

πŸ’‘ Highlights

  • πŸ”₯We introduce Segment Concept (SeC), a concept-driven segmentation framework for video object segmentation that integrates Large Vision-Language Models (LVLMs) for robust, object-centric representations.
  • πŸ”₯SeC dynamically balances semantic reasoning with feature matching, adaptively adjusting computational efforts based on scene complexity for optimal segmentation performance.
  • πŸ”₯We propose the Semantic Complex Scenarios Video Object Segmentation (SeCVOS) benchmark, designed to evaluate segmentation in challenging scenarios.

✨ SeC Performance

Model SA-V val SA-V test LVOS v2 val MOSE val DAVIS 2017 val YTVOS 2019 val SeCVOS
SAM 2.1 78.6 79.6 84.1 74.5 90.6 88.7 58.2
SAMURAI 79.8 80.0 84.2 72.6 89.9 88.3 62.2
SAM2.1Long 81.1 81.2 85.9 75.2 91.4 88.7 62.3
SeC (Ours) 82.7 81.7 86.5 75.3 91.3 88.6 70.0

πŸ‘¨β€πŸ’» TODO

  • Release SeC training code
  • Release SeCVOS benchmark annotations
  • Release SeC inference code and checkpoints

πŸ› οΈ Usage

1. Install environment and dependencies

Please make sure using the correct versions of transformers and peft.

conda create -n sec python=3.10
conda activate sec
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

2. Download the Pretrained Checkpoints

Download the SeC checkpoint from πŸ€—HuggingFace and place it in the following directory :

saved_models
  β”œβ”€β”€ SeC-4B
  β”‚   └── config.json
  β”‚   └── generation_config.json
  ...

3. Quick Start

If you want to test SeC inference on a single video, please refer to demo.ipynb.

4. Run the inference and evaluate the results

The inference instruction is in INFERENCE.md.

The evaluation instruction can be found in EVALUATE.md. To evaluate performance on seen and unseen categories in the LVOS dataset, refer to the evaluation code available here.

❀️ Acknowledgments and License

This repository are licensed under a Apache License 2.0.

This repo benefits from SAM 2, SAM2Long and Sa2VA. Thanks for their wonderful works.

βœ’οΈ Citation

If you find our work helpful for your research, please consider giving a star ⭐ and citation πŸ“

@article{zhang2025sec,
    title     = {SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction},
    author    = {Zhixiong Zhang and Shuangrui Ding and Xiaoyi Dong and Songxin He and Jianfan Lin and Junsong Tang and Yuhang Zang and Yuhang Cao and Dahua Lin and Jiaqi Wang},
    journal   = {arXiv preprint arXiv:2507.15852},
    year      = {2025}
}