SwinDocSegmenter

Description

Pytorch implementation of the paper SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation. This model is implemented on top of the detectron2 framework. The proposed model can be used to analysis the complex layouts including magazines, Scientific Reports, historical documents, patents and so on as shown in the following examples.

Magazines	Scientific Reports

Tables	Others

Getting Started

Step 1: Clone this repository and change directory to repository root

git clone https://github.com/ayanban011/SwinDocSegmenter.git 
cd SwinDocSegmenter

Step 2: Setup and activate the conda environment with required dependencies:

follow the installation instructions

Step 3: For testing our model, download the best pretrained model weights from the Model Zoo

python ./train_net.py \
    --config-file maskdino_R50_bs16_50ep_4s_dowsample1_2048.yaml \
    --eval-only \
    --num-gpus 1 \
    MODEL.WEIGHTS ./model_final.pth

Step 4: For training the model from scratch, use this magic command for training on 'n' GPUs:

python train_net.py --num-gpus 1 --config-file config_path SOLVER.IMS_PER_BATCH SET_TO_SOME_REASONABLE_VALUE SOLVER.BASE_LR SET_TO_SOME_REASONABLE_VALUE

Step 4: For training the model from scratch, use this magic command for training on 'n' GPUs:

In train_net.py

def main(args):
    register_coco_instances("dataset_train",{},"path to the ground truth json file","path to the training image folder")
    register_coco_instances("dataset_val",{},"path to the ground truth json file","path to the validation image folder")

    MetadataCatalog.get("dataset_train").thing_classes = ['name of the classes']
    MetadataCatalog.get("dataset_val").thing_classes = ['name of the classes']
    ...
if __name__ == "__main__":
    ...
    MetadataCatalog.get("dataset_train").thing_classes = ['name of the classes']
    MetadataCatalog.get("dataset_val").thing_classes = ['name of the classes']
    ...

In Config File

...
SEM_SEG_HEAD:
    ...
    NUM_CLASSES: #no. of classes
...
DATASETS:
  TRAIN: ("dataset_train",)
  TEST: ("dataset_val",)
...

Model Zoo

In this section, we release the pre-trained weights for all the best DocEnTr model variants trained on benchmark datasets.

Dataset	Config-file	Weights	AP
PublayNet	config-publay	model	93.72
Prima	config-prima	model	54.39
HJ Dataset	config-hj	model	84.65
TableBank	config-table	model	98.04
DoclayNet	config-doclay	model	76.85

Citation

If you find this useful for your research, please cite it as follows:

@article{banerjee2023swindocsegmenter,
  title={SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation},
  author={Banerjee, Ayan and Biswas, Sanket and Llad{\'o}s, Josep and Pal, Umapada},
  journal={arXiv preprint arXiv:2305.04609},
  year={2023}
}

Mukilan-Krishnakumar/SwinDocSegmenter

SwinDocSegmenter

Description

Getting Started

Step 1: Clone this repository and change directory to repository root

Step 2: Setup and activate the conda environment with required dependencies:

Step 3: For testing our model, download the best pretrained model weights from the Model Zoo

Step 4: For training the model from scratch, use this magic command for training on 'n' GPUs:

Step 4: For training the model from scratch, use this magic command for training on 'n' GPUs:

Model Zoo

Citation

Acknowledgement

Authors

Conclusion