lanfeng4659/STR-TDSL

测试

Opened this issue · 10 comments

您好,我配置了环境,下载了您训练好的权重model_7709.pth,对比论文中的实验数据,但是我用这个权重文件测试iiit_str只得到了72.89的mAP,我不知道这是怎么回事,我不知道是环境还是其他的原因。

您好,我配置了环境,下载了您训练好的权重model_7709.pth,对比论文中的实验数据,但是我用这个权重文件测试iiit_str只得到了72.89的mAP,我不知道这是怎么回事,我不知道是环境还是其他的原因。

可以把测试的log.txt文件提供看一下吗

因为我提问的时间较晚,以为您会隔日回复,实在抱歉,第一个是SVT数据集的log文件 第二个是IIIT-STR数据集的log文件(我将maskrcnn_benchmark包的名字改为了mnbk)
Uploading log_iiit10000.txt…

因为我提问的时间较晚,以为您会隔日回复,实在抱歉,第一个是SVT数据集的log文件 第二个是IIIT-STR数据集的log文件(我将maskrcnn_benchmark包的名字改为了mnbk)
Uploading log_iiit10000.txt

以下是我测试的log,可以参考对比一下。

2023-06-27 17:55:55,377 maskrcnn_benchmark INFO: Using 1 GPUs
2023-06-27 17:55:55,377 maskrcnn_benchmark INFO: AMP_VERBOSE: False
DARTS:
ARCH_START_ITER: 5000
LR_A: 0.001
LR_END: 0.0001
TIE_CELL: False
T_MAX: 2500
WD_A: 0.001
DATALOADER:
ASPECT_RATIO_GROUPING: True
NUM_WORKERS: 8
SIZE_DIVISIBILITY: 32
DATASETS:
TEST: ('iiit_test',)
TEXT:
NUM_CHARS: 25
VOC_SIZE: 97
TRAIN: ()
DTYPE: float32
INPUT:
AUGMENT: PSSAugmentation
BRIGHTNESS: 0.125
CONTRAST: 0.125
CROP_PROB_TRAIN: 0.0
CROP_SIZE_TRAIN: -1
FLIP_PROB_TRAIN: 0.0
HUE: 0.5
MAX_SIZE_TEST: 1333
MAX_SIZE_TRAIN: 1333
MIN_SIZE_RANGE_TRAIN: (-1, -1)
MIN_SIZE_TEST: 800
MIN_SIZE_TRAIN: (800,)
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [57.375, 57.12, 58.395]
SATURATION: 0.5
TO_BGR255: True
VERTICAL_FLIP_PROB_TRAIN: 0.0
IS_LOAD_OPTIMIZER: True
IS_LOAD_SCHEDULER: True
MODEL:
ALIGN:
IS_CHINESE: False
NUM_CONVS: 2
POOLER_CANONICAL_SCALE: 160
POOLER_RESOLUTION: (4, 15)
POOLER_SCALES: (0.25, 0.125, 0.0625)
PREDICTOR: ctc
PYRAMID_LAYERS: (2, 3, 4, 5)
USE_ALONG_LOSS: False
USE_BOX_AUG: False
USE_CHARACTER_AWARENESS: False
USE_CHAR_COUNT: False
USE_COMMON_SPACE: False
USE_CONTRASTIVE_LOSS: False
USE_CTC_LOSS: False
USE_DOMAIN_ALIGN_LOSS: False
USE_DOMAIN_CLASSIFIER: False
USE_DYNAMIC_SIMILARITY: False
USE_FOCAL_L1_LOSS: False
USE_GLOBAL_LOCAL_SIMILARITY: False
USE_HANMING: False
USE_IOU_PREDICTOR: False
USE_LOOK_UP: False
USE_NO_RNN: False
USE_N_GRAM_ED: False
USE_PYRAMID: False
USE_RES_LINK: False
USE_RETRIEVAL: True
USE_STEP: False
USE_TEXTNESS: False
USE_WORD_AUG: False
USE_WORD_INSTANCE_AUG: False
ATTENTION:
IS_CHINESE: False
NUM_CONVS: 4
POOLER_CANONICAL_SCALE: 160
POOLER_RESOLUTION: (14, 64)
POOLER_SCALES: (0.125, 0.0625, 0.03125)
PREDICTOR: ctc
USE_BOX_AUG: False
USE_RETRIEVAL: True
USE_WORD_AUG: False
BACKBONE:
CONV_BODY: R-50
FREEZE_BN: False
FREEZE_CONV_BODY_AT: 2
CHAR_INST_ON: False
CHAR_ON: False
CLS_AGNOSTIC_BBOX_REG: False
DEVICE: cuda
EAST:
CENTER_SAMPLE: True
FPN_STRIDES: [8, 16, 32, 64, 128]
INFERENCE_TH: 0.2
LOC_LOSS_TYPE: giou
LOSS_ALPHA: 0.25
LOSS_GAMMA: 2.0
NMS_TH: 0.6
NUM_CLASSES: 2
NUM_CONVS: 2
POS_RADIUS: 1.5
PRE_NMS_TOP_N: 1000
PRIOR_PROB: 0.01
SIZES_OF_INTEREST: [64, 128, 256, 512]
USE_BN: False
USE_DEFORMABLE: False
USE_GN: True
USE_LIGHTWEIGHT: False
USE_RELU: True
FBNET:
ARCH: default
ARCH_DEF:
BN_TYPE: bn
DET_HEAD_BLOCKS: []
DET_HEAD_LAST_SCALE: 1.0
DET_HEAD_STRIDE: 0
DW_CONV_SKIP_BN: True
DW_CONV_SKIP_RELU: True
KPTS_HEAD_BLOCKS: []
KPTS_HEAD_LAST_SCALE: 0.0
KPTS_HEAD_STRIDE: 0
MASK_HEAD_BLOCKS: []
MASK_HEAD_LAST_SCALE: 0.0
MASK_HEAD_STRIDE: 0
RPN_BN_TYPE:
RPN_HEAD_BLOCKS: 0
SCALE_FACTOR: 1.0
WIDTH_DIVISOR: 1
FCOS:
CENTER_SAMPLE: True
FPN_STRIDES: [8, 16, 32, 64, 128]
INFERENCE_TH: 0.05
LOC_LOSS_TYPE: giou
LOSS_ALPHA: 0.25
LOSS_GAMMA: 2.0
NMS_TH: 0.6
NUM_CLASSES: 2
NUM_CONVS: 4
POS_RADIUS: 1.5
PRE_NMS_TOP_N: 1000
PRIOR_PROB: 0.01
SIZES_OF_INTEREST: [64, 128, 256, 512]
USE_BN: False
USE_DEFORMABLE: False
USE_GN: True
USE_LIGHTWEIGHT: False
USE_RELU: True
FCOS_ON: True
FPN:
USE_BN: False
USE_DEFORMABLE: False
USE_GN: False
USE_RELU: False
GROUP_NORM:
DIM_PER_GP: -1
EPSILON: 1e-05
NUM_GROUPS: 32
HEAD:
BBOX_LOSS:
ALPHA: 0.5
BETA: 0.11
GAMMA: 1.5
TYPE: IOULoss
WEIGHT: 1.0
INST_ON: False
KEYPOINT_ON: False
KE_ON: False
MASK_ON: False
META_ARCHITECTURE: OneStage
MSR_ON: False
NECK:
CONV_BODY: fpn-align
IN_CHANNELS: 256
LAST_STRIDE: 2
NUM_LEVELS: 5
REFINE_LEVEL: 1
REFINE_TYPE: non_local
USE_DEFORMABLE: False
USE_GN: False
OFFSET:
KERNEL_SIZE: 3
PREDICTOR: polar
STOP_OFFSETS: 1500
ONE_STAGE_HEAD: align
POLYGON_DET: False
RESNETS:
BACKBONE_OUT_CHANNELS: 256
DEFORMABLE_GROUPS: 1
DEFORM_POOLING: False
MAX_DCN_LAYER: 15
NUM_GROUPS: 1
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STAGE_WITH_CONTEXT: (False, False, False, False)
STAGE_WITH_DCN: (False, False, False, False)
STEM_FUNC: StemWithFixedBatchNorm
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: True
TRANS_FUNC: BottleneckWithFixedBatchNorm
WIDTH_PER_GROUP: 64
WITH_MODULATED_DCN: False
RETINANET:
ANCHOR_SIZES: (32, 64, 128, 256, 512)
ANCHOR_STRIDES: (8, 16, 32, 64, 128)
ASPECT_RATIOS: (0.5, 1.0, 2.0)
BBOX_REG_BETA: 0.11
BBOX_REG_WEIGHT: 4.0
BG_IOU_THRESHOLD: 0.4
FG_IOU_THRESHOLD: 0.5
INFERENCE_TH: 0.05
LOSS_ALPHA: 0.25
LOSS_GAMMA: 2.0
NMS_TH: 0.4
NUM_CLASSES: 81
NUM_CONVS: 4
OCTAVE: 2.0
PRE_NMS_TOP_N: 1000
PRIOR_PROB: 0.01
SCALES_PER_OCTAVE: 3
STRADDLE_THRESH: 0
USE_C5: False
RETINANET_ON: False
RETRIEVAL_ONLY: False
ROI_BOX_HEAD:
CLASS_WEIGHT: 1.0
CONV_HEAD_DIM: 256
DILATION: 1
FEATURE_EXTRACTOR: ResNet50Conv5ROIFeatureExtractor
MLP_HEAD_DIM: 1024
NUM_CLASSES: 81
NUM_STACKED_CONVS: 4
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_SCALES: (0.0625,)
PREDICTOR: FastRCNNPredictor
USE_DFPOOL: False
USE_GN: False
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0)
BG_IOU_THRESHOLD: 0.5
DETECTIONS_PER_IMG: 100
FG_IOU_THRESHOLD: 0.5
NMS: 0.5
POSITIVE_FRACTION: 0.25
SCORE_THRESH: 0.05
USE_FPN: False
ROI_INST_HEAD:
PREDICTOR: EmbeddingPredictor
ROI_KEYPOINT_HEAD:
CONV_LAYERS: (512, 512, 512, 512, 512, 512, 512, 512)
FEATURE_EXTRACTOR: KeypointRCNNFeatureExtractor
MLP_HEAD_DIM: 1024
NUM_CLASSES: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_SCALES: (0.0625,)
PREDICTOR: KeypointRCNNPredictor
RESOLUTION: 14
SHARE_BOX_FEATURE_EXTRACTOR: True
ROI_MASK_HEAD:
CONV_LAYERS: (256, 256, 256, 256)
DILATION: 1
FEATURE_EXTRACTOR: ResNet50Conv5ROIFeatureExtractor
MLP_HEAD_DIM: 1024
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_SCALES: (0.0625,)
POSTPROCESS_MASKS: False
POSTPROCESS_MASKS_THRESHOLD: 0.5
PREDICTOR: MaskRCNNC4Predictor
RESOLUTION: 14
SHARE_BOX_FEATURE_EXTRACTOR: True
USE_DFPOOL: False
USE_GN: False
RPN:
ANCHOR_SIZES: (32, 64, 128, 256, 512)
ANCHOR_STRIDE: (16,)
ASPECT_RATIOS: (0.5, 1.0, 2.0)
BATCH_SIZE_PER_IMAGE: 256
BG_IOU_THRESHOLD: 0.3
FG_IOU_THRESHOLD: 0.7
FPN_POST_NMS_PER_BATCH: True
FPN_POST_NMS_TOP_N_TEST: 2000
FPN_POST_NMS_TOP_N_TRAIN: 2000
MIN_SIZE: 0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOP_N_TEST: 1000
POST_NMS_TOP_N_TRAIN: 2000
PRE_NMS_TOP_N_TEST: 6000
PRE_NMS_TOP_N_TRAIN: 12000
RPN_HEAD: SingleConvRPNHead
STRADDLE_THRESH: 0
USE_FPN: False
RPN_ONLY: True
WEIGHT: catalog://ImageNetPretrained/MSRA/R-50
OUTPUT_DIR: Log/evluation
PATHS_CATALOG: /workspace/wanghao/projects/STR-TDSL/maskrcnn_benchmark/config/paths_catalog.py
PROCESS:
NMS_THRESH: 0.4
PNMS: False
SOLVER:
BASE_LR: 0.001
BIAS_LR_FACTOR: 2
CHECKPOINT_PERIOD: 2500
GAMMA: 0.1
IMS_PER_BATCH: 16
MAX_ITER: 40000
MOMENTUM: 0.9
ONE_STAGE_HEAD_LR_FACTOR: 1.0
POLY_POWER: 0.9
SCHEDULER: multistep
STEPS: (30000,)
WARMUP_FACTOR: 0.3333333333333333
WARMUP_ITERS: 500
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0005
WEIGHT_DECAY_BIAS: 0
SYNCBN: False
TEST:
BBOX_AUG:
ENABLED: False
H_FLIP: False
MAX_SIZE: 4000
SCALES: ()
SCALE_H_FLIP: False
DETECTIONS_PER_IMG: 100
EXPECTED_RESULTS: []
EXPECTED_RESULTS_SIGMA_TOL: 4
IMS_PER_BATCH: 1
2023-06-27 17:55:55,377 maskrcnn_benchmark INFO: Collecting env info (might take some time)
2023-06-27 17:56:00,256 maskrcnn_benchmark INFO:
PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 20.04.4 LTS
GCC version: (Ubuntu 7.5.0-6ubuntu2) 7.5.0
CMake version: Could not collect

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.2.89
GPU models and configuration:
GPU 0: NVIDIA TITAN X (Pascal)
GPU 1: NVIDIA TITAN X (Pascal)
GPU 2: NVIDIA TITAN X (Pascal)
GPU 3: NVIDIA TITAN X (Pascal)

Nvidia driver version: 470.182.03
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.2.0
[pip3] torchaudio==0.8.0
[pip3] torchvision==0.4.0
[conda] blas 1.0 mkl https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl 2020.1 217 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl-service 2.3.0 py37he904b0f_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl_fft 1.1.0 py37h23d657b_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] mkl_random 1.1.1 py37h0573a6f_0 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
[conda] torch 1.2.0 pypi_0 pypi
[conda] torchaudio 0.8.0 pypi_0 pypi
[conda] torchvision 0.4.0 pypi_0 pypi
Pillow (7.2.0)
<class 'maskrcnn_benchmark.data.datasets.iiit.IIITDataset'>
2023-06-27 17:56:05,592 maskrcnn_benchmark.utils.checkpoint INFO: Loading checkpoint from model_rec_synth_ic17_7709.pth
2023-06-27 17:56:06,320 maskrcnn_benchmark.inference INFO: Start evaluation on iiit_test dataset(10000 images).
2023-06-27 18:15:12,922 maskrcnn_benchmark.inference INFO: Total run time: 0:19:06.601513 (0.1146601513147354 s / img per device, on 1 devices)
2023-06-27 18:15:12,922 maskrcnn_benchmark.inference INFO: Model inference time: 0:15:10.037243 (0.0910037243127823 s / img per device, on 1 devices)
2023-06-27 18:15:45,493 maskrcnn_benchmark.inference INFO: Evaluating bbox proposals
Model:model_rec_synth_ic17_7709.pth,mAP:0.7709092824651216,best mAP:0.7709092824651216

您好,我配置了环境,下载了您训练好的权重model_7709.pth,对比论文中的实验数据,但是我用这个权重文件测试iiit_str只得到了72.89的mAP,我不知道这是怎么回事,我不知道是环境还是其他的原因。

您好,问题解决了告知一声。如果是代码问题,我好更新。

好的,非常感谢您,我去配置一下您的测试环境,然后再测试一遍,测试结果出来再留言告知您

我对比了您的配置文件重新测试了一遍,结果和上一次的一样,您可以更新一下代码然后我再去测试一下,因为这可能是我配置的虚拟环境出了问题

您好,我用了一些取巧的方法重新测试了一遍,实验结果和您论文中描述的一样,非常感谢您的解答