JosephKJ/OWOD

Not able to detect unknowns

Jiyang-Zheng opened this issue · 3 comments

Hi,

I have used the command

python tools/train_net.py --num-gpus 4 --dist-url='tcp://127.0.0.1:52125' --resume --config-file ./configs/OWOD/t1/t1_train.yaml SOLVER.IMS_PER_BATCH 4 SOLVER.BASE_LR 0.01 OUTPUT_DIR "./output/t1"

to train the model, however, it seems like the model does not detect any unknown objects as

[32m[12/09 00:10:18 d2.data.build]: �[0mRemoved 0 images with no usable annotations. 16551 images left. �[32m[12/09 00:10:18 d2.data.build]: �[0mValid classes: range(0, 20) �[32m[12/09 00:10:18 d2.data.build]: �[0mRemoving earlier seen class objects and the unknown objects... �[32m[12/09 00:10:21 d2.data.build]: �[0mDistribution of instances among all 81 categories: �[36m| category | #instances | category | #instances | category | #instances | |:-------------:|:-------------|:-------------:|:-------------|:----------:|:-------------| | aeroplane | 1285 | bicycle | 1208 | bird | 1820 | | boat | 1397 | bottle | 2116 | bus | 909 | | car | 4008 | cat | 1616 | chair | 4338 | | cow | 1058 | diningtable | 1057 | dog | 2079 | | horse | 1156 | motorbike | 1141 | person | 15576 | | pottedplant | 1724 | sheep | 1347 | sofa | 1211 | | train | 984 | tvmonitor | 1193 | truck | 0 | | traffic light | 0 | fire hydrant | 0 | stop sign | 0 | | parking meter | 0 | bench | 0 | elephant | 0 | | bear | 0 | zebra | 0 | giraffe | 0 | | backpack | 0 | umbrella | 0 | handbag | 0 | | tie | 0 | suitcase | 0 | microwave | 0 | | oven | 0 | toaster | 0 | sink | 0 | | refrigerator | 0 | frisbee | 0 | skis | 0 | | snowboard | 0 | sports ball | 0 | kite | 0 | | baseball bat | 0 | baseball gl.. | 0 | skateboard | 0 | | surfboard | 0 | tennis racket | 0 | banana | 0 | | apple | 0 | sandwich | 0 | orange | 0 | | broccoli | 0 | carrot | 0 | hot dog | 0 | | pizza | 0 | donut | 0 | cake | 0 | | bed | 0 | toilet | 0 | laptop | 0 | | mouse | 0 | remote | 0 | keyboard | 0 | | cell phone | 0 | book | 0 | clock | 0 | | vase | 0 | scissors | 0 | teddy bear | 0 | | hair drier | 0 | toothbrush | 0 | wine glass | 0 | | cup | 0 | fork | 0 | knife | 0 | | spoon | 0 | bowl | 0 | unknown | 0 | | | | | | | | | total | 47223 | | | | |�[0m �[32m[12/09 00:10:21 d2.data.build]: �[0mNumber of datapoints: 16551 �[32m[12/09 00:10:21 d2.data.common]: �[0mSerializing 16551 elements to byte tensors and concatenating them all ...

Therefore, I have used 't1_test.yaml' with 'model_final.pth' from the backup directory and directly do the inference, the result still shows that zero unknown was detected.

Is there anything we need to modify in order to make the model to be able to detect unknown?

Thanks

Full log as followed

`[32m[12/08 23:02:05 detectron2]: �[0mEnvironment info:


sys.platform linux
Python 3.6.13 |Anaconda, Inc.| (default, Jun 4 2021, 14:25:59) [GCC 7.5.0]
numpy 1.19.5
detectron2 0.2.1 @/scratch1/zhe030/OWOD/detectron2
Compiler GCC 7.5
CUDA compiler not available
detectron2 arch flags /scratch1/zhe030/OWOD/detectron2/_C.cpython-36m-x86_64-linux-gnu.so
DETECTRON2_ENV_MODULE
PyTorch 1.8.0+cu111 @/home/zhe030/.local/lib/python3.6/site-packages/torch
PyTorch debug build False
GPU available True
GPU 0,1,2,3 Tesla P100-SXM2-16GB (arch=6.0)
CUDA_HOME /apps/cuda/11.1.1/
Pillow 8.4.0
torchvision 0.9.0+cu111 @/home/zhe030/.local/lib/python3.6/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore 0.1.1.dev200512
cv2 4.5.4


PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

�[32m[12/08 23:02:05 detectron2]: �[0mCommand line arguments: Namespace(config_file='./configs/OWOD/t1/t1_train.yaml', dist_url='tcp://127.0.0.1:52125', eval_only=False, machine_rank=0, num_gpus=4, num_machines=1, opts=['SOLVER.IMS_PER_BATCH', '4', 'SOLVER.BASE_LR', '0.01', 'OUTPUT_DIR', './output/t1'], resume=True)
�[32m[12/08 23:02:05 detectron2]: �[0mContents of args.config_file=./configs/OWOD/t1/t1_train.yaml:
BASE: "../../Base-RCNN-C4-OWOD.yaml"
MODEL:
WEIGHTS: "detectron2://ImageNetPretrained/MSRA/R-50.pkl"
DATASETS:
TRAIN: ('t1_voc_coco_2007_train', ) # t1_voc_coco_2007_train, t1_voc_coco_2007_ft
TEST: ('voc_coco_2007_test', 't1_voc_coco_2007_known_test') # voc_coco_2007_test, t1_voc_coco_2007_test, t1_voc_coco_2007_val
SOLVER:
STEPS: (12000, 16000)
MAX_ITER: 18000
WARMUP_ITERS: 100
OUTPUT_DIR: "./output/t1"
OWOD:
PREV_INTRODUCED_CLS: 0
CUR_INTRODUCED_CLS: 20
�[32m[12/08 23:02:05 detectron2]: �[0mRunning with full config:
CUDNN_BENCHMARK: False
DATALOADER:
ASPECT_RATIO_GROUPING: True
FILTER_EMPTY_ANNOTATIONS: True
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: ()
PROPOSAL_FILES_TRAIN: ()
TEST: ('voc_coco_2007_test', 't1_voc_coco_2007_known_test')
TRAIN: ('t1_voc_coco_2007_train',)
GLOBAL:
HACK: 1.0
INPUT:
CROP:
ENABLED: False
SIZE: [0.9, 0.9]
TYPE: relative_range
FORMAT: BGR
MASK_FORMAT: polygon
MAX_SIZE_TEST: 1333
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800)
MIN_SIZE_TRAIN_SAMPLING: choice
RANDOM_FLIP: horizontal
MODEL:
ANCHOR_GENERATOR:
ANGLES: [[-90, 0, 90]]
ASPECT_RATIOS: [[0.5, 1.0, 2.0]]
NAME: DefaultAnchorGenerator
OFFSET: 0.0
SIZES: [[32, 64, 128, 256, 512]]
BACKBONE:
FREEZE_AT: 2
NAME: build_resnet_backbone
DEVICE: cuda
FPN:
FUSE_TYPE: sum
IN_FEATURES: []
NORM:
OUT_CHANNELS: 256
KEYPOINT_ON: False
LOAD_PROPOSALS: False
MASK_ON: False
META_ARCHITECTURE: GeneralizedRCNN
PANOPTIC_FPN:
COMBINE:
ENABLED: True
INSTANCES_CONFIDENCE_THRESH: 0.5
OVERLAP_THRESH: 0.5
STUFF_AREA_LIMIT: 4096
INSTANCE_LOSS_WEIGHT: 1.0
PIXEL_MEAN: [103.53, 116.28, 123.675]
PIXEL_STD: [1.0, 1.0, 1.0]
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
RESNETS:
DEFORM_MODULATED: False
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE: [False, False, False, False]
DEPTH: 50
NORM: FrozenBN
NUM_GROUPS: 1
OUT_FEATURES: ['res4']
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: True
WIDTH_PER_GROUP: 64
RETINANET:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
FOCAL_LOSS_ALPHA: 0.25
FOCAL_LOSS_GAMMA: 2.0
IN_FEATURES: ['p3', 'p4', 'p5', 'p6', 'p7']
IOU_LABELS: [0, -1, 1]
IOU_THRESHOLDS: [0.4, 0.5]
NMS_THRESH_TEST: 0.5
NORM:
NUM_CLASSES: 80
NUM_CONVS: 4
PRIOR_PROB: 0.01
SCORE_THRESH_TEST: 0.05
SMOOTH_L1_LOSS_BETA: 0.1
TOPK_CANDIDATES_TEST: 1000
ROI_BOX_CASCADE_HEAD:
BBOX_REG_WEIGHTS: ((10.0, 10.0, 5.0, 5.0), (20.0, 20.0, 10.0, 10.0), (30.0, 30.0, 15.0, 15.0))
IOUS: (0.5, 0.6, 0.7)
ROI_BOX_HEAD:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0)
CLS_AGNOSTIC_BBOX_REG: False
CONV_DIM: 256
FC_DIM: 1024
NAME:
NORM:
NUM_CONV: 0
NUM_FC: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
SMOOTH_L1_BETA: 0.0
TRAIN_ON_PRED_BOXES: False
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
IN_FEATURES: ['res4']
IOU_LABELS: [0, 1]
IOU_THRESHOLDS: [0.5]
NAME: Res5ROIHeads
NMS_THRESH_TEST: 0.5
NUM_CLASSES: 81
POSITIVE_FRACTION: 0.25
PROPOSAL_APPEND_GT: True
SCORE_THRESH_TEST: 0.05
ROI_KEYPOINT_HEAD:
CONV_DIMS: (512, 512, 512, 512, 512, 512, 512, 512)
LOSS_WEIGHT: 1.0
MIN_KEYPOINTS_PER_IMAGE: 1
NAME: KRCNNConvDeconvUpsampleHead
NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: True
NUM_KEYPOINTS: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
ROI_MASK_HEAD:
CLS_AGNOSTIC_MASK: False
CONV_DIM: 256
NAME: MaskRCNNConvUpsampleHead
NORM:
NUM_CONV: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
RPN:
BATCH_SIZE_PER_IMAGE: 256
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
BOUNDARY_THRESH: -1
HEAD_NAME: StandardRPNHead
IN_FEATURES: ['res4']
IOU_LABELS: [0, -1, 1]
IOU_THRESHOLDS: [0.3, 0.7]
LOSS_WEIGHT: 1.0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOPK_TEST: 1000
POST_NMS_TOPK_TRAIN: 2000
PRE_NMS_TOPK_TEST: 6000
PRE_NMS_TOPK_TRAIN: 12000
SMOOTH_L1_BETA: 0.0
SEM_SEG_HEAD:
COMMON_STRIDE: 4
CONVS_DIM: 128
IGNORE_VALUE: 255
IN_FEATURES: ['p2', 'p3', 'p4', 'p5']
LOSS_WEIGHT: 1.0
NAME: SemSegFPNHead
NORM: GN
NUM_CLASSES: 54
WEIGHTS: detectron2://ImageNetPretrained/MSRA/R-50.pkl
OUTPUT_DIR: ./output/t1
OWOD:
CLUSTERING:
ITEMS_PER_CLASS: 20
MARGIN: 10.0
MOMENTUM: 0.99
START_ITER: 1000
UPDATE_MU_ITER: 3000
Z_DIMENSION: 128
COMPUTE_ENERGY: False
CUR_INTRODUCED_CLS: 20
ENABLE_CLUSTERING: True
ENABLE_THRESHOLD_AUTOLABEL_UNK: True
ENABLE_UNCERTAINITY_AUTOLABEL_UNK: False
ENERGY_SAVE_PATH:
FEATURE_STORE_SAVE_PATH: feature_store
NUM_UNK_PER_IMAGE: 1
PREV_INTRODUCED_CLS: 0
SKIP_TRAINING_WHILE_EVAL: False
TEMPERATURE: 1.5
SEED: -1
SOLVER:
BASE_LR: 0.01
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 5000
CLIP_GRADIENTS:
CLIP_TYPE: value
CLIP_VALUE: 1.0
ENABLED: False
NORM_TYPE: 2.0
GAMMA: 0.1
IMS_PER_BATCH: 4
LR_SCHEDULER_NAME: WarmupMultiStepLR
MAX_ITER: 18000
MOMENTUM: 0.9
NESTEROV: False
REFERENCE_WORLD_SIZE: 0
STEPS: (12000, 16000)
WARMUP_FACTOR: 0.001
WARMUP_ITERS: 100
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: 0.0001
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: False
FLIP: True
MAX_SIZE: 4000
MIN_SIZES: (400, 500, 600, 700, 800, 900, 1000, 1100, 1200)
DETECTIONS_PER_IMAGE: 100
EVAL_PERIOD: 0
EXPECTED_RESULTS: []
KEYPOINT_OKS_SIGMAS: []
PRECISE_BN:
ENABLED: False
NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0
�[32m[12/08 23:02:05 detectron2]: �[0mFull config saved to ./output/t1/config.yaml
�[32m[12/08 23:02:05 d2.utils.env]: �[0mUsing a generated random seed 5295504
�[32m[12/08 23:02:07 d2.modeling.roi_heads.fast_rcnn]: �[0mInvalid class range: [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79]
�[32m[12/08 23:02:07 d2.modeling.roi_heads.fast_rcnn]: �[0mFeature store not found in ./output/t1/feature_store/feat.pt. Creating new feature store.
�[32m[12/08 23:02:07 d2.engine.defaults]: �[0mModel:
GeneralizedRCNN(
(backbone): ResNet(
(stem): BasicStem(
(conv1): Conv2d(
3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
)
(res2): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv1): Conv2d(
64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv2): Conv2d(
64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv3): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv2): Conv2d(
64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv3): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv2): Conv2d(
64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
)
(conv3): Conv2d(
64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
)
)
(res3): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv1): Conv2d(
256, 128, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
(3): BottleneckBlock(
(conv1): Conv2d(
512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv2): Conv2d(
128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
)
(conv3): Conv2d(
128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
)
)
(res4): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
(conv1): Conv2d(
512, 256, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(3): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(4): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
(5): BottleneckBlock(
(conv1): Conv2d(
1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv2): Conv2d(
256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
)
(conv3): Conv2d(
256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
)
)
)
)
(proposal_generator): RPN(
(rpn_head): StandardRPNHead(
(conv): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(objectness_logits): Conv2d(1024, 15, kernel_size=(1, 1), stride=(1, 1))
(anchor_deltas): Conv2d(1024, 60, kernel_size=(1, 1), stride=(1, 1))
)
(anchor_generator): DefaultAnchorGenerator(
(cell_anchors): BufferList()
)
)
(roi_heads): Res5ROIHeads(
(pooler): ROIPooler(
(level_poolers): ModuleList(
(0): ROIAlign(output_size=(14, 14), spatial_scale=0.0625, sampling_ratio=0, aligned=True)
)
)
(res5): Sequential(
(0): BottleneckBlock(
(shortcut): Conv2d(
1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
(conv1): Conv2d(
1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv2): Conv2d(
512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv3): Conv2d(
512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
)
(1): BottleneckBlock(
(conv1): Conv2d(
2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv2): Conv2d(
512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv3): Conv2d(
512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
)
(2): BottleneckBlock(
(conv1): Conv2d(
2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv2): Conv2d(
512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
)
(conv3): Conv2d(
512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
(norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
)
)
)
(box_predictor): FastRCNNOutputLayers(
(cls_score): Linear(in_features=2048, out_features=82, bias=True)
(bbox_pred): Linear(in_features=2048, out_features=324, bias=True)
(hingeloss): HingeEmbeddingLoss()
)
)
)
�[32m[12/08 23:02:32 d2.data.build]: �[0mRemoved 0 images with no usable annotations. 16551 images left.
�[32m[12/08 23:02:32 d2.data.build]: �[0mValid classes: range(0, 20)
�[32m[12/08 23:02:32 d2.data.build]: �[0mRemoving earlier seen class objects and the unknown objects...
�[32m[12/08 23:02:34 d2.data.build]: �[0mDistribution of instances among all 81 categories:
�[36m| category | #instances | category | #instances | category | #instances |
|:-------------:|:-------------|:-------------:|:-------------|:----------:|:-------------|
| aeroplane | 1285 | bicycle | 1208 | bird | 1820 |
| boat | 1397 | bottle | 2116 | bus | 909 |
| car | 4008 | cat | 1616 | chair | 4338 |
| cow | 1058 | diningtable | 1057 | dog | 2079 |
| horse | 1156 | motorbike | 1141 | person | 15576 |
| pottedplant | 1724 | sheep | 1347 | sofa | 1211 |
| train | 984 | tvmonitor | 1193 | truck | 0 |
| traffic light | 0 | fire hydrant | 0 | stop sign | 0 |
| parking meter | 0 | bench | 0 | elephant | 0 |
| bear | 0 | zebra | 0 | giraffe | 0 |
| backpack | 0 | umbrella | 0 | handbag | 0 |
| tie | 0 | suitcase | 0 | microwave | 0 |
| oven | 0 | toaster | 0 | sink | 0 |
| refrigerator | 0 | frisbee | 0 | skis | 0 |
| snowboard | 0 | sports ball | 0 | kite | 0 |
| baseball bat | 0 | baseball gl.. | 0 | skateboard | 0 |
| surfboard | 0 | tennis racket | 0 | banana | 0 |
| apple | 0 | sandwich | 0 | orange | 0 |
| broccoli | 0 | carrot | 0 | hot dog | 0 |
| pizza | 0 | donut | 0 | cake | 0 |
| bed | 0 | toilet | 0 | laptop | 0 |
| mouse | 0 | remote | 0 | keyboard | 0 |
| cell phone | 0 | book | 0 | clock | 0 |
| vase | 0 | scissors | 0 | teddy bear | 0 |
| hair drier | 0 | toothbrush | 0 | wine glass | 0 |
| cup | 0 | fork | 0 | knife | 0 |
| spoon | 0 | bowl | 0 | unknown | 0 |
| | | | | | |
| total | 47223 | | | | |�[0m
�[32m[12/08 23:02:34 d2.data.build]: �[0mNumber of datapoints: 16551
�[32m[12/08 23:02:34 d2.data.common]: �[0mSerializing 16551 elements to byte tensors and concatenating them all ...
�[32m[12/08 23:02:36 d2.data.common]: �[0mSerialized dataset takes 7.57 MiB
�[32m[12/08 23:02:36 d2.data.dataset_mapper]: �[0mAugmentations used in training: [ResizeShortestEdge(short_edge_length=(480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
�[32m[12/08 23:02:36 d2.data.build]: �[0mUsing training sampler TrainingSampler
�[32m[12/08 23:02:36 fvcore.common.checkpoint]: �[0mLoading checkpoint from detectron2://ImageNetPretrained/MSRA/R-50.pkl `

Have you solved it?

Yes. It turns out that the final weights were stored in 't1', 'tx_ft' .etc. The naming of the folders causes confusion.