JosephKJ/OWOD

question about t1_test.yaml

PowderYu opened this issue · 7 comments

when I run
python tools/train_net.py --num-gpus 2 --eval-only --config-file ./configs/OWOD/t1/t1_test.yaml SOLVER.IMS_PER_BATCH 8 SOLVER.BASE_LR 0.005 OUTPUT_DIR "./output/t1_final" MODEL.WEIGHTS "/data/yu/code/OWOD-master/output/t1/model_final.pth"

It gave the following error:

Command Line Args: Namespace(config_file='./configs/OWOD/t1/t1_test.yaml', dist_url='tcp://127.0.0.1:50153', eval_only=True, machine_rank=0, num_gpus=2, num_machines=1, opts=['SOLVER.IMS_PER_BATCH', '8', 'SOLVER.BASE_LR', '0.005', 'OUTPUT_DIR', './output/t1_final', 'MODEL.WEIGHTS', '/data/yu/code/OWOD-master/output/t1/model_final.pth'], resume=False)
[W ProcessGroupNCCL.cpp:1569] Rank 0 using best-guess GPU 0 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
[W ProcessGroupNCCL.cpp:1569] Rank 1 using best-guess GPU 1 to perform barrier as devices used by this process are currently unknown. This can potentially cause a hang if this rank to GPU mapping is incorrect.Specify device_ids in barrier() to force use of a particular device.
�[32m[09/14 19:42:57 detectron2]: �[0mRank of current process: 0. World size: 2
�[32m[09/14 19:42:59 detectron2]: �[0mEnvironment info:
----------------------  -------------------------------------------------------------------------------------
sys.platform            linux
Python                  3.7.0 (default, Oct  9 2018, 10:31:47) [GCC 7.3.0]
numpy                   1.21.2
detectron2              0.2.1 @/data/yu/code/OWOD-master/detectron2
Compiler                GCC 5.4
CUDA compiler           CUDA 11.1
detectron2 arch flags   8.6
DETECTRON2_ENV_MODULE   <not set>
PyTorch                 1.9.0+cu111 @/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch
PyTorch debug build     False
GPU available           True
GPU 0,1,2,3             NVIDIA GeForce RTX 3090 (arch=8.6)
CUDA_HOME               /usr/local/cuda-11.1
Pillow                  8.3.1
torchvision             0.10.0+cu111 @/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torchvision
torchvision arch flags  3.5, 5.0, 6.0, 7.0, 7.5, 8.0, 8.6
fvcore                  0.1.1.dev200512
cv2                     4.5.3
----------------------  -------------------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

�[32m[09/14 19:42:59 detectron2]: �[0mCommand line arguments: Namespace(config_file='./configs/OWOD/t1/t1_test.yaml', dist_url='tcp://127.0.0.1:50153', eval_only=True, machine_rank=0, num_gpus=2, num_machines=1, opts=['SOLVER.IMS_PER_BATCH', '8', 'SOLVER.BASE_LR', '0.005', 'OUTPUT_DIR', './output/t1_final', 'MODEL.WEIGHTS', '/data/yu/code/OWOD-master/output/t1/model_final.pth'], resume=False)
�[32m[09/14 19:42:59 detectron2]: �[0mContents of args.config_file=./configs/OWOD/t1/t1_test.yaml:
_BASE_: "../../Base-RCNN-C4-OWOD.yaml"
MODEL:
  WEIGHTS: "/home/joseph/workspace/OWOD/output/t1_clustering_with_save/model_final.pth"
  ROI_HEADS:
    NMS_THRESH_TEST: 0.4
TEST:
  DETECTIONS_PER_IMAGE: 50
DATASETS:
  TRAIN: ('t1_voc_coco_2007_train', ) # t1_voc_coco_2007_train, t1_voc_coco_2007_ft
  TEST: ('voc_coco_2007_test', )   # voc_coco_2007_test
SOLVER:
  STEPS: (12000, 16000)
  MAX_ITER: 18000
  WARMUP_ITERS: 100
OUTPUT_DIR: "./output/temp_3"
OWOD:
  PREV_INTRODUCED_CLS: 0
  CUR_INTRODUCED_CLS: 20
�[32m[09/14 19:42:59 detectron2]: �[0mRunning with full config:
CUDNN_BENCHMARK: False
DATALOADER:
  ASPECT_RATIO_GROUPING: True
  FILTER_EMPTY_ANNOTATIONS: True
  NUM_WORKERS: 4
  REPEAT_THRESHOLD: 0.0
  SAMPLER_TRAIN: TrainingSampler
DATASETS:
  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
  PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
  PROPOSAL_FILES_TEST: ()
  PROPOSAL_FILES_TRAIN: ()
  TEST: ('voc_coco_2007_test',)
  TRAIN: ('t1_voc_coco_2007_train',)
GLOBAL:
  HACK: 1.0
INPUT:
  CROP:
    ENABLED: False
    SIZE: [0.9, 0.9]
    TYPE: relative_range
  FORMAT: BGR
  MASK_FORMAT: polygon
  MAX_SIZE_TEST: 1333
  MAX_SIZE_TRAIN: 1333
  MIN_SIZE_TEST: 800
  MIN_SIZE_TRAIN: (480, 512, 544, 576, 608, 640, 672, 704, 736, 768, 800)
  MIN_SIZE_TRAIN_SAMPLING: choice
  RANDOM_FLIP: horizontal
MODEL:
  ANCHOR_GENERATOR:
    ANGLES: [[-90, 0, 90]]
    ASPECT_RATIOS: [[0.5, 1.0, 2.0]]
    NAME: DefaultAnchorGenerator
    OFFSET: 0.0
    SIZES: [[32, 64, 128, 256, 512]]
  BACKBONE:
    FREEZE_AT: 2
    NAME: build_resnet_backbone
  DEVICE: cuda
  FPN:
    FUSE_TYPE: sum
    IN_FEATURES: []
    NORM: 
    OUT_CHANNELS: 256
  KEYPOINT_ON: False
  LOAD_PROPOSALS: False
  MASK_ON: False
  META_ARCHITECTURE: GeneralizedRCNN
  PANOPTIC_FPN:
    COMBINE:
      ENABLED: True
      INSTANCES_CONFIDENCE_THRESH: 0.5
      OVERLAP_THRESH: 0.5
      STUFF_AREA_LIMIT: 4096
    INSTANCE_LOSS_WEIGHT: 1.0
  PIXEL_MEAN: [103.53, 116.28, 123.675]
  PIXEL_STD: [1.0, 1.0, 1.0]
  PROPOSAL_GENERATOR:
    MIN_SIZE: 0
    NAME: RPN
  RESNETS:
    DEFORM_MODULATED: False
    DEFORM_NUM_GROUPS: 1
    DEFORM_ON_PER_STAGE: [False, False, False, False]
    DEPTH: 50
    NORM: FrozenBN
    NUM_GROUPS: 1
    OUT_FEATURES: ['res4']
    RES2_OUT_CHANNELS: 256
    RES5_DILATION: 1
    STEM_OUT_CHANNELS: 64
    STRIDE_IN_1X1: True
    WIDTH_PER_GROUP: 64
  RETINANET:
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
    FOCAL_LOSS_ALPHA: 0.25
    FOCAL_LOSS_GAMMA: 2.0
    IN_FEATURES: ['p3', 'p4', 'p5', 'p6', 'p7']
    IOU_LABELS: [0, -1, 1]
    IOU_THRESHOLDS: [0.4, 0.5]
    NMS_THRESH_TEST: 0.5
    NORM: 
    NUM_CLASSES: 80
    NUM_CONVS: 4
    PRIOR_PROB: 0.01
    SCORE_THRESH_TEST: 0.05
    SMOOTH_L1_LOSS_BETA: 0.1
    TOPK_CANDIDATES_TEST: 1000
  ROI_BOX_CASCADE_HEAD:
    BBOX_REG_WEIGHTS: ((10.0, 10.0, 5.0, 5.0), (20.0, 20.0, 10.0, 10.0), (30.0, 30.0, 15.0, 15.0))
    IOUS: (0.5, 0.6, 0.7)
  ROI_BOX_HEAD:
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_LOSS_WEIGHT: 1.0
    BBOX_REG_WEIGHTS: (10.0, 10.0, 5.0, 5.0)
    CLS_AGNOSTIC_BBOX_REG: False
    CONV_DIM: 256
    FC_DIM: 1024
    NAME: 
    NORM: 
    NUM_CONV: 0
    NUM_FC: 0
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
    SMOOTH_L1_BETA: 0.0
    TRAIN_ON_PRED_BOXES: False
  ROI_HEADS:
    BATCH_SIZE_PER_IMAGE: 512
    IN_FEATURES: ['res4']
    IOU_LABELS: [0, 1]
    IOU_THRESHOLDS: [0.5]
    NAME: Res5ROIHeads
    NMS_THRESH_TEST: 0.4
    NUM_CLASSES: 81
    POSITIVE_FRACTION: 0.25
    PROPOSAL_APPEND_GT: True
    SCORE_THRESH_TEST: 0.05
  ROI_KEYPOINT_HEAD:
    CONV_DIMS: (512, 512, 512, 512, 512, 512, 512, 512)
    LOSS_WEIGHT: 1.0
    MIN_KEYPOINTS_PER_IMAGE: 1
    NAME: KRCNNConvDeconvUpsampleHead
    NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: True
    NUM_KEYPOINTS: 17
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
  ROI_MASK_HEAD:
    CLS_AGNOSTIC_MASK: False
    CONV_DIM: 256
    NAME: MaskRCNNConvUpsampleHead
    NORM: 
    NUM_CONV: 0
    POOLER_RESOLUTION: 14
    POOLER_SAMPLING_RATIO: 0
    POOLER_TYPE: ROIAlignV2
  RPN:
    BATCH_SIZE_PER_IMAGE: 256
    BBOX_REG_LOSS_TYPE: smooth_l1
    BBOX_REG_LOSS_WEIGHT: 1.0
    BBOX_REG_WEIGHTS: (1.0, 1.0, 1.0, 1.0)
    BOUNDARY_THRESH: -1
    HEAD_NAME: StandardRPNHead
    IN_FEATURES: ['res4']
    IOU_LABELS: [0, -1, 1]
    IOU_THRESHOLDS: [0.3, 0.7]
    LOSS_WEIGHT: 1.0
    NMS_THRESH: 0.7
    POSITIVE_FRACTION: 0.5
    POST_NMS_TOPK_TEST: 1000
    POST_NMS_TOPK_TRAIN: 2000
    PRE_NMS_TOPK_TEST: 6000
    PRE_NMS_TOPK_TRAIN: 12000
    SMOOTH_L1_BETA: 0.0
  SEM_SEG_HEAD:
    COMMON_STRIDE: 4
    CONVS_DIM: 128
    IGNORE_VALUE: 255
    IN_FEATURES: ['p2', 'p3', 'p4', 'p5']
    LOSS_WEIGHT: 1.0
    NAME: SemSegFPNHead
    NORM: GN
    NUM_CLASSES: 54
  WEIGHTS: /data/yu/code/OWOD-master/output/t1/model_final.pth
OUTPUT_DIR: ./output/t1_final
OWOD:
  CLUSTERING:
    ITEMS_PER_CLASS: 20
    MARGIN: 10.0
    MOMENTUM: 0.99
    START_ITER: 1000
    UPDATE_MU_ITER: 3000
    Z_DIMENSION: 128
  COMPUTE_ENERGY: False
  CUR_INTRODUCED_CLS: 20
  ENABLE_CLUSTERING: True
  ENABLE_THRESHOLD_AUTOLABEL_UNK: True
  ENABLE_UNCERTAINITY_AUTOLABEL_UNK: False
  ENERGY_SAVE_PATH: 
  FEATURE_STORE_SAVE_PATH: feature_store
  NUM_UNK_PER_IMAGE: 1
  PREV_INTRODUCED_CLS: 0
  SKIP_TRAINING_WHILE_EVAL: False
  TEMPERATURE: 1.5
SEED: -1
SOLVER:
  BASE_LR: 0.005
  BIAS_LR_FACTOR: 1.0
  CHECKPOINT_PERIOD: 5000
  CLIP_GRADIENTS:
    CLIP_TYPE: value
    CLIP_VALUE: 1.0
    ENABLED: False
    NORM_TYPE: 2.0
  GAMMA: 0.1
  IMS_PER_BATCH: 8
  LR_SCHEDULER_NAME: WarmupMultiStepLR
  MAX_ITER: 18000
  MOMENTUM: 0.9
  NESTEROV: False
  REFERENCE_WORLD_SIZE: 0
  STEPS: (12000, 16000)
  WARMUP_FACTOR: 0.001
  WARMUP_ITERS: 100
  WARMUP_METHOD: linear
  WEIGHT_DECAY: 0.0001
  WEIGHT_DECAY_BIAS: 0.0001
  WEIGHT_DECAY_NORM: 0.0
TEST:
  AUG:
    ENABLED: False
    FLIP: True
    MAX_SIZE: 4000
    MIN_SIZES: (400, 500, 600, 700, 800, 900, 1000, 1100, 1200)
  DETECTIONS_PER_IMAGE: 50
  EVAL_PERIOD: 0
  EXPECTED_RESULTS: []
  KEYPOINT_OKS_SIGMAS: []
  PRECISE_BN:
    ENABLED: False
    NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0
�[32m[09/14 19:42:59 detectron2]: �[0mFull config saved to ./output/t1_final/config.yaml
�[32m[09/14 19:42:59 d2.utils.env]: �[0mUsing a generated random seed 59387917
�[32m[09/14 19:42:59 d2.modeling.roi_heads.fast_rcnn]: �[0mInvalid class range: [20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79]
�[32m[09/14 19:42:59 d2.modeling.roi_heads.fast_rcnn]: �[0mFeature store not found in ./output/t1_final/feature_store/feat.pt. Creating new feature store.
�[32m[09/14 19:42:59 d2.engine.defaults]: �[0mModel:
GeneralizedRCNN(
  (backbone): ResNet(
    (stem): BasicStem(
      (conv1): Conv2d(
        3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
        (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
      )
    )
    (res2): Sequential(
      (0): BottleneckBlock(
        (shortcut): Conv2d(
          64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv1): Conv2d(
          64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
        (conv2): Conv2d(
          64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
        (conv3): Conv2d(
          64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
      )
      (1): BottleneckBlock(
        (conv1): Conv2d(
          256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
        (conv2): Conv2d(
          64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
        (conv3): Conv2d(
          64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
      )
      (2): BottleneckBlock(
        (conv1): Conv2d(
          256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
        (conv2): Conv2d(
          64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
        (conv3): Conv2d(
          64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
      )
    )
    (res3): Sequential(
      (0): BottleneckBlock(
        (shortcut): Conv2d(
          256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv1): Conv2d(
          256, 128, kernel_size=(1, 1), stride=(2, 2), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv2): Conv2d(
          128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv3): Conv2d(
          128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
      )
      (1): BottleneckBlock(
        (conv1): Conv2d(
          512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv2): Conv2d(
          128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv3): Conv2d(
          128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
      )
      (2): BottleneckBlock(
        (conv1): Conv2d(
          512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv2): Conv2d(
          128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv3): Conv2d(
          128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
      )
      (3): BottleneckBlock(
        (conv1): Conv2d(
          512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv2): Conv2d(
          128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=128, eps=1e-05)
        )
        (conv3): Conv2d(
          128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
      )
    )
    (res4): Sequential(
      (0): BottleneckBlock(
        (shortcut): Conv2d(
          512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
        (conv1): Conv2d(
          512, 256, kernel_size=(1, 1), stride=(2, 2), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv2): Conv2d(
          256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv3): Conv2d(
          256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
      )
      (1): BottleneckBlock(
        (conv1): Conv2d(
          1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv2): Conv2d(
          256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv3): Conv2d(
          256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
      )
      (2): BottleneckBlock(
        (conv1): Conv2d(
          1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv2): Conv2d(
          256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv3): Conv2d(
          256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
      )
      (3): BottleneckBlock(
        (conv1): Conv2d(
          1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv2): Conv2d(
          256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv3): Conv2d(
          256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
      )
      (4): BottleneckBlock(
        (conv1): Conv2d(
          1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv2): Conv2d(
          256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv3): Conv2d(
          256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
      )
      (5): BottleneckBlock(
        (conv1): Conv2d(
          1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv2): Conv2d(
          256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
        )
        (conv3): Conv2d(
          256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
        )
      )
    )
  )
  (proposal_generator): RPN(
    (rpn_head): StandardRPNHead(
      (conv): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (objectness_logits): Conv2d(1024, 15, kernel_size=(1, 1), stride=(1, 1))
      (anchor_deltas): Conv2d(1024, 60, kernel_size=(1, 1), stride=(1, 1))
    )
    (anchor_generator): DefaultAnchorGenerator(
      (cell_anchors): BufferList()
    )
  )
  (roi_heads): Res5ROIHeads(
    (pooler): ROIPooler(
      (level_poolers): ModuleList(
        (0): ROIAlign(output_size=(14, 14), spatial_scale=0.0625, sampling_ratio=0, aligned=True)
      )
    )
    (res5): Sequential(
      (0): BottleneckBlock(
        (shortcut): Conv2d(
          1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False
          (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
        )
        (conv1): Conv2d(
          1024, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv2): Conv2d(
          512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv3): Conv2d(
          512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
        )
      )
      (1): BottleneckBlock(
        (conv1): Conv2d(
          2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv2): Conv2d(
          512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv3): Conv2d(
          512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
        )
      )
      (2): BottleneckBlock(
        (conv1): Conv2d(
          2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv2): Conv2d(
          512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
        )
        (conv3): Conv2d(
          512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
          (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
        )
      )
    )
    (box_predictor): FastRCNNOutputLayers(
      (cls_score): Linear(in_features=2048, out_features=82, bias=True)
      (bbox_pred): Linear(in_features=2048, out_features=324, bias=True)
      (hingeloss): HingeEmbeddingLoss()
    )
  )
)
�[32m[09/14 19:42:59 fvcore.common.checkpoint]: �[0mLoading checkpoint from /data/yu/code/OWOD-master/output/t1/model_final.pth
�[32m[09/14 19:43:01 d2.data.build]: �[0mKnown classes: range(0, 20)
�[32m[09/14 19:43:01 d2.data.build]: �[0mLabelling known instances the corresponding label, and unknown instances as unknown...
�[32m[09/14 19:43:01 d2.data.build]: �[0mDistribution of instances among all 81 categories:
�[36m|   category    | #instances   |   category    | #instances   |  category  | #instances   |
|:-------------:|:-------------|:-------------:|:-------------|:----------:|:-------------|
|   aeroplane   | 361          |    bicycle    | 700          |    bird    | 800          |
|     boat      | 607          |    bottle     | 2339         |    bus     | 429          |
|      car      | 3463         |      cat      | 522          |   chair    | 3996         |
|      cow      | 392          |  diningtable  | 1477         |    dog     | 697          |
|     horse     | 455          |   motorbike   | 587          |   person   | 18378        |
|  pottedplant  | 1043         |     sheep     | 387          |    sofa    | 686          |
|     train     | 385          |   tvmonitor   | 683          |   truck    | 0            |
| traffic light | 0            | fire hydrant  | 0            | stop sign  | 0            |
| parking meter | 0            |     bench     | 0            |  elephant  | 0            |
|     bear      | 0            |     zebra     | 0            |  giraffe   | 0            |
|   backpack    | 0            |   umbrella    | 0            |  handbag   | 0            |
|      tie      | 0            |   suitcase    | 0            | microwave  | 0            |
|     oven      | 0            |    toaster    | 0            |    sink    | 0            |
| refrigerator  | 0            |    frisbee    | 0            |    skis    | 0            |
|   snowboard   | 0            |  sports ball  | 0            |    kite    | 0            |
| baseball bat  | 0            | baseball gl.. | 0            | skateboard | 0            |
|   surfboard   | 0            | tennis racket | 0            |   banana   | 0            |
|     apple     | 0            |   sandwich    | 0            |   orange   | 0            |
|   broccoli    | 0            |    carrot     | 0            |  hot dog   | 0            |
|     pizza     | 0            |     donut     | 0            |    cake    | 0            |
|      bed      | 0            |    toilet     | 0            |   laptop   | 0            |
|     mouse     | 0            |    remote     | 0            |  keyboard  | 0            |
|  cell phone   | 0            |     book      | 0            |   clock    | 0            |
|     vase      | 0            |   scissors    | 0            | teddy bear | 0            |
|  hair drier   | 0            |  toothbrush   | 0            | wine glass | 0            |
|      cup      | 0            |     fork      | 0            |   knife    | 0            |
|     spoon     | 0            |     bowl      | 0            |  unknown   | 23320        |
|               |              |               |              |            |              |
|     total     | 61707        |               |              |            |              |�[0m
�[32m[09/14 19:43:01 d2.data.build]: �[0mNumber of datapoints: 10246
�[32m[09/14 19:43:01 d2.data.common]: �[0mSerializing 10246 elements to byte tensors and concatenating them all ...
�[32m[09/14 19:43:01 d2.data.common]: �[0mSerialized dataset takes 6.34 MiB
�[32m[09/14 19:43:01 d2.data.dataset_mapper]: �[0mAugmentations used in training: [ResizeShortestEdge(short_edge_length=(800, 800), max_size=1333, sample_style='choice')]
�[32m[09/14 19:43:01 d2.evaluation.pascal_voc_evaluation]: �[0mLoading energy distribution from ./output/t1_final/energy_dist_20.pkl
�[32m[09/14 19:43:01 d2.evaluation.evaluator]: �[0mStart inference on 5123 images
/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Traceback (most recent call last):
  File "tools/train_net.py", line 161, in <module>
    args=(args, ),
  File "/data/yu/code/OWOD-master/detectron2/engine/launch.py", line 59, in launch
    daemon=False,
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/data/yu/code/OWOD-master/detectron2/engine/launch.py", line 94, in _distributed_worker
    main_func(*args)
  File "/data/yu/code/OWOD-master/tools/train_net.py", line 133, in main
    res = Trainer.test(cfg, model)
  File "/data/yu/code/OWOD-master/detectron2/engine/defaults.py", line 508, in test
    results_i = inference_on_dataset(model, data_loader, evaluator)
  File "/data/yu/code/OWOD-master/detectron2/evaluation/evaluator.py", line 145, in inference_on_dataset
    evaluator.process(inputs, outputs)
  File "/data/yu/code/OWOD-master/detectron2/evaluation/pascal_voc_evaluation.py", line 122, in process
    classes = self.update_label_based_on_energy(logits, classes)
  File "/data/yu/code/OWOD-master/detectron2/evaluation/pascal_voc_evaluation.py", line 103, in update_label_based_on_energy
    p_known = self.compute_prob(energy, self.known_dist)
  File "/data/yu/code/OWOD-master/detectron2/evaluation/pascal_voc_evaluation.py", line 88, in compute_prob
    pdf = distribution.log_prob(dx).exp()
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/distributions/transformed_distribution.py", line 149, in log_prob
    log_prob = log_prob + _sum_rightmost(self.base_dist.log_prob(y),
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/distributions/transformed_distribution.py", line 138, in log_prob
    self._validate_sample(value)
  File "/home/yuxiaoji/.conda/envs/OWOD/lib/python3.7/site-packages/torch/distributions/distribution.py", line 277, in _validate_sample
    raise ValueError('The value argument must be within the support')
ValueError: The value argument must be within the support

Very much looking forward to your reply~~thanks

I run run.sh instead of t1_test alone. In fact, this error occurs when the program in run.sh is executed to t1_test. I am curious why there is no such error when running t1_train and t1_val. The same problem exists in tasks 2, 3, and 4. I would like to know where is my problem? thanks

After parsing through the error, it seems that you are erroring out while trying to compute the likelihood. I can suggest two things:

  1. are you using reliability version 0.5.6? Considering that many are able to run the code without any trouble, I suspect this to be a setup problem.
  2. Can you try recreating the energy distribution parameters by running the validation step again?

Kindly reopen this if the issue persists. Thanks!

Hello. I am still having this issue even with reliability version 0.5.6.
I ran validation step several times, but still doesn't work :(

Hello. I am still having this issue even with reliability version 0.5.6. I ran validation step several times, but still doesn't work :(
I solved this problem by modifying the following function, just need to add a parameter.I think this is because of the pytorch version

def create_distribution(self, scale, shape, shift):
        wd = Weibull(scale=scale, concentration=shape, validate_args=False)
        transforms = AffineTransform(loc=shift, scale=1.)
        weibull = TransformedDistribution(wd, transforms, validate_args=False)
        return weibull

Hello. I am still having this issue even with reliability version 0.5.6. I ran validation step several times, but still doesn't work :(
I solved this problem by modifying the following function, just need to add a parameter.I think this is because of the pytorch version

def create_distribution(self, scale, shape, shift):
        wd = Weibull(scale=scale, concentration=shape, validate_args=False)
        transforms = AffineTransform(loc=shift, scale=1.)
        weibull = TransformedDistribution(wd, transforms, validate_args=False)
        return weibull

I can't fix the problem even though I add that parameter...Can this problem be related to the version of pytorch or detectron2? Mine are pytorch==1.8.0, detectron2==0.2.1. Thank you very much!

Hello. I am still having this issue even with reliability version 0.5.6. I ran validation step several times, but still doesn't work :(
I solved this problem by modifying the following function, just need to add a parameter.I think this is because of the pytorch version

def create_distribution(self, scale, shape, shift):
        wd = Weibull(scale=scale, concentration=shape, validate_args=False)
        transforms = AffineTransform(loc=shift, scale=1.)
        weibull = TransformedDistribution(wd, transforms, validate_args=False)
        return weibull

Thanks, that will work. Change the function in detectron2/evaluation/pascal_voc_evaluation/PascalVOCDetectionEvaluator.

Hello. I am still having this issue even with reliability version 0.5.6. I ran validation step several times, but still doesn't work :(
I solved this problem by modifying the following function, just need to add a parameter.I think this is because of the pytorch version

def create_distribution(self, scale, shape, shift):
        wd = Weibull(scale=scale, concentration=shape, validate_args=False)
        transforms = AffineTransform(loc=shift, scale=1.)
        weibull = TransformedDistribution(wd, transforms, validate_args=False)
        return weibull

Thank you, it works.
But could you example why you set this parameters?