Training with custom dataset
Closed this issue · 4 comments
manhcntt21 commented
Hi, I am doing training with datasets, but my screen is stuck here.
I searched on google but still can't fix it
on some other runs it gives error like this
[WinError 1455] The paging file is too small for this operation to complete. Error loading "D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torch\lib\cudnn_adv_infer64_8.dll" or one of its dependencies.
This is full my log:
[11/03 14:42:23][INFO] train_net.py: 408: Train with config:
[11/03 14:42:23][INFO] train_net.py: 409: {'AUG': {'AA_TYPE': 'rand-m7-n4-mstd0.5-inc1',
'COLOR_JITTER': 0.4,
'ENABLE': True,
'INTERPOLATION': 'bicubic',
'NUM_SAMPLE': 2,
'RE_COUNT': 1,
'RE_MODE': 'pixel',
'RE_PROB': 0.25,
'RE_SPLIT': False},
'AVA': {'ANNOTATION_DIR': '/mnt/vol/gfsai-flash3-east/ai-group/users/haoqifan/ava/frame_list/',
'BGR': False,
'DETECTION_SCORE_THRESH': 0.9,
'EXCLUSION_FILE': 'ava_val_excluded_timestamps_v2.2.csv',
'FRAME_DIR': '/mnt/fair-flash3-east/ava_trainval_frames.img/',
'FRAME_LIST_DIR': '/mnt/vol/gfsai-flash3-east/ai-group/users/haoqifan/ava/frame_list/',
'FULL_TEST_ON_VAL': False,
'GROUNDTRUTH_FILE': 'ava_val_v2.2.csv',
'IMG_PROC_BACKEND': 'cv2',
'LABEL_MAP_FILE': 'ava_action_list_v2.2_for_activitynet_2019.pbtxt',
'TEST_FORCE_FLIP': False,
'TEST_LISTS': ['val.csv'],
'TEST_PREDICT_BOX_LISTS': ['ava_val_predicted_boxes.csv'],
'TRAIN_GT_BOX_LISTS': ['ava_train_v2.2.csv'],
'TRAIN_LISTS': ['train.csv'],
'TRAIN_PCA_JITTER_ONLY': True,
'TRAIN_PREDICT_BOX_LISTS': [],
'TRAIN_USE_COLOR_AUGMENTATION': False},
'BENCHMARK': CfgNode({'NUM_EPOCHS': 5, 'LOG_PERIOD': 100, 'SHUFFLE': True}),
'BN': {'NORM_TYPE': 'batchnorm',
'NUM_BATCHES_PRECISE': 200,
'NUM_SPLITS': 1,
'NUM_SYNC_DEVICES': 1,
'USE_PRECISE_STATS': False,
'WEIGHT_DECAY': 0.0},
'DATA': {'DECODING_BACKEND': 'decord',
'ENSEMBLE_METHOD': 'sum',
'IMAGE_TEMPLATE': '{:05d}.jpg',
'INPUT_CHANNEL_NUM': [3],
'INV_UNIFORM_SAMPLE': False,
'LABEL_PATH_TEMPLATE': 'somesomev1_rgb_{}_split.txt',
'MEAN': [0.45, 0.45, 0.45],
'MULTI_LABEL': False,
'NUM_FRAMES': 8,
'PATH_LABEL_SEPARATOR': ',',
'PATH_PREFIX': '',
'PATH_TO_DATA_DIR': 'E:/master/datasets/uniformer/data_vid',
'PATH_TO_PRELOAD_IMDB': '',
'RANDOM_FLIP': True,
'REVERSE_INPUT_CHANNEL': False,
'SAMPLING_RATE': 8,
'STD': [0.225, 0.225, 0.225],
'TARGET_FPS': 30,
'TEST_CROP_SIZE': 224,
'TRAIN_CROP_SIZE': 224,
'TRAIN_JITTER_ASPECT_RELATIVE': [0.75, 1.3333],
'TRAIN_JITTER_MOTION_SHIFT': False,
'TRAIN_JITTER_SCALES': [256, 320],
'TRAIN_JITTER_SCALES_RELATIVE': [0.08, 1.0],
'TRAIN_PCA_EIGVAL': [0.225, 0.224, 0.229],
'TRAIN_PCA_EIGVEC': [[-0.5675, 0.7192, 0.4009],
[-0.5808, -0.0045, -0.814],
[-0.5836, -0.6948, 0.4203]],
'USE_OFFSET_SAMPLING': True},
'DATA_LOADER': {'ENABLE_MULTI_THREAD_DECODE': False,
'NUM_WORKERS': 8,
'PIN_MEMORY': True},
'DEMO': {'BUFFER_SIZE': 0,
'CLIP_VIS_SIZE': 10,
'COMMON_CLASS_NAMES': ['watch (a person)',
'talk to (e.g., self, a person, a group)',
'listen to (a person)',
'touch (an object)',
'carry/hold (an object)',
'walk',
'sit',
'lie/sleep',
'bend/bow (at the waist)'],
'COMMON_CLASS_THRES': 0.7,
'DETECTRON2_CFG': 'COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml',
'DETECTRON2_THRESH': 0.9,
'DETECTRON2_WEIGHTS': 'detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl',
'DISPLAY_HEIGHT': 0,
'DISPLAY_WIDTH': 0,
'ENABLE': False,
'FPS': 30,
'GT_BOXES': '',
'INPUT_FORMAT': 'BGR',
'INPUT_VIDEO': '',
'LABEL_FILE_PATH': '',
'NUM_CLIPS_SKIP': 0,
'NUM_VIS_INSTANCES': 2,
'OUTPUT_FILE': '',
'OUTPUT_FPS': -1,
'PREDS_BOXES': '',
'SLOWMO': 1,
'STARTING_SECOND': 900,
'THREAD_ENABLE': False,
'UNCOMMON_CLASS_THRES': 0.3,
'VIS_MODE': 'thres',
'WEBCAM': -1},
'DETECTION': {'ALIGNED': True,
'ENABLE': False,
'ROI_XFORM_RESOLUTION': 7,
'SPATIAL_SCALE_FACTOR': 16},
'DIST_BACKEND': 'gloo',
'LOG_MODEL_INFO': True,
'LOG_PERIOD': 10,
'MIXUP': {'ALPHA': 0.8,
'CUTMIX_ALPHA': 1.0,
'ENABLE': True,
'LABEL_SMOOTH_VALUE': 0.1,
'PROB': 1.0,
'SWITCH_PROB': 0.5},
'MODEL': {'ARCH': 'uniformer',
'CHECKPOINT_NUM': [0, 0, 4, 0],
'DROPCONNECT_RATE': 0.0,
'DROPOUT_RATE': 0.5,
'FC_INIT_STD': 0.01,
'HEAD_ACT': 'softmax',
'LOSS_FUNC': 'soft_cross_entropy',
'MODEL_NAME': 'Uniformer',
'MULTI_PATHWAY_ARCH': ['slowfast'],
'NUM_CLASSES': 400,
'SINGLE_PATHWAY_ARCH': ['2d',
'c2d',
'i3d',
'slow',
'x3d',
'mvit',
'uniformer'],
'USE_CHECKPOINT': True},
'MULTIGRID': {'BN_BASE_SIZE': 8,
'DEFAULT_B': 0,
'DEFAULT_S': 0,
'DEFAULT_T': 0,
'EPOCH_FACTOR': 1.5,
'EVAL_FREQ': 3,
'LONG_CYCLE': False,
'LONG_CYCLE_FACTORS': [(0.25, 0.7071067811865476),
(0.5, 0.7071067811865476),
(0.5, 1),
(1, 1)],
'LONG_CYCLE_SAMPLING_RATE': 0,
'SHORT_CYCLE': False,
'SHORT_CYCLE_FACTORS': [0.5, 0.7071067811865476]},
'MVIT': {'CLS_EMBED_ON': True,
'DEPTH': 16,
'DIM_MUL': [],
'DROPOUT_RATE': 0.0,
'DROPPATH_RATE': 0.1,
'EMBED_DIM': 96,
'HEAD_MUL': [],
'MLP_RATIO': 4.0,
'MODE': 'conv',
'NORM': 'layernorm',
'NORM_STEM': False,
'NUM_HEADS': 1,
'PATCH_2D': False,
'PATCH_KERNEL': [3, 7, 7],
'PATCH_PADDING': [2, 4, 4],
'PATCH_STRIDE': [2, 4, 4],
'POOL_KVQ_KERNEL': None,
'POOL_KV_STRIDE': [],
'POOL_Q_STRIDE': [],
'QKV_BIAS': True,
'SEP_POS_EMBED': False,
'ZERO_DECAY_POS_CLS': True},
'NONLOCAL': {'GROUP': [[1], [1], [1], [1]],
'INSTANTIATION': 'dot_product',
'LOCATION': [[[]], [[]], [[]], [[]]],
'POOL': [[[1, 2, 2], [1, 2, 2]],
[[1, 2, 2], [1, 2, 2]],
[[1, 2, 2], [1, 2, 2]],
[[1, 2, 2], [1, 2, 2]]]},
'NUM_GPUS': 1,
'NUM_SHARDS': 1,
'OUTPUT_DIR': './exp/uniformer_s8x8_k400',
'RESNET': {'DEPTH': 50,
'INPLACE_RELU': True,
'NUM_BLOCK_TEMP_KERNEL': [[3], [4], [6], [3]],
'NUM_GROUPS': 1,
'SPATIAL_DILATIONS': [[1], [1], [1], [1]],
'SPATIAL_STRIDES': [[1], [2], [2], [2]],
'STRIDE_1X1': False,
'TRANS_FUNC': 'bottleneck_transform',
'WIDTH_PER_GROUP': 64,
'ZERO_INIT_FINAL_BN': False},
'RNG_SEED': 6666,
'SHARD_ID': 0,
'SLOWFAST': {'ALPHA': 8,
'BETA_INV': 8,
'FUSION_CONV_CHANNEL_RATIO': 2,
'FUSION_KERNEL_SZ': 5},
'SOLVER': {'BASE_LR': 0.0004,
'BASE_LR_SCALE_NUM_SHARDS': True,
'CLIP_GRADIENT': 20,
'COSINE_AFTER_WARMUP': True,
'COSINE_END_LR': 1e-06,
'DAMPENING': 0.0,
'GAMMA': 0.1,
'LRS': [],
'LR_POLICY': 'cosine',
'MAX_EPOCH': 100,
'MOMENTUM': 0.9,
'NESTEROV': True,
'OPTIMIZING_METHOD': 'adamw',
'STEPS': [],
'STEP_SIZE': 1,
'WARMUP_EPOCHS': 10.0,
'WARMUP_FACTOR': 0.1,
'WARMUP_START_LR': 1e-06,
'WEIGHT_DECAY': 0.05,
'ZERO_WD_1D_PARAM': True},
'TENSORBOARD': {'CATEGORIES_PATH': '',
'CLASS_NAMES_PATH': '',
'CONFUSION_MATRIX': {'ENABLE': False,
'FIGSIZE': [8, 8],
'SUBSET_PATH': ''},
'ENABLE': True,
'HISTOGRAM': {'ENABLE': False,
'FIGSIZE': [8, 8],
'SUBSET_PATH': '',
'TOPK': 10},
'LOG_DIR': '',
'MODEL_VIS': {'ACTIVATIONS': False,
'COLORMAP': 'Pastel2',
'ENABLE': False,
'GRAD_CAM': {'COLORMAP': 'viridis',
'ENABLE': True,
'LAYER_LIST': [],
'USE_TRUE_LABEL': False},
'INPUT_VIDEO': False,
'LAYER_LIST': [],
'MODEL_WEIGHTS': False,
'TOPK_PREDS': 1},
'PREDICTIONS_PATH': '',
'WRONG_PRED_VIS': {'ENABLE': False,
'SUBSET_PATH': '',
'TAG': 'Incorrectly classified videos.'}},
'TEST': {'BATCH_SIZE': 64,
'CHECKPOINT_FILE_PATH': '',
'CHECKPOINT_TYPE': 'pytorch',
'DATASET': 'kinetics',
'ENABLE': True,
'NUM_ENSEMBLE_VIEWS': 1,
'NUM_SPATIAL_CROPS': 1,
'SAVE_RESULTS_PATH': ''},
'TRAIN': {'AUTO_RESUME': True,
'BATCH_SIZE': 4,
'CHECKPOINT_CLEAR_NAME_PATTERN': (),
'CHECKPOINT_EPOCH_RESET': False,
'CHECKPOINT_FILE_PATH': './path_to_models/uniformer_small_k400_8x8.pth',
'CHECKPOINT_INFLATE': False,
'CHECKPOINT_PERIOD': 1,
'CHECKPOINT_TYPE': 'pytorch',
'DATASET': 'kinetics',
'ENABLE': True,
'EVAL_PERIOD': 5},
'UNIFORMER': {'ATTENTION_DROPOUT_RATE': 0,
'DEPTH': [3, 4, 8, 3],
'DROPOUT_RATE': 0,
'DROP_DEPTH_RATE': 0.1,
'EMBED_DIM': [64, 128, 320, 512],
'HEAD_DIM': 64,
'MLP_RATIO': 4,
'PRETRAIN_NAME': 'uniformer_small_in1k',
'QKV_BIAS': True,
'QKV_SCALE': None,
'REPRESENTATION_SIZE': None,
'SPLIT': False,
'STAGE_TYPE': [0, 0, 1, 1],
'STD': False},
'X3D': {'BN_LIN5': False,
'BOTTLENECK_FACTOR': 1.0,
'CHANNELWISE_3x3x3': True,
'DEPTH_FACTOR': 1.0,
'DIM_C1': 12,
'DIM_C5': 2048,
'SCALE_RES2': False,
'WIDTH_FACTOR': 1.0}}
[11/03 14:42:23][INFO] uniformer.py: 287: Use checkpoint: True
[11/03 14:42:23][INFO] uniformer.py: 288: Checkpoint number: [0, 0, 4, 0]
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: patch_embed1.proj.weight, torch.Size([64, 3, 4, 4]) => torch.Size([64, 3, 3, 4, 4])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: patch_embed2.proj.weight, torch.Size([128, 64, 2, 2]) => torch.Size([128, 64, 1, 2, 2])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: patch_embed3.proj.weight, torch.Size([320, 128, 2, 2]) => torch.Size([320, 128, 1, 2, 2])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: patch_embed4.proj.weight, torch.Size([512, 320, 2, 2]) => torch.Size([512, 320, 1, 2, 2])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.0.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.0.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.0.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.0.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.0.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.0.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.1.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.1.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.1.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.1.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.1.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.1.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.2.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.2.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.2.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.2.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.2.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks1.2.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.0.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.0.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.0.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.0.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.0.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.0.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.1.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.1.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.1.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.1.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.1.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.1.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.2.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.2.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.2.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.2.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.2.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.2.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.3.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.3.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.3.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.3.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.3.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks2.3.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.0.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.1.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.2.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.3.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.4.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.5.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.6.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks3.7.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks4.0.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks4.1.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 412: Inflate: blocks4.2.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/03 14:42:23][INFO] uniformer.py: 410: Ignore: head.weight
[11/03 14:42:23][INFO] uniformer.py: 410: Ignore: head.bias
[11/03 14:42:23][INFO] build.py: 45: load pretrained model
[11/03 14:42:23][INFO] misc.py: 183: Model:
Uniformer(
(patch_embed1): SpeicalPatchEmbed(
(norm): LayerNorm((64,), eps=1e-05, elementwise_affine=True)
(proj): Conv3d(3, 64, kernel_size=(3, 4, 4), stride=(2, 4, 4), padding=(1, 0, 0))
)
(patch_embed2): PatchEmbed(
(norm): LayerNorm((128,), eps=1e-05, elementwise_affine=True)
(proj): Conv3d(64, 128, kernel_size=(1, 2, 2), stride=(1, 2, 2))
)
(patch_embed3): PatchEmbed(
(norm): LayerNorm((320,), eps=1e-05, elementwise_affine=True)
(proj): Conv3d(128, 320, kernel_size=(1, 2, 2), stride=(1, 2, 2))
)
(patch_embed4): PatchEmbed(
(norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
(proj): Conv3d(320, 512, kernel_size=(1, 2, 2), stride=(1, 2, 2))
)
(pos_drop): Dropout(p=0, inplace=False)
(blocks1): ModuleList(
(0): CBlock(
(pos_embed): Conv3d(64, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=64)
(norm1): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(64, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(64, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(64, 64, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=64)
(drop_path): Identity()
(norm2): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(64, 256, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(256, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
(1): CBlock(
(pos_embed): Conv3d(64, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=64)
(norm1): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(64, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(64, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(64, 64, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=64)
(drop_path): DropPath(drop_prob=0.006)
(norm2): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(64, 256, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(256, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
(2): CBlock(
(pos_embed): Conv3d(64, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=64)
(norm1): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(64, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(64, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(64, 64, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=64)
(drop_path): DropPath(drop_prob=0.012)
(norm2): BatchNorm3d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(64, 256, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(256, 64, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
)
(blocks2): ModuleList(
(0): CBlock(
(pos_embed): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=128)
(norm1): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(128, 128, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=128)
(drop_path): DropPath(drop_prob=0.018)
(norm2): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(128, 512, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(512, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
(1): CBlock(
(pos_embed): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=128)
(norm1): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(128, 128, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=128)
(drop_path): DropPath(drop_prob=0.024)
(norm2): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(128, 512, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(512, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
(2): CBlock(
(pos_embed): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=128)
(norm1): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(128, 128, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=128)
(drop_path): DropPath(drop_prob=0.029)
(norm2): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(128, 512, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(512, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
(3): CBlock(
(pos_embed): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=128)
(norm1): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv1): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(conv2): Conv3d(128, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(attn): Conv3d(128, 128, kernel_size=(5, 5, 5), stride=(1, 1, 1), padding=(2, 2, 2), groups=128)
(drop_path): DropPath(drop_prob=0.035)
(norm2): BatchNorm3d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(mlp): CMlp(
(fc1): Conv3d(128, 512, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(act): GELU()
(fc2): Conv3d(512, 128, kernel_size=(1, 1, 1), stride=(1, 1, 1))
(drop): Dropout(p=0, inplace=False)
)
)
)
(blocks3): ModuleList(
(0): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.041)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(1): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.047)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(2): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.053)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(3): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.059)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(4): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.065)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(5): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.071)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(6): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.076)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(7): SABlock(
(pos_embed): Conv3d(320, 320, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=320)
(norm1): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=320, out_features=960, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=320, out_features=320, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.082)
(norm2): LayerNorm((320,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=320, out_features=1280, bias=True)
(act): GELU()
(fc2): Linear(in_features=1280, out_features=320, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
)
(blocks4): ModuleList(
(0): SABlock(
(pos_embed): Conv3d(512, 512, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=512)
(norm1): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=512, out_features=1536, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=512, out_features=512, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.088)
(norm2): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(act): GELU()
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(1): SABlock(
(pos_embed): Conv3d(512, 512, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=512)
(norm1): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=512, out_features=1536, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=512, out_features=512, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.094)
(norm2): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(act): GELU()
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
(2): SABlock(
(pos_embed): Conv3d(512, 512, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), groups=512)
(norm1): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
(attn): Attention(
(qkv): Linear(in_features=512, out_features=1536, bias=True)
(attn_drop): Dropout(p=0, inplace=False)
(proj): Linear(in_features=512, out_features=512, bias=True)
(proj_drop): Dropout(p=0, inplace=False)
)
(drop_path): DropPath(drop_prob=0.100)
(norm2): LayerNorm((512,), eps=1e-06, elementwise_affine=True)
(mlp): Mlp(
(fc1): Linear(in_features=512, out_features=2048, bias=True)
(act): GELU()
(fc2): Linear(in_features=2048, out_features=512, bias=True)
(drop): Dropout(p=0, inplace=False)
)
)
)
(norm): BatchNorm3d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(pre_logits): Identity()
(head): Linear(in_features=512, out_features=400, bias=True)
)
[11/03 14:42:23][INFO] misc.py: 184: Params: 21,400,400
[11/03 14:42:23][INFO] misc.py: 185: Mem: 0.0800790786743164 MB
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::add encountered 42 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::gelu encountered 14 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator prim::PythonOp.CheckpointFunction encountered 4 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::div encountered 7 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::mul encountered 7 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::softmax encountered 7 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::mean encountered 1 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 511: The following submodules of the model were never called during the trace of the graph. They may be unused, or they were accessed by direct calls to .forward() or via other python methods. In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
blocks1.1.drop_path, blocks1.2.drop_path, blocks2.0.drop_path, blocks2.1.drop_path, blocks2.2.drop_path, blocks2.3.drop_path, blocks3.0, blocks3.0.attn, blocks3.0.attn.attn_drop, blocks3.0.attn.proj, blocks3.0.attn.proj_drop, blocks3.0.attn.qkv, blocks3.0.drop_path, blocks3.0.mlp, blocks3.0.mlp.act, blocks3.0.mlp.drop, blocks3.0.mlp.fc1, blocks3.0.mlp.fc2, blocks3.0.norm1, blocks3.0.norm2, blocks3.0.pos_embed, blocks3.1, blocks3.1.attn, blocks3.1.attn.attn_drop, blocks3.1.attn.proj, blocks3.1.attn.proj_drop, blocks3.1.attn.qkv, blocks3.1.drop_path, blocks3.1.mlp, blocks3.1.mlp.act, blocks3.1.mlp.drop, blocks3.1.mlp.fc1, blocks3.1.mlp.fc2, blocks3.1.norm1, blocks3.1.norm2, blocks3.1.pos_embed, blocks3.2, blocks3.2.attn, blocks3.2.attn.attn_drop, blocks3.2.attn.proj, blocks3.2.attn.proj_drop, blocks3.2.attn.qkv, blocks3.2.drop_path, blocks3.2.mlp, blocks3.2.mlp.act, blocks3.2.mlp.drop, blocks3.2.mlp.fc1, blocks3.2.mlp.fc2, blocks3.2.norm1, blocks3.2.norm2, blocks3.2.pos_embed, blocks3.3, blocks3.3.attn, blocks3.3.attn.attn_drop, blocks3.3.attn.proj, blocks3.3.attn.proj_drop, blocks3.3.attn.qkv, blocks3.3.drop_path, blocks3.3.mlp, blocks3.3.mlp.act, blocks3.3.mlp.drop, blocks3.3.mlp.fc1, blocks3.3.mlp.fc2, blocks3.3.norm1, blocks3.3.norm2, blocks3.3.pos_embed, blocks3.4.drop_path, blocks3.5.drop_path, blocks3.6.drop_path, blocks3.7.drop_path, blocks4.0.drop_path, blocks4.1.drop_path, blocks4.2.drop_path
[11/03 14:42:26][INFO] misc.py: 186: Flops: 12.149269504 G
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::layer_norm encountered 18 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::add encountered 42 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::batch_norm encountered 15 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::gelu encountered 14 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator prim::PythonOp.CheckpointFunction encountered 4 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::div encountered 7 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::mul encountered 7 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::softmax encountered 7 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 499: Unsupported operator aten::mean encountered 1 time(s)
[11/03 14:42:26][WARNING] jit_analysis.py: 511: The following submodules of the model were never called during the trace of the graph. They may be unused, or they were accessed by direct calls to .forward() or via other python methods. In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
blocks1.1.drop_path, blocks1.2.drop_path, blocks2.0.drop_path, blocks2.1.drop_path, blocks2.2.drop_path, blocks2.3.drop_path, blocks3.0, blocks3.0.attn, blocks3.0.attn.attn_drop, blocks3.0.attn.proj, blocks3.0.attn.proj_drop, blocks3.0.attn.qkv, blocks3.0.drop_path, blocks3.0.mlp, blocks3.0.mlp.act, blocks3.0.mlp.drop, blocks3.0.mlp.fc1, blocks3.0.mlp.fc2, blocks3.0.norm1, blocks3.0.norm2, blocks3.0.pos_embed, blocks3.1, blocks3.1.attn, blocks3.1.attn.attn_drop, blocks3.1.attn.proj, blocks3.1.attn.proj_drop, blocks3.1.attn.qkv, blocks3.1.drop_path, blocks3.1.mlp, blocks3.1.mlp.act, blocks3.1.mlp.drop, blocks3.1.mlp.fc1, blocks3.1.mlp.fc2, blocks3.1.norm1, blocks3.1.norm2, blocks3.1.pos_embed, blocks3.2, blocks3.2.attn, blocks3.2.attn.attn_drop, blocks3.2.attn.proj, blocks3.2.attn.proj_drop, blocks3.2.attn.qkv, blocks3.2.drop_path, blocks3.2.mlp, blocks3.2.mlp.act, blocks3.2.mlp.drop, blocks3.2.mlp.fc1, blocks3.2.mlp.fc2, blocks3.2.norm1, blocks3.2.norm2, blocks3.2.pos_embed, blocks3.3, blocks3.3.attn, blocks3.3.attn.attn_drop, blocks3.3.attn.proj, blocks3.3.attn.proj_drop, blocks3.3.attn.qkv, blocks3.3.drop_path, blocks3.3.mlp, blocks3.3.mlp.act, blocks3.3.mlp.drop, blocks3.3.mlp.fc1, blocks3.3.mlp.fc2, blocks3.3.norm1, blocks3.3.norm2, blocks3.3.pos_embed, blocks3.4.drop_path, blocks3.5.drop_path, blocks3.6.drop_path, blocks3.7.drop_path, blocks4.0.drop_path, blocks4.1.drop_path, blocks4.2.drop_path
[11/03 14:42:26][INFO] misc.py: 191: Activations: 65.24801599999999 M
[11/03 14:42:26][INFO] misc.py: 196: nvidia-smi
[11/03 14:42:26][INFO] checkpoint_amp.py: 507: Load from given checkpoint file.
[11/03 14:42:26][INFO] checkpoint_amp.py: 213: Loading network weights from ./path_to_models/uniformer_small_k400_8x8.pth.
[11/03 14:42:26][INFO] kinetics.py: 76: Constructing Kinetics train...
[11/03 14:42:26][INFO] kinetics.py: 123: Constructing kinetics dataloader (size: 1275) from E:/master/datasets/uniformer/data_vid\train.csv
[11/03 14:42:26][INFO] kinetics.py: 76: Constructing Kinetics val...
[11/03 14:42:26][INFO] kinetics.py: 123: Constructing kinetics dataloader (size: 200) from E:/master/datasets/uniformer/data_vid\val.csv
[11/03 14:42:26][INFO] tensorboard_vis.py: 54: To see logged results in Tensorboard, please launch using the command `tensorboard --port=<port-number> --logdir ./exp/uniformer_s8x8_k400\runs-kinetics`
[11/03 14:42:26][INFO] train_net.py: 451: Start epoch: 1
Andy1621 commented
Thanks for your question!
For the current codebase, it needs some time to prepare the dataset.
It seems that you are waiting for the data, but it needs CPU. And there may be some process occupying CPU, thus blocking the running.
Andy1621 commented
As there is no more activity, I am closing the issue, don't hesitate to reopen it if necessary.
manhcntt21 commented
Andy1621 commented
When testing, you have to change the yaml
as in README. Or you can simple change the code as in my new repo UniFormerV2.