Sense-X/UniFormer

Training from n epoch

manhcntt21 opened this issue · 0 comments

Hi

I have a question. I trained with my custom dataset, assum 2nd epoch. Now, i want to continue training from this 2nd epoch. Is there any way to do something like that?

I replaced TRAIN.CHECKPOINT_FILE_PATH with value is 2nd epoch's checkpoint, but maybe not working.

This is log

$ bash ./exp/uniformer_s8x8_k400/run.sh
...
[11/17 12:04:36][INFO] uniformer.py: 287: Use checkpoint: True
[11/17 12:04:36][INFO] uniformer.py: 288: Checkpoint number: [0, 0, 4, 0]
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: patch_embed1.proj.weight, torch.Size([64, 3, 4, 4]) => torch.Size([64, 3, 3, 4, 4])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: patch_embed2.proj.weight, torch.Size([128, 64, 2, 2]) => torch.Size([128, 64, 1, 2, 2])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: patch_embed3.proj.weight, torch.Size([320, 128, 2, 2]) => torch.Size([320, 128, 1, 2, 2])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: patch_embed4.proj.weight, torch.Size([512, 320, 2, 2]) => torch.Size([512, 320, 1, 2, 2])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.0.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.0.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.0.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.0.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.0.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.0.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.1.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.1.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.1.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.1.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.1.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.1.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.2.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.2.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.2.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.2.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.2.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks1.2.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.0.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.0.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.0.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.0.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.0.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.0.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.1.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.1.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.1.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.1.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.1.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.1.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.2.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.2.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.2.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.2.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.2.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.2.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.3.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.3.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.3.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.3.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.3.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks2.3.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.0.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.1.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.2.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.3.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.4.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.5.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.6.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks3.7.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks4.0.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks4.1.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 412: Inflate: blocks4.2.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/17 12:04:36][INFO] uniformer.py: 410: Ignore: head.weight
[11/17 12:04:36][INFO] uniformer.py: 410: Ignore: head.bias
[11/17 12:04:36][INFO] build.py:  45: load pretrained model
[11/17 12:04:37][INFO] misc.py: 183: Model:
...
[11/17 12:04:37][INFO] misc.py: 184: Params: 21,400,400
[11/17 12:04:37][INFO] misc.py: 185: Mem: 0.0800790786743164 MB
e:\master\uniformer\video_classification\slowfast\models\uniformer.py:85: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator aten::add encountered 42 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator aten::gelu encountered 14 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator prim::PythonOp.CheckpointFunction encountered 4 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator aten::div encountered 7 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator aten::mul encountered 7 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator aten::softmax encountered 7 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 499: Unsupported operator aten::mean encountered 1 time(s)
[11/17 12:04:39][WARNING] jit_analysis.py: 511: The following submodules of the model were never called during the trace of the graph. They may be unused, or they were accessed by direct calls to .forward() or via other python methods. In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
blocks1.1.drop_path, blocks1.2.drop_path, blocks2.0.drop_path, blocks2.1.drop_path, blocks2.2.drop_path, blocks2.3.drop_path, blocks3.0, blocks3.0.attn, blocks3.0.attn.attn_drop, blocks3.0.attn.proj, blocks3.0.attn.proj_drop, blocks3.0.attn.qkv, blocks3.0.drop_path, blocks3.0.mlp, blocks3.0.mlp.act, blocks3.0.mlp.drop, blocks3.0.mlp.fc1, blocks3.0.mlp.fc2, blocks3.0.norm1, blocks3.0.norm2, blocks3.0.pos_embed, blocks3.1, blocks3.1.attn, blocks3.1.attn.attn_drop, blocks3.1.attn.proj, blocks3.1.attn.proj_drop, blocks3.1.attn.qkv, blocks3.1.drop_path, blocks3.1.mlp, blocks3.1.mlp.act, blocks3.1.mlp.drop, blocks3.1.mlp.fc1, blocks3.1.mlp.fc2, blocks3.1.norm1, blocks3.1.norm2, blocks3.1.pos_embed, blocks3.2, blocks3.2.attn, blocks3.2.attn.attn_drop, blocks3.2.attn.proj, blocks3.2.attn.proj_drop, blocks3.2.attn.qkv, blocks3.2.drop_path, blocks3.2.mlp, blocks3.2.mlp.act, blocks3.2.mlp.drop, blocks3.2.mlp.fc1, blocks3.2.mlp.fc2, blocks3.2.norm1, blocks3.2.norm2, blocks3.2.pos_embed, blocks3.3, blocks3.3.attn, blocks3.3.attn.attn_drop, blocks3.3.attn.proj, blocks3.3.attn.proj_drop, blocks3.3.attn.qkv, blocks3.3.drop_path, blocks3.3.mlp, blocks3.3.mlp.act, blocks3.3.mlp.drop, blocks3.3.mlp.fc1, blocks3.3.mlp.fc2, blocks3.3.norm1, blocks3.3.norm2, blocks3.3.pos_embed, blocks3.4.drop_path, blocks3.5.drop_path, blocks3.6.drop_path, blocks3.7.drop_path, blocks4.0.drop_path, blocks4.1.drop_path, blocks4.2.drop_path
[11/17 12:04:39][INFO] misc.py: 186: Flops: 12.149269504 G
e:\master\uniformer\video_classification\slowfast\models\uniformer.py:85: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::layer_norm encountered 18 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::add encountered 42 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::batch_norm encountered 15 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::gelu encountered 14 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator prim::PythonOp.CheckpointFunction encountered 4 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::div encountered 7 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::mul encountered 7 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::softmax encountered 7 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 499: Unsupported operator aten::mean encountered 1 time(s)
[11/17 12:04:40][WARNING] jit_analysis.py: 511: The following submodules of the model were never called during the trace of the graph. They may be unused, or they were accessed by direct calls to .forward() or via other python methods. In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
blocks1.1.drop_path, blocks1.2.drop_path, blocks2.0.drop_path, blocks2.1.drop_path, blocks2.2.drop_path, blocks2.3.drop_path, blocks3.0, blocks3.0.attn, blocks3.0.attn.attn_drop, blocks3.0.attn.proj, blocks3.0.attn.proj_drop, blocks3.0.attn.qkv, blocks3.0.drop_path, blocks3.0.mlp, blocks3.0.mlp.act, blocks3.0.mlp.drop, blocks3.0.mlp.fc1, blocks3.0.mlp.fc2, blocks3.0.norm1, blocks3.0.norm2, blocks3.0.pos_embed, blocks3.1, blocks3.1.attn, blocks3.1.attn.attn_drop, blocks3.1.attn.proj, blocks3.1.attn.proj_drop, blocks3.1.attn.qkv, blocks3.1.drop_path, blocks3.1.mlp, blocks3.1.mlp.act, blocks3.1.mlp.drop, blocks3.1.mlp.fc1, blocks3.1.mlp.fc2, blocks3.1.norm1, blocks3.1.norm2, blocks3.1.pos_embed, blocks3.2, blocks3.2.attn, blocks3.2.attn.attn_drop, blocks3.2.attn.proj, blocks3.2.attn.proj_drop, blocks3.2.attn.qkv, blocks3.2.drop_path, blocks3.2.mlp, blocks3.2.mlp.act, blocks3.2.mlp.drop, blocks3.2.mlp.fc1, blocks3.2.mlp.fc2, blocks3.2.norm1, blocks3.2.norm2, blocks3.2.pos_embed, blocks3.3, blocks3.3.attn, blocks3.3.attn.attn_drop, blocks3.3.attn.proj, blocks3.3.attn.proj_drop, blocks3.3.attn.qkv, blocks3.3.drop_path, blocks3.3.mlp, blocks3.3.mlp.act, blocks3.3.mlp.drop, blocks3.3.mlp.fc1, blocks3.3.mlp.fc2, blocks3.3.norm1, blocks3.3.norm2, blocks3.3.pos_embed, blocks3.4.drop_path, blocks3.5.drop_path, blocks3.6.drop_path, blocks3.7.drop_path, blocks4.0.drop_path, blocks4.1.drop_path, blocks4.2.drop_path
[11/17 12:04:40][INFO] misc.py: 191: Activations: 65.24801599999999 M
[11/17 12:04:40][INFO] misc.py: 196: nvidia-smi
Thu Nov 17 12:04:40 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 522.25       Driver Version: 522.25       CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:26:00.0  On |                  N/A |
|  0%   51C    P2    28W / 120W |   2424MiB /  6144MiB |     11%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1456    C+G   ...\PowerToys.FancyZones.exe    N/A      |
|    0   N/A  N/A      5844    C+G   ...werToys.PowerLauncher.exe    N/A      |
|    0   N/A  N/A      9248    C+G   ...6.0.3.0\GoogleDriveFS.exe    N/A      |
|    0   N/A  N/A     10196    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     11424    C+G   ...ropbox\Client\Dropbox.exe    N/A      |
|    0   N/A  N/A     11960    C+G   ...8bbwe\Microsoft.Notes.exe    N/A      |
|    0   N/A  N/A     12932    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     13532    C+G   ...bbwe\Microsoft.Photos.exe    N/A      |
|    0   N/A  N/A     13536    C+G   ...me\Application\chrome.exe    N/A      |
|    0   N/A  N/A     15480    C+G   ...2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     16648    C+G   ...ram Files\LGHUB\lghub.exe    N/A      |
|    0   N/A  N/A     17376    C+G   ...perience\NVIDIA Share.exe    N/A      |
|    0   N/A  N/A     19824      C   ...envs\uniformer\python.exe    N/A      |
|    0   N/A  N/A     20416    C+G   ...ons\Grammarly.Desktop.exe    N/A      |
|    0   N/A  N/A     22376    C+G   ...werToys.ColorPickerUI.exe    N/A      |
|    0   N/A  N/A     24568    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     26792    C+G   ...zpdnekdrzrea0\Spotify.exe    N/A      |
|    0   N/A  N/A     28376    C+G   ...icrosoft VS Code\Code.exe    N/A      |
+-----------------------------------------------------------------------------+
bn 30, non bn 102, zero 154
[11/17 12:04:40][INFO] checkpoint_amp.py: 501: Load from last checkpoint, ./exp/uniformer_s8x8_k400\checkpoints\checkpoint_epoch_00002.pyth.
[11/17 12:04:40][INFO] checkpoint_amp.py: 213: Loading network weights from ./exp/uniformer_s8x8_k400\checkpoints\checkpoint_epoch_00002.pyth.
[11/17 12:04:40][INFO] kinetics.py:  76: Constructing Kinetics train...
[11/17 12:04:40][INFO] kinetics.py: 123: Constructing kinetics dataloader (size: 1275) from ./data_vid\train.csv
[11/17 12:04:40][INFO] kinetics.py:  76: Constructing Kinetics val...
[11/17 12:04:40][INFO] kinetics.py: 123: Constructing kinetics dataloader (size: 200) from ./data_vid\val.csv
[11/17 12:04:40][INFO] tensorboard_vis.py:  54: To see logged results in Tensorboard, please launch using the command             `tensorboard  --port=<port-number> --logdir ./exp/uniformer_s8x8_k400\runs-kinetics`
[11/17 12:04:40][INFO] train_net.py: 451: Start epoch: 3
D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torchvision\transforms\_functional_video.py:5: UserWarning: The _functional_video module is deprecated. Please use the functional module instead.
  warnings.warn(
D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torchvision\transforms\_transforms_video.py:25: UserWarning: The _transforms_video module is deprecated. Please use the transforms module instead.
  warnings.warn(
[11/17 12:04:43][INFO] test_net.py: 157: Test with config:
[11/17 12:04:43][INFO] test_net.py: 158: AUG:
  AA_TYPE: rand-m7-n4-mstd0.5-inc1
  COLOR_JITTER: 0.4
  ENABLE: True
  INTERPOLATION: bicubic
  NUM_SAMPLE: 2
  RE_COUNT: 1
  RE_MODE: pixel
  RE_PROB: 0.25
  RE_SPLIT: False
AVA:
  ANNOTATION_DIR: /mnt/vol/gfsai-flash3-east/ai-group/users/haoqifan/ava/frame_list/
  BGR: False
  DETECTION_SCORE_THRESH: 0.9
  EXCLUSION_FILE: ava_val_excluded_timestamps_v2.2.csv
  FRAME_DIR: /mnt/fair-flash3-east/ava_trainval_frames.img/
  FRAME_LIST_DIR: /mnt/vol/gfsai-flash3-east/ai-group/users/haoqifan/ava/frame_list/
  FULL_TEST_ON_VAL: False
  GROUNDTRUTH_FILE: ava_val_v2.2.csv
  IMG_PROC_BACKEND: cv2
  LABEL_MAP_FILE: ava_action_list_v2.2_for_activitynet_2019.pbtxt
  TEST_FORCE_FLIP: False
  TEST_LISTS: ['val.csv']
  TEST_PREDICT_BOX_LISTS: ['ava_val_predicted_boxes.csv']
  TRAIN_GT_BOX_LISTS: ['ava_train_v2.2.csv']
  TRAIN_LISTS: ['train.csv']
  TRAIN_PCA_JITTER_ONLY: True
  TRAIN_PREDICT_BOX_LISTS: []
  TRAIN_USE_COLOR_AUGMENTATION: False
BENCHMARK:
  LOG_PERIOD: 100
  NUM_EPOCHS: 5
  SHUFFLE: True
BN:
  NORM_TYPE: batchnorm
  NUM_BATCHES_PRECISE: 200
  NUM_SPLITS: 1
  NUM_SYNC_DEVICES: 1
  USE_PRECISE_STATS: False
  WEIGHT_DECAY: 0.0
DATA:
  DECODING_BACKEND: decord
  ENSEMBLE_METHOD: sum
  IMAGE_TEMPLATE: {:05d}.jpg
  INPUT_CHANNEL_NUM: [3]
  INV_UNIFORM_SAMPLE: False
  LABEL_PATH_TEMPLATE: somesomev1_rgb_{}_split.txt
  MEAN: [0.45, 0.45, 0.45]
  MULTI_LABEL: False
  NUM_FRAMES: 8
  PATH_LABEL_SEPARATOR: ,
  PATH_PREFIX:
  PATH_TO_DATA_DIR: ./data_vid
  PATH_TO_PRELOAD_IMDB:
  RANDOM_FLIP: True
  REVERSE_INPUT_CHANNEL: False
  SAMPLING_RATE: 8
  STD: [0.225, 0.225, 0.225]
  TARGET_FPS: 30
  TEST_CROP_SIZE: 224
  TRAIN_CROP_SIZE: 224
  TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
  TRAIN_JITTER_MOTION_SHIFT: False
  TRAIN_JITTER_SCALES: [256, 320]
  TRAIN_JITTER_SCALES_RELATIVE: [0.08, 1.0]
  TRAIN_PCA_EIGVAL: [0.225, 0.224, 0.229]
  TRAIN_PCA_EIGVEC: [[-0.5675, 0.7192, 0.4009], [-0.5808, -0.0045, -0.814], [-0.5836, -0.6948, 0.4203]]
  USE_OFFSET_SAMPLING: True
DATA_LOADER:
  ENABLE_MULTI_THREAD_DECODE: False
  NUM_WORKERS: 8
  PIN_MEMORY: True
DEMO:
  BUFFER_SIZE: 0
  CLIP_VIS_SIZE: 10
  COMMON_CLASS_NAMES: ['watch (a person)', 'talk to (e.g., self, a person, a group)', 'listen to (a person)', 'touch (an object)', 'carry/hold (an object)', 'walk', 'sit', 'lie/sleep', 'bend/bow (at the waist)']
  COMMON_CLASS_THRES: 0.7
  DETECTRON2_CFG: COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml
  DETECTRON2_THRESH: 0.9
  DETECTRON2_WEIGHTS: detectron2://COCO-Detection/faster_rcnn_R_50_FPN_3x/137849458/model_final_280758.pkl
  DISPLAY_HEIGHT: 0
  DISPLAY_WIDTH: 0
  ENABLE: False
  FPS: 30
  GT_BOXES:
  INPUT_FORMAT: BGR
  INPUT_VIDEO:
  LABEL_FILE_PATH:
  NUM_CLIPS_SKIP: 0
  NUM_VIS_INSTANCES: 2
  OUTPUT_FILE:
  OUTPUT_FPS: -1
  PREDS_BOXES:
  SLOWMO: 1
  STARTING_SECOND: 900
  THREAD_ENABLE: False
  UNCOMMON_CLASS_THRES: 0.3
  VIS_MODE: thres
  WEBCAM: -1
DETECTION:
  ALIGNED: True
  ENABLE: False
  ROI_XFORM_RESOLUTION: 7
  SPATIAL_SCALE_FACTOR: 16
DIST_BACKEND: gloo
LOG_MODEL_INFO: True
LOG_PERIOD: 10
MIXUP:
  ALPHA: 0.8
  CUTMIX_ALPHA: 1.0
  ENABLE: True
  LABEL_SMOOTH_VALUE: 0.1
  PROB: 1.0
  SWITCH_PROB: 0.5
MODEL:
  ARCH: uniformer
  CHECKPOINT_NUM: [0, 0, 4, 0]
  DROPCONNECT_RATE: 0.0
  DROPOUT_RATE: 0.5
  FC_INIT_STD: 0.01
  HEAD_ACT: softmax
  LOSS_FUNC: soft_cross_entropy
  MODEL_NAME: Uniformer
  MULTI_PATHWAY_ARCH: ['slowfast']
  NUM_CLASSES: 400
  SINGLE_PATHWAY_ARCH: ['2d', 'c2d', 'i3d', 'slow', 'x3d', 'mvit', 'uniformer']
  USE_CHECKPOINT: True
MULTIGRID:
  BN_BASE_SIZE: 8
  DEFAULT_B: 0
  DEFAULT_S: 0
  DEFAULT_T: 0
  EPOCH_FACTOR: 1.5
  EVAL_FREQ: 3
  LONG_CYCLE: False
  LONG_CYCLE_FACTORS: [(0.25, 0.7071067811865476), (0.5, 0.7071067811865476), (0.5, 1), (1, 1)]
  LONG_CYCLE_SAMPLING_RATE: 0
  SHORT_CYCLE: False
  SHORT_CYCLE_FACTORS: [0.5, 0.7071067811865476]
MVIT:
  CLS_EMBED_ON: True
  DEPTH: 16
  DIM_MUL: []
  DROPOUT_RATE: 0.0
  DROPPATH_RATE: 0.1
  EMBED_DIM: 96
  HEAD_MUL: []
  MLP_RATIO: 4.0
  MODE: conv
  NORM: layernorm
  NORM_STEM: False
  NUM_HEADS: 1
  PATCH_2D: False
  PATCH_KERNEL: [3, 7, 7]
  PATCH_PADDING: [2, 4, 4]
  PATCH_STRIDE: [2, 4, 4]
  POOL_KVQ_KERNEL: None
  POOL_KV_STRIDE: []
  POOL_Q_STRIDE: []
  QKV_BIAS: True
  SEP_POS_EMBED: False
  ZERO_DECAY_POS_CLS: True
NONLOCAL:
  GROUP: [[1], [1], [1], [1]]
  INSTANTIATION: dot_product
  LOCATION: [[[]], [[]], [[]], [[]]]
  POOL: [[[1, 2, 2], [1, 2, 2]], [[1, 2, 2], [1, 2, 2]], [[1, 2, 2], [1, 2, 2]], [[1, 2, 2], [1, 2, 2]]]
NUM_GPUS: 1
NUM_SHARDS: 1
OUTPUT_DIR: ./exp/uniformer_s8x8_k400
RESNET:
  DEPTH: 50
  INPLACE_RELU: True
  NUM_BLOCK_TEMP_KERNEL: [[3], [4], [6], [3]]
  NUM_GROUPS: 1
  SPATIAL_DILATIONS: [[1], [1], [1], [1]]
  SPATIAL_STRIDES: [[1], [2], [2], [2]]
  STRIDE_1X1: False
  TRANS_FUNC: bottleneck_transform
  WIDTH_PER_GROUP: 64
  ZERO_INIT_FINAL_BN: False
RNG_SEED: 6666
SHARD_ID: 0
SLOWFAST:
  ALPHA: 8
  BETA_INV: 8
  FUSION_CONV_CHANNEL_RATIO: 2
  FUSION_KERNEL_SZ: 5
SOLVER:
  BASE_LR: 0.0004
  BASE_LR_SCALE_NUM_SHARDS: True
  CLIP_GRADIENT: 20
  COSINE_AFTER_WARMUP: True
  COSINE_END_LR: 1e-06
  DAMPENING: 0.0
  GAMMA: 0.1
  LRS: []
  LR_POLICY: cosine
  MAX_EPOCH: 2
  MOMENTUM: 0.9
  NESTEROV: True
  OPTIMIZING_METHOD: adamw
  STEPS: []
  STEP_SIZE: 1
  WARMUP_EPOCHS: 10.0
  WARMUP_FACTOR: 0.1
  WARMUP_START_LR: 1e-06
  WEIGHT_DECAY: 0.05
  ZERO_WD_1D_PARAM: True
TENSORBOARD:
  CATEGORIES_PATH:
  CLASS_NAMES_PATH:
  CONFUSION_MATRIX:
    ENABLE: False
    FIGSIZE: [8, 8]
    SUBSET_PATH:
  ENABLE: True
  HISTOGRAM:
    ENABLE: False
    FIGSIZE: [8, 8]
    SUBSET_PATH:
    TOPK: 10
  LOG_DIR:
  MODEL_VIS:
    ACTIVATIONS: False
    COLORMAP: Pastel2
    ENABLE: False
    GRAD_CAM:
      COLORMAP: viridis
      ENABLE: True
      LAYER_LIST: []
      USE_TRUE_LABEL: False
    INPUT_VIDEO: False
    LAYER_LIST: []
    MODEL_WEIGHTS: False
    TOPK_PREDS: 1
  PREDICTIONS_PATH:
  WRONG_PRED_VIS:
    ENABLE: False
    SUBSET_PATH:
    TAG: Incorrectly classified videos.
TEST:
  BATCH_SIZE: 64
  CHECKPOINT_FILE_PATH:
  CHECKPOINT_TYPE: pytorch
  DATASET: kinetics
  ENABLE: True
  NUM_ENSEMBLE_VIEWS: 1
  NUM_SPATIAL_CROPS: 1
  SAVE_RESULTS_PATH:
TRAIN:
  AUTO_RESUME: True
  BATCH_SIZE: 8
  CHECKPOINT_CLEAR_NAME_PATTERN: ()
  CHECKPOINT_EPOCH_RESET: False
  CHECKPOINT_FILE_PATH: ./exp/uniformer_s8x8_k400/checkpoints/checkpoint_epoch_00002.pyth
  CHECKPOINT_INFLATE: False
  CHECKPOINT_PERIOD: 1
  CHECKPOINT_TYPE: pytorch
  DATASET: kinetics
  ENABLE: True
  EVAL_PERIOD: 5
UNIFORMER:
  ATTENTION_DROPOUT_RATE: 0
  DEPTH: [3, 4, 8, 3]
  DROPOUT_RATE: 0
  DROP_DEPTH_RATE: 0.1
  EMBED_DIM: [64, 128, 320, 512]
  HEAD_DIM: 64
  MLP_RATIO: 4
  PRETRAIN_NAME: uniformer_small_in1k
  QKV_BIAS: True
  QKV_SCALE: None
  REPRESENTATION_SIZE: None
  SPLIT: False
  STAGE_TYPE: [0, 0, 1, 1]
  STD: False
X3D:
  BN_LIN5: False
  BOTTLENECK_FACTOR: 1.0
  CHANNELWISE_3x3x3: True
  DEPTH_FACTOR: 1.0
  DIM_C1: 12
  DIM_C5: 2048
  SCALE_RES2: False
  WIDTH_FACTOR: 1.0
[11/17 12:04:43][INFO] uniformer.py: 287: Use checkpoint: True
[11/17 12:04:43][INFO] uniformer.py: 288: Checkpoint number: [0, 0, 4, 0]
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: patch_embed1.proj.weight, torch.Size([64, 3, 4, 4]) => torch.Size([64, 3, 3, 4, 4])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: patch_embed2.proj.weight, torch.Size([128, 64, 2, 2]) => torch.Size([128, 64, 1, 2, 2])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: patch_embed3.proj.weight, torch.Size([320, 128, 2, 2]) => torch.Size([320, 128, 1, 2, 2])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: patch_embed4.proj.weight, torch.Size([512, 320, 2, 2]) => torch.Size([512, 320, 1, 2, 2])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.0.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.0.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.0.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.0.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.0.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.0.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.1.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.1.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.1.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.1.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.1.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.1.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.2.pos_embed.weight, torch.Size([64, 1, 3, 3]) => torch.Size([64, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.2.conv1.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.2.conv2.weight, torch.Size([64, 64, 1, 1]) => torch.Size([64, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.2.attn.weight, torch.Size([64, 1, 5, 5]) => torch.Size([64, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.2.mlp.fc1.weight, torch.Size([256, 64, 1, 1]) => torch.Size([256, 64, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks1.2.mlp.fc2.weight, torch.Size([64, 256, 1, 1]) => torch.Size([64, 256, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.0.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.0.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.0.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.0.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.0.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.0.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.1.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.1.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.1.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.1.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.1.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.1.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.2.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.2.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.2.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.2.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.2.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.2.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.3.pos_embed.weight, torch.Size([128, 1, 3, 3]) => torch.Size([128, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.3.conv1.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.3.conv2.weight, torch.Size([128, 128, 1, 1]) => torch.Size([128, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.3.attn.weight, torch.Size([128, 1, 5, 5]) => torch.Size([128, 1, 5, 5, 5])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.3.mlp.fc1.weight, torch.Size([512, 128, 1, 1]) => torch.Size([512, 128, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks2.3.mlp.fc2.weight, torch.Size([128, 512, 1, 1]) => torch.Size([128, 512, 1, 1, 1])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.0.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.1.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.2.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.3.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.4.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.5.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.6.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks3.7.pos_embed.weight, torch.Size([320, 1, 3, 3]) => torch.Size([320, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks4.0.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks4.1.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 412: Inflate: blocks4.2.pos_embed.weight, torch.Size([512, 1, 3, 3]) => torch.Size([512, 1, 3, 3, 3])
[11/17 12:04:43][INFO] uniformer.py: 410: Ignore: head.weight
[11/17 12:04:43][INFO] uniformer.py: 410: Ignore: head.bias
[11/17 12:04:43][INFO] build.py:  45: load pretrained model
[11/17 12:04:43][INFO] misc.py: 183: Model:
...
...
...
[11/17 12:04:43][INFO] misc.py: 184: Params: 21,400,400
[11/17 12:04:43][INFO] misc.py: 185: Mem: 0.0800790786743164 MB
e:\master\uniformer\video_classification\slowfast\models\uniformer.py:85: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::add encountered 42 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::gelu encountered 14 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator prim::PythonOp.CheckpointFunction encountered 4 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::div encountered 7 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::mul encountered 7 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::softmax encountered 7 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::mean encountered 1 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 511: The following submodules of the model were never called during the trace of the graph. They may be unused, or they were accessed by direct calls to .forward() or via other python methods. In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
blocks1.1.drop_path, blocks1.2.drop_path, blocks2.0.drop_path, blocks2.1.drop_path, blocks2.2.drop_path, blocks2.3.drop_path, blocks3.0, blocks3.0.attn, blocks3.0.attn.attn_drop, blocks3.0.attn.proj, blocks3.0.attn.proj_drop, blocks3.0.attn.qkv, blocks3.0.drop_path, blocks3.0.mlp, blocks3.0.mlp.act, blocks3.0.mlp.drop, blocks3.0.mlp.fc1, blocks3.0.mlp.fc2, blocks3.0.norm1, blocks3.0.norm2, blocks3.0.pos_embed, blocks3.1, blocks3.1.attn, blocks3.1.attn.attn_drop, blocks3.1.attn.proj, blocks3.1.attn.proj_drop, blocks3.1.attn.qkv, blocks3.1.drop_path, blocks3.1.mlp, blocks3.1.mlp.act, blocks3.1.mlp.drop, blocks3.1.mlp.fc1, blocks3.1.mlp.fc2, blocks3.1.norm1, blocks3.1.norm2, blocks3.1.pos_embed, blocks3.2, blocks3.2.attn, blocks3.2.attn.attn_drop, blocks3.2.attn.proj, blocks3.2.attn.proj_drop, blocks3.2.attn.qkv, blocks3.2.drop_path, blocks3.2.mlp, blocks3.2.mlp.act, blocks3.2.mlp.drop, blocks3.2.mlp.fc1, blocks3.2.mlp.fc2, blocks3.2.norm1, blocks3.2.norm2, blocks3.2.pos_embed, blocks3.3, blocks3.3.attn, blocks3.3.attn.attn_drop, blocks3.3.attn.proj, blocks3.3.attn.proj_drop, blocks3.3.attn.qkv, blocks3.3.drop_path, blocks3.3.mlp, blocks3.3.mlp.act, blocks3.3.mlp.drop, blocks3.3.mlp.fc1, blocks3.3.mlp.fc2, blocks3.3.norm1, blocks3.3.norm2, blocks3.3.pos_embed, blocks3.4.drop_path, blocks3.5.drop_path, blocks3.6.drop_path, blocks3.7.drop_path, blocks4.0.drop_path, blocks4.1.drop_path, blocks4.2.drop_path
[11/17 12:04:46][INFO] misc.py: 186: Flops: 12.149269504 G
e:\master\uniformer\video_classification\slowfast\models\uniformer.py:85: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  qkv = self.qkv(x).reshape(B, N, 3, self.num_heads, C // self.num_heads).permute(2, 0, 3, 1, 4)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::layer_norm encountered 18 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::add encountered 42 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::batch_norm encountered 15 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::gelu encountered 14 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator prim::PythonOp.CheckpointFunction encountered 4 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::div encountered 7 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::mul encountered 7 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::softmax encountered 7 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 499: Unsupported operator aten::mean encountered 1 time(s)
[11/17 12:04:46][WARNING] jit_analysis.py: 511: The following submodules of the model were never called during the trace of the graph. They may be unused, or they were accessed by direct calls to .forward() or via other python methods. In the latter case they will have zeros for statistics, though their statistics will still contribute to their parent calling module.
blocks1.1.drop_path, blocks1.2.drop_path, blocks2.0.drop_path, blocks2.1.drop_path, blocks2.2.drop_path, blocks2.3.drop_path, blocks3.0, blocks3.0.attn, blocks3.0.attn.attn_drop, blocks3.0.attn.proj, blocks3.0.attn.proj_drop, blocks3.0.attn.qkv, blocks3.0.drop_path, blocks3.0.mlp, blocks3.0.mlp.act, blocks3.0.mlp.drop, blocks3.0.mlp.fc1, blocks3.0.mlp.fc2, blocks3.0.norm1, blocks3.0.norm2, blocks3.0.pos_embed, blocks3.1, blocks3.1.attn, blocks3.1.attn.attn_drop, blocks3.1.attn.proj, blocks3.1.attn.proj_drop, blocks3.1.attn.qkv, blocks3.1.drop_path, blocks3.1.mlp, blocks3.1.mlp.act, blocks3.1.mlp.drop, blocks3.1.mlp.fc1, blocks3.1.mlp.fc2, blocks3.1.norm1, blocks3.1.norm2, blocks3.1.pos_embed, blocks3.2, blocks3.2.attn, blocks3.2.attn.attn_drop, blocks3.2.attn.proj, blocks3.2.attn.proj_drop, blocks3.2.attn.qkv, blocks3.2.drop_path, blocks3.2.mlp, blocks3.2.mlp.act, blocks3.2.mlp.drop, blocks3.2.mlp.fc1, blocks3.2.mlp.fc2, blocks3.2.norm1, blocks3.2.norm2, blocks3.2.pos_embed, blocks3.3, blocks3.3.attn, blocks3.3.attn.attn_drop, blocks3.3.attn.proj, blocks3.3.attn.proj_drop, blocks3.3.attn.qkv, blocks3.3.drop_path, blocks3.3.mlp, blocks3.3.mlp.act, blocks3.3.mlp.drop, blocks3.3.mlp.fc1, blocks3.3.mlp.fc2, blocks3.3.norm1, blocks3.3.norm2, blocks3.3.pos_embed, blocks3.4.drop_path, blocks3.5.drop_path, blocks3.6.drop_path, blocks3.7.drop_path, blocks4.0.drop_path, blocks4.1.drop_path, blocks4.2.drop_path
[11/17 12:04:46][INFO] misc.py: 191: Activations: 65.24801599999999 M
[11/17 12:04:46][INFO] misc.py: 196: nvidia-smi
Thu Nov 17 12:04:46 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 522.25       Driver Version: 522.25       CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:26:00.0  On |                  N/A |
|  0%   51C    P2    32W / 120W |   2494MiB /  6144MiB |      9%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1456    C+G   ...\PowerToys.FancyZones.exe    N/A      |
|    0   N/A  N/A      5844    C+G   ...werToys.PowerLauncher.exe    N/A      |
|    0   N/A  N/A      9248    C+G   ...6.0.3.0\GoogleDriveFS.exe    N/A      |
|    0   N/A  N/A     10196    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     11424    C+G   ...ropbox\Client\Dropbox.exe    N/A      |
|    0   N/A  N/A     11960    C+G   ...8bbwe\Microsoft.Notes.exe    N/A      |
|    0   N/A  N/A     12932    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A     13532    C+G   ...bbwe\Microsoft.Photos.exe    N/A      |
|    0   N/A  N/A     13536    C+G   ...me\Application\chrome.exe    N/A      |
|    0   N/A  N/A     15480    C+G   ...2txyewy\TextInputHost.exe    N/A      |
|    0   N/A  N/A     16648    C+G   ...ram Files\LGHUB\lghub.exe    N/A      |
|    0   N/A  N/A     17376    C+G   ...perience\NVIDIA Share.exe    N/A      |
|    0   N/A  N/A     20416    C+G   ...ons\Grammarly.Desktop.exe    N/A      |
|    0   N/A  N/A     21528      C   ...envs\uniformer\python.exe    N/A      |
|    0   N/A  N/A     22376    C+G   ...werToys.ColorPickerUI.exe    N/A      |
|    0   N/A  N/A     24568    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     26792    C+G   ...zpdnekdrzrea0\Spotify.exe    N/A      |
|    0   N/A  N/A     28376    C+G   ...icrosoft VS Code\Code.exe    N/A      |
+-----------------------------------------------------------------------------+
[11/17 12:04:46][INFO] checkpoint.py: 213: Loading network weights from ./exp/uniformer_s8x8_k400\checkpoints\checkpoint_epoch_00002.pyth.
[11/17 12:04:46][INFO] kinetics.py:  76: Constructing Kinetics test...
Traceback (most recent call last):
  File "C:\Users\Admin\Documents\master\UniFormer\video_classification\tools\run_net.py", line 31, in <module>
    main()
  File "C:\Users\Admin\Documents\master\UniFormer\video_classification\tools\run_net.py", line 27, in main
    launch_job(cfg=cfg, init_method=args.init_method, func=test)
  File "e:\master\uniformer\video_classification\slowfast\utils\misc.py", line 296, in launch_job
    torch.multiprocessing.spawn(
  File "D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torch\multiprocessing\spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torch\multiprocessing\spawn.py", line 188, in start_processes
    while not context.join():
  File "D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torch\multiprocessing\spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torch\multiprocessing\spawn.py", line 59, in _wrap
    fn(i, *args)
  File "e:\master\uniformer\video_classification\slowfast\utils\multiprocessing.py", line 60, in run
    ret = func(cfg)
  File "C:\Users\Admin\Documents\master\UniFormer\video_classification\tools\test_net.py", line 168, in test
    test_loader = loader.construct_loader(cfg, "test")
  File "e:\master\uniformer\video_classification\slowfast\datasets\loader.py", line 112, in construct_loader
    dataset = build_dataset(dataset_name, cfg, split)
  File "e:\master\uniformer\video_classification\slowfast\datasets\build.py", line 31, in build_dataset
    return DATASET_REGISTRY.get(name)(cfg, split)
  File "e:\master\uniformer\video_classification\slowfast\datasets\kinetics.py", line 77, in __init__
    self._construct_loader()
  File "e:\master\uniformer\video_classification\slowfast\datasets\kinetics.py", line 121, in _construct_loader
    self._split_idx, path_to_file
  File "D:\CAIDATPHANMEM\miniconda3\envs\uniformer\lib\site-packages\torch\utils\data\dataset.py", line 83, in __getattr__
    raise AttributeError
AttributeError

(uniformer)

and this is my config in run.sh

work_path=$(dirname $0)
PYTHONPATH=$PYTHONPATH:./slowfast \
python tools/run_net.py \
  --cfg $work_path/config.yaml \
  DATA.PATH_TO_DATA_DIR ./data_vid \
  DATA.PATH_LABEL_SEPARATOR "," \
  TRAIN.EVAL_PERIOD 5 \
  TRAIN.CHECKPOINT_PERIOD 1 \
  TRAIN.BATCH_SIZE 8 \
  TRAIN.CHECKPOINT_FILE_PATH ./exp/uniformer_s8x8_k400/checkpoints/checkpoint_epoch_00002.pyth \
  NUM_GPUS 1 \
  UNIFORMER.DROP_DEPTH_RATE 0.1 \
  SOLVER.MAX_EPOCH 2 \
  SOLVER.BASE_LR 4e-4 \
  SOLVER.WARMUP_EPOCHS 10.0 \
  DATA.TEST_CROP_SIZE 224 \
  TEST.NUM_ENSEMBLE_VIEWS 1 \
  TEST.NUM_SPATIAL_CROPS 1 \
  RNG_SEED 6666 \
  OUTPUT_DIR $work_path

Thank you in advance!