Python 3.10

Model Training and Quantitative Evaluation Pipeline.

  1. For MOSI,MOSEI,SIMSv2 and MIntRec feature datasets, you need to download them Using the following link,(unaligned_v171_a25_l50.pkl). In addition, the link contains files such as background noise.

    BaiduYun Disk code: zzx1

    Google Drive

  2. Using the following script to run our codes.

    2.1 Performing <selected_model> on <selected_database> dataset with feature level noise-based augmentation.

    Constructing Feature Level Noisy Augmentation using feat_noise_cons.py. The Generated Noisy Databases will be saved at NOISY_DATASET_ROOT_DIR/DATABASE/NOISETYPE/noisy_feature_INTENSITY_SEED.pkl

    python feat_noise_cons.py --dataset <selected_database> --injected-noise <selected_noise_type> --noise-intensity <selected_intensity> --inject-noise-seed <selected_seed>

    2.2 Performing <noise_type> on <video_path> raw dataset with raw-video level noise-based augmentation.

    Constructing Feature Level Noisy Augmentation using real_noise_cons.py. The Generated Noisy Databases will be saved at <save_dir>.

    python real_noise_cons.py  --video-dir <video_path> --noise-type <noise_type> --save-dir <save_dir>

    2.3 During the training phase, Performing <selected_model> on <selected_database> dataset with default configurations w.o Augmentation.

    python main.py --model <selected_model> --dataset <selected_database>

    Note: For models which utilize paired perfect and noisy instances (TFR-Net, NIAT, and EMT-DLFR), noise-based augmentations is required. Taking TFRNet for example.

    python main.py --model TFRNet --augmentation feat_random_drop --dataset  <selected_database>

    2.4 During the test phase, Performing <selected_model> on <selected_database> dataset, and <model_save_dir> is the saved model path.

    python test.py --model <selected_model> --dataset <selected_database> --model-save-dir <model_save_dir>

Experimental results

*indicates that data augmentation is applied, and the augmentation type is consistent with the validation type. The experimental results on MOSI and SIMS v2 datasets are shown as follows.

The test noise type is Random Drop

MOSI SIMSv2
Model Acc-2 F1 MAE Corr Acc-2 F1 MAE Corr
T2FN 62.23/64.07 57.08/59.19 130.64 30.68 64.97 63.70 43.43 38.76
TPFN 62.16/64.26 53.81/56.22 130.25 31.73 64.06 62.96 43.51 39.67
CTFN 63.14/65.52 54.95/57.76 123.12 32.72 62.35 60.09 44.59 38.32
MMIN 62.62/64.89 53.98/56.64 128.54 27.98 63.33 62.23 43.71 38.65
GCNET 62.90/64.23 58.81/60.40 128.33 29.03 63.84 63.01 44.27 37.98
----- ----------- ----------- ------ ----- ----- ----- ----- -----
T2FN* 63.11/63.69 61.76/62.46 128.20 32.53 65.70 64.21 43.33 37.86
TPFN* 63.76/63.61 61.23/61.18 128.99 37.91 66.89 63.06 42.93 38.99
CTFN* 64.67/65.60 62.56/63.63 123.31 37.62 66.54 65.05 42.23 41.81
MMIN* 64.74/65.53 63.47/64.39 124.13 36.20 66.76 64.74 43.05 40.59
GCNET* 62.65/62.98 61.04/61.51 130.87 31.71 66.32 63.70 43.95 39.61
TFRNet* 66.88/67.39 65.87/66.48 120.36 43.86 67.47 65.93 43.99 42.72
NIAT* 67.47/67.92 66.64/67.19 147.98 45.64 66.32 66.19 58.31 38.95
EMT_DLFR* 68.33/68.67 67.07/67.51 123.34 46.47 67.93 67.22 43.80 43.72

The test noise type is Structural Drop

MOSI SIMSv2
Model Acc-2 F1 MAE Corr Acc-2 F1 MAE Corr
T2FN 62.36/63.87 58.16/59.92 128.62 30.62 66.56 65.88 42.28 41.95
TPFN 63.73/65.69 59.22/61.48 121.49 35.23 64.76 64.04 42.94 42.65
CTFN 65.47/67.61 61.49/63.92 117.04 38.15 63.48 61.86 43.58 41.52
MMIN 64.59/66.66 60.64/63.05 120.86 34.88 65.90 65.45 42.30 43.05
GCNET 61.77/64.12 53.34/56.11 131.15 28.11 64.78 62.80 44.28 40.93
----- ----------- ----------- ------ ----- ----- ----- ----- -----
T2FN* 63.76/64.22 63.17/63.74 124.52 36.86 66.15 62.11 44.81 38.77
TPFN* 65.50/66.70 64.37/65.72 119.03 39.22 67.54 65.56 42.81 42.15
CTFN* 64.18/64.93 63.64/64.53 123.85 39.34 66.73 66.04 42.01 42.86
MMIN* 65.43/67.05 63.49/65.31 119.99 36.80 68.19 66.09 42.93 42.64
GCNET* 63.76/64.76 62.59/63.75 129.27 34.26 67.44 63.80 45.91 40.53
TFRNet* 66.26/66.60 64.44/64.90 119.75 45.25 67.55 66.84 42.18 45.09
NIAT* 69.67/70.65 69.13/70.23 118.09 50.86 64.94 63.66 51.31 39.31
EMT_DLFR* 70.30/71.00 69.98/70.79 107.07 53.17 68.85 68.37 43.63 45.71

The test noise type is Audio BG Park

MOSI SIMSv2
Model Acc-2 F1 MAE Corr Acc-2 F1 MAE Corr
T2FN 64.84/65.23 64.76/65.24 121.89 38.19 62.07 60.21 48.03 26.85
TPFN 63.31/63.12 63.00/62.94 128.30 37.57 62.19 59.57 48.40 27.58
CTFN 65.68/66.39 65.39/66.19 118.84 41.94 62.04 60.61 46.92 28.05
MMIN 63.65/63.57 63.37/63.41 123.85 40.17 59.32 58.87 47.60 27.08
GCNET 64.96/65.11 65.00/65.25 125.63 38.11 59.76 58.92 49.22 22.59
----- ----------- ----------- ------ ----- ----- ----- ----- -----
T2FN* 64.25/64.75 64.01/64.63 123.60 37.54 62.16 59.29 48.84 24.90
TPFN* 63.46/63.17 63.04/62.90 126.75 39.16 62.74 60.05 47.76 27.37
CTFN* 65.26/66.01 64.97/65.85 121.28 41.73 62.94 61.23 47.00 29.23
MMIN* 66.32/67.62 65.58/66.99 118.78 40.68 61.72 61.55 49.29 26.29
GCNET* 64.65/65.07 64.35/64.89 126.13 37.40 61.34 60.27 47.19 26.72
TFRNet* 66.41/67.47 66.18/67.35 116.95 42.56 61.80 61.95 51.71 27.58
NIAT* 66.34/66.87 66.37/67.01 122.01 44.94 63.04 62.50 50.16 27.09
EMT_DLFR* 65.10/65.10 64.72/64.85 122.14 44.96 63.81 63.85 50.16 29.88

The test noise type is Audio Color W

MOSI SIMSv2
Model Acc-2 F1 MAE Corr Acc-2 F1 MAE Corr
T2FN 64.14/64.72 63.96/64.64 122.88 36.56 62.63 60.88 48.19 26.33
TPFN 62.44/62.17 61.96/61.82 129.65 35.89 61.95 59.61 48.45 26.64
CTFN 65.71/66.34 65.46/66.19 119.86 40.87 63.13 61.50 46.67 28.55
MMIN 63.10/62.94 62.69/62.65 125.53 38.27 60.31 59.90 47.27 27.14
GCNET 63.96/64.18 63.94/64.29 127.60 37.33 59.10 58.58 50.22 25.35
----- ----------- ----------- ------ ----- ----- ----- ----- -----
T2FN* 62.28/62.48 62.32/62.64 129.35 35.00 63.02 59.26 48.45 26.50
TPFN* 62.30/62.62 62.03/62.46 129.42 35.21 63.39 61.90 47.46 28.02
CTFN* 64.30/64.89 64.21/64.92 123.47 41.28 63.32 61.27 47.31 29.15
MMIN* 64.92/65.57 64.22/65.00 124.94 39.31 62.62 61.54 48.59 26.75
GCNET* 64.89/65.58 64.79/65.59 125.00 38.51 62.05 59.72 48.07 26.52
TFRNet* 64.29/64.98 64.11/64.93 122.92 40.94 63.07 61.81 48.96 29.80
NIAT* 65.61/66.29 65.16/65.98 120.96 43.15 63.29 62.64 50.15 26.05
EMT_DLFR* 65.95/66.48 65.91/66.55 115.92 43.14 63.36 63.34 48.84 29.32

The test noise type is Video Gblur

MOSI SIMSv2
Model Acc-2 F1 MAE Corr Acc-2 F1 MAE Corr
T2FN 77.92/79.14 77.90/79.19 91.45 67.28 77.61 77.58 33.91 64.83
TPFN 77.70/78.81 77.74/78.92 94.78 67.24 76.91 76.81 34.92 63.98
CTFN 78.62/80.02 78.58/80.05 88.87 70.15 76.87 76.88 34.5 63.17
MMIN 78.57/79.42 78.59/79.49 91.51 69.90 76.47 76.53 34.39 63.76
GCNET 77.65/78.86 77.61/78.90 95.47 66.40 76.44 76.10 36.06 61.6
----- ----------- ----------- ------ ----- ----- ----- ----- -----
T2FN* 77.80/78.96 77.78/79.02 92.31 67.31 76.34 76.35 34.93 61.90
TPFN* 78.42/79.83 78.41/79.88 92.06 69.38 76.34 76.35 34.13 63.45
CTFN* 77.47/78.43 77.51/78.54 94.22 70.48 77.14 77.09 34.00 64.36
MMIN* 78.98/80.56 78.84/80.49 90.96 69.40 76.31 76.29 34.65 63.85
GCNET* 75.54/76.46 75.59/76.59 99.14 64.62 76.26 76.17 36.75 61.20
TFRNet* 80.43/81.88 80.42/81.92 82.74 75.64 76.06 76.03 37.16 62.52
NIAT* 81.80/83.66 81.74/83.67 78.25 78.51 76.51 76.54 36.89 59.87
EMT_DLFR* 82.44/84.16 82.39/84.17 72.30 78.88 77.41 77.45 33.71 64.92

The test noise type is Video Impulse

MOSI SIMSv2
Model Acc-2 F1 MAE Corr Acc-2 F1 MAE Corr
T2FN 77.95/79.43 77.86/79.4 93.27 68.07 77.33 77.14 36.27 63.20
TPFN 78.45/79.85 78.44/79.9 91.78 69.38 76.77 76.63 35.4 63.92
CTFN 78.07/79.44 78.09/79.51 93.37 69.53 76.98 76.98 34.52 63.14
MMIN 78.93/80.58 78.81/80.53 89.67 70.53 76.34 76.42 34.44 63.63
GCNET 77.18/78.33 77.03/78.26 96.12 65.24 76.35 76.30 35.55 61.92
----- ----------- ----------- ------ ----- ----- ----- ----- -----
T2FN* 77.81/79.32 77.73/79.32 92.09 67.83 76.23 76.24 35.30 33.87
TPFN* 77.60/78.98 77.49/78.95 94.12 67.78 77.30 77.22 34.85 32.44
CTFN* 77.83/79.21 77.85/79.28 93.25 69.32 76.82 76.79 34.20 33.98
MMIN* 79.40/81.05 79.27/80.99 90.80 69.39 76.84 76.78 34.71 33.57
GCNET* 77.26/78.48 77.07/78.38 98.77 66.19 75.47 75.23 38.17 24.39
TFRNet* 81.02/82.20 81.02/82.30 83.38 75.68 76.36 76.31 35.89 27.09
NIAT* 81.72/84.27 81.49/84.15 79.99 78.17 76.11 76.02 39.61 13.62
EMT_DLFR* 82.87/84.79 82.74/84.73 71.48 79.23 76.69 76.80 35.10 28.06