georgian-io/Multimodal-Toolkit

Inference Error with Text Features Only

e-hossam96 opened this issue · 7 comments

I have an issue when inferring with text features only.

I loaded my test data as follows:

test_dataset = load_data(
    data_df=test,
    tokenizer=tokenizer,
    text_cols=['text'],
    label_col='labels',
    label_list=data_args.column_info['label_list'],
    categorical_cols=None,
    numerical_cols=None,
    sep_text_token_str=tokenizer.sep_token,
)

I used the following setup during training combine_feat_method='gating_on_cat_and_num_feats_then_sum' and I use the Hugging Face trainer as suggested in the demo notebook.

I got the error in the tabular_combiner.py file during inference as follows:

File /usr/local/lib/python3.9/dist-packages/multimodal_transformers/model/tabular_combiner.py:441, in TabularFeatCombiner.forward(self, text_feats, cat_feats, numerical_feats)
    438     g_mult_num = 0
    440 H = g_mult_cat + g_mult_num + self.h_bias
--> 441 norm = torch.norm(text_feats, dim=1) / torch.norm(H, dim=1)
    442 alpha = torch.clamp(norm * self.tabular_config.gating_beta, min=0, max=1)
    443 combined_feats = text_feats + alpha[:, None] * H

File /usr/local/lib/python3.9/dist-packages/torch/functional.py:1501, in norm(input, p, dim, keepdim, out, dtype)
   1499 if p == "fro" and (dim is None or isinstance(dim, int) or len(dim) <= 2):
   1500     if out is None:
-> 1501         return torch.linalg.vector_norm(input, 2, _dim, keepdim, dtype=dtype)
   1502     else:
   1503         return torch.linalg.vector_norm(input, 2, _dim, keepdim, dtype=dtype, out=out)

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Appreciate your help in advance.
Thanks

Hi, same here. For all strategies not only text_only method.

test_predictions = trainer.predict(val_dataset)
is giving me the following error: AxisError: axis 1 is out of bounds for array of dimension 1.
Did you find a solution ?

Thank you & Regards

After trying all combination mechanisms, only attention_on_cat_and_numerical_feats and weighted_feature_sum_on_transformer_cat_and_numerical_feats worked for me when masking the numerical and categorical features at test time.

I'm still researching why the other mechanisms fail. I found that sticking to Python < 3.10 produces a different error other than the one mentioned above. If you're still working, I recommend using the combination methods above and only using PyTorch 1.13 instead of 2.0 to avoid errors.

Good luck