DeepSeek-Coder-V2推理警告

Question

DeepSeek-Coder-V2推理警告

Qlalq opened this issue 4 months ago · 0 comments

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:10<00:00,  2.52s/it]
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100001 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
The `seen_tokens` attribute is deprecated and will be removed in v4.41. Use the `cache_position` model input instead.

推理时终端输出如上警告，且无法通过训练集的测试（训练输入A，输出B，实际输入A，输出C），训练loss如图，请问您之前解决过类似的问题吗？