huangb23/VTimeLLM

chatglm3的中文理解能力怎么样?

Closed this issue · 3 comments

能否理解到更深的层次,比如氛围、活动、动作等

The performance of VTimeLLM-ChatGLM is not as good as the Vicuna version. The reason for this might be that the data we used during training was directly obtained through a translation API, and we did not conduct careful adjustment of hyperparameters on VTimeLLM-ChatGLM, which may not fully unlock the optimal capabilities of this architecture.

@huangb23 by saying this, you mean ChatGLM6b compare with Vicuna13b?
the performance not good you mean Chinese? I mainly focus on Chinese performance, since there are many culture related images English version might not be very good at.

@lucasjinreal 我们用了英文数据训练VTimeLLM-Vicuna1.5-7B,翻译的中文数据来训练VTimeLLM-ChatGLM3-6B。
对于同样的视频和问题,用中文问VTimeLLM-ChatGLM3-6B的效果不如用英文问VTimeLLM-Vicuna1.5-7B。