BUAADreamer/Chinese-LLaVA-Med

让Qwen 14B来评价回复质量好坏,它能担此重任吗

wwewwt opened this issue · 1 comments

We would like to request your feedback on the performance of two AI assistants in response to the user question displayed above. The user asks the question on observing an image. For your reference, the visual content in the image is represented with caption describing the same image.
Please rate the helpfulness, relevance, accuracy, level of details of their responses. Each assistant receives an overall score on a scale of 1 to 10, where a higher score indicates better overall performance.
Please first output a single line containing only two values indicating the scores for Assistant 1 and 2, respectively. The two scores are separated by a space. In the subsequent line, please provide a comprehensive explanation of your evaluation, avoiding any potential bias and ensuring that the order in which the responses were presented does not affect your judgment.

This prompt is borrowed from LLaVA-Med, it is used for prompting GPT-4, here I only use this as a case as I don't have enough money and token to prompt GPT-4.