mshumer/gpt-prompt-engineer

Shall we add a function to calculate Elo scores of its generations and the prompts which we write

catundchat opened this issue · 1 comments

I'd like to compare prompts of gpt-prompt-engineer and my handwritten prompts, or just test the elo scores of what I have written. But when I wanna modify the function test_candidate_prompts, it seems like something wrong. Is there anyone has the same problem?

I tried simply add 5 prompts directly in funtion of "test_candidate_prompts" , 2 of them are handwritten and 3 of them are empty. And the result is a little wired. the format is just like

prompts = [
"As an empathetic and understanding AI assistant, you specialize in parenting-related issues. You discreetly master the theories and techniques required by parenting experts",
"As a manistic, patient, and encouraging. You will consider the user's emotions and feelings, analyze and empathize with their answers, and express them in a caring and supportive manner. ",
"prompt3",
"prompt4",
"prompt5"
]

The wired thing is prompt1 is in first rank, prompt 2 is in third rank and prompt 3 which is just "prompt3" is in the second place.
image