KhoomeiK/LlamaGym

Why are all the past messages for any given question -> answer pair included?

Opened this issue · 0 comments

LlamaGym/llamagym/agent.py

Lines 95 to 107 in 92d7827

for i in range(2, len(messages), 2):
prompt = self.tokenizer.apply_chat_template(
messages[: i + 1], tokenize=False, add_generation_prompt=False
)
conversation_chunks = prompt.split("[/INST] ")
query = "[/INST] ".join(conversation_chunks[:-1]) + "[/INST] "
response = conversation_chunks[-1]
query = self.tokenizer(query, return_tensors="pt").input_ids[0]
response = self.tokenizer(response, return_tensors="pt").input_ids[0]
queries.append(query)
responses.append(response)

The code iterates over every assistant message and forms a query by appending all previous messages to it. I read up on literature and I couldn't find anything that states that training can be done in this way.

I'd assume that one would only use the current question and answer pair.