maitrix-org/llm-reasoners

--> 245 assert torch.all(prompt_tokens[: len(prefix_tokens)] == prefix_tokens), (prompt_tokens, prefix_tokens)

Opened this issue · 3 comments

getting this assertion error when using openchat-3.6

Hi, could you provide the complete command and error information?

world_model = BlocksWorldModelToT(base_model=model, prompt=prompt)
config = BWConfigToT(base_model=model, prompt=prompt)
algorithm = BeamSearch(beam_size=4, max_depth=7)
reasoner_tot = Reasoner(world_model=world_model, search_config=config, search_algo=algorithm)
result_tot = reasoner_tot(example)
print(result_tot)

I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do

Pick up a block
Unstack a block from on top of another block
Put down a block
Stack a block on top of another block

I have the following restrictions on my actions:
I can only pick up or unstack one block at a time.
I can only pick up or unstack a block if my hand is empty.
I can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up.
I can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block.
I can only unstack a block from on top of another block if the block I am unstacking is clear.
Once I pick up or unstack a block, I am holding the block.
I can only put down a block that I am holding.
I can only stack a block on top of another block if I am holding the block being stacked.
I can only stack a block on top of another block if the block onto which I am stacking the block is clear.
Once I put down or stack a block, my hand becomes empty.

[STATEMENT]
As initial conditions I have that, the red block is clear, the orange block is clear, the hand is empty, the orange block is on top of the blue block, the red block is on the table and the blue block is on the table.
My goal is to have that the blue block is on top of the orange block.

My plan is as follows:

[PLAN]
unstack the orange block from on top of the blue block
put down the orange block
pick up the blue block
stack the blue block on top of the orange block
[PLAN END]

[STATEMENT]
As initial conditions I have that, the blue block is clear, the orange block is clear, the hand is empty, the red block is on top of the yellow block, the orange block is on top of the red block, the blue block is on the table and the yellow block is on the table.
My goal is to have that the blue block is on top of the yellow block and the orange block is on top of the blue block.

My plan is as follows:

[PLAN]
unstack the orange block from on top of the red block
put down the orange block
unstack the red block from on top of the yellow block
put down the red block
pick up the blue block
stack the blue block on top of the yellow block
pick up the orange block
stack the orange block on top of the blue block
[PLAN END]

[STATEMENT]
As initial conditions I have that, the red block is clear, the yellow block is clear, the hand is empty, the red block is on top of the blue block, the blue block is on top of the orange block, the orange block is on the table and the yellow block is on the table.
My goal is to have that the blue block is on top of the orange block and the yellow block is on top of the red block.

My plan is as follows:

[PLAN]
pick up the yellow block
stack the yellow block on top of the red block
[PLAN END]

[STATEMENT]
As initial conditions I have that, the blue block is clear, the yellow block is clear, the hand is empty, the red block is on top of the orange block, the blue block is on top of the red block, the orange block is on the table and the yellow block is on the table.
My goal is to have that the blue block is on top of the red block and the yellow block is on top of the blue block.

My plan is as follows:

[PLAN]
pick up the yellow block
stack the yellow block on top of the blue block
[PLAN END]

[STATEMENT]
As initial conditions I have that, the blue block is clear, the orange block is clear, the hand is empty, the orange block is on top of the red block, the red block is on the table and the blue block is on the table.
My goal is to have that the red block is on top of the blue block.

My plan is as follows:

[PLAN]
['I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do\n\nPick up a block\nUnstack a block from on top of another block\nPut down a block\nStack a block on top of another block\n\nI have the following restrictions on my actions:\nI can only pick up or unstack one block at a time.\nI can only pick up or unstack a block if my hand is empty.\nI can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up.\nI can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block.\nI can only unstack a block from on top of another block if the block I am unstacking is clear.\nOnce I pick up or unstack a block, I am holding the block.\nI can only put down a block that I am holding.\nI can only stack a block on top of another block if I am holding the block being stacked.\nI can only stack a block on top of another block if the block onto which I am stacking the block is clear.\nOnce I put down or stack a block, my hand becomes empty.\n\n[STATEMENT]\nAs initial conditions I have that, the red block is clear, the orange block is clear, the hand is empty, the orange block is on top of the blue block, the red block is on the table and the blue block is on the table.\nMy goal is to have that the blue block is on top of the orange block.\n\nMy plan is as follows:\n\n[PLAN]\nunstack the orange block from on top of the blue block\nput down the orange block\npick up the blue block\nstack the blue block on top of the orange block\n[PLAN END]\n\n[STATEMENT]\nAs initial conditions I have that, the blue block is clear, the orange block is clear, the hand is empty, the red block is on top of the yellow block, the orange block is on top of the red block, the blue block is on the table and the yellow block is on the table.\nMy goal is to have that the blue block is on top of the yellow block and the orange block is on top of the blue block.\n\nMy plan is as follows:\n\n[PLAN]\nunstack the orange block from on top of the red block\nput down the orange block\nunstack the red block from on top of the yellow block\nput down the red block\npick up the blue block\nstack the blue block on top of the yellow block\npick up the orange block\nstack the orange block on top of the blue block\n[PLAN END]\n\n[STATEMENT]\nAs initial conditions I have that, the red block is clear, the yellow block is clear, the hand is empty, the red block is on top of the blue block, the blue block is on top of the orange block, the orange block is on the table and the yellow block is on the table.\nMy goal is to have that the blue block is on top of the orange block and the yellow block is on top of the red block.\n\nMy plan is as follows:\n\n[PLAN]\npick up the yellow block\nstack the yellow block on top of the red block\n[PLAN END]\n\n[STATEMENT]\nAs initial conditions I have that, the blue block is clear, the yellow block is clear, the hand is empty, the red block is on top of the orange block, the blue block is on top of the red block, the orange block is on the table and the yellow block is on the table.\nMy goal is to have that the blue block is on top of the red block and the yellow block is on top of the blue block.\n\nMy plan is as follows:\n\n[PLAN]\npick up the yellow block\nstack the yellow block on top of the blue block\n[PLAN END]\n\n[STATEMENT]\nAs initial conditions I have that, the blue block is clear, the orange block is clear, the hand is empty, the orange block is on top of the red block, the red block is on the table and the blue block is on the table.\nMy goal is to have that the red block is on top of the blue block.\n\nMy plan is as follows:\n\n[PLAN]\npick up the red block']

AssertionError Traceback (most recent call last)
Cell In[12], line 5
3 algorithm = BeamSearch(beam_size=4, max_depth=7)
4 reasoner_tot = Reasoner(world_model=world_model, search_config=config, search_algo=algorithm)
----> 5 result_tot = reasoner_tot(example)
6 print(result_tot)

File /kcr/llm-reasoners/reasoners/base.py:183, in Reasoner.call(self, example, prompt, **kwargs)
181 self.world_model.update_example(example, prompt=prompt)
182 self.search_config.update_example(example, prompt=prompt)
--> 183 return self.search_algo(self.world_model, self.search_config, **kwargs)

File /kcr/llm-reasoners/reasoners/algorithm/beam_search.py:250, in BeamSearch.call(self, world, config)
244 raise ValueError(f"If unbiased stochastic sampling is used,
245 please make sure the reward function returns
246 a dictionary with keys 'acc_action_prob', which
247 is the accumulated action probability, and
248 'cur_action_prob', which is the current action probability.")
249 else:
--> 250 fast_reward, fast_reward_aux = config.fast_reward(state, action)
251 reward = config.reward(state, action, **aux, **fast_reward_aux)
253 # if the reward is a tuple, then it is (reward, aux)

Cell In[11], line 87, in BWConfigToT.fast_reward(self, state, action)
79 def fast_reward(self, state: BWStateToT, action: BWAction) -> tuple[float, dict]:
80 # We use two rewards here:
81 # 1. Intuition: The loglikelihood of the action given the prompt.
82 # 2. Self-eval: Ask the language model whether this step is "Good".
83 inputs = self.prompt["icl"].replace("", "\n".join(state.action_history + [""]))
84 .replace("<init_state>", utils.extract_init_state(self.example))
85 .replace("", utils.extract_goals(self.example, return_raw=True))[:-1]
---> 87 intuition = self.base_model.get_loglikelihood(inputs, [inputs + "\n" + action])[0]
89 self_eval_prompt = (self.prompt["self-eval"].replace("<init_state>", utils.extract_init_state(self.example))
90 .replace("", utils.extract_goals(self.example, return_raw=True))
91 .replace("", action))
92 self_eval = self.base_model.get_loglikelihood(self_eval_prompt, [self_eval_prompt + "good"])[0]

File ~/anaconda3/envs/reasoners/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator..decorate_context(*args, **kwargs)
112 @functools.wraps(func)
113 def decorate_context(*args, **kwargs):
114 with ctx_factory():
--> 115 return func(*args, **kwargs)

File /kcr/llm-reasoners/reasoners/lm/hf_model.py:246, in HFModel.get_loglikelihood(self, prefix, contents, **kwargs)
244 print(contents)
245 for prompt_tokens in prompts_tokens.input_ids:
--> 246 assert torch.all(prompt_tokens[: len(prefix_tokens)] == prefix_tokens), (prompt_tokens, prefix_tokens)
248 tokens = prompts_tokens
249 logits = self.model(**tokens, return_dict=True).logits

AssertionError: (tensor([ 40, 1097, 5737, 449, 264, 743, 315, 10215, 1405, 358,
1205, 311, 31993, 279, 10215, 1139, 41050, 13, 5810, 527,
279, 6299, 358, 649, 656, 271, 38053, 709, 264, 2565,
198, 1844, 7848, 264, 2565, 505, 389, 1948, 315, 2500,
2565, 198, 19648, 1523, 264, 2565, 198, 4434, 264, 2565,
389, 1948, 315, 2500, 2565, 271, 40, 617, 279, 2768,
17294, 389, 856, 6299, 512, 40, 649, 1193, 3820, 709,
477, 653, 7848, 832, 2565, 520, 264, 892, 627, 40,
649, 1193, 3820, 709, 477, 653, 7848, 264, 2565, 422,
856, 1450, 374, 4384, 627, 40, 649, 1193, 3820, 709,
264, 2565, 422, 279, 2565, 374, 389, 279, 2007, 323,
279, 2565, 374, 2867, 13, 362, 2565, 374, 2867, 422,
279, 2565, 706, 912, 1023, 10215, 389, 1948, 315, 433,
323, 422, 279, 2565, 374, 539, 13061, 709, 627, 40,
649, 1193, 653, 7848, 264, 2565, 505, 389, 1948, 315,
2500, 2565, 422, 279, 2565, 358, 1097, 653, 7848, 287,
574, 2216, 389, 1948, 315, 279, 1023, 2565, 627, 40,
649, 1193, 653, 7848, 264, 2565, 505, 389, 1948, 315,
2500, 2565, 422, 279, 2565, 358, 1097, 653, 7848, 287,
374, 2867, 627, 12805, 358, 3820, 709, 477, 653, 7848,
264, 2565, 11, 358, 1097, 10168, 279, 2565, 627, 40,
649, 1193, 2231, 1523, 264, 2565, 430, 358, 1097, 10168,
627, 40, 649, 1193, 5729, 264, 2565, 389, 1948, 315,
2500, 2565, 422, 358, 1097, 10168, 279, 2565, 1694, 42415,
627, 40, 649, 1193, 5729, 264, 2565, 389, 1948, 315,
2500, 2565, 422, 279, 2565, 8800, 902, 358, 1097, 75172,
279, 2565, 374, 2867, 627, 12805, 358, 2231, 1523, 477,
5729, 264, 2565, 11, 856, 1450, 9221, 4384, 382, 58,
25651, 5441, 933, 2170, 2926, 4787, 358, 617, 430, 11,
279, 2579, 2565, 374, 2867, 11, 279, 19087, 2565, 374,
2867, 11, 279, 1450, 374, 4384, 11, 279, 19087, 2565,
374, 389, 1948, 315, 279, 6437, 2565, 11, 279, 2579,
2565, 374, 389, 279, 2007, 323, 279, 6437, 2565, 374,
389, 279, 2007, 627, 5159, 5915, 374, 311, 617, 430,
279, 6437, 2565, 374, 389, 1948, 315, 279, 19087, 2565,
382, 5159, 3197, 374, 439, 11263, 1473, 58, 95179, 933,
359, 7848, 279, 19087, 2565, 505, 389, 1948, 315, 279,
6437, 2565, 198, 631, 1523, 279, 19087, 2565, 198, 30345,
709, 279, 6437, 2565, 198, 7848, 279, 6437, 2565, 389,
1948, 315, 279, 19087, 2565, 198, 58, 95179, 11424, 2595,
58, 25651, 5441, 933, 2170, 2926, 4787, 358, 617, 430,
11, 279, 6437, 2565, 374, 2867, 11, 279, 19087, 2565,
374, 2867, 11, 279, 1450, 374, 4384, 11, 279, 2579,
2565, 374, 389, 1948, 315, 279, 14071, 2565, 11, 279,
19087, 2565, 374, 389, 1948, 315, 279, 2579, 2565, 11,
279, 6437, 2565, 374, 389, 279, 2007, 323, 279, 14071,
2565, 374, 389, 279, 2007, 627, 5159, 5915, 374, 311,
617, 430, 279, 6437, 2565, 374, 389, 1948, 315, 279,
14071, 2565, 323, 279, 19087, 2565, 374, 389, 1948, 315,
279, 6437, 2565, 382, 5159, 3197, 374, 439, 11263, 1473,
58, 95179, 933, 359, 7848, 279, 19087, 2565, 505, 389,
1948, 315, 279, 2579, 2565, 198, 631, 1523, 279, 19087,
2565, 198, 359, 7848, 279, 2579, 2565, 505, 389, 1948,
315, 279, 14071, 2565, 198, 631, 1523, 279, 2579, 2565,
198, 30345, 709, 279, 6437, 2565, 198, 7848, 279, 6437,
2565, 389, 1948, 315, 279, 14071, 2565, 198, 30345, 709,
279, 19087, 2565, 198, 7848, 279, 19087, 2565, 389, 1948,
315, 279, 6437, 2565, 198, 58, 95179, 11424, 2595, 58,
25651, 5441, 933, 2170, 2926, 4787, 358, 617, 430, 11,
279, 2579, 2565, 374, 2867, 11, 279, 14071, 2565, 374,
2867, 11, 279, 1450, 374, 4384, 11, 279, 2579, 2565,
374, 389, 1948, 315, 279, 6437, 2565, 11, 279, 6437,
2565, 374, 389, 1948, 315, 279, 19087, 2565, 11, 279,
19087, 2565, 374, 389, 279, 2007, 323, 279, 14071, 2565,
374, 389, 279, 2007, 627, 5159, 5915, 374, 311, 617,
430, 279, 6437, 2565, 374, 389, 1948, 315, 279, 19087,
2565, 323, 279, 14071, 2565, 374, 389, 1948, 315, 279,
2579, 2565, 382, 5159, 3197, 374, 439, 11263, 1473, 58,
95179, 933, 30345, 709, 279, 14071, 2565, 198, 7848, 279,
14071, 2565, 389, 1948, 315, 279, 2579, 2565, 198, 58,
95179, 11424, 2595, 58, 25651, 5441, 933, 2170, 2926, 4787,
358, 617, 430, 11, 279, 6437, 2565, 374, 2867, 11,
279, 14071, 2565, 374, 2867, 11, 279, 1450, 374, 4384,
11, 279, 2579, 2565, 374, 389, 1948, 315, 279, 19087,
2565, 11, 279, 6437, 2565, 374, 389, 1948, 315, 279,
2579, 2565, 11, 279, 19087, 2565, 374, 389, 279, 2007,
323, 279, 14071, 2565, 374, 389, 279, 2007, 627, 5159,
5915, 374, 311, 617, 430, 279, 6437, 2565, 374, 389,
1948, 315, 279, 2579, 2565, 323, 279, 14071, 2565, 374,
389, 1948, 315, 279, 6437, 2565, 382, 5159, 3197, 374,
439, 11263, 1473, 58, 95179, 933, 30345, 709, 279, 14071,
2565, 198, 7848, 279, 14071, 2565, 389, 1948, 315, 279,
6437, 2565, 198, 58, 95179, 11424, 2595, 58, 25651, 5441,
933, 2170, 2926, 4787, 358, 617, 430, 11, 279, 6437,
2565, 374, 2867, 11, 279, 19087, 2565, 374, 2867, 11,
279, 1450, 374, 4384, 11, 279, 19087, 2565, 374, 389,
1948, 315, 279, 2579, 2565, 11, 279, 2579, 2565, 374,
389, 279, 2007, 323, 279, 6437, 2565, 374, 389, 279,
2007, 627, 5159, 5915, 374, 311, 617, 430, 279, 2579,
2565, 374, 389, 1948, 315, 279, 6437, 2565, 382, 5159,
3197, 374, 439, 11263, 1473, 58, 95179, 933, 30345, 709,
279, 2579, 2565], device='cuda:0'), tensor([ 40, 1097, 5737, 449, 264, 743, 315, 10215, 1405, 358,
1205, 311, 31993, 279, 10215, 1139, 41050, 13, 5810, 527,
279, 6299, 358, 649, 656, 271, 38053, 709, 264, 2565,
198, 1844, 7848, 264, 2565, 505, 389, 1948, 315, 2500,
2565, 198, 19648, 1523, 264, 2565, 198, 4434, 264, 2565,
389, 1948, 315, 2500, 2565, 271, 40, 617, 279, 2768,
17294, 389, 856, 6299, 512, 40, 649, 1193, 3820, 709,
477, 653, 7848, 832, 2565, 520, 264, 892, 627, 40,
649, 1193, 3820, 709, 477, 653, 7848, 264, 2565, 422,
856, 1450, 374, 4384, 627, 40, 649, 1193, 3820, 709,
264, 2565, 422, 279, 2565, 374, 389, 279, 2007, 323,
279, 2565, 374, 2867, 13, 362, 2565, 374, 2867, 422,
279, 2565, 706, 912, 1023, 10215, 389, 1948, 315, 433,
323, 422, 279, 2565, 374, 539, 13061, 709, 627, 40,
649, 1193, 653, 7848, 264, 2565, 505, 389, 1948, 315,
2500, 2565, 422, 279, 2565, 358, 1097, 653, 7848, 287,
574, 2216, 389, 1948, 315, 279, 1023, 2565, 627, 40,
649, 1193, 653, 7848, 264, 2565, 505, 389, 1948, 315,
2500, 2565, 422, 279, 2565, 358, 1097, 653, 7848, 287,
374, 2867, 627, 12805, 358, 3820, 709, 477, 653, 7848,
264, 2565, 11, 358, 1097, 10168, 279, 2565, 627, 40,
649, 1193, 2231, 1523, 264, 2565, 430, 358, 1097, 10168,
627, 40, 649, 1193, 5729, 264, 2565, 389, 1948, 315,
2500, 2565, 422, 358, 1097, 10168, 279, 2565, 1694, 42415,
627, 40, 649, 1193, 5729, 264, 2565, 389, 1948, 315,
2500, 2565, 422, 279, 2565, 8800, 902, 358, 1097, 75172,
279, 2565, 374, 2867, 627, 12805, 358, 2231, 1523, 477,
5729, 264, 2565, 11, 856, 1450, 9221, 4384, 382, 58,
25651, 5441, 933, 2170, 2926, 4787, 358, 617, 430, 11,
279, 2579, 2565, 374, 2867, 11, 279, 19087, 2565, 374,
2867, 11, 279, 1450, 374, 4384, 11, 279, 19087, 2565,
374, 389, 1948, 315, 279, 6437, 2565, 11, 279, 2579,
2565, 374, 389, 279, 2007, 323, 279, 6437, 2565, 374,
389, 279, 2007, 627, 5159, 5915, 374, 311, 617, 430,
279, 6437, 2565, 374, 389, 1948, 315, 279, 19087, 2565,
382, 5159, 3197, 374, 439, 11263, 1473, 58, 95179, 933,
359, 7848, 279, 19087, 2565, 505, 389, 1948, 315, 279,
6437, 2565, 198, 631, 1523, 279, 19087, 2565, 198, 30345,
709, 279, 6437, 2565, 198, 7848, 279, 6437, 2565, 389,
1948, 315, 279, 19087, 2565, 198, 58, 95179, 11424, 2595,
58, 25651, 5441, 933, 2170, 2926, 4787, 358, 617, 430,
11, 279, 6437, 2565, 374, 2867, 11, 279, 19087, 2565,
374, 2867, 11, 279, 1450, 374, 4384, 11, 279, 2579,
2565, 374, 389, 1948, 315, 279, 14071, 2565, 11, 279,
19087, 2565, 374, 389, 1948, 315, 279, 2579, 2565, 11,
279, 6437, 2565, 374, 389, 279, 2007, 323, 279, 14071,
2565, 374, 389, 279, 2007, 627, 5159, 5915, 374, 311,
617, 430, 279, 6437, 2565, 374, 389, 1948, 315, 279,
14071, 2565, 323, 279, 19087, 2565, 374, 389, 1948, 315,
279, 6437, 2565, 382, 5159, 3197, 374, 439, 11263, 1473,
58, 95179, 933, 359, 7848, 279, 19087, 2565, 505, 389,
1948, 315, 279, 2579, 2565, 198, 631, 1523, 279, 19087,
2565, 198, 359, 7848, 279, 2579, 2565, 505, 389, 1948,
315, 279, 14071, 2565, 198, 631, 1523, 279, 2579, 2565,
198, 30345, 709, 279, 6437, 2565, 198, 7848, 279, 6437,
2565, 389, 1948, 315, 279, 14071, 2565, 198, 30345, 709,
279, 19087, 2565, 198, 7848, 279, 19087, 2565, 389, 1948,
315, 279, 6437, 2565, 198, 58, 95179, 11424, 2595, 58,
25651, 5441, 933, 2170, 2926, 4787, 358, 617, 430, 11,
279, 2579, 2565, 374, 2867, 11, 279, 14071, 2565, 374,
2867, 11, 279, 1450, 374, 4384, 11, 279, 2579, 2565,
374, 389, 1948, 315, 279, 6437, 2565, 11, 279, 6437,
2565, 374, 389, 1948, 315, 279, 19087, 2565, 11, 279,
19087, 2565, 374, 389, 279, 2007, 323, 279, 14071, 2565,
374, 389, 279, 2007, 627, 5159, 5915, 374, 311, 617,
430, 279, 6437, 2565, 374, 389, 1948, 315, 279, 19087,
2565, 323, 279, 14071, 2565, 374, 389, 1948, 315, 279,
2579, 2565, 382, 5159, 3197, 374, 439, 11263, 1473, 58,
95179, 933, 30345, 709, 279, 14071, 2565, 198, 7848, 279,
14071, 2565, 389, 1948, 315, 279, 2579, 2565, 198, 58,
95179, 11424, 2595, 58, 25651, 5441, 933, 2170, 2926, 4787,
358, 617, 430, 11, 279, 6437, 2565, 374, 2867, 11,
279, 14071, 2565, 374, 2867, 11, 279, 1450, 374, 4384,
11, 279, 2579, 2565, 374, 389, 1948, 315, 279, 19087,
2565, 11, 279, 6437, 2565, 374, 389, 1948, 315, 279,
2579, 2565, 11, 279, 19087, 2565, 374, 389, 279, 2007,
323, 279, 14071, 2565, 374, 389, 279, 2007, 627, 5159,
5915, 374, 311, 617, 430, 279, 6437, 2565, 374, 389,
1948, 315, 279, 2579, 2565, 323, 279, 14071, 2565, 374,
389, 1948, 315, 279, 6437, 2565, 382, 5159, 3197, 374,
439, 11263, 1473, 58, 95179, 933, 30345, 709, 279, 14071,
2565, 198, 7848, 279, 14071, 2565, 389, 1948, 315, 279,
6437, 2565, 198, 58, 95179, 11424, 2595, 58, 25651, 5441,
933, 2170, 2926, 4787, 358, 617, 430, 11, 279, 6437,
2565, 374, 2867, 11, 279, 19087, 2565, 374, 2867, 11,
279, 1450, 374, 4384, 11, 279, 19087, 2565, 374, 389,
1948, 315, 279, 2579, 2565, 11, 279, 2579, 2565, 374,
389, 279, 2007, 323, 279, 6437, 2565, 374, 389, 279,
2007, 627, 5159, 5915, 374, 311, 617, 430, 279, 2579,
2565, 374, 389, 1948, 315, 279, 6437, 2565, 382, 5159,
3197, 374, 439, 11263, 1473, 58, 95179, 60],
device='cuda:0'))

   result  is with  openchat-3.6 here 
   
   
   from reasoners.benchmark import BWEvaluator

import json

dataset loaded from here

with open('examples/CoT/blocksworld/prompts/pool_prompt_v1.json') as f:
prompt = json.load(f)
evaluator = BWEvaluator(config_file='examples/CoT/blocksworld/data/bw_config.yaml',
domain_file='examples/CoT/blocksworld/data/generated_domain.pddl',
data_path='examples/CoT/blocksworld/data/split_v1/split_v1_step_4_data.json',
init_prompt=prompt)
prompt = evaluator.sample_prompt(shuffle_prompt=False, num_shot=4)
example = evaluator.full_dataset[1]
cot_inputs = (prompt['icl'].replace('<init_state>', example["init"])
.replace('', example["goal"])
.replace('', ''))

Our prompts are designed for base models. Using a chat model may lead to unexpected behaviors