agential-ai/agential

[Feature Request]: WebShop for CRITIC

Opened this issue · 0 comments

Feature Description

Familiarize yourself with the repository and take a look at the CRITIC repo, paper, and WebShop.

Currently, the CRITIC implementation only has prompts for HotpotQA and TriviaQA.

Add relevant the prompts and logic to the current CRITIC implementation. You'll see that an agent's current structure is divided into cog/agent, cog/modules, and cog/functional/. This task will require you to make modifications in cog/prompts, but will also require you to test your code in all the other relevant modules cog/functional and cog/agent. CRITIC does not have any cog/modules.

What to submit:

  • Set up your environment via the CONTRIBUTING.md
  • Make a Pull Request (PR)
  • Add the prompts for the specified benchmark
  • Write a short notebook tmp.ipynb in cog/agent showcasing the agent ran on a sample question from the benchmark
    • Add print statements for all calls to the LLM for easier debugging + I can easily verify the outputs
  • Include a thorough description of your changes within the PR
  • Request a review from @alckasoc

CRITIC may not have been tested on this benchmark. If this is true, refer to other methods that have been tested on this benchmark. Check the project lifecycle document.
If there is any additional logic for testing CRITIC on this benchmark, include these specifications in the PR description.

Feel free to ask me questions on Slack if you're confused! Good luck!