daveshap/Raspberry

Paper -> CoT pipeline: Optimize prompt for paper grading rubric

Opened this issue · 3 comments

Requirements:

  • Finalize rubric questions
  • Optimize rest of prompt if necessary

Deliverable: the final prompt

Here are the current rubric questions, the model is instructed to provide a simple yes/no answer to the question, along with an explanation of the reasoning for the answer:

  • Is there a clear, well-defined central question explicitly stated in the paper?
  • Does the paper provide a definitive answer to this central question?
  • Is the answer derived through multi-step reasoning that includes at least 3 distinct logical steps or connections?
  • Is the reasoning leading to the answer logically coherent and well-structured?
  • Can the reasoning be explained to a layperson (defined as an educated adult without specific expertise in the paper's field) with some effort?
  • Does the paper minimize jargon in the reasoning process, or does it explain necessary technical terms used to derive the answer?
  • Are there illustrative examples or analogies in the reasoning that aid in understanding the answer?
  • Does the reasoning provide significant insights or depth specifically related to the question and its answer?
  • Does the paper provide sufficient information for the key reasoning steps to be independently verified or reproduced?
  • Is the paper suitable for extracting a clear question and an answer arrived at by comprehensible, complex reasoning?

Here is the full archive of artifacts (inference data for profiling and CoT extraction, JSONL training files) for a test run of 100 papers through the pipeline:

paper-cot-extraction-test-data.tar.gz

The CoT extraction logs are in results/inference with -paper-profiling.txt extension.

We've decided the current list of questions is sufficient until we need to run a larger number of papers, we'll need funding for that.