Paper -> CoT pipeline: Optimize prompt for paper grading rubric

Question

Paper -> CoT pipeline: Optimize prompt for paper grading rubric

Opened this issue 3 months ago · 3 comments

Requirements:

Finalize rubric questions
Optimize rest of prompt if necessary

Deliverable: the final prompt

Answer 1 · 2024-10-02T03:27:13.000Z

Here are the current rubric questions, the model is instructed to provide a simple yes/no answer to the question, along with an explanation of the reasoning for the answer:

Is there a clear, well-defined central question explicitly stated in the paper?
Does the paper provide a definitive answer to this central question?
Is the answer derived through multi-step reasoning that includes at least 3 distinct logical steps or connections?
Is the reasoning leading to the answer logically coherent and well-structured?
Can the reasoning be explained to a layperson (defined as an educated adult without specific expertise in the paper's field) with some effort?
Does the paper minimize jargon in the reasoning process, or does it explain necessary technical terms used to derive the answer?
Are there illustrative examples or analogies in the reasoning that aid in understanding the answer?
Does the reasoning provide significant insights or depth specifically related to the question and its answer?
Does the paper provide sufficient information for the key reasoning steps to be independently verified or reproduced?
Is the paper suitable for extracting a clear question and an answer arrived at by comprehensible, complex reasoning?

Answer 2 · 2024-10-03T17:17:27.000Z

Here is the full archive of artifacts (inference data for profiling and CoT extraction, JSONL training files) for a test run of 100 papers through the pipeline:

paper-cot-extraction-test-data.tar.gz

The CoT extraction logs are in results/inference with -paper-profiling.txt extension.

Answer 3 · 2024-10-12T18:12:42.000Z

We've decided the current list of questions is sufficient until we need to run a larger number of papers, we'll need funding for that.