AnswerDotAI/cold-compress

Benchmark prompt summarizers with frontier LLMs (GPT-4o / Opus)

griff4692 opened this issue · 1 comments

See this section of the writeup

Inspiration

The goal is to test out different methods for taking an existing prompt and shortening it.

This just involves a prompt along the lines of "summarize this prompt".

Evaluation metrics:

  • Compression Ratio
  • Benchmark performance (Lm-eval-harness is integrated into gpt-fast).
  • This should be tested across settings (RAG, n-shot, short prompt, long prompt, etc.)

De-prioritized