Benchmark prompt summarizers with frontier LLMs (GPT-4o / Opus)

Question

griff4692 opened this issue 4 months ago · 1 comments

Inspiration

The goal is to test out different methods for taking an existing prompt and shortening it.

This just involves a prompt along the lines of "summarize this prompt".

Evaluation metrics:

Compression Ratio
Benchmark performance (Lm-eval-harness is integrated into gpt-fast).
This should be tested across settings (RAG, n-shot, short prompt, long prompt, etc.)

Answer 1 · 2024-06-04T11:19:49.000Z

De-prioritized