Priompt

Priompt (priority + prompt) is a JSX-based prompting library. It uses priorities to decide what to include in the context window.

Priompt is an attempt at a prompt design library, inspired by web design libraries like React. Read more about the motivation here.

Installation

Install from npm:

npm install @anysphere/priompt && npm install -D @anysphere/priompt-preview

yarn add @anysphere/priompt && yarn add @anysphere/priompt-preview --dev

pnpm add @anysphere/priompt && pnpm add -D @anysphere/priompt-preview

Examples

Read examples/README.md to run the examples.

Principles

Prompts are rendered from a JSX component, which can look something like this:

function ExamplePrompt(
  props: PromptProps<{
    name: string,
    message: string,
    history: { case: "user" | "assistant", message: string },
  }>
): PromptElement {
  const capitalizedName = props.name[0].toUpperCase() + props.name.slice(1);
  return (
    <>
      <SystemMessage>
        The user's name is {capitalizedName}. Please respond to them kindly.
      </SystemMessage>
      {props.history.map((m, i) => (
        <scope prel={-(props.history.length - i)}>
          {m.case === "user" ? (
            <UserMessage>{m.message}</UserMessage>
          ) : (
            <AssistantMessage>{m.message}</AssistantMessage>
          )}
        </scope>
      ))}
      <UserMessage>{props.message}</UserMessage>
      <empty tokens={1000} />
    </>
  );
}

A component is rendered only once. Each child has a priority, where a higher priority means that the child is more important to include in the prompt. If no priority is specified, the child is included if and only if its parent is included. Absolute priorities are specified with p and relative ones are specified with prel.

In the example above, we always include the system message and the latest user message, and are including as many messages from the history as possible, where later messages are prioritized over earlier messages.

The key promise of the priompt renderer is:

Let $T$ be the token limit and $\text{Prompt}(p_\text{cutoff})$ be the function that creates a prompt by including all scopes with priority $p_\text{scope} \geq p_\text{cutoff}$, and no other. Then, the rendered prompt is $\text{\textbf{P}} = \text{Prompt}(p_\text{opt-cutoff})$ where $p_\text{opt-cutoff}$ is the minimum value such that $|\text{Prompt}(p_\text{opt-cutoff})| \leq T$.

The building blocks of a priompt prompt are:

<scope>: this allows you to set priorities p for absolute or prel for relative.
<first>: the first child with a sufficiently high priority will be included, and all children below it will not. This is useful for fallbacks for implementing something like "when the result is too long we want to say (result omitted)".
<empty>: for specifying empty space, useful for reserving tokens for generation.
<capture>: capture the output and parse it right within the prompt.
<isolate>: isolate a section of the prompt with its own token limit. This is useful for guaranteeing that the start of the prompt will be the same for caching purposes. it would be nice to extend this to allow token limits like 100% - 100.

You can create components all you want, just like in React.

Future

Some building blocks we're thinking of adding:

<max>: specify a limit on the number of tokens within a scope
onExcluded={() => {...}}: a callback for when a particular scope is excluded, which allows you to do things like "summarize this result when it doesn't fit in the prompt anymore".

We're also thinking about making a framework around Priompt for agents. It would look something like interactive web design but for agents, where onClicks are simulated by having the agent call a function. We would love ideas here!

Caveats

We've discovered that adding priorities to everything is sort of an anti-pattern. It is possible that priorities are the wrong abstraction. We have found them useful though for including long files in the prompt in a line-by-line way.
The Priompt renderer has no builtin support for creating cacheable prompts. If you overuse priorities, it is easy to make hard-to-cache prompts, which may increase your cost or latency for LLM inference. We are interested in good solutions here, but for now it is up to the prompt designer to think about caching.
The current version of priompt only supports around 10K scopes reasonably fast (this is enough for most use cases). If you want to include a file in the prompt that is really long (>10K lines), and you split it line-by-line, you probably want to implement something like "for lines farther than 1000 lines away from the cursor position we have coarser scopes of 10 lines at a time".
For latency-critical prompts you want to monitor the time usage in the priompt preview dashboard. If there are too many scopes you may want to optimize for performance.
The Priompt renderer is not always guaranteed to produce the perfect $p_\text{opt-cutoff}$. For example, if a higher-priority child of a <first> has more tokens than a lower-priority child, the currently implemented binary search renderer may return a (very slightly) incorrect result.

Contributions

Contributions are very welcome! This entire repo is MIT-licensed.