`harness`

`harness` is a tool for testing the efficacy of prompts and prompt + model combinations.

What and Why • Installation • Usage • Examples •

Navigation

What and Why
Installation
Usage
Examples

Note

August 21, 2024 — This is brand new and there will be lots of updates coming soon.

What and Why

I made this silly little tool because I'm tired of wondering, subjectively, if one prompt works better than another one. I want a way to test them against eah other. So that's what this is.

Installation

First install Fabric, which is another project of ours. To install Fabric, make sure Go is installed, and then run the following command.

# Install Fabric directly from the repo
go install github.com/danielmiessler/fabric@latest

# Run the setup to set up your directories and keys
fabric --setup

Environment Variables

If everything works you are good to go, but you may need to set some environment variables in your ~/.bashrc or ~/.zshrc file. Here is an example of what you can add:

# Golang environment variables
export GOROOT=/usr/local/go
export GOPATH=$HOME/go
export PATH=$GOPATH/bin:$GOROOT/bin:$HOME/.local/bin:$PATH:

Then once fabric runs fine, you're pretty much done. harness just runs Fabric using Bash.

Usage

To use harness, just do the following:

Clone this repo.
cd $your_harness_directory
Put your input in input.md (like the transcript you're analyzing, or whatever)
Put your first prompt in prompt1.md.
Put your second prompt in prompt2.md.
Run ./harness.sh.

Multi-mode

Harness has a cool feature where you can try mulitple runs and do analysis on the full set of results (because LLMs have a lot of various from run to run).

To do that, just add a number to the end of the command.

./harness.sh 10

Now you should be able to see which is better across multiple runs, and if you don't see much of a difference across like 10 runs, the differences are probably pretty small.

Enjoy!

NOTES

Caution

There is no security or input validation of any kind on this thing. It's a shell script, so like don't put this in production or anything.

0xfoc-eth/harness

harness

harness is a tool for testing the efficacy of prompts and prompt + model combinations.