/cog-anthropic

Primary LanguageTypeScriptMIT LicenseMIT

Anthropic Cog

This is a Crank Cog for Anthropic Claude, providing steps and assertions for you to validate the state and behavior of your Claude instance.

Installation

Ensure you have the crank CLI and docker installed and running locally, then run the following. You'll be prompted to enter your Anthropic credentials once the Cog is successfully installed.

$ crank cog:install stackmoxie/anthropic

Note: You can always re-authenticate later.

Usage

Authentication

You will be asked for the following authentication details on installation. To avoid prompts in a CI/CD context, you can provide the same details as environment variables.

Field Install-Time Environment Variable Description
apiKey CRANK_STACKMOXIE_ANTHROPIC__APIKEY Anthropic API Key
# Re-authenticate by running this
$ crank cog:auth stackmoxie/anthropic

Steps

Once installed, the following steps will be available for use in any of your Scenario files.

Name (ID) Expression Expected Data
Compare Anthropic model A and B prompt responses from completion
(CompletionEqualsAb)
Anthropic model (?<modela>[a-zA-Z0-9_-]+) and (?<modelb>[a-zA-Z0-9_-]+) responses to "(?<prompt>[a-zA-Z0-9_ -\p{P}]+)" should (?<operator>be set|not be set|be less than|be greater than|be one of|be|contain|not be one of|not be|not contain|match|not match) ?(?<expectation>.+)? - prompt: User Prompt to send to Anthropic Model
- modela: Anthropic Model A to use for completion
- modelb: Anthropic Model B to use for completion
- operator: Check Logic (be, not be, contain, not contain, be greater than, be less than, be set, not be set, be one of, not be one of)
- expectation: Expected Anthropic model response value
Check Anthropic prompt response from completion
(CompletionEquals)
Anthropic model (?<model>[a-zA-Z0-9_-]+) response to "(?<prompt>[a-zA-Z0-9_ -\p{P}]+)" should (?<operator>be set|not be set|be less than|be greater than|be one of|be|contain|not be one of|not be|not contain|match|not match) ?(?<expectation>.+)? - prompt: User Prompt to send to Anthropic Model
- model: Anthropic Model to use for completion
- operator: Check Logic (be, not be, contain, not contain, be greater than, be less than, be set, not be set, be one of, not be one of)
- expectation: Expected Anthropic model response value
Check Anthropic prompt response FRES reading ease evaluation
(CompletionReadability)
Anthropic model (?<model>[a-zA-Z0-9_-]+) school level of the response to "(?<prompt>[a-zA-Z0-9_ -\p{P}]+)" should (?<operator>be less than|be greater than|be one of|be|not be one of|not be) ?(?<schoollevel>.+)? - prompt: User Prompt to send to Anthropic Model
- model: Anthropic Model to use for completion
- operator: Check Logic (be, not be, be greater than, be less than, be one of, not be one of)
- schoollevel: Expected School Level
Check Anthropic semantic similarity of response to provided text from completion
(CompletionSemanticSimilarity)
Anthropic model (?<model>[a-zA-Z0-9_-]+) response to "(?<prompt>[a-zA-Z0-9_ -\p{P}]+)" semantically compared with "(?<comparetext>[a-zA-Z0-9_ -\p{P}]+)" should (?<operator>be set|not be set|be less than|be greater than|be one of|be|contain|not be one of|not be|not contain|match|not match) ?(?<semanticsimilarity>.+)? - prompt: User Prompt to send to Anthropic Model
- model: Anthropic Model to use for completion
- operator: Check Logic (be, not be, contain, not contain, be greater than, be less than, be set, not be set, be one of, or not be one of)
- comparetext: Expected text to compare to Anthropic response
- semanticsimilarity: Expected Semantic Similarity Score
Check Anthropic prompt response word count from completion
(CompletionWordCount)
Anthropic model (?<model>[a-zA-Z0-9_-]+) word count in a response to "(?<prompt>[a-zA-Z0-9_ -\p{P}]+)" should (?<operator>be set|not be set|be less than|be greater than|be one of|be|contain|not be one of|not be|not contain|match|not match) ?(?<expectation>.+)? - prompt: User Prompt to send to Anthropic Model
- model: Anthropic Model to use for completion
- operator: Check Logic (be, not be, contain, not contain, be greater than, be less than, be set, not be set, be one of, or not be one of)
- expectation: Expected Anthropic word count
Check Anthropic prompt token cost given a prompt and model
(CompletionTokenCost)
Anthropic model (?<model>[a-zA-Z0-9_-]+) ?(?<type>.+)? token cost in response to "(?<prompt>[a-zA-Z0-9_ -]+)" should ((?<operator>be set|not be set|be less than|be greater than|be one of|be|contain|not be one of|not be|not contain|match|not match) ?(?<expectation>.+)? tokens - prompt: User Prompt to send to Anthropic Model
- model: Anthropic Model to use for completion
- type: Specify which token output to show (prompt/completion/total)
- operator: Check Logic (be, not be, contain, not contain, be greater than, be less than, be set, not be set, be one of, or not be one of)
- expectation: Expected Anthropic input/output/total token cost
Check Anthropic prompt response time from requiest to completion
(CompletionResponseTime)
Anthropic model (?<model>[a-zA-Z0-9_-]+) response time in response to "(?<prompt>[a-zA-Z0-9_ -]+)" should (?<operator>be set|not be set|be less than|be greater than|be one of|be|contain|not be one of|not be|not contain|match|not match) ?(?<expectation>.+)? ms - prompt: User Prompt to send to Anthropic Model
- model: Anthropic Model to use for completion
- operator: Check Logic (be, not be, contain, not contain, be greater than, be less than, be set, not be set, be one of, or not be one of)
- expectation: Expected Anthropic response time in milliseconds

Development and Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Please make sure to add or update tests as appropriate.

Setup

  1. Install node.js (v12.x+ recommended)
  2. Clone this repository.
  3. Install dependencies via npm install
  4. Run npm start to validate the Cog works locally (ctrl+c to kill it)
  5. Run crank cog:install --source=local --local-start-command="npm start" to register your local instance of this Cog. You may need to append a --force flag or run crank cog:uninstall stackmoxie/anthropic if you've already installed the distributed version of this Cog.

Adding/Modifying Steps

Modify code in src/steps and validate your changes by running crank cog:step stackmoxie/anthropic and selecting your step.

To add new steps, create new step classes in src/steps. Use existing steps as a starting point for your new step(s). Note that you will need to run crank registry:rebuild in order for your new steps to be recognized.

Always add tests for your steps in the test/steps folder. Use existing tests as a guide.

Modifying the API Client or Authentication Details

Modify the ClientWrapper class at src/client/client-wrapper.ts.

  • If you need to add or modify authentication details, see the expectedAuthFields static property.
  • If you need to expose additional logic from the wrapped API client, add a new public method to the wrapper class or mixins, which can then be called in any step.
  • It's also possible to swap out the wrapped API client completely. You should only have to modify code within this class or mixins to achieve that.

Note that you will need to run crank registry:rebuild in order for any changes to authentication fields to be reflected. Afterward, you can re-authenticate this Cog by running crank cog:auth stackmoxie/anthropic

Tests and Housekeeping

Tests can be found in the test directory and run like this: npm test. Ensure your code meets standards by running npm run lint.