testing gpt / other llms on the cyborg environment
writeup: https://docs.google.com/document/d/1-2tfITqtReGDWNxjrSJ5jQocOZjX0y9aG255o7SKL5U/edit?usp=sharing
- results will probably be much better with rl agent guiding llm instead of only llm planning
- maybe try fine tuning on task specific stuff (on oss models for now, costly for openai models)
- ignore securitybot.py, it was unfinished attempt at reimplementing the depending on yourself when you should paper
Probably good to use DSPy, will be testing on the dspy-test branch.
$6.65 - observation 1 - 100 steps with only sleep action for blue agent cost
$0.94 - observation 2 - 26 steps
$1.87 - observation 5 - 48 steps
$3.76 - observation 7 - 48 steps
$0.83 - observation 11 - 100 steps
$4.58 - 83 steps
observations 1 - $6.65
observations 2 - $0.94
observation 5 - $1.87
observation 7 - $3.76
observation 11 - $0.83
Total Spent:
API requests: 1,210
Tokens: 2,784,561
Credits: $34.08
help i have used around $40 now and i don't think i can have any more money for adding more gpt 4 credits.....
observations with analyse and monitor actions successfully working file: llama3.py
should probably add actual logging of results from analyse actions too so can take further actions from there, and maybe analyzing more than the first detected host?
currently using the groq api for llama3-8b-8192 because it's really fast and free.
You can get an api key here: https://console.groq.com/keys
results later