OpenFn/apollo

Develop AI Chat panel for Lightning

josephjclark opened this issue · 1 comments

This is a high level project proposal for the gen-ai / apollo server. You should contact @josephjclark before getting started.

Overview

This is a proposal for some kind of high-level chat-based AI. Users should be able type into a text box which gets sent to an LLM which returns a response.

The idea is basically to train/finetune a model which knows about openfn. Instead of people going to chatGPT and asking "what is openfn" or "how do I use the dhis2 adaptor", I want them to come to our model. We should have confidence that it will give better quality answers to stuff relating to openfn (bonus points if it refuses to answer questions unrelated to openfn).

It's not so much a chatbot, it's more like chatGPT but specialised.

Our initial target is to get an off-the-shelf model integrated into Lightning (our web-based workflow automation app, aka our frontend) without worrying too much about the quality and content of the returned chat. A little prompt engineering should be enough to get us started.

Details

The solution requires something like a chat.py module adding to the server.

Chat should take a question from a user, wrap it up in a prompt, and call the appopriate model.

Ideally it should preserve some context for follow up questions and stuff. I don't know how ChatGPT does this.

Users should include their own API key. Documentation should tell users how to get one.

If the model is successful we'll one day build a frontend to it and host it somewhere. But for now the spec is to create an interface through REST or the CLI.

Lightning integration

On the lightning side, we need to display the whole chat in a side panel (as one of the Output tabs). It should be quite minimal.

But this issue is just to cover the backend chat service.

The server needs to support the following payload:

{ 
  content: the user's question or input
  history: [ { role, content }], the chat history so far
  context: {
    expression: the job expression
    adaptor: the adaptor name and version
    input: the input data going into the job
    output: the output from the last run
    log: the log from the last run
  }
}

Almost every property is optional - the prompt needs to be robust enough to add context if available, but not freak out if its not there

it will return this:

{
  response: the response from gpt (ie, the last message, the answer to the question)
  history: the chat history so far (needed for the next input)
}

I would like the history to EXCLUDE the prompt added by the service, to just be the raw inputs and outputs, and we only build the prompt into the next question. I don;t know if that'll work. We'll chew up millions of tokens if we have to duplicate all the input stuff across every message in chat history.