stateful/runme

Support using Foyle's AI capabilities as a plugin

Closed this issue ยท 10 comments

Per the discussion in discord; we'd like to experiment with integrating RunMe with Foyle's AI capabilities. Foyle aims to be a DevOps copilot. One of the key problems its tackling is trying to train an AI copilot to be an expert in your infrastructure. Foyle relies on a notebook like experience to collect implicit human feedback which is used to train the AI (blog post). Like RunMe, Foyle is also built using VSCode Notebooks. RunMe is far more mature and has more features than Foyle. So it likely makes sense to reuse RunMe as Foyle's notebook rather than rebuilding a separate stack.

Towards that end, this issue is tracking the minimal implementation to get RunMe to support calling out to Foyle and surfacing the results. The current thinking is they would be two separate extensions to start with and RunMe would use gRPC to communicate with Foyle. A more detailed design is provided in Foyle Tech Note 004.

The first step is designing a shared gRPC service to allow the services to communicate; #573

  • Create a minimal runme code to generate completions using foyle (#1356)
  • Ensure Runme logs data necessary for retraining Foyle
  • Related to jlewi/foyle#110

Here's a more detailed look at what we need to do for logging

Gaps In Current RunMe Logging

Here are the current limitations of logging in RunMe for the purposes of supporting Foyle retraining

  • Server logging isn't always enabled when using VSCode
  • When server logs are enabled, server logs aren't persisted to files; they are echoed to an output terminal in VSCode
    • This means they are ephemeral and aren't easily accessible by Foyle for retraining
  • Server Logs don't record the actual executed command at info level
    • request is logged at debug level here
    • However, I don't think it would be logged using the JSON serialization format of the proto which would make deserializing it difficult
      • Per StackOverflow
        it won't be seralized as the proto JSON format

Proposed Changes to RunMe Code Base

Server Changes

  • Server should setup a second logger to always log to a JSON file

    • We can use a tee logger as in Foyle to create a logger that sends messages to the console and to the JSON file
  • The logs should be stored in ${CONFIGDIR}/logs

  • Add a command line flag to the serve command to enable Foyle logs

  • Add a configDir flag to the serve command

  • Execute Logger should be keyed by known_id and known_name is present

  • We need to log the Execute request at info level in JSON format StackOverflow

    • An issue here is how to log it in proto JSON format
    • The simplest thing to do is to just add code to serialize the request to JSON and then log the result
    • The alternative is to to use the proto plugin go-proto-zap-marshaler to auto generate LogMarshalObject methods for all protos
      • That only makes sense if there's going to be widespread logging of protos in JSON format. Since we only need to update a single log message right now I don't think its worth introducing it as part of this change

vscode-runme Changes

I think the main changes we'd need to make to RunMe's vscode extension is to plumb through the server changes

  • We should introduce a Foyle experiments flag to enable Foyle
  • Update start in runmeServer.ts to pass along the appropriate command line flags if the Foyle experiment is enabled

@sourishkrout Could you PTAL and let me know if this looks good or if you have any suggestions before I get started on implementing the changes?

@sourishkrout Could you PTAL and let me know if this looks good or if you have any suggestions before I get started on implementing the changes?

Overall this looks good to me. One thing I'm wondering ๐Ÿค” is how do we make it clear that logging is no longer exclusively for "human consumption" (troubleshooting, debugging, etc)? As far as I understand your proposal Foyle's training will be based on the logs being stable and machine-readable, no @jlewi?

As far as I understand your proposal Foyle's training will be based on the logs being stable and machine-readable, no @jlewi?

That's correct. Although only a subset of the logs are arguably intended to be machine readable. I'd argue that one of the main benefits of adopting structured logging is that it produces logs that can be consumed by humans or by machines. Nonetheless there's some subtlty here.

The first is configuring the logger. In particular, it isn't necessarily a good UX if the console logger and the machine logger are both using the same format; for the console you might want human readable while for machine logger you may always want to use JSON. Furthermore, for machine readable you may not want the user to be able to adjust the log level because they could avoid including critical messages.

This part we can solve by using configuring separate loggers for the console and the machine logs and then routing logs to both. This is what we are already doing in Foyle.

The second part is logging and processing data so that we avoid brittleness in our pipeline. We have the following the log line

logger.Debug("received initial request", zap.Any("req", req))

Which will be critical to Foyle's learning process. This means if someone removes or edits the above log line they could break certain functionality; this is rather unexpected for most developers. I don't have good solutions for that. In principle, we could use a combination of linters and unittests to catch those breakages.

๐Ÿ‘ @jlewi. Don't see a problem in changing logging in Runme to suit the needs here.

This means if someone removes or edits the above log line they could break certain functionality; this is rather unexpected for most developers. I don't have good solutions for that. In principle, we could use a combination of linters and unittests to catch those breakages.

This is primarily what I'd like to guard against. I think a combo of unit tests and perhaps we can even name the logger.(...) something like learningLogs.(...) or something that makes developers think twice. In any case, unit tests are usually the best guard in this scenario to maintain a "contract".

  • An issue here is how to log it in proto JSON format

  • The simplest thing to do is to just add code to serialize the request to JSON and then log the result

  • The alternative is to to use the proto plugin go-proto-zap-marshaler to auto generate LogMarshalObject methods for all protos

    • That only makes sense if there's going to be widespread logging of protos in JSON format. Since we only need to update a single log message right now I don't think its worth introducing it as part of this change

I suggest going the simple route first and see how far that'll get us.

After #585 ; the only remaining change on the RunMe side is to add a vscode flag to enable Foyle which would launch the RunMe server with the flags to enable logging. That's only a blocker to being able to train off of RunMe data but using Foyle should still work even without that.

I believe Foyle should be available in RunMe vscode as soon as
stateful/vscode-runme#1356
Lands in a release.
It looks like it just missed the 3.5.5 release
https://github.com/stateful/vscode-runme/releases/tag/3.5.5

It looks like there was
https://github.com/stateful/vscode-runme/releases/tag/3.5.6

But that didn't include the ailogger experiment. That should be in the next release

RunMe 3.5.7 https://github.com/stateful/vscode-runme/releases/tag/3.5.7 was released earlier today

v3.5.8 was just released and includes the fix for vscode-runme#1389

Woo Hoo! Thanks @sourishkrout