/kyma_poc

Kyma is a proof of concept demonstration using Generative AI inside of Konveyor.io to help with code modernization efforts.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

kyma_poc

Kyma (greek for Wave, https://www.howtopronounce.com/kyma) is a proof of concept demonstration using Generative AI inside of Konveyor.io to help with code modernization efforts.

Slides: Konveyor KymaML: GenAI Code Migration

Overview

  • Kyma is focused on leveraging data that Konveyor already collects via Migration Waves and it's view of an organizations Application Portfolio.
  • Kyma will look at a given application's analysis report and will generate potential patches for each incident.
    • This is done by looking at similar incidents that were solved in other applications and grabbing the code diff that fixed that incident and working with a LLM to leverage how the organization solved the problem in the past for this specific application and incident.
      • For the POC this will involve looking at applications that have both javaee and quarkus solutions, we will simulate what the experience would be like in Konveyor where we have access to a larger application portfolio and can find similar applications which were already migrated and use them as examples.
  • The approach leverages prompt engineering with Few Shots examples and a series of agents to work with the LLM to improve the generated patch.
    • Note: Model Training and Fine Tuning is not required in this phase. We have plans for fine-tuning in conjuction with Open Data Hub in the future, leveraging open source foundation models from huggingface.co, yet that is out scope for this current phase.
  • We are assuming that Kyma will work with various models, our focus is on the tooling/prompt engineering and expect to treat the model coordinates as an interchangeable entity. This is to allow us to experiment with evolving models.

What is our criteria for success of Proof of Concept

  • We want to assess what level of quality can we get from prompt engineering, i.e. no fine-tuning.
    • Future phases may leverage fine-tuning, but this phase is about a simpler approach of building tooling to collect useful data in Konveyor and use prompt-engineering to help Migrators with code changes
  • This POC is intended to be a small python project to work with a few sample applications and analysis reports to gauage the usefulness of generated patches in the domain of modernizing Java EE applications to Quarkus
  • Success for us is being able to generate patches that save time from the Migrator working on modernizing an application.
    • We assume the patches will not be 100% functional all the time and will involve some user intervention. Success will be related to how much intervention is required and if the patches yield enough help to save time.

Setup

  1. python -m venv env
  2. source env/bin/activate
  3. pip install -r ./requirements.txt
  4. Install podman so you can run Kantra for static code analysis

Running

Run an analysis of a sample app (example for MacOS)

  1. cd data
  2. ./fetch.sh # this will git clone some sample source code apps
  3. ./darwin_restart_podman_machine.sh # setups the podman VM on MacOS so it will mount the host filesystem into the VM
  4. ./darwin_get_latest_kantra_cli.sh # fetches 'kantra' our analyzer tool
  5. ./analyzer_coolstuff.sh # Analyzes 'coolstuff-javaee' directory and writes an analysis output to example_reports/coolstuff-javaee/output.yaml
    • Follow that example to analyze other sample applications

Generate a Markdown version of a given analysis report

  • This will take the YAML output and convert to Markdown for an easier view
  1. Start in the project root and execute

Generate result from LLM interaction

  1. export OPENAI_API_KEY="mysecretkey"
  2. ./generate_coolstuff.sh
  3. View the results for the above at example_output/coolstuff-quarkus

Notes

  • Idea of running times when we are processing coolstuff-store for:
    • -t "quarkus" -t "jakarta-ee" -t "jakarta-ee8+" -t "jakarta-ee9+" -t "cloud-readiness"
      • 'gpt-3.5-turbo-16k'
        • time ./generate_coolstuff.sh 5.12s user 3.46s system 1% cpu 11:02.25 total
      • 'gpt-4-1106-preview'
        • ./generate_coolstuff.sh 4.86s user 3.73s system 0% cpu 15:52.24 total