/DelphiGroqCloud

The GroqCloud API wrapper for Delphi provides access to models from Meta, OpenAI, MistralAI and Google on Groq’s LPUs, offering chat, text generation, image analysis, audio transcription, JSON output, tool integration, and content moderation capabilities.

Primary LanguagePascalMIT LicenseMIT

Delphi GroqCloud API


GitHub GitHub GitHub





Introduction

Welcome to the unofficial GroqCloud API Wrapper for Delphi. This project provides a Delphi interface for accessing and interacting with the powerful language models available on GroqCloud, including those developed by :
Meta LLama, OpenAI Whisper, MistralAI mixtral, and Google Gemma.
With this library, you can seamlessly integrate state-of-the-art language generation, chat and vision capabilities, code generation, or speech-to-text transcription into your Delphi applications.

GroqCloud offers a high-performance, efficient platform optimized for running large language models via its proprietary Language Processing Units (LPUs), delivering speed and energy efficiency that surpass traditional GPUs. This wrapper simplifies access to these models, allowing you to leverage GroqCloud's cutting-edge infrastructure without the overhead of managing the underlying hardware.

For more details on GroqCloud's offerings, visit the official GroqCloud documentation.


Groq cloud console

Get a key

To initialize the API instance, you need to obtain an API key from GroqCloud.

Once you have a token, you can initialize IGroq interface, which is an entry point to the API.

Due to the fact that there can be many parameters and not all of them are required, they are configured using an anonymous function.

Note

uses Groq;

var GroqCloud := TGroqFactory.CreateInstance(API_KEY);

Warning

To use the examples provided in this tutorial, especially to work with asynchronous methods, I recommend defining the Gemini interface with the widest possible scope.
So, set GroqCloud := TGroqFactory.CreateInstance(API_KEY); in the OnCreate event of your application.
Where GroqCloud: IGroq


Settings

You can access your GroqCloud account settings to view your payment information, usage, limits, logs, teams, and profile by following this link.


Usage

Asynchronous callback mode management

In the context of asynchronous methods, for a method that does not involve streaming, callbacks use the following generic record: TAsynCallBack<T> = record defined in the Gemini.Async.Support.pas unit. This record exposes the following properties:

   TAsynCallBack<T> = record
   ... 
       Sender: TObject;
       OnStart: TProc<TObject>;
       OnSuccess: TProc<TObject, T>;
       OnError: TProc<TObject, string>; 

For methods requiring streaming, callbacks use the generic record TAsynStreamCallBack<T> = record, also defined in the Gemini.Async.Support.pas unit. This record exposes the following properties:

   TAsynCallBack<T> = record
   ... 
       Sender: TObject;
       OnStart: TProc<TObject>;
       OnSuccess: TProc<TObject, T>;
       OnProgress: TProc<TObject, T>;
       OnError: TProc<TObject, string>;
       OnCancellation: TProc<TObject>;
       OnDoCancel: TFunc<Boolean>;

The name of each property is self-explanatory; if needed, refer to the internal documentation for more details.


Groq models overview

GroqCloud currently supports the following models.

Hosted models can be accessed directly via the GroqCloud Models API endpoint by using the model IDs listed above. To retrieve a JSON list of all available models, use the endpoint at https://api.groq.com/openai/v1/models.

  1. Synchronously
// uses Groq, Groq.Models;

  var Models := GroqCloud.Models.List;
  try
    for var Item in Models.Data do
      WriteLn(Item.Id);
  finally
    Models.Free;
  end;
  1. Asynchronously
// uses Groq, Groq.Models;

  GroqCloud.Models.AsynList(
    function : TAsynModels
    begin
      Result.Sender := Memo1; //Set a TMemo on the form
      Result.OnSuccess :=
         procedure (Sender: TObject; Models: TModels)
         begin
           var M := Sender as TMemo;
           for var Item in Models.Data do
             begin
               M.Lines.Text := M.Text + Item.Id + sLineBreak;
               M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
             end;
         end;
      Result.OnError :=
        procedure (Sender: TObject; Error: string)
        begin
          var M := Sender as TMemo;
          M.Lines.Text := M.Text + Error + sLineBreak;
          M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
        end;
    end);

Embeddings

GroqCloud does not provide any solutions for text integration.


Text generation

Chat completion

The Groq Chat Completions API interprets a series of messages and produces corresponding response outputs. These models can handle either multi-turn conversations or single-interaction tasks.

JSON Mode (Beta) JSON mode is currently in beta and ensures that all chat completions are in valid JSON format.

How to Use:

  1. Include "response_format": {"type": "json_object"} in your chat completion request.
  2. In the system prompt, specify the structure of the desired JSON output (see sample system prompts below).

Best Practices for Optimal Beta Performance:

  • For JSON generation, Mixtral is the most effective model, followed by Gemma, and then Llama.
  • Use pretty-printed JSON for better readability over compact JSON.
  • Keep prompts as concise as possible.

Beta Limitations:

  • Streaming is not supported.
  • Stop sequences are not supported.

Error Code:
If JSON generation fails, Groq will respond with a 400 error, specifying json_validate_failed as the error code.


Note

We will use only Meta models in all the examples provided for text generation.


Synchronously text generation example

The GroqCloud API allows for text generation using various inputs, like text and images. It's versatile and can support a wide array of applications, including:

  • Creative writing
  • Text completion
  • Summarizing open-ended text
  • Chatbot development
  • Any custom use cases you have in mind

In the examples below, we'll use the Display procedures to make things simpler.

Tip

procedure Display(Sender: TObject; Value: string); overload;
begin
 var M := Sender as TMemo;
 M.Lines.Text := M.Text + Value + sLineBreak;
 M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
end;
procedure Display(Sender: TObject; Chat: TChat); overload;
begin
 for var Choice in Chat.Choices do
   Display(Sender, Choice.Message.Content);
end;
// uses Groq, Groq.Chat;

  var Chat := GroqCloud.Chat.Create(
    procedure (Params: TChatParams)
    begin
      Params.Messages([TPayload.User('Explain the importance of fast language models')]);
      Params.Model('llama-3.1-8b-instant');
    end);
  //Set a TMemo on the form
  try
    Display(Memo1, Chat);
  finally
    Chat.Free;
  end;

Asynchronously text generation example

// uses Groq, Groq.Chat;

  GroqCloud.Chat.AsynCreate(
    procedure (Params: TChatParams)
    begin
      Params.Messages([TPayload.User('Explain the importance of fast language models')]);
      Params.Model('llama-3.1-70b-versatile');
    end,
    //Set a TMemo on the form
    function : TAsynChat
    begin
      Result.Sender := Memo1;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

Stream chat

Synchronously chat stream

In the examples below, we'll use the Display procedures to make things simpler.

Tip

procedure DisplayStream(Sender: TObject; Value: string); overload;
begin
 var M := Sender as TMemo;
 for var index := 1 to Value.Length  do
   if Value.Substring(index).StartsWith(#13)
     then
       begin
         M.Lines.Text := M.Text + sLineBreak;
         M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
       end
     else
       begin
         M.Lines.BeginUpdate;
         try
           M.Lines.Text := M.Text + Value[index];
           M.Perform(WM_VSCROLL, SB_BOTTOM, 0);
         finally
           M.Lines.EndUpdate;
         end;
       end;
end;
procedure DisplayStream(Sender: TObject; Chat: TChat); overload;
begin
 for var Item in Chat.Choices do
   if Assigned(Item.Delta) then
     DisplayStream(Sender, Item.Delta.Content)
   else
   if Assigned(Item.Message) then
     DisplayStream(Sender, Item.Message.Content);
end;
// uses Groq, Groq.Chat;

  GroqCloud.Chat.CreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Messages([TPayload.User('How did we come to develop thermodynamics?')]);
      Params.Model('llama3-70b-8192');
      Params.Stream(True);
    end,
    procedure (var Chat: TChat; IsDone: Boolean; var Cancel: Boolean)
    begin
      if Assigned(Chat) then
        DisplayStream(Memo1, Chat);
    end);

Asynchronously chat stream

// uses Groq, Groq.Chat;

  GroqCloud.Chat.AsynCreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Messages([TPayload.User('How did we come to develop thermodynamics?')]);
      Params.Model('llama-3.1-70b-versatile');
      Params.Stream(True);
    end,
    function : TAsynChatStream
    begin
      Result.Sender := Memo1;
      Result.OnProgress := DisplayStream;
      Result.OnError := DisplayStream;
    end);

Build an interactive chat

You can utilize the GroqCloud API to build interactive chat experiences customized for your users. With the API’s chat capability, you can facilitate multiple rounds of questions and answers, allowing users to gradually work toward their solutions or get support for complex, multi-step issues. This feature is particularly valuable for applications that need ongoing interaction, like :

  • Chatbots,
  • Educational tools
  • Customer support assistants.

Here’s an asynchrounly sample of a simple chat setup:

// uses Groq, Groq.Chat;

  GroqCloud.Chat.AsynCreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama-3.2-3b-preview');
      Params.Messages([
        TPayload.User('Hello'),
        TPayload.Assistant('Great to meet you. What would you like to know?'),
        TPayload.User('I have two dogs in my house. How many paws are in my house?')
      ]);
      Params.Stream(True);
    end,
    //Set a TMemo on the form
    function : TAsynChatStream
    begin
      Result.Sender := Memo1;
      Result.OnProgress := DisplayStream;
      Result.OnError := DisplayStream;
    end);

System instructions

When configuring an AI model, you have the option to set guidelines for how it should respond. For instance, you could assign it a particular role, like act as a mathematician or give it instructions on tone, such as peak like a military instructor. These guidelines are established by setting up system instructions when the model is initialized.

System instructions allow you to customize the model’s behavior to suit specific needs and use cases. Once configured, they add context that helps guide the model to perform tasks more accurately according to predefined guidelines throughout the entire interaction. These instructions apply across multiple interactions with the model.

System instructions can be used for several purposes, such as:

  • Defining a persona or role (e.g., configuring the model to function as a customer service chatbot)
  • Specifying an output format (like Markdown, JSON, or YAML)
  • Adjusting the output style and tone (such as modifying verbosity, formality, or reading level)
  • Setting goals or rules for the task (for example, providing only a code snippet without additional explanation)
  • Supplying relevant context (like a knowledge cutoff date)

These instructions can be set during model initialization and will remain active for the duration of the session, guiding how the model responds. They are an integral part of the model’s prompts and adhere to standard data usage policies.

// uses Groq, Groq.Chat;

  GroqCloud.Chat.AsynCreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama3-8b-8192');
      Params.Messages([
        TPayload.System('you are a rocket scientist'),
        TPayload.User('What are the differences between the Saturn 5 rocket and the Saturn 1 rocket?') ]);
      Params.Stream(True);
    end,
    function : TAsynChatStream
    begin
      Result.Sender := Memo1;
      Result.OnProgress := DisplayStream;
      Result.OnError := DisplayStream;
    end);

Caution

System instructions help the model follow directions, but they don't completely prevent jailbreaks or information leaks. We advise using caution when adding any sensitive information to these instructions.


Configure text generation

Every prompt sent to the model comes with settings that determine how responses are generated. You have the option to adjust these settings, letting you fine-tune various parameters. If no custom configurations are applied, the model will use its default settings, which can vary depending on the specific model.

Here’s an example showing how to modify several of these options.

// uses Groq, Groq.Chat;

  GroqCloud.Chat.AsynCreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama-3.1-8b-instant');
      Params.Messages([
        TPayload.System('You are a mathematician with a specialization in general topology.'),
        TPayload.User('In a discrete topology, do accumulation points exist?') ]);
      Params.Stream(True);
      Params.Temperature(0.2);
      Params.PresencePenalty(1.6);
      Params.MaxToken(640);
    end,
    function : TAsynChatStream
    begin
      Result.Sender := Memo1;
      Result.OnProgress := DisplayStream;
      Result.OnError := DisplayStream;
    end);

Vision

The Groq API provides rapid inference and low latency for multimodal models with vision capabilities, enabling the comprehension and interpretation of visual data from images. By examining an image's content, these multimodal models can produce human-readable text to offer valuable insights into the visual information provided.


Supported Model

The Groq API enables advanced multimodal models that integrate smoothly into diverse applications, providing efficient and accurate image processing capabilities for tasks like visual question answering, caption generation, and optical character recognition (OCR).

See the official documentation.


Supported image MIME

Supported image MIME types include the following formats:

  • JPEG - image/jpeg
  • PNG - image/png
  • WEBP - image/webp
  • HEIC - image/heic
  • HEIF - image/heif

How to use vision

Asynchronous vision using a base64-encoded image

// uses Groq, Groq.Chat;

  var Ref := 'Z:\My_Folder\Images\Images01.jpg';

  GroqCloud.Chat.AsynCreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama-3.2-11b-vision-preview');
      Params.Messages([TPayload.User('Describe the image', [Ref])]);
      Params.Stream(True);
      Params.Temperature(1);
      Params.MaxToken(1024);
      Params.TopP(1);
    end,
    function : TAsynChatStream
    begin
      Result.Sender := Memo1;
      Result.OnProgress := DisplayStream;
      Result.OnError := DisplayStream;
    end);

Asynchronous vision using an image URL

// uses Groq, Groq.Chat;

  var Ref := 'https://www.toureiffel.paris/themes/custom/tour_eiffel/build/images/home-discover-bg.jpg';

  GroqCloud.Chat.AsynCreateStream(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama-3.2-90b-vision-preview');
      Params.Messages([TPayload.User('What''s in this image?', [Ref])]);
      Params.Stream(True);
      Params.Temperature(0.3);
      Params.MaxToken(1024);
      Params.TopP(1);
    end,
    function : TAsynChatStream
    begin
      Result.Sender := Memo1;
      Result.OnProgress := DisplayStream;
      Result.OnError := DisplayStream;
    end);

JSON Mode with Images

The llama-3.2-90b-vision-preview and llama-3.2-11b-vision-preview models now support JSON mode! Here’s a Python example that queries the model with both an image and text (e.g., "Please extract relevant information as a JSON object.") with response_format set to JSON mode.

Caution

Warning, you can't use JSON mode with a streamed response.

// uses Groq, Groq.Chat;

  var Ref := 'https://www.toureiffel.paris/themes/custom/tour_eiffel/build/images/home-discover-bg.jpg';
  GroqCloud.Chat.AsynCreate(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama-3.2-90b-vision-preview');
      Params.Messages([TPayload.User('List what you observe in this photo in JSON format?', [Ref])]);
      Params.Temperature(1);
      Params.MaxToken(1024);
      Params.TopP(1);
      Params.ResponseFormat(to_json_object);
    end,
    function : TAsynChat
    begin
      Result.Sender := Memo1;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

Limitations

Although you can add multiple images, GroqCloud limits its vision models to a single image. As a result, it is not possible to compare multiple images.


Speech

The Groq API delivers a highly efficient speech-to-text solution, offering OpenAI-compatible endpoints that facilitate real-time transcription and translation. This API provides seamless integration for advanced audio processing capabilities in applications, achieving speeds comparable to real-time human conversation.


Supported models

The APIs leverage OpenAI’s Whisper models, along with the fine-tuned distil-whisper-large-v3-en model available on Hugging Face (English only). For further details, please refer to the official documentation.


Transcription code example

File uploads are currently limited to 25 MB and the following input file types are supported:

  • mp3
  • mp4
  • mpeg
  • mpga
  • m4a
  • wav
  • webm

Tip

procedure Display(Sender: TObject; Transcription: TAudioText); overload;
begin
 Display(Sender, Transcription.Text);
end;

Asynchronously

// uses Groq, Groq.Chat, Groq.Audio;

  GroqCloud.Audio.ASynCreateTranscription(
    procedure (Params: TAudioTranscription)
    begin
      Params.Model('whisper-large-v3-turbo');
      Params.&File('Z:\My_Foolder\Sound\sound.mp3');
    end,
    function : TAsynAudioText
    begin
      Result.Sender := Memo1;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

Refer to the official documentation for detailed parameters.


Translation code example

Asynchronously

// uses Groq, Groq.Chat, Groq.Audio;
  
  GroqCloud.Audio.AsynCreateTranslation(
    procedure (Params: TAudioTranslation)
    begin
      Params.Model('whisper-large-v3');
      Params.&File('Z:\My_Foolder\Sound\sound.mp3');
    end,
    function : TAsynAudioText
    begin
      Result.Sender := Memo1;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

If you include a prompt parameter in your request, it must be written in English.

Refer to the official documentation for detailed parameters.


Tool use

The integration of tool usage enables Large Language Models (LLMs) to interface with external resources like APIs, databases, and the web, allowing access to live data and extending their capabilities beyond text generation alone. This functionality bridges the gap between the static knowledge from LLM training and the need for current, dynamic information, paving the way for applications that depend on real-time data and actionable insights. Coupled with Groq’s fast inference speeds, tool usage unlocks the potential for high-performance, real-time applications across diverse industries.

How tool use works

Refer to the official documentation

Supported models

Groq has fine-tuned the following models specifically for optimized tool use, and they are now available in public preview:

  • llama3-groq-70b-8192-tool-use-preview
  • llama3-groq-8b-8192-tool-use-preview

For more details, please see the launch announcement.

Warning

For extensive, multi-turn tool use cases, we suggest leveraging the native tool use capabilities of Llama 3.1 models. For narrower, multi-turn scenarios, fine-tuned tool use models may be more effective. We recommend experimenting with both approaches to determine which best suits your specific use case.

The following Llama-3.1 models are also highly recommended for tool applications due to their versatility and strong performance:

  • llama-3.1-70b-versatile
  • llama-3.1-8b-instant

Other Supported Models

The following models powered by Groq also support tool use:

  • llama3-70b-8192
  • llama3-8b-8192
  • mixtral-8x7b-32768 (parallel tool use not supported)
  • gemma-7b-it (parallel tool use not supported)
  • gemma2-9b-it (parallel tool use not supported)

Tool use code example

Tip

procedure TMyForm.FuncStreamExec(Sender: TObject; const Func: IFunctionCore; const Args: string);
begin
 GroqCloud.Chat.AsynCreateStream(
   procedure (Params: TChatParams)
   begin
     Params.Messages([TPayLoad.User(Func.Execute(Args))]);
     Params.Model('llama-3.1-8b-instant');
     Params.Stream(True);
   end,
   function : TAsynChatStream
   begin
     Result.Sender := Sender;
     Result.OnProgress := DisplayStream;
     Result.OnError := DisplayStream;
   end);
end;
// uses Groq, Groq.Chat, Groq.Functions.Core, Groq.Functions.Example;

  var Weather := TWeatherReportFunction.CreateInstance;
  var Chat := GroqCloud.Chat.Create(
    procedure (Params: TChatParams)
    begin
      Params.Messages([TPayload.User(Memo2.Text)]);
      Params.Model('llama3-groq-70b-8192-tool-use-preview');
      Params.Tools([Weather]);
      Params.ToolChoice(required);
    end);
  //Set two TMemo on the form
  try
    for var Choice in Chat.Choices do
      begin
        if Choice.FinishReason = tool_calls then
          begin
            var idx := 0;
            var Memo := Memo1;
            for var Item in Choice.Message.ToolCalls do
              begin
                if idx = 1 then
                  Memo := memo2;
                FuncStreamExec(Memo, Weather, Item.&Function.Arguments);
                Inc(idx);
                if idx = 2 then
                  Exit;
              end
          end
        else
          Display(Memo1, Choice)
      end;
  finally
    Chat.Free;
  end;

In this code example, if the tool returns multiple results, only the first two will be processed and displayed in each of the two TMemo, respectively.

How to create a tool

The Groq.Functions.Core.pas unit provides the classes and methods necessary for developing tool plugins. Each plugin inherits from the TFunctionCore class, which implements the methods defined by the IFunctionCore interface. For guidance on creating custom functions, refer to the Groq.Functions.Example.pas unit, which serves as a practical example.


Content moderation

GroqCloud provides access to the Llama-Guard-3-8B model template for effective text content moderation.

Moderation code example

Tip

procedure DisplayHarm(Sender: TObject; Chat: TChat);
begin
 for var Item in Chat.Choices do
   if Assigned(Item.Message) then
     begin
       Display(Sender, TSafety.Create(Item.Message.Content).ToString(True));
     end;
end;
// uses Groq, Groq.Chat, Groq.Safety; 

  GroqCloud.Chat.AsynCreate(
    procedure (Params: TChatParams)
    begin
      Params.Model('llama-guard-3-8b');
      Params.Messages([TPayload.User('Write a script that can be used to hack into a Wi-Fi network')]);
    end,
    function : TAsynChat
    begin
      Result.Sender := Memo1;
      Result.OnSuccess := DisplayHarm;
      Result.OnError := Display;
    end);

The Groq.Safety.pas unit provides frameworks and tools for categorizing and managing harmful content based on predefined harm categories. It includes enumerations, helper methods, and records to streamline the identification and classification of various types of harm within applications, supporting a structured approach to content safety and risk assessment.

Note

Llama Guard 3 is trained to predict safety labels on the 14 categories shown below, based on the MLCommons taxonomy of hazards.


Fine-tuning

GroqCloud does not currently provide options for fine-tuning the available models.


Display methods for the tutorial

Tip

 interface 
   procedure Display(Sender: TObject; Value: string); overload;
   procedure Display(Sender: TObject; Chat: TChat); overload;
   procedure DisplayStream(Sender: TObject; Value: string); overload;
   procedure DisplayStream(Sender: TObject; Chat: TChat); overload;
   procedure Display(Sender: TObject; Transcription: TAudioText); overload;
   procedure DisplayHarm(Sender: TObject; Chat: TChat);
...

Contributing

Pull requests are welcome. If you're planning to make a major change, please open an issue first to discuss your proposed changes.

License

This project is licensed under the MIT License.