This progressive tutorial is for building your own AI chat application informed with your enterprise data. In Chapter 1, we start with building a simple ChatGPT-like application using Semantic Kernel (SK). Chapter 2 imports files into a “Memories Store” for reference by the SK orchestrator when chatting. Having the data from these files allows SK to build better prompts so the AI can offer better answers to questions – this is a key part of the Retrieval Augmented Generation (RAG) pattern. Chapter 3 extends the context of the chat application by using Azure Cognitive Search for data indexing and retrieval.
In all, this tutorial creates a minimal implementation for using Semantic Kernel as a foundation for enabling enterprise data ingestion, long-term memory, plug-ins, and more.
Special thanks to Adam Hurwitz's for his SemanticQuestion10K sample, which was used in Chapter 2.
- Chapter 1: ChatGPT
- Chapter 2: Memories of Enterprise Data
- Chapter 3: Azure Cognitive Search and Retrieval Augmented Generation
- Appendix
In this section, we will create a minimal chat implementation for a chat application that uses Semantic Kernel as a foundation for enterprise data ingestion, long-term memory, plug-ins, and more. We will write a C# Azure Function in detail from scratch that wraps all AI calls using SK and we will use a prebuilt C# console app as our UI for the chat app.
Before you get started, make sure you have the following requirements in place:
- Visual Studio Code with extensions:
- .NET 7.0 SDK for building and deploying .NET 7 projects.
- Azure Function Core Tools 4.x for managing Azure Functions
- OpenAI API key for using the OpenAI API (or click here to signup).
Then, open a terminal and clone this repo with the following command:
git clone https://github.com/Azure-Samples/semantic-kernel-rag-chat
-
Open a new Visual Studio Code window and click on the Azure extension (or press
SHIFT+ALT+A
). -
Mouse-over
WORKSPACE
(in the lower left pane) and selectCreate Function
(i.e., +⚡) to create a new local Azure function project. -
Select
Browse
and create a folder calledmyfunc
inside the cloned repo'ssrc
directory to house your Azure Function code (e.g.,semantic-kernel-rag-chat/src/myfunc
). Then use the selections below when creating the project:Selection Value Language C#
Runtime .NET 7 Isolated
Template Http trigger
Function name MyChatFunction
Namespace My.MyChatFunction
Access rights Function
Now close and reopen Visual Studio Code, this time opening the semantic-kernel-rag-chat
folder so you can view and interact with the entire repository.
-
Open a terminal window, change to the directory with your Azure Function project file (e.g.,
semantic-kernel-rag-chat/src/myfunc
), and run thedotnet
command below to add the Semantic Kernel NuGet package to your project.dotnet add package Microsoft.SemanticKernel --prerelease -v 0.14.547.1-preview
In addition, use the commands below to configure .NET User Secrets and then securely store your OpenAI API key.
dotnet add package Microsoft.Extensions.Configuration.UserSecrets dotnet user-secrets init --id semantic-kernel-rag-chat dotnet user-secrets set "OPENAI_APIKEY" "<your OpenAI API key>"
Make sure to specify
semantic-kernel-rag-chat
as the--id
parameter. This will enable you to access your secrets from any of the projects in this repository. -
Back in your Azure Function project in Visual Studio Code, open the
Program.cs
file and replace everything in the file with the content below. This updates theHostBuilder
to read configuration variables from user secrets and sets up a reference to the SK runtime.We will walk through the code step-by-step, but you can find the complete code files in step 7 of this section.
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; var hostBuilder = new HostBuilder() .ConfigureFunctionsWorkerDefaults(); hostBuilder.ConfigureAppConfiguration((context, config) => { config.AddUserSecrets<Program>(); }); hostBuilder.Build().Run();
-
Add the Semantic Kernel by adding a
ConfigureServices
call below the existingConfigureAppConfiguration
and populating it with an instance of the kernel.The kernel below is configured to use an OpenAI ChatGPT model (e.g., gpt-3.5-turbo) for chat completions.
hostBuilder.ConfigureServices(services => { services.AddSingleton<IKernel>(sp => { IConfiguration configuration = sp.GetRequiredService<IConfiguration>(); string openAiApiKey = configuration["OPENAI_APIKEY"]; IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddOpenAIChatCompletionService( modelId: "gpt-3.5-turbo", apiKey: openAiApiKey)) .Build(); return kernel; }); });
-
Enable a chat service and in-memory storage for the chat history by adding two more singletons inside the
ConfigureServices
call.services.AddSingleton<IChatCompletion>(sp => sp.GetRequiredService<IKernel>().GetService<IChatCompletion>()); const string instructions = "You are a helpful friendly assistant."; services.AddSingleton<ChatHistory>(sp => sp.GetRequiredService<IChatCompletion>().CreateNewChat(instructions));
-
Open your function code file (e.g.,
MyChatFunction.cs
) and add the chat completion using statement to the top.using Microsoft.SemanticKernel.AI.ChatCompletion;
Replace the private members and constructor to include the chat history and chat completion services – these will be used to give the AI a history of the conversation (since the AI is stateless) and to make calls to the AI.
private readonly ILogger _logger; private readonly IChatCompletion _chat; private readonly ChatHistory _chatHistory; public MyChatFunction(ILoggerFactory loggerFactory, ChatHistory chatHistory, IChatCompletion chat) { _logger = loggerFactory.CreateLogger<MyChatFunction>(); _chat = chat; _chatHistory = chatHistory; }
-
Update the
Run
method to read the user's chat message, add it to the chat history, use our chat service to call ChatGPT and generate a reply message, add the AI's reply to our chat history, and finally, send the reply back to the caller.[Function("MyChatFunction")] public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequestData req) { _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, await req.ReadAsStringAsync() ?? string.Empty); string reply = await _chat.GenerateMessageAsync(_chatHistory, new ChatRequestSettings()); _chatHistory.AddMessage(ChatHistory.AuthorRoles.Assistant, reply); HttpResponseData response = req.CreateResponse(HttpStatusCode.OK); response.WriteString(reply); return response; }
-
The complete code files (with additional comments).
Program.cs
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; var hostBuilder = new HostBuilder() .ConfigureFunctionsWorkerDefaults(); hostBuilder.ConfigureAppConfiguration((context, config) => { config.AddUserSecrets<Program>(); }); hostBuilder.ConfigureServices(services => { services.AddSingleton<IKernel>(sp => { // Retrieve the OpenAI API key from the configuration. IConfiguration configuration = sp.GetRequiredService<IConfiguration>(); string openAiApiKey = configuration["OPENAI_APIKEY"]; // Construct a semantic kernel and connect the OpenAI chat completion APIs. IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddOpenAIChatCompletionService( modelId: "gpt-3.5-turbo", apiKey: openAiApiKey)) .Build(); return kernel; }); // Provide a chat completion service client to our function. services.AddSingleton<IChatCompletion>(sp => sp.GetRequiredService<IKernel>().GetService<IChatCompletion>()); // Provide a persistant in-memory chat history store with the // initial ChatGPT system message. const string instructions = "You are a helpful friendly assistant."; services.AddSingleton<ChatHistory>(sp => sp.GetRequiredService<IChatCompletion>().CreateNewChat(instructions)); }); hostBuilder.Build().Run();
MyChatFunction.cs
using System.Net; using Microsoft.Azure.Functions.Worker; using Microsoft.Azure.Functions.Worker.Http; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel.AI.ChatCompletion; namespace My.MyChatFunction { public class MyChatFunction { private readonly ILogger _logger; private readonly IChatCompletion _chat; private readonly ChatHistory _chatHistory; public MyChatFunction(ILoggerFactory loggerFactory, ChatHistory chatHistory, IChatCompletion chat) { _logger = loggerFactory.CreateLogger<MyChatFunction>(); _chat = chat; _chatHistory = chatHistory; } [Function("MyChatFunction")] public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequestData req) { // Add the user's chat message to the history. _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, await req.ReadAsStringAsync() ?? string.Empty); // Send the chat history to the AI and receive a reply. string reply = await _chat.GenerateMessageAsync(_chatHistory, new ChatRequestSettings()); // Add the AI's reply to the chat history for next time. _chatHistory.AddMessage(ChatHistory.AuthorRoles.Assistant, reply); // Send the AI's response back to the caller. HttpResponseData response = req.CreateResponse(HttpStatusCode.OK); response.WriteString(reply); return response; } } }
-
Run your Azure Function locally by opening a terminal, changing directory to your Azure Function project (e.g.,
semantic-kernel-rag-chat/src/myfunc
), and starting the function by runningfunc start
Make note of the URL displayed (e.g.,
http://localhost:7071/api/MyChatFunction
). -
Start the test console application Open a second terminal and change directory to the
chatconsole
project folder (e.g.,semantic-kernel-rag-chat/src/chatconsole
) and run the application using the Azure Function URL.dotnet run http://localhost:7071/api/MyChatFunction
-
Type a message and press enter to verify that we are able to chat with the AI!
Input: Hello, how are you? AI: Hello! As an AI language model, I don't have feelings, but I'm functioning properly and ready to assist you. How can I help you today?
-
Now let's try to ask about something that is not in the current AI model, such as "What was Microsoft's total revenue for 2022?"
Input: What was Microsoft's cloud revenue for 2022? AI: I'm sorry, but I cannot provide information about Microsoft's cloud revenue for 2022 as it is not yet available. Microsoft's fiscal year 2022 ends on June 30, 2022, and the company typically releases its financial results a few weeks after the end of the fiscal year. However, Microsoft's cloud revenue for fiscal year 2021 was $59.5 billion, an increase of 34% from the previous year.
As you can see the AI is a bit out of date with its answers.
In Chapter 1 we created an Azure function hosting the Semantic Kernel that makes it easy to send API calls we want to make to the AI. This gives us a shared, production ready endpoint that we could use from any given solution we want to build.
Next we'll add a 'knowledge base' to the chat to help answer questions such as those above more accurately.
Semantic Kernel's memory stores are used to integrate data from your knowledge base into AI interactions. Any data can be added to a knowledge base and you have full control of that data and who it is shared with. SK uses embeddings to encode data and store it in a vector database. Using a vector database also allows us to use vector search engines to quickly find the most relevant data for a given query that we then share with the AI. In this chapter, we'll add a memory store to our chat function, import the Microsoft revenue data, and use it to answer the question from Chapter 1.
Before you get started, make sure you have the following additional requirements in place:
- Docker Desktop for hosting the Qdrant vector search engine.
Note that a different vector store, such as Pinecone or Weviate, could be leveraged.
-
Open a terminal window, change to the directory with your project file (e.g.,
semantic-kernel-rag-chat/src/myfunc
), and run thedotnet
command below to add the Semantic Kernel Qdrant Memory Store to your project.dotnet add package Microsoft.SemanticKernel.Connectors.Memory.Qdrant --prerelease -v 0.14.547.1-preview
-
Open your Program code file (e.g.,
Program.cs
) and add the Qdrant memory store using statement to the top.We will walk through the code step-by-step, but you can find the complete code files in step 5 of this section.
using Microsoft.SemanticKernel.Connectors.Memory.Qdrant;
Replace the builder code we wrote in Chapter 1, where we instantiate the kernel, to include a Qdrant memory store and an OpenAI embedding generation service.
QdrantMemoryStore memoryStore = new QdrantMemoryStore( host: "http://localhost", port: 6333, vectorSize: 1536, logger: sp.GetRequiredService<ILogger<QdrantMemoryStore>>()); IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddOpenAIChatCompletionService( modelId: "gpt-3.5-turbo", apiKey: openAiApiKey)) .Configure(c => c.AddOpenAITextEmbeddingGenerationService( modelId: "text-embedding-ada-002", apiKey: openAiApiKey)) .WithMemoryStorage(memoryStore) .Build();
-
Open
MyChatFunction.cs
and add the following using statements to the top.using System.Text; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.Memory;
Update the constructor to take an instance of the kernel and store it as a member variable.
private readonly ILogger _logger; private readonly IKernel _kernel; private readonly IChatCompletion _chat; private readonly ChatHistory _chatHistory; public MyChatFunction(ILoggerFactory loggerFactory, IKernel kernel, ChatHistory chatHistory, IChatCompletion chat) { _logger = loggerFactory.CreateLogger<MyChatFunction>(); _kernel = kernel; _chat = chat; _chatHistory = chatHistory; }
Replace where we add the user's message to the chat history (
_chatHistory!.AddMessage(ChatHistory.AuthorRoles.User,...
) with a call that will search for related memories and include them in the user's message to the AI.// _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, await req.ReadAsStringAsync() ?? string.Empty); string message = await SearchMemoriesAsync(_kernel, await req.ReadAsStringAsync() ?? string.Empty); _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, message);
-
And finally we'll add the
SearchMemoriesAsync
method to this class.The strategy of this memory search is to find memories that are similar to the user's input and then include those memories in the user's message to the AI. This is done by first searching for memories that are similar to the user's input and including the previous and subsequent memories. These memories provide the AI with context for the user's input.
private async Task<string> SearchMemoriesAsync(IKernel kernel, string query) { StringBuilder result = new StringBuilder(); result.Append("The below is relevant information.\n[START INFO]"); // Search for memories that are similar to the user's input. const string memoryCollectionName = "ms10k"; IAsyncEnumerable<MemoryQueryResult> queryResults = kernel.Memory.SearchAsync(memoryCollectionName, query, limit: 3, minRelevanceScore: 0.77); // For each memory found, try to get previous and next memories. await foreach (MemoryQueryResult r in queryResults) { int id = int.Parse(r.Metadata.Id); MemoryQueryResult? rb2 = await kernel.Memory.GetAsync(memoryCollectionName, (id - 2).ToString()); MemoryQueryResult? rb = await kernel.Memory.GetAsync(memoryCollectionName, (id - 1).ToString()); MemoryQueryResult? ra = await kernel.Memory.GetAsync(memoryCollectionName, (id + 1).ToString()); MemoryQueryResult? ra2 = await kernel.Memory.GetAsync(memoryCollectionName, (id + 2).ToString()); if (rb2 != null) result.Append("\n " + rb2.Metadata.Id + ": " + rb2.Metadata.Description + "\n"); if (rb != null) result.Append("\n " + rb.Metadata.Description + "\n"); if (r != null) result.Append("\n " + r.Metadata.Description + "\n"); if (ra != null) result.Append("\n " + ra.Metadata.Description + "\n"); if (ra2 != null) result.Append("\n " + ra2.Metadata.Id + ": " + ra2.Metadata.Description + "\n"); } result.Append("\n[END INFO]"); result.Append($"\n{query}"); return result.ToString(); }
-
The complete code files (with additional comments).
Program.cs
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; using Microsoft.SemanticKernel.Connectors.Memory.Qdrant; var hostBuilder = new HostBuilder() .ConfigureFunctionsWorkerDefaults(); hostBuilder.ConfigureAppConfiguration((context, config) => { config.AddUserSecrets<Program>(); }); hostBuilder.ConfigureServices(services => { services.AddSingleton<IKernel>(sp => { // Retrieve the OpenAI API key from the configuration. IConfiguration configuration = sp.GetRequiredService<IConfiguration>(); string openAiApiKey = configuration["OPENAI_APIKEY"]; // Create a memory store that will be used to store memories. QdrantMemoryStore memoryStore = new QdrantMemoryStore( host: "http://localhost", port: 6333, vectorSize: 1536, logger: sp.GetRequiredService<ILogger<QdrantMemoryStore>>()); // Create the kerne with chat completion, embedding generation, and memory storage. IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddOpenAIChatCompletionService( modelId: "gpt-3.5-turbo", apiKey: openAiApiKey)) .Configure(c => c.AddOpenAITextEmbeddingGenerationService( modelId: "text-embedding-ada-002", apiKey: openAiApiKey)) .WithMemoryStorage(memoryStore) .Build(); return kernel; }); // Register the chat completion service. services.AddSingleton<IChatCompletion>(sp => sp.GetRequiredService<IKernel>().GetService<IChatCompletion>()); // Create a new chat history. const string instructions = "You are a helpful friendly assistant."; services.AddSingleton<ChatHistory>(sp => sp.GetRequiredService<IChatCompletion>().CreateNewChat(instructions)); }); hostBuilder.Build().Run();
MyChatFunction.cs
using System.Net; using System.Text; using Microsoft.Azure.Functions.Worker; using Microsoft.Azure.Functions.Worker.Http; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; using Microsoft.SemanticKernel.Memory; namespace My.MyChatFunction { public class MyChatFunction { private readonly ILogger _logger; private readonly IKernel _kernel; private readonly IChatCompletion _chat; private readonly ChatHistory _chatHistory; public MyChatFunction(ILoggerFactory loggerFactory, IKernel kernel, ChatHistory chatHistory, IChatCompletion chat) { _logger = loggerFactory.CreateLogger<MyChatFunction>(); _kernel = kernel; _chat = chat; _chatHistory = chatHistory; } [Function("MyChatFunction")] public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequestData req) { _logger.LogInformation("C# HTTP trigger function processed a request."); //_chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, await req.ReadAsStringAsync() ?? string.Empty); string message = await SearchMemoriesAsync(_kernel, await req.ReadAsStringAsync() ?? string.Empty); _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, message); string reply = await _chat.GenerateMessageAsync(_chatHistory, new ChatRequestSettings()); _chatHistory.AddMessage(ChatHistory.AuthorRoles.Assistant, reply); HttpResponseData response = req.CreateResponse(HttpStatusCode.OK); response.WriteString(reply); return response; } private async Task<string> SearchMemoriesAsync(IKernel kernel, string query) { StringBuilder result = new StringBuilder(); result.Append("The below is relevant information.\n[START INFO]"); const string memoryCollectionName = "ms10k"; IAsyncEnumerable<MemoryQueryResult> queryResults = kernel.Memory.SearchAsync(memoryCollectionName, query, limit: 3, minRelevanceScore: 0.77); // For each memory found, get previous and next memories. await foreach (MemoryQueryResult r in queryResults) { int id = int.Parse(r.Metadata.Id); MemoryQueryResult? rb2 = await kernel.Memory.GetAsync(memoryCollectionName, (id - 2).ToString()); MemoryQueryResult? rb = await kernel.Memory.GetAsync(memoryCollectionName, (id - 1).ToString()); MemoryQueryResult? ra = await kernel.Memory.GetAsync(memoryCollectionName, (id + 1).ToString()); MemoryQueryResult? ra2 = await kernel.Memory.GetAsync(memoryCollectionName, (id + 2).ToString()); if (rb2 != null) result.Append("\n " + rb2.Metadata.Id + ": " + rb2.Metadata.Description + "\n"); if (rb != null) result.Append("\n " + rb.Metadata.Description + "\n"); if (r != null) result.Append("\n " + r.Metadata.Description + "\n"); if (ra != null) result.Append("\n " + ra.Metadata.Description + "\n"); if (ra2 != null) result.Append("\n " + ra2.Metadata.Id + ": " + ra2.Metadata.Description + "\n"); } result.Append("\n[END INFO]"); result.Append($"\n{query}"); return result.ToString(); } } }
Before running our new code, we'll need to launch and populate the vector database.
In this section we deploy the Qdrant vector database locally and populate it with example data (i.e., Microsoft's 2022 10-K financial report). This will take approximately 15 minutes to import and will use OpenAI’s embedding generation service to create embeddings for the 10-K.
-
Start Docker Desktop and wait until it is running.
-
Open a terminal and use Docker to pull down the container image for Qdrant.
docker pull qdrant/qdrant
-
Change directory to the root of this repo (e.g.,
semantic-kernel-rag-chat
) and create a./data/qdrant
directory for Qdrant to use as persistent storage. Then start the Qdrant container on port6333
using the./data/qdrant
folder as the persistent storage location.mkdir ./data/qdrant docker run --name mychat -p 6333:6333 -v "$(pwd)/data/qdrant:/qdrant/storage" qdrant/qdrant
To stop the container, in another terminal window run
docker container stop mychat; docker container rm mychat;
. -
Open a second terminal and change directory to the
importmemories
project folder in this repo (e.g.,semantic-kernel-rag-chat/src/importmemories
). Run theimportmemories
tool with the command below to populate the vector database with your data.Make sure the
--collection
argument matches thecollectionName
variable in theSearchMemoriesAsync
method above.Note: This may take several minutes to several hours depending on the size of your data. This repo contains Microsoft's 2022 10-K financial report data as an example which should normally take about 15 minutes to import.
dotnet run -- --memory-type qdrant --memory-url http://localhost:6333 --collection ms10k --text-file ../../data/ms10k.txt
When importing your own data, try to import all files at the same time using multiple
--text-file
arguments. This example leverages incremental indexes which are best constructed when all data is present.If you want to reset the memory store, delete and recreate the directory in step 2, or create a new directory to use.
-
With Qdrant running and populated, run your Azure Function locally by opening a terminal, changing directory to your Azure Function project (e.g.,
semantic-kernel-rag-chat/src/myfunc
), and starting the function by runningfunc start
Make a note of the URL displayed (e.g.,
http://localhost:7071/api/MyChatFunction
). -
Start the test console application Open a second terminal and change directory to the
chatconsole
project folder (e.g.,semantic-kernel-rag-chat/src/chatconsole
) and run the application using the Azure Function URL.dotnet run http://localhost:7071/api/MyChatFunction
-
Type a message and press enter to verify that we are able to chat with the AI!
Input: Hello, how are you? AI: Hello! As an AI language model, I don't have feelings, but I'm functioning properly and ready to assist you. How can I help you today?
-
Now let's try ask the same question from before about Microsoft's 2022 revenue
Input: What was Microsoft's cloud revenue for 2022? AI: Microsoft's cloud revenue for 2022 was $91.2 billion.
The AI now has the ability to search through the Microsoft 10-K financial report and find the answer to our question. Let's try another...
Input: Did linkedin's revenue grow in 2022? AI: Yes, LinkedIn's revenue grew in 2022. It increased by $3.5 billion or 34% driven by a strong job market in the Talent Solutions business and advertising demand in the Marketing Solutions business.
Azure Cognitive Search is a powerful cloud search service that enables developers to build rich search experiences across their own private and heterogenous data sources. With semantic search, Azure Cognitive Search can produce more semantically relevant results for text-based queries.
This is an alternative to the vector-based approach we took in Chapter 2. With semantic search, we no longer need to generate embeddings like we did in the previous chapter. Instead, a semantic re-ranking process is applied to the initial set of search results, using the context and meaning of words to elevate the results that are most relevant.
In this chapter, we will modify our chat function to use Azure Cognitive Search with semantic search. We will once again demonstrate how we can use this memory to generate more meaningful results in our chat application.
Before you get started, make sure you have the following additional requirements:
- An instance of the Azure Cognitive Search service, with semantic search enabled.
- To connect to the service, you will need the following two pieces of data:
AZURE_COGNITIVE_SEARCH_APIKEY
: an admin key to your Azure Cognitive Search serviceAZURE_COGNITIVE_SEARCH_URL
: the URL to your Azure Cognitive Search endpoint
-
Open a terminal window, change to the directory with your project file (e.g.,
semantic-kernel-rag-chat/src/myfunc
), and run thedotnet
command below to add the Semantic Kernel Azure Cognitive Search connector to your project.dotnet add package Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch --prerelease -v 0.14.547.1-preview
In addition, use the
dotnet user-secrets
commands below to securely store your Azure Cognitive Search API key and endpoint URL.dotnet user-secrets set "AZURE_COGNITIVE_SEARCH_APIKEY" "<your Azure Cognitive Search admin API key>" dotnet user-secrets set "AZURE_COGNITIVE_SEARCH_URL" "<your Azure Cognitive Search endpoint URL>"
-
Open your Program code file (e.g.,
Program.cs
) and add the Azure Cognitive Search connector using statement to the top.using Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch;
Replace the Qdrant memory store that we added in Chapter 2 with the Azure Cognitive Search memory connector.
AzureCognitiveSearchMemory memory = new AzureCognitiveSearchMemory( configuration["AZURE_COGNITIVE_SEARCH_URL"], configuration["AZURE_COGNITIVE_SEARCH_APIKEY"] );
Then, update the builder code where we instantiate the kernel. We can remove the OpenAI embedding generation service and the Qdrant memory store from the builder, and replace them with the Azure Cognitive Search memory that we just created.
IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddOpenAIChatCompletionService( modelId: "gpt-3.5-turbo", apiKey: openAiApiKey)) .WithMemory(memory) .Build();
No changes need to be made to
SearchMemoriesAsync
, since it uses the kernel's semantic memory abstraction to generate context for the query. While the underlying memory source has changed, this abstraction has not. -
The complete code files (with additional comments).
Program.cs
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; using Microsoft.SemanticKernel.Connectors.Memory.AzureCognitiveSearch; var hostBuilder = new HostBuilder() .ConfigureFunctionsWorkerDefaults(); hostBuilder.ConfigureAppConfiguration((context, config) => { config.AddUserSecrets<Program>(); }); hostBuilder.ConfigureServices(services => { services.AddSingleton<IKernel>(sp => { // Retrieve the OpenAI API key from the configuration. IConfiguration configuration = sp.GetRequiredService<IConfiguration>(); string openAiApiKey = configuration["OPENAI_APIKEY"]; // Create a memory connector to Azure Cognitive Search that will be used to store memories. AzureCognitiveSearchMemory memory = new AzureCognitiveSearchMemory( configuration["AZURE_COGNITIVE_SEARCH_URL"], configuration["AZURE_COGNITIVE_SEARCH_APIKEY"] ); // Create the kernel with chat completion and memory. IKernel kernel = new KernelBuilder() .WithLogger(sp.GetRequiredService<ILogger<IKernel>>()) .Configure(config => config.AddOpenAIChatCompletionService( modelId: "gpt-3.5-turbo", apiKey: openAiApiKey)) .WithMemory(memory) .Build(); return kernel; }); // Register the chat completion service. services.AddSingleton<IChatCompletion>(sp => sp.GetRequiredService<IKernel>().GetService<IChatCompletion>()); // Create a new chat history. const string instructions = "You are a helpful friendly assistant."; services.AddSingleton<ChatHistory>(sp => sp.GetRequiredService<IChatCompletion>().CreateNewChat(instructions)); }); hostBuilder.Build().Run();
MyChatFunction.cs (unchanged)
using System.Net; using System.Text; using Microsoft.Azure.Functions.Worker; using Microsoft.Azure.Functions.Worker.Http; using Microsoft.Extensions.Logging; using Microsoft.SemanticKernel; using Microsoft.SemanticKernel.AI.ChatCompletion; using Microsoft.SemanticKernel.Memory; namespace My.MyChatFunction { public class MyChatFunction { private readonly ILogger _logger; private readonly IKernel _kernel; private readonly IChatCompletion _chat; private readonly ChatHistory _chatHistory; public MyChatFunction(ILoggerFactory loggerFactory, IKernel kernel, ChatHistory chatHistory, IChatCompletion chat) { _logger = loggerFactory.CreateLogger<MyChatFunction>(); _kernel = kernel; _chat = chat; _chatHistory = chatHistory; } [Function("MyChatFunction")] public async Task<HttpResponseData> Run([HttpTrigger(AuthorizationLevel.Function, "post")] HttpRequestData req) { _logger.LogInformation("C# HTTP trigger function processed a request."); //_chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, await req.ReadAsStringAsync() ?? string.Empty); string message = await SearchMemoriesAsync(_kernel, await req.ReadAsStringAsync() ?? string.Empty); _chatHistory!.AddMessage(ChatHistory.AuthorRoles.User, message); string reply = await _chat.GenerateMessageAsync(_chatHistory, new ChatRequestSettings()); _chatHistory.AddMessage(ChatHistory.AuthorRoles.Assistant, reply); HttpResponseData response = req.CreateResponse(HttpStatusCode.OK); response.WriteString(reply); return response; } private async Task<string> SearchMemoriesAsync(IKernel kernel, string query) { StringBuilder result = new StringBuilder(); result.Append("The below is relevant information.\n[START INFO]"); const string memoryCollectionName = "ms10k"; IAsyncEnumerable<MemoryQueryResult> queryResults = kernel.Memory.SearchAsync(memoryCollectionName, query, limit: 3, minRelevanceScore: 0.77); // For each memory found, get previous and next memories. await foreach (MemoryQueryResult r in queryResults) { int id = int.Parse(r.Metadata.Id); MemoryQueryResult? rb2 = await kernel.Memory.GetAsync(memoryCollectionName, (id - 2).ToString()); MemoryQueryResult? rb = await kernel.Memory.GetAsync(memoryCollectionName, (id - 1).ToString()); MemoryQueryResult? ra = await kernel.Memory.GetAsync(memoryCollectionName, (id + 1).ToString()); MemoryQueryResult? ra2 = await kernel.Memory.GetAsync(memoryCollectionName, (id + 2).ToString()); if (rb2 != null) result.Append("\n " + rb2.Metadata.Id + ": " + rb2.Metadata.Description + "\n"); if (rb != null) result.Append("\n " + rb.Metadata.Description + "\n"); if (r != null) result.Append("\n " + r.Metadata.Description + "\n"); if (ra != null) result.Append("\n " + ra.Metadata.Description + "\n"); if (ra2 != null) result.Append("\n " + ra2.Metadata.Id + ": " + ra2.Metadata.Description + "\n"); } result.Append("\n[END INFO]"); result.Append($"\n{query}"); return result.ToString(); } } }
Before running our updated code, we'll need to populate an Azure Cognitive Search index.
In this section we create and populate an Azure Cognitive Search index with example data (i.e., Microsoft's 2022 10-K financial report). This will take approximately 5 minutes to import.
-
Open a terminal and change directory to the
importmemories
project folder in this repo. Run theimportmemories
tool with the command below to populate the search index with your data.Make sure the
--collection
argument matches thecollectionName
variable in theSearchMemoriesAsync
method above.Note: This may take several minutes to several hours depending on the size of your data. This repo contains Microsoft's 2022 10-K financial report data as an example which should normally take about 5 minutes to import.
dotnet run -- --memory-type azurecognitivesearch --memory-url $AZURE_COGNITIVE_SEARCH_URL --collection ms10k --text-file ../../data/ms10k.txt
If you want to reset the memory store, you can delete the index from your service via the Azure portal or the Azure Cognitive Search REST API. The index name is the same as the
--collection
argument (e.g.ms10k
).
-
With the Azure Cognitive Search service running and populated, run your Azure Function locally by opening a terminal, changing directory to your Azure Function project (e.g.,
semantic-kernel-rag-chat/src/myfunc
), and starting the function by runningfunc start
Make a note of the URL displayed (e.g.,
http://localhost:7071/api/MyChatFunction
). -
Start the test console application Open a second terminal and change directory to the
chatconsole
project folder (e.g.,semantic-kernel-rag-chat/src/chatconsole
) and run the application using the Azure Function URL.dotnet run http://localhost:7071/api/MyChatFunction
-
Type a message and press enter to verify that we are still able to chat with the AI.
Input: Hi, how are you? AI: Hello! I'm an AI language model, so I don't have feelings, but I'm here to assist you. How can I help you today?
-
Now let's try asking the same questions from before about Microsoft's 2022 revenue.
Input: What was Microsoft's cloud revenue for 2022? AI: Microsoft's cloud revenue in fiscal year 2022 was $91.2 billion.
Input: Did linkedin's revenue grow in 2022? AI: Yes, LinkedIn's revenue increased by 34% in fiscal year 2022 compared to the previous year.
We've now seen how we can improve the experience of our Semantic Kernel chat application to leverage the Retrieval Augmented Generation pattern using Azure Cognitive Search with semantic search. Try adding more data to your search index and explore what else you can create with Semantic Kernel.
For more guidance and ideas, checkout out the SK documentation, blog, and Discord community. Happy building!
-
If you don't already have an Azure account go to https://azure.microsoft.com, click on
Try Azure for free
, and selectStart Free
to start creating a free Azure account with your Microsoft or GitHub account. After signing in, you will be prompted to enter some information.This tutorial uses Azure Functions (pricing) and Azure Cognitive Search (pricing) that may incur a monthly cost. Visit here to get some free Azure credits to get you started.
-
In Visual Studio Code, click on the Azure extension (or press
SHIFT+ALT+A
) -
Mouse-over
RESOURCES
and selectCreate Resource
(i.e., +), selectCreate Function App in Azure...
, select your Azure Subscription. -
Enter a name for your deployed function, for example
fn-mychatfunction
. -
Set the runtime stack to
.NET 7 Isolated
and choose a location in which to deploy your Azure Function.If you don't have a preference, choose the recommended region.
-
Wait until the
Create Function App
completes, which should only take a minute or so. -
Mouse-over
WORKSPACE
and selectDeploy
(i.e., ☁️) thenDeploy to Function App
. -
Select the same Azure Subscription in which you created the Azure Function in Azure, then select the Azure Function you created above (e.g.,
fn-mychatfunction
).It may take a minute or two to complete the deployment.