microsoft/kernel-memory

GenerateEmbeddingsHandler - An item with the same key has already been added

Closed this issue · 2 comments

Context / Scenario

I just want to add document to the memory store and I get this error.
Code:


 var ms = new System.IO.MemoryStream();
 await uploadForm.File.OpenReadStream(uploadForm.File.Size).CopyToAsync(ms);
 var d = await km.ImportDocumentAsync(ms, uploadForm.File.Name);

KM is setup as follows:

var km = new KernelMemoryBuilder()
       .WithOpenAIDefaults(openAIConfig.APIKey)
       .WithOpenAITextEmbeddingGeneration(openAIConfig)
       .WithSimpleVectorDb(new SimpleVectorDbConfig { Directory= "/tmp/robRezervacije/" })
       .WithSimpleFileStorage(new Microsoft.KernelMemory.ContentStorage.DevTools.SimpleFileStorageConfig { Directory = "/tmp/robRezervacije/" })
       .Build<MemoryServerless>();

What happened?

I got an exception System.ArgumentException: An item with the same key has already been added. Key: perls.pdf.partition.0.txt.AI.OpenAI.OpenAITextEmbeddingGenerator.TODO.text_embedding
at System.Collections.Generic.Dictionary2.TryInsert(TKey key, TValue value, InsertionBehavior behavior) at System.Collections.Generic.Dictionary2.Add(TKey key, TValue value)
at Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler.InvokeAsync(DataPipeline pipeline, CancellationToken cancellationToken)

Importance

I cannot use Kernel Memory

Platform, Language, Versions

.net 8, macos sonoma 14.4.1, KM 0.36.240415.2
c#

Relevant log output

`dbug: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0]
      Saving embedding file perls.pdf.partition.0.txt.AI.OpenAI.OpenAITextEmbeddingGenerator.TODO.text_embedding
trce: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0]
      Generating embeddings using AI.OpenAI.OpenAITextEmbeddingGenerator, file: perls.pdf.partition.0.txt
dbug: Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler[0]
      Saving embedding file perls.pdf.partition.0.txt.AI.OpenAI.OpenAITextEmbeddingGenerator.TODO.text_embedding
fail: Microsoft.KernelMemory.Pipeline.BaseOrchestrator[0]
      Pipeline start failed
      System.ArgumentException: An item with the same key has already been added. Key: perls.pdf.partition.0.txt.AI.OpenAI.OpenAITextEmbeddingGenerator.TODO.text_embedding
         at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
         at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
         at Microsoft.KernelMemory.Handlers.GenerateEmbeddingsHandler.InvokeAsync(DataPipeline pipeline, CancellationToken cancellationToken)
         at Microsoft.KernelMemory.Pipeline.InProcessPipelineOrchestrator.RunPipelineAsync(DataPipeline pipeline, CancellationToken cancellationToken)
         at Microsoft.KernelMemory.Pipeline.BaseOrchestrator.ImportDocumentAsync(String index, DocumentUploadRequest uploadRequest, CancellationToken cancellationToken)`
dluc commented

I think you're adding the same embedding generator twice when using WithOpenAIDefaults and WithOpenAITextEmbeddingGeneration.

try removing WithOpenAIDefaults, changing this code:

       .WithOpenAIDefaults(openAIConfig.APIKey)
       .WithOpenAITextEmbeddingGeneration(openAIConfig)

to

       .WithOpenAITextGeneration(openAITextConfig)
       .WithOpenAITextEmbeddingGeneration(openAIEmbeddingConfig)

Ah, this works, thank you! I wasn't aware that sets both generation and embeddings.