Indexing multiple Azure Blob Storage containers in Azure Cognitive Search

This repo provides an example on how to index multiple Azure Blob Storage containers in Azure Cognitive Search by using a single Azure Table Storage Indexer and dumping blob metadata in an Azure Table Storage with Azure Functions.

Skillset

Logical components:

Azure Cognitive Search assets:

Azure Blob Storage to Table Storage

The project provides two Azure Functions to copy blob metadata from Azure Blob Storage to Azure Table Storage in batch mode and event-based:

  1. BlobToTable - Function with EventGrid input to store Blob name and container name in an Azure Table Storage, using an event-based pattern
  2. ContainerToTableHttp - HTTP Function to call for copying all Blob metadata available in an Azure Blob Storage Container in an Azure Table Storage Use the batch mode for the initial ingestion and the event-based function to keep consistency between the updated blobs and the rows in the Azure Table Storage.

How to create an Event-Grid subscription for Azure Blob Storage

Application settings

"AzureWebJobsStorage": # Storage account connection string for Azure Functions execution
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureBlobStorageConnectionString" : # Storage account connection string for blob metadata reading and SAS Token generation
"TableName": "droptable" # Table Storage name for dropping metadata from blob
"CopyMetadata": # set to "1" if you want to copy Blob Metadata in the event-based function (BlobToTable)