The Azure Search OpenAI Demo is an excellent and useful base demo for showing how to search unstructured data. That said, it's not really designed to support analysis of structured (tabular) datasets which are best solved using other forms of database. Regardless, the usability, quality and popularity of this demo mean that there is a desire for a quick way to load small amounts of structured data into Cognitive Search to demonstrate reasoning over it with GPT.
This is a fork of the original PDF loader (prepdocs.py
) modified for loading CSVs into CogSearch.
It's incredibly simplistic and (right now) loads each CSV as a single document in CogSearch.
Do not load CSVs that are more than a few hundred cells (rows x columns) as they will be too big (too many tokens) for the default GPT model (
gpt-35-turbo
) to process (4k tokens limit)
-
Download this repository -- https://github.com/leongj/azure-search-demo-csv-loader/zipball/main
-
Copy the
csvloader
directory into your workingazure-search-openai-demo
directory (so it becomesazure-search-openai-demo/csvloader
). -
Copy your
.csv
file(s) into thecsvloader
folder. -
Open a PowerShell (or bash) Terminal in the base folder (
azure-search-openai-demo
). -
(optional) Delete the existing index (e.g. if it has the default sample content in it), run
./csvloader/loadcsv.ps1 --deleteindex
(replace withloadcsv.sh
on Linux/Bash) -
Load your CSV files by running
./csvloader/loadcsv.ps1
(replace withloadcsv.sh
on Linux/Bash)
If it is successful, you should be able to search in the index immediately.