/azure-search-power-skills

A collection of useful functions to be deployed as custom skills for Azure Cognitive Search

Primary LanguageC#MIT LicenseMIT

python C#

Azure Search Power Skills

Power Skills are a collection of useful functions to be deployed as custom skills for Azure Cognitive Search. The skills can be used as templates or starting points for your own custom skills, or they can be deployed and used as they are if they happen to meet your requirements. We also invite you to contribute your own work by submitting a pull request.

Skills

This project provides the following custom skills:

Skill Description Type Language Environment Deployment
GeoPointFromName retrieves coordinates from place names and addresses. Geography C# functions ARM Template
AcronymLinker provides definitions for known acronyms. Text C# functions ARM Template
Anonymizer Uses Presidio to analyze and anonymize PII entities. Text python docker Manual
BingEntitySearch finds rich and structured information about public figures, locations, or organizations. Text C# functions ARM Template
CustomEntityLookup finds custom entity names in text. A custom skill implementation of the custom entity lookup skill, consider using in the cognitive skill instead of this custom skill implementation. Text C# functions ARM Template
CustomNER extracts your custom entities, using Natural Language Processing with Text Analytics Custom NER Text python functions ARM Template
CustomTextClassifier extracts your custom text classification, using Natural Language Processing with Text Analytics Custom Text Classification Text python functions Arm Template
Distinct de-duplicates a list of terms. Text C# functions ARM Template
Summarizer Uses a HuggingFace/FaceBook BART model to summarize text BART-Large-CNN. Text python docker Manual
TextAnalyticsForHealth A wrapper for the Text Analytics for Health API Text C# functions ARM Template
TextQualityWatchdog Uses a pretrained language model to detect low quality text extracted during document cracking Text python functions Manual
Tokenizer extracts non-stop words from a text. Text C# functions
AbbyyOCR OCR to extract text from images using ABBYY Cloud OCR. Vision C# functions ARM Template
FormRecognizer Use Form Recognizer to analyze a document. Form Recognizer skill supports the following model types Layout, Invoice, Receipt, ID, Business Card, General key value pairs, Custom Form Vision python functions Manual
AutoMLVisionClassifier Gets your latest Data Labelling AML AutoML Vision model and runs inference on it Vision python docker Manual
CustomVision classifies documents using Custom Vision models. Vision C# functions ARM Template
HocrGenerator transforms the result of OCR into the hOCR format. Vision C# functions ARM Template
ImageClustering Uses clustering to automatically group and label images Vision python docker Manual
ImageSegmentation Breakdown a full image or PDF page in subimages and upload them on Azure Blob Storage Vision python functions Manual
ImageSimilarity Uses ResNet to find the top-n most similar images Vision python docker Manual
P&ID Parser Extracts equipment tags and text blocks from piping and instrumentation diagrams Vision python docker Manual
DecryptBlobFile downloads, decrypts and returns a file that was previously encrypted and stored in Azure Blob Storage. Utility C# functions ARM Template
GetFileExtension returns the filename and extension as separate values allowing you to filter on document type. Utility C# functions ARM Template
ImageStore Stores and fetches base64-encoded images to and from blob storage. The knowledge store is a cleaner implementation of the pattern to save images to storage. Utility C# functions ARM Template
HelloWorld A minimal skill that can be used as a starting point or template for your own skills. Template C# functions ARM Template
PythonFastAPI A production web server and api scaffold for a python power skill Template python docker Terraform template

Getting Started

Prerequisites

In order to use the functions in this project, you'll need an active Azure subscription. Most of the functions can be used on their own for quick evaluation and experimentation, but they are meant to be used as part of an Azure Cognitive Search pipeline. Each function may also add its own specific requirements, such as API keys for services they leverage.

Visual Studio 2019 is recommended, but not required. You need a recent version of the C# compiler. Postman is highly recommended as a way to experiment and test skills.

Installation and deployment

If using Visual Studio with the Azure workload installed, no installation is required, and the functions can just be run locally using F5.

Deployment of a function to Azure can be done through Visual Studio, the Deploy to Azure button, or continuous deployment.

Some functions may require setting environment variables or configuration entries. Please refer to the readme file in the function's directory.

Quickstart

  1. Clone the repository
  2. Open the PowerSkills solution in Visual Studio
  3. Set the project for the function to test as the startup project
  4. Hit F5
  5. Experiment with calling the function using Postman

You can also create your own skills using our Hello World template skill as a starting point or if you are using python our FastAPI template skill.

Up for grabs

Here are a few suggestions of simple contributions to get you started:

Resources