In this repository you will find the materials and challenges for the "Azure AI - Bilder, Text und Sprache mit Azure AI verstehen und durchsuchbar machen" bootcamp.
The associated bootcamp teaches you how to deploy and use the following two Azure technologies:
- Azure Cognitive Services
- Azure Search (including Cognitive Search)
The bootcamp is organized as six independent challenges, each one containing a high-level description and a few hints in case you get stuck. We'll touch on the following services:
Service | Where? |
---|---|
Azure Search + Cognitive Search | Challenge 01 |
Azure Cognitive Services - Computer Vision API and Custom Vision Service | Challenge 02 |
Azure Cognitive Services - Speech Services | Challenge 03 |
Azure Cognitive Services - Language Understanding | Challenge 04 |
Azure Cognitive Services - Text Analytics API | Challenge 05 |
Azure Cognitive Services - Search API | Challenge 06 |
You can solve these challenges in a programming language of your choice (some even in curl
🔨). For sake of convenience, we are providing hints in Python
, which you can easily (and for free) run in Azure Notebooks. SDK Support for C#
or .NET Core
is available for most challenges. Especially Azure Search features an easy-to-use .NET SDK
. You can find code examples in the Azure documentation for the associated services.
🚩 Goal: Deploy an Azure Search instance and index a PDF-based data set
- Deploy an Azure Search instance
- Index the unstructured PDF data set from here - which document contains the term
Content Moderator
?
❓ Questions:
- What is an Index? What is an Indexer? What is a Data Source? How do they relate to each other?
- Why would you want to use replicas? Why would you want more partitions?
- How would you index
json
documents sitting in Azure Blob?
🚩 Goal: Index an unstructured data set with Cognitive Search
- Add another index to the Azure Search instance, but this time enable Cognitive Search
- Index an existing data set coming from
Azure Blob
(data set can be downloaded here) - which document contains the termPin to Dashboard
?
❓ Questions:
- Let's assume we've built a Machine Learning model that can detect beer glasses images (we'll do that in the next challenge) - how could we leverage this model directly in Azure Search for tagging our data?
🚩 Goal: Leverage OCR to make a hand-written or printed text document in images machine-readable
In the language of your choice (Python solution is provided), write two small scripts that
- Convert hand-written text from an image into text - Test data: 1, 2
- Convert printed text from an image into text - Test data: 1, 2
❓ Questions:
- How well does the OCR service work with German text? How well with English?
- What happens when the image is not oriented correctly?
🚩 Goal: Detect beer glasses in images
- Use Custom Vision to detect beer glasses in images - Image Dataset for training and testing
❓ Questions:
- What could we do to increase the detection performance?
- What happens if the beer glasses are really small in the image?
🚩 Goal: Leverage Speech-to-Text and Text-to-Speech
In the language of your choice (Python solution is provided), write two small scripts or apps that
- Convert written text into speech (German or English)
- Convert speech into written text (German or English)
You can use can use this file: data/test.wav
(English).
❓ Questions:
- What happens if you transcribe a long audio file with the speech-to-text API (>15s)?
- What happens if you select the wrong language in the text-to-speech API? How could you solve this problem?
Now that we have converted a user's speech input into text, we'll try to determine the intent of that text in the next challenge.
🚩 Goal: Make your application understand the meaning of text
In the language of your choice (Python solution is provided), write two small scripts or apps that
- Translate the input text into German (using the Text Translator API)
- Detect the intent and entities of the text (German) - see examples below (using https://eu.luis.ai)
Let's use an example where we want to detect a Pizza order from the user. We also want to detect if the user wants to cancel an order.
LUIS example data:
2 Intents: "CreateOrder", "CancelOrder"
Utterances:
(CreateOrder) Ich moechte eine Pizza Salami bestellen
(CreateOrder) Vier Pizza Hawaii bitte
(CancelOrder) Bitte Bestellung 123 stornieren
(CancelOrder) Cancel bitte Bestellung 42
(CancelOrder) Ich will Order 933 nicht mehr
(None) Wieviel Uhr ist heute?
(None) Wie ist das Wetter in Berlin?
(None) Bitte Termin fuer Montag einstellen
🚩 Goal: Leverage Text Analytics API for extracting language, sentiment, key phrases, and entities from text
In the language of your choice (Python solution is provided), write a small scripts that
- Extracts sentiment, key phrases and entities from unstructured text using the Text Analytics API
❓ Questions:
- What happens if we do not pass in the
language
parameter while getting the sentiment?
🙈 Hints
❓ Questions:
- Why do we need to fill the
None
intent with examples? - What is the
Review endpoint utterances
feature in LUIS?
🚩 Goal: Write a script for auto-suggestion of text
- Leverage Bing Autosuggest to make predictions on how a user might wants to continue an half-written sentence
❓ Questions:
- What other services does Bing Search offer?
- How does the service react in case of a denial-of-service (DoS) attack?
@clemenssiebler, Cloud Solution Architect for Azure at Microsoft