Welcome!

In this repository you will find the materials and challenges for the "Azure AI - Bilder, Text und Sprache mit Azure AI verstehen und durchsuchbar machen" bootcamp.

The associated bootcamp teaches you how to deploy and use the following two Azure technologies:

The bootcamp is organized as six independent challenges, each one containing a high-level description and a few hints in case you get stuck. We'll touch on the following services:

Service Where?
Azure Search + Cognitive Search Challenge 01
Azure Cognitive Services - Computer Vision API and Custom Vision Service Challenge 02
Azure Cognitive Services - Speech Services Challenge 03
Azure Cognitive Services - Language Understanding Challenge 04
Azure Cognitive Services - Text Analytics API Challenge 05
Azure Cognitive Services - Search API Challenge 06

Challenges

You can solve these challenges in a programming language of your choice (some even in curl 🔨). For sake of convenience, we are providing hints in Python, which you can easily (and for free) run in Azure Notebooks. SDK Support for C# or .NET Core is available for most challenges. Especially Azure Search features an easy-to-use .NET SDK. You can find code examples in the Azure documentation for the associated services.

Challenge 1 (Azure Search & Cognitive Search)

🚩 Goal: Deploy an Azure Search instance and index a PDF-based data set

  1. Deploy an Azure Search instance
  2. Index the unstructured PDF data set from here - which document contains the term Content Moderator?

Questions:

  1. What is an Index? What is an Indexer? What is a Data Source? How do they relate to each other?
  2. Why would you want to use replicas? Why would you want more partitions?
  3. How would you index json documents sitting in Azure Blob?

🚩 Goal: Index an unstructured data set with Cognitive Search

  1. Add another index to the Azure Search instance, but this time enable Cognitive Search
  2. Index an existing data set coming from Azure Blob (data set can be downloaded here) - which document contains the term Pin to Dashboard?

Questions:

  1. Let's assume we've built a Machine Learning model that can detect beer glasses images (we'll do that in the next challenge) - how could we leverage this model directly in Azure Search for tagging our data?

🙈 Hints for challenge 1

Challenge 2 (Azure Cognitive Services - Vision & Custom Vision)

🚩 Goal: Leverage OCR to make a hand-written or printed text document in images machine-readable

In the language of your choice (Python solution is provided), write two small scripts that

  1. Convert hand-written text from an image into text - Test data: 1, 2
  2. Convert printed text from an image into text - Test data: 1, 2

Questions:

  1. How well does the OCR service work with German text? How well with English?
  2. What happens when the image is not oriented correctly?

🚩 Goal: Detect beer glasses in images

  1. Use Custom Vision to detect beer glasses in images - Image Dataset for training and testing

Questions:

  1. What could we do to increase the detection performance?
  2. What happens if the beer glasses are really small in the image?

🙈 Hints for challenge 2

Challenge 3 (Azure Cognitive Services - Speech)

🚩 Goal: Leverage Speech-to-Text and Text-to-Speech

In the language of your choice (Python solution is provided), write two small scripts or apps that

  1. Convert written text into speech (German or English)
  2. Convert speech into written text (German or English)

You can use can use this file: data/test.wav (English).

Questions:

  1. What happens if you transcribe a long audio file with the speech-to-text API (>15s)?
  2. What happens if you select the wrong language in the text-to-speech API? How could you solve this problem?

Now that we have converted a user's speech input into text, we'll try to determine the intent of that text in the next challenge.

🙈 Hints for challenge 3

Challenge 4 (Azure Cognitive Services - Language)

🚩 Goal: Make your application understand the meaning of text

In the language of your choice (Python solution is provided), write two small scripts or apps that

  1. Translate the input text into German (using the Text Translator API)
  2. Detect the intent and entities of the text (German) - see examples below (using https://eu.luis.ai)

Let's use an example where we want to detect a Pizza order from the user. We also want to detect if the user wants to cancel an order.

LUIS example data:

2 Intents: "CreateOrder", "CancelOrder"

Utterances:

(CreateOrder) Ich moechte eine Pizza Salami bestellen 
(CreateOrder) Vier Pizza Hawaii bitte 

(CancelOrder) Bitte Bestellung 123 stornieren
(CancelOrder) Cancel bitte Bestellung 42
(CancelOrder) Ich will Order 933 nicht mehr

(None) Wieviel Uhr ist heute?
(None) Wie ist das Wetter in Berlin?
(None) Bitte Termin fuer Montag einstellen

Challenge 5 (Azure Cognitive Services - Text Analytics)

🚩 Goal: Leverage Text Analytics API for extracting language, sentiment, key phrases, and entities from text

In the language of your choice (Python solution is provided), write a small scripts that

  1. Extracts sentiment, key phrases and entities from unstructured text using the Text Analytics API

Questions:

  1. What happens if we do not pass in the language parameter while getting the sentiment?

🙈 Hints

Questions:

  1. Why do we need to fill the None intent with examples?
  2. What is the Review endpoint utterances feature in LUIS?

🙈 Hints for challenge 4

Challenge 6 (Azure Cognitive Services - Search)

🚩 Goal: Write a script for auto-suggestion of text

  1. Leverage Bing Autosuggest to make predictions on how a user might wants to continue an half-written sentence

Questions:

  1. What other services does Bing Search offer?
  2. How does the service react in case of a denial-of-service (DoS) attack?

🙈 Hints for challenge 6

Authors

@clemenssiebler, Cloud Solution Architect for Azure at Microsoft