/IA-ImageToText

[🎯Challenge] DIO's Artificial Intelligence Challenge, with a presentation on how AI works to create images and recognize text from those images.

Primary LanguageC#

📸 Converting Image to Text ✍🏻

Imagem de capa do projeto do Douglas da DIO

IDE: Visual Code

Author: Douglas Yugo Language: C# Framework: .NET

📌Description

The project was developed as part of the conclusion of the DIO Bootcamp, with a focus on exploring Generative Artificial Intelligence resources, such as OpenAI and GitHub Copilot.

The main aim of the project was to recognize text in images stored in the "inputs" folder. To do this, I created an API using C# and .NET, which captures the texts contained in the images and displays the results on the console. To read and extract the texts from the images, I used Tesseract, an open source OCR tool.


🤖Technologies

  • Copilot : Creating images for the presentation.
  • C# and .NET : I used these technologies to create the API with Tesseract.
  • Tesseract : To recognize text in images.

  • ⚠Warning

    To use Tesseract, you'll need to download it to your machine. In Technologies you'll find it with a link to all the documentation, including installation.

    You also need to install the trainedddata, which is responsible for teaching Tesseract about the languages it can find in the images. Each "traineddata" file has its own language.

    The images you run Tesseract on should preferably be in black and white. Try to follow the example images in the Inputs folder

    It may be that the API doesn't perfectly recognize what's in the image because of the fonts that are used in the image.


    📁Folders

    • Inputs : Folder where the images are stored.
    • Outputs : Folder where the Tesseract results are stored.
    • ImageRecognition : Folder containing the API that recognizes text in images.

    🧐Creation Process

    1. First I started looking for images where I chose the theme of heroes like Marvel and DC.
    2. I explored Azure and created the API in C# and .NET.
    3. I challenged myself and wanted to find something in open code and I found Tesseract where I began to study the documentation on how to use it.
    4. I'd to edit the images and make them black and white with high contrast so that Tesseract could read them.
    5. Finally, I finalized the API by improving some features for Tesseract to read and put in two languages to avoid any kind of error during reading.

    🚀Results

    The results met expectations, correctly returning the text contained in the images. Below is a sample of the results obtained, along with a link to a short presentation explaining the project and the development of the DIO challenge.

    Gif with a system presentation

    💭Reflections

    The DIO challenge provided significant learning in the use of Copilot, especially in the elaboration of prompts to obtain valuable tips on the use of Tesseract, allowing me to solve problems and difficulties to achieve the expected results. In addition, I learned how to use some Azure tools, including integration with API_KEYS, which contributed to improving my C# and .NET skills. I was also introduced to Tesseract, an open source tool that I found extremely useful and impressive.