Open-Source-Models-with-Hugging-Face

Source models on Hugging Face Hub to perform text, audio, image, and multimodal tasks using the Hugging Face transformers library.

Overview

Use the Transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions.
Translate between languages, summarize documents, and measure the similarity between two pieces of text, which can be used for search and retrieval.
Convert audio to text with Automatic Speech Recognition (ASR), and convert text to audio using text-to-speech (TTS).
Perform zero-shot audio classification, to classify audio without fine-tuning the model.
Generate an audio narration describing an image by combining object detection and text-to-speech models.
Identify objects or regions in an image by prompting a zero-shot image segmentation model with points to identify the object that you want to select.