Source models on Hugging Face Hub to perform text, audio, image, and multimodal tasks using the Hugging Face transformers library.
-
Use the Transformers library to turn a small language model into a chatbot capable of multi-turn conversations to answer follow-up questions.
-
Translate between languages, summarize documents, and measure the similarity between two pieces of text, which can be used for search and retrieval.
-
Convert audio to text with Automatic Speech Recognition (ASR), and convert text to audio using text-to-speech (TTS).
-
Perform zero-shot audio classification, to classify audio without fine-tuning the model.
-
Generate an audio narration describing an image by combining object detection and text-to-speech models.
-
Identify objects or regions in an image by prompting a zero-shot image segmentation model with points to identify the object that you want to select.