Pinned Repositories
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
core
Production ready AI assistant framework
DeepSpeech-Italian-Model
Tooling for producing Italian model (public release available) for DeepSpeech and text corpus
GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 24
gqa-it
Italian Question Answering on Image Scene Graphs
mamba
MyNN1
OCRmyImage
OmniFusion
OmniFusion — a multimodal model to communicate using text and images
piperino11's Repositories
piperino11/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
piperino11/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
piperino11/core
Production ready AI assistant framework
piperino11/DeepSpeech-Italian-Model
Tooling for producing Italian model (public release available) for DeepSpeech and text corpus
piperino11/GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 24
piperino11/gqa-it
Italian Question Answering on Image Scene Graphs
piperino11/mamba
piperino11/MyNN1
piperino11/OCRmyImage
piperino11/OmniFusion
OmniFusion — a multimodal model to communicate using text and images
piperino11/parler-tts
Inference and training library for high-quality TTS models.
piperino11/squad-it
A large scale dataset for Question Answering in Italian
piperino11/video-caption.pytorch
piperino11/skynet
AI core services for Jitsi
piperino11/u-deppllama
Dependency parsing with Large Language Models
piperino11/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction"
piperino11/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild