Awj2021's Stars
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
talkingnow/HM-Conformer
grip-unina/poi-forensics
POI-Forensics
k2-fsa/icefall
PairCustomization/PairCustomization
wangkai930418/DPL
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing (NeurIPS 2023)
bbenligiray/nus_wide_formatter
A tool to download and format NUS-WIDE dataset for multilabel classification
sivannavis/samo
SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING
Purdue-M2/Robust_DM_Generated_Image_Detection
52CV/CVPR-2024-Papers
qiuqiangkong/audioset_tagging_cnn
KindXiaoming/pykan
Kolmogorov Arnold Networks
ljwztc/CLIP-Driven-Universal-Model
[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.
facebookresearch/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
lucidrains/ema-pytorch
A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model
Awj2021/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
DeepMicroscopy/QuiltCleaner
Automatic cleaning of the QUILT-1M pathology dataset
okaris/omni-zero
A diffusers pipeline for zero shot stylised portrait creation
CreamyLong/stable-diffusion
Speechless at the original stable-diffusion
PathologyFoundation/plip
Pathology Language and Image Pre-Training (PLIP) is the first vision and language foundation model for Pathology AI (Nature Medicine). PLIP is a large-scale pre-trained model that can be used to extract visual and language features from pathology images and text description. The model is a fine-tuned version of the original CLIP model.
cccntu/fine-tune-models
openai/consistencydecoder
Consistency Distilled Diff VAE
whlzy/FiT
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
GuyTevet/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
mlmed/torchxrayvision
TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
haofanwang/ControlNet-for-Diffusers
Transfer the ControlNet with any basemodel in diffusers🔥
CompVis/stable-diffusion
A latent text-to-image diffusion model