Awj2021

Awj2021's Stars

coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python35.5k4.3k
talkingnow/HM-Conformer
Language:Python241
grip-unina/poi-forensics
POI-Forensics
Language:Python555
k2-fsa/icefall
Language:Python934296
PairCustomization/PairCustomization
Language:Python915
wangkai930418/DPL
Dynamic Prompt Learning: Addressing Cross-Attention Leakage for Text-Based Image Editing (NeurIPS 2023)
Language:Python905
bbenligiray/nus_wide_formatter
A tool to download and format NUS-WIDE dataset for multilabel classification
Language:Python53
sivannavis/samo
SAMO: SPEAKER ATTRACTOR MULTI-CENTER ONE-CLASS LEARNING FOR VOICE ANTI-SPOOFING
Language:Python359
Purdue-M2/Robust_DM_Generated_Image_Detection
Language:Python62
52CV/CVPR-2024-Papers
82649
qiuqiangkong/audioset_tagging_cnn
Language:Python1.4k255
KindXiaoming/pykan
Kolmogorov Arnold Networks
Language:Jupyter Notebook15.1k1.4k
ljwztc/CLIP-Driven-Universal-Model
[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.
Language:Python58071
facebookresearch/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Language:Python6.4k911
lucidrains/ema-pytorch
A simple way to keep track of an Exponential Moving Average (EMA) version of your Pytorch model
Language:Python51731
Awj2021/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
1
YuanGongND/psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".
Language:Python14016
DeepMicroscopy/QuiltCleaner
Automatic cleaning of the QUILT-1M pathology dataset
Language:Jupyter Notebook3
okaris/omni-zero
A diffusers pipeline for zero shot stylised portrait creation
Language:Python43723
CreamyLong/stable-diffusion
Speechless at the original stable-diffusion
Language:Jupyter Notebook7211
PathologyFoundation/plip
Pathology Language and Image Pre-Training (PLIP) is the first vision and language foundation model for Pathology AI (Nature Medicine). PLIP is a large-scale pre-trained model that can be used to extract visual and language features from pathology images and text description. The model is a fine-tuned version of the original CLIP model.
Language:Python27527
cccntu/fine-tune-models
Language:Jupyter Notebook11012
openai/consistencydecoder
Consistency Distilled Diff VAE
Language:Python2.1k76
whlzy/FiT
[ICML 2024 Spotlight] FiT: Flexible Vision Transformer for Diffusion Model
Language:Python3899
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Language:Jupyter Notebook5.3k336
GuyTevet/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
Language:Python3.2k343
CompVis/taming-transformers
Taming Transformers for High-Resolution Image Synthesis
Language:Jupyter Notebook5.8k1.1k
mlmed/torchxrayvision
TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
Language:Jupyter Notebook931218
haofanwang/ControlNet-for-Diffusers
Transfer the ControlNet with any basemodel in diffusers🔥
Language:Python81348
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook68.4k10.2k