Awj2021

Awj2021's Stars

suno-ai/bark
🔊 Text-Prompted Generative Audio Model
Language:Jupyter Notebook36.1k 332 4444.3k
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python35.5k 295 1.1k4.3k
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python30.5k 425 4.2k6.4k
guoyww/AnimateDiff
Official implementation of AnimateDiff.
Language:Python10.6k 104 361872
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Language:Jupyter Notebook9.2k 95 406825
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Language:Python5k 77 197417
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python4.8k 40 184630
YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy
Diffusion model papers, survey, and taxonomy
3k 53 8251
sfikas/medical-imaging-datasets
A list of Medical imaging datasets.
2.3k 61 5415
tencent-ailab/V-Express
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
Language:Python2.3k 39 51281
IDEA-Research/Grounding-DINO-1.5-API
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
Language:Python779 12 4522
google-research/omniglue
Code release for CVPR'24 submission 'OmniGlue'
Language:Python575 10 2650
sail-sg/MDT
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
Language:Python526 17 5238
Jeff-LiangF/streamv2v
Official Pytorch implementation of StreamV2V.
Language:Python450 7 851
G-U-N/Phased-Consistency-Model
[NeurIPS 2024] Boosting the performance of consistency models with PCM!
Language:Python364 19 2012
Francis-Rings/MotionFollower
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
Language:Python191 16 616
wusize/CLIPSelf
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Language:Python168 6 329
Francis-Rings/MotionEditor
[CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.
Language:Python137 5 77
haofanwang/T2I-Adapter-for-Diffusers
Transfer the T2I-Adapter with any basemodel in diffusers🔥
134 6 58
khawar-islam/diffuseMix
Official PyTorch implementation of DiffuseMix : Label-Preserving Data Augmentation with Diffusion Models (CVPR'2024)
Language:Python91 1 127
ChenhongyiYang/PPAL
[CVPR 2024] Plug and Play Active Learning for Object Detection
Language:Python85 4 2110
EasonXiao-888/GrootVL
[NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model
Language:Python83 3 112
abachaa/VQA-Med-2019
Visual Question Answering in the Medical Domain VQA-Med 2019
82 2 224
Curt-Park/yolo-world-with-efficientvit-sam
YOLO-World + EfficientViT SAM
Language:Python76 2 19
10Ring/LAA-Net
The official implementation for LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection
Language:Python64 2 159
roymiles/VkD
[CVPR 2024] VkD : Improving Knowledge Distillation using Orthogonal Projections
Language:Python47 1 102
zhengli97/DM-KD
Official PyTorch Code for "Is Synthetic Data From Diffusion Models Ready for Knowledge Distillation?" (https://arxiv.org/abs/2305.12954)
Language:Python45 6 32
apapiu/mamba_small_bench
Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)
Language:Python42 3 54
YtongXie/PairAug
[CVPR2024] PairAug: What Can Augmented Image-Text Pairs Do for Radiology?
Language:Python27 1 01
ies-research/multi-annotator-machine-learning
Training with Data Annotated by Multipe Error-prone Annotators
Language:Python41