Awj2021's Stars
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
guoyww/AnimateDiff
Official implementation of AnimateDiff.
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy
Diffusion model papers, survey, and taxonomy
sfikas/medical-imaging-datasets
A list of Medical imaging datasets.
tencent-ailab/V-Express
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
IDEA-Research/Grounding-DINO-1.5-API
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
google-research/omniglue
Code release for CVPR'24 submission 'OmniGlue'
sail-sg/MDT
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
Jeff-LiangF/streamv2v
Official Pytorch implementation of StreamV2V.
G-U-N/Phased-Consistency-Model
[NeurIPS 2024] Boosting the performance of consistency models with PCM!
Francis-Rings/MotionFollower
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
wusize/CLIPSelf
[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Francis-Rings/MotionEditor
[CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.
haofanwang/T2I-Adapter-for-Diffusers
Transfer the T2I-Adapter with any basemodel in diffusers🔥
khawar-islam/diffuseMix
Official PyTorch implementation of DiffuseMix : Label-Preserving Data Augmentation with Diffusion Models (CVPR'2024)
ChenhongyiYang/PPAL
[CVPR 2024] Plug and Play Active Learning for Object Detection
EasonXiao-888/GrootVL
[NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model
abachaa/VQA-Med-2019
Visual Question Answering in the Medical Domain VQA-Med 2019
Curt-Park/yolo-world-with-efficientvit-sam
YOLO-World + EfficientViT SAM
10Ring/LAA-Net
The official implementation for LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection
roymiles/VkD
[CVPR 2024] VkD : Improving Knowledge Distillation using Orthogonal Projections
zhengli97/DM-KD
Official PyTorch Code for "Is Synthetic Data From Diffusion Models Ready for Knowledge Distillation?" (https://arxiv.org/abs/2305.12954)
apapiu/mamba_small_bench
Trying out the Mamba architecture on small examples (cifar-10, shakespeare char level etc.)
YtongXie/PairAug
[CVPR2024] PairAug: What Can Augmented Image-Text Pairs Do for Radiology?
ies-research/multi-annotator-machine-learning
Training with Data Annotated by Multipe Error-prone Annotators