Oguzhanercan's Stars
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
yangxiaofeng/rectified_flow_prior
Official code for paper: Text-to-Image Rectified Flow as Plug-and-Play Priors [ICLR 2025]
LingxiaoYang2023/DSG2024
Official pytorch repository for “Guidance with Spherical Gaussian Constraint for Conditional Diffusion”
pisacode/voice
roudimit/whisper-flamingo
[Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
fengredrum/finetune-whisper-lora
Fine-Tune Whisper with Transformers and PEFT
backspacetg/simul_whisper
Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
plutonium-239/memsave_torch
Lowering PyTorch's Memory Consumption for Selective Differentiation
google/RB-Modulation
Official code for "RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control"
chen-wl20/DreamCinema
DreamCinema: Cinematic Transfer with Free Camera and 3D Character
bghira/SimpleTuner
A general fine-tuning kit geared toward diffusion models.
YuxinWenRick/diffusion_memorization
Official repo for Detecting, Explaining, and Mitigating Memorization in Diffusion Models (ICLR 2024)
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
guoqincode/DiT-Visualization
Visualization of DiT self attention features
adelacvg/detail_tts
All generative model in one for better TTS model
Young98CN/LoRA_Composer
LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models
unity-research/IP-Adapter-Instruct
IP Adapter Instruct
wenet-e2e/speech-synthesis-paper
List of speech synthesis papers.
JackAILab/ConsistentID
Customized ID Consistent for human
black-forest-labs/flux
Official inference repo for FLUX.1 models
XiangZ-0/HiT-SR
[ECCV 2024 - Oral] HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
LeonHLJ/FouriScale
Official implementation of FouriScale (ECCV2024)
ruohaoguo/ovavss
Official Implementation of "Open-Vocabulary Audio-Visual Semantic Segmentation" [ACM MM 2024 Oral].
Algolzw/daclip-uir
[ICLR 2024] Controlling Vision-Language Models for Universal Image Restoration. 5th place in the NTIRE 2024 Restore Any Image Model in the Wild Challenge.
TheLartians/ModernCppStarter
🚀 Kick-start your C++! A template for modern C++ projects using CMake, CI, code coverage, clang-format, reproducible dependency management and much more.
phohenecker/switch-cuda
A simple bash script for switching between installed versions of CUDA.
voxel51/fiftyone
Refine high-quality datasets and visual AI models
tinygrad/tinygrad
You like pytorch? You like micrograd? You love tinygrad! ❤️
ManimCommunity/manim
A community-maintained Python framework for creating mathematical animations.
3b1b/manim
Animation engine for explanatory math videos