aroslanov's Stars
ali-vilab/In-Context-LoRA
Official repository of In-Context LoRA for Diffusion Transformers
MahdeenSky/SoftVC-VITS-MusicSingerChanger
Google collab for testing SoftVC VITS Singing Voice Conversion for AI capable of changing the singer within music files.
aigc-apps/EasyAnimate
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
NaruseMioShirakana/DragonianVoice
多个SVC/TTS的C++推理库
kijai/ComfyUI-CogVideoXWrapper
HallowSiddharth/VoiceCraftAI
VoiceCraftAI is a revolutionary AI tool to dub videos into multiple regional languages and lip-sync at the same time.
wenqsun/DimensionX
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
AlexxIT/go2rtc
Ultimate camera streaming application with support RTSP, RTMP, HTTP-FLV, WebRTC, MSE, HLS, MP4, MJPEG, HomeKit, FFmpeg, etc.
kijai/ComfyUI-GIMM-VFI
instantX-research/InstantIR
InstantIR: Blind Image Restoration with Instant Generative Reference 🔥
instantX-research/Regional-Prompting-FLUX
Training-free Regional Prompting for Diffusion Transformers 🔥
cvlab-kaist/PF3plat
Official Implementation of "PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting"
logtd/ComfyUI-MochiEdit
ComfyUI nodes to edit videos using Genmo Mochi
cheahjs/free-llm-api-resources
A list of free LLM inference resources accessible via API.
AykutSarac/jsoncrack.com
✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
microsoft/MoGe
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
ttxskk/AiOS
[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
kestra-io/kestra
:zap: Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
ZGCTroy/CamI2V
official repo of paper for "CamI2V: Camera-Controlled Image-to-Video Diffusion Model"
alimama-creative/SDXL_EcomID_ComfyUI
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
janhq/ichigo
Local realtime voice AI
shallowdream204/DreamClear
[NeurIPS 2024🔥] DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
cvg/depthsplat
DepthSplat: Connecting Gaussian Splatting and Depth
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
MrTornado24/DreamCraft3D_Plus
lllyasviel/IC-Light
More relighting!
WHU-USI3DV/VistaDream
[arXiv'24] VistaDream: Sampling multiview consistent images for single-view scene reconstruction
stanford-oval/storm
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.