ESK-01's Stars
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
Guyungy/damaihelper
支持大麦网,淘票票、缤玩岛等多个平台,演唱会演出抢票脚本
nilaoda/N_m3u8DL-CLI
[.NET] m3u8 downloader 开源的命令行m3u8/HLS/dash下载器,支持普通AES-128-CBC解密,多线程,自定义请求头等. 支持简体中文,繁体中文和英文. English Supported.
hzwer/CVPR2023-DMVFN
CVPR2023 (highlight) - A Dynamic Multi-Scale Voxel Flow Network for Video Prediction
naver/dust3r
DUSt3R: Geometric 3D Vision Made Easy
pq-yang/PGDiff
[NeurIPS 2023] PGDiff: Guiding Diffusion Models for Versatile Face Restoration via Partial Guidance
zsyOAOA/ResShift
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting (NeurIPS@2023 Spotlight, TPAMI@2024)
aixcoder-plugin/aiXcoder-7B
official repository of aiXcoder-7B Code Large Language Model
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi, Qwen & Gemma LLMs 2-5x faster with 80% less memory
OpenInterpreter/open-interpreter
A natural language interface for computers
lllyasviel/stable-diffusion-webui-forge
binary-husky/gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
google/magika
Detect file content types with deep learning
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
meta-llama/codellama
Inference code for CodeLlama models
Fanghua-Yu/SUPIR
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
collabora/WhisperFusion
WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
MooreThreads/Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
jianchang512/clone-voice
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
Hillobar/Rope
GUI-focused roop
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
cumulo-autumn/StreamDiffusion
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
mistralai/mistral-inference
Official inference library for Mistral models
pytorch-labs/segment-anything-fast
A batched offline inference oriented version of segment-anything
netease-youdao/EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences