Pinned Repositories
AI-Youtube-Shorts-Generator
A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
api4sensevoice
API and websocket server for sensevoice. It has inherited some enhanced features, such as VAD detection, real-time streaming recognition, and speaker verification.
chatwiki
CompreFace
Leading free and open-source face recognition system
demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
easegen-admin
easegen-front
echomimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
yaojun's Repositories
yaojun/CompreFace
Leading free and open-source face recognition system
yaojun/Modelscope_Faster_Whisper_Multi_Subtitle
基于Faster-whisper和modelscope一键生成双语字幕,双语字幕生成器,基于离线大模型,Generate bilingual subtitles with one click based on Faster-whisper and modelscope. Off-line large model
yaojun/VideoLingo
Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
yaojun/Westlake-Omni
yaojun/AI-Youtube-Shorts-Generator
A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.
yaojun/ECCV2022-RIFE
ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
yaojun/PySceneDetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
yaojun/Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
yaojun/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
yaojun/Linly-Dubbing
智能视频多语言AI配音/翻译工具 - Linly-Dubbing — “AI赋能,语言无界”
yaojun/OpenVoice
Instant voice cloning by MIT and MyShell.
yaojun/faster-whisper
Faster Whisper transcription with CTranslate2
yaojun/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
yaojun/elevenlabs-python
The official Python API for ElevenLabs Text to Speech.
yaojun/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
yaojun/Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
yaojun/elevenlabs-examples
yaojun/translation-agent
yaojun/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
yaojun/X-Portrait
Source code for the SIGGRAPH 2024 paper "X-Portrait: Expressive Portrait Animation with Hierarchical Motion Attention"
yaojun/Wav2Lip-GFPGAN
High quality Lip sync
yaojun/GFPGAN
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
yaojun/Semantic-SAM
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
yaojun/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
yaojun/video-subtitle-remover
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
yaojun/chatwiki
yaojun/ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
yaojun/YouDub-webui
yaojun/YouDub
YouDub是一个开源工具,旨在自动化地将优质的YouTube视频进行翻译和配音,以便将其搬运到中文互联网上。该工具使用了AI语音识别技术将音频转换为文本,然后通过大语言模型将文本翻译成中文,最后通过AI声音克隆技术将中文转换为音频。这样,我们就可以创建出具有原始YouTuber音色的中文配音视频。
yaojun/music_source_separation