Pinned Repositories
chinese-ocr
data_augement
Image augmentation for machine learning experiments
F0Estimator
A neural network estimating audio dominant melody.
five-video-classification-methods
imgwarp-opencv
Matching-Networks-for-One-Shot-Learning
Near-Duplicate-Video-Detection
open-chat-video-editor
Open source short video automatic generation tool
sceneReco
ctpn+crnn Scene character recognition
Super-Resolution
Super-Resolution-超分辨率重建
yijiuzai's Repositories
yijiuzai/open-chat-video-editor
Open source short video automatic generation tool
yijiuzai/animatediff-cli-prompt-travel
animatediff prompt travel
yijiuzai/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
yijiuzai/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
yijiuzai/autocut
用文本编辑器剪视频
yijiuzai/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
yijiuzai/dreambooth-for-diffusion
首个完整封装、一体化训练stable diffusion dreambooth的autodl镜像环境,可训练定制自己的独特大模型风格、人物,开箱即用,内含详细教程。
yijiuzai/Fay
Fay是一个完整的开源项目,包含Fay控制器及数字人模型,可灵活组合出不同的应用场景:虚拟主播、现场推销货、商品导购、语音助理、远程语音助理、数字人互动、数字人面试官及心理测评、贾维斯、Her。 开源项目,非产品试用!!!
yijiuzai/generative-ai-roadmap
生成式AI的应用路线图 The roadmap of generative AI: use cases and applications
yijiuzai/GigaSpeech
Large, modern dataset for speech recognition
yijiuzai/gl-transitions
The open collection of GL Transitions
yijiuzai/hecate
Automagically generate thumbnails, animated GIFs, and summaries from videos
yijiuzai/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
yijiuzai/Joint-beat-and-downbeat-estimation
yijiuzai/LIQE
[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
yijiuzai/lora-label
LoRA 训练文本标签辅助工具 / LoRA Training Text Labeling Aid
yijiuzai/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
yijiuzai/MultiDiffusion
Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)
yijiuzai/open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
yijiuzai/PaSST
Efficient Training of Audio Transformers with Patchout
yijiuzai/phenaki-pytorch
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch
yijiuzai/pytorch-lightning-learn
yijiuzai/SadTalker
(CVPR 2023)SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
yijiuzai/Text2Video-Zero
Text-to-Image Diffusion Models are Zero-Shot Video Generators
yijiuzai/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
yijiuzai/VideoCrafter
A Toolkit for Text-to-Video Generation and Editing
yijiuzai/VisCPM
Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
yijiuzai/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
yijiuzai/VToonify
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
yijiuzai/zihao_AIGC
同济子豪兄的AIGC作品