yijiuzai

Music analysis, video analysis, multimodal, automatic editing, automatic generation

上海

Pinned Repositories

chinese-ocr
Language:Python6 3 310
data_augement
Image augmentation for machine learning experiments
Language:Python1 3 02
F0Estimator
A neural network estimating audio dominant melody.
Language:Python1 1 03
five-video-classification-methods
Language:Python1 2 00
imgwarp-opencv
Language:C++11
Matching-Networks-for-One-Shot-Learning
Language:Python3 4 02
Near-Duplicate-Video-Detection
Language:Jupyter Notebook1 2 00
open-chat-video-editor
Open source short video automatic generation tool
Language:Python1 0 00
sceneReco
ctpn+crnn Scene character recognition
Language:Python3 2 12
Super-Resolution
Super-Resolution-超分辨率重建
Language:MATLAB3 2 00

yijiuzai's Repositories

yijiuzai/open-chat-video-editor
Open source short video automatic generation tool
Language:Python1 0 00
yijiuzai/animatediff-cli-prompt-travel
animatediff prompt travel
yijiuzai/ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".
Language:Jupyter Notebook0 0
yijiuzai/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Language:Python0 0
yijiuzai/autocut
用文本编辑器剪视频
yijiuzai/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
Language:Python0 0
yijiuzai/dreambooth-for-diffusion
首个完整封装、一体化训练stable diffusion dreambooth的autodl镜像环境，可训练定制自己的独特大模型风格、人物，开箱即用，内含详细教程。
Language:Python0 0
yijiuzai/Fay
Fay是一个完整的开源项目，包含Fay控制器及数字人模型，可灵活组合出不同的应用场景：虚拟主播、现场推销货、商品导购、语音助理、远程语音助理、数字人互动、数字人面试官及心理测评、贾维斯、Her。开源项目，非产品试用！！！
yijiuzai/generative-ai-roadmap
生成式AI的应用路线图 The roadmap of generative AI: use cases and applications
yijiuzai/GigaSpeech
Large, modern dataset for speech recognition
Language:Shell0 0
yijiuzai/gl-transitions
The open collection of GL Transitions
Language:GLSL0 0
yijiuzai/hecate
Automagically generate thumbnails, animated GIFs, and summaries from videos
yijiuzai/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Language:Python0 0
yijiuzai/Joint-beat-and-downbeat-estimation
yijiuzai/LIQE
[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
yijiuzai/lora-label
LoRA 训练文本标签辅助工具 / LoRA Training Text Labeling Aid
yijiuzai/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
yijiuzai/MultiDiffusion
Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)
Language:Jupyter Notebook0 0
yijiuzai/open_llama
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
0 0
yijiuzai/PaSST
Efficient Training of Audio Transformers with Patchout
Language:Python0 0
yijiuzai/phenaki-pytorch
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch
yijiuzai/pytorch-lightning-learn
yijiuzai/SadTalker
（CVPR 2023）SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
yijiuzai/Text2Video-Zero
Text-to-Image Diffusion Models are Zero-Shot Video Generators
yijiuzai/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
yijiuzai/VideoCrafter
A Toolkit for Text-to-Video Generation and Editing
yijiuzai/VisCPM
Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列
yijiuzai/vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
yijiuzai/VToonify
[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
yijiuzai/zihao_AIGC
同济子豪兄的AIGC作品
0 0