JerryWei1985

JerryWei1985's Stars

facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python6.4k578
adobe-fonts/source-code-pro
Monospaced font family for user interface and coding environments
Language:CSS19.9k1.6k
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Language:Python4.3k321
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k280
Stability-AI/generative-models
Generative Models by Stability AI
Language:Python24.8k2.7k
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Python39.3k5.1k
bmaltais/kohya_ss
Language:Python9.7k1.3k
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Language:Python26.4k5.4k
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python9.5k897
guoyww/AnimateDiff
Official implementation of AnimateDiff.
Language:Python10.7k874
TachibanaYoshino/AnimeGAN
A Tensorflow implementation of AnimeGAN for fast photo animation ! This is the Open source of the paper 「AnimeGAN: a novel lightweight GAN for photo animation」, which uses the GAN framwork to transform real-world photos into anime images.
Language:Python4.5k663
kanasimi/work_crawler
Download comics novels 小说漫画下载工具小説漫画のダウンローダ小說漫畫下載:腾讯漫画大角虫漫画有妖气咪咕 SF漫画哦漫画看漫画漫画柜汗汗酷漫動漫伊甸園快看漫画微博动漫 733动漫网大古漫画网漫画DB 無限動漫動漫狂卡推漫画动漫之家动漫屋古风漫画网 36漫画网亲亲漫画网乙女漫画 webtoons 咚漫ニコニコ静画 ComicWalker ヤングエースUP モアイ pixivコミックサイコミ;アルファポリスカクヨムハーメルン小説家になろう起点中文网八一中文网顶点小说落霞小说网努努书坊笔趣阁→epub.
Language:JavaScript3.2k319
gpakosz/.tmux
🇫🇷 Oh my tmux! My self-contained, pretty & versatile tmux configuration made with ❤️
Language:Shell22.1k3.4k
instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Language:Python11.2k814
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language:Python12.7k1.1k
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python12.7k1.3k
thuanz123/enhancing-transformers
An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch
Language:Python28934
lucidrains/magvit2-pytorch
Implementation of MagViT2 Tokenizer in Pytorch
Language:Python56534
google-research/magvit
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
Language:Python95642
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
Language:Python1.7k86
facebookresearch/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Language:Python8.4k1.1k
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
Language:Jupyter Notebook11k1.1k
microsoft/LLaVA-Med
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Language:Python1.6k202
microsoft/ML-For-Beginners
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
Language:HTML70k14.6k
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Language:Python6.1k420
microsoft/generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Language:Jupyter Notebook65.5k33.5k
avaneev/r8brain-free-src
High-quality pro audio resampler / sample rate conversion C++ library. Very fast, for both audio resampling and time-series interpolation.
Language:C++58261
mct10/RepCodec
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
Language:Python16211
mli/autocut
用文本编辑器剪视频
Language:Python6.8k682
microsoft/table-transformer
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
Language:Python2.3k259