Fengdalu

A Gray Cat.

Fengdalu's Stars

ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C32.7k 299 1.2k3.3k
svc-develop-team/so-vits-svc
SoftVC VITS Singing Voice Conversion
Language:Python24.6k 175 1304.7k
huggingface/diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Language:Python23.7k 195 3.7k4.9k
jantic/DeOldify
A Deep Learning based project for colorizing and restoring old images (and video!)
Language:Python17.7k 441 3782.5k
huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Language:Python14.8k 105 9401.4k
ExistentialAudio/BlackHole
BlackHole is a modern macOS audio loopback driver that allows applications to pass audio to other applications with zero additional latency.
Language:C14.4k 121 394564
graphdeco-inria/gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Language:Python12.3k 110 8041.5k
NVIDIA/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python8.1k 102 1.2k1.3k
HumanAIGC/EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
7.1k 318 255833
dreamgaussian/dreamgaussian
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
Language:Python3.7k 46 140329
whitead/paper-qa
LLM Chain for answering questions from documents with citations
Language:Python3.7k 41 135356
mseitzer/pytorch-fid
Compute FID scores with PyTorch.
Language:Python3.2k 14 85494
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
Language:Python1.8k 27 6671
DanielSWolf/rhubarb-lip-sync
Rhubarb Lip Sync is a command-line tool that automatically creates 2D mouth animation from voice recordings. You can use it for characters in computer games, in animated cartoons, or in any other project that requires animating mouths based on existing recordings.
Language:C++1.7k 55 115207
thu-ml/unidiffuser
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
Language:Python1.3k 17 3285
baofff/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
Language:Jupyter Notebook831 12 2457
suragnair/seqGAN
A simplified PyTorch implementation of "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient." (Yu, Lantao, et al.)
Language:Python632 15 24147
dvlab-research/LLaMA-VID
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Language:Python615 11 9439
facebookresearch/fairseq2
FAIR Sequence Modeling Toolkit 2
Language:Python609 18 8857
xwjdsh/2048-ai
An simple AI for the 2048 game.
Language:Go318 13 528
yangshun/2048-python
🐍 2048
Language:Python315 12 4223
lingjzhu/CharsiuG2P
Multilingual G2P in 100 languages
Language:Jupyter Notebook262 10 1025
lingjzhu/charsiu
Charsiu: A neural phonetic aligner.
Language:Jupyter Notebook257 8 1734
kazgu/zotero-chatgpt
ChatGPT plugin for Zotero
Language:JavaScript196 6 169
mpc001/auto_avsr
Auto-AVSR: Lip-Reading Sentences Project
Language:Python147 5 3235
Vontigo/Vontigo
🛸 Vontigo is an open-source CMS built with SvelteKit, featuring 🤖 AI-powered (ChatGPT) content generation. With fast page loads and seamless routing, Vontigo offers a user-friendly interface with customizable themes and templates.
Language:Svelte135 7 2051
jasonppy/PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
Language:Python129 4 711
ms-dot-k/Visual-Audio-Memory
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
Language:Python17 1 54
Janie1996/AV4SER
PyTorch implementation for Audio-Visual Domain Adaptation Feature Fusion for Speech Emotion Recognition
Language:Python100
srv-sh/Visual-Audio-Memory
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
Language:Python2 0 00

Fengdalu

Fengdalu's Stars

ggerganov/whisper.cpp

svc-develop-team/so-vits-svc

huggingface/diffusers

jantic/DeOldify

huggingface/peft

ExistentialAudio/BlackHole

graphdeco-inria/gaussian-splatting

NVIDIA/apex

HumanAIGC/EMO

dreamgaussian/dreamgaussian

whitead/paper-qa

mseitzer/pytorch-fid

Alpha-VLLM/Lumina-T2X

DanielSWolf/rhubarb-lip-sync

thu-ml/unidiffuser

baofff/U-ViT

suragnair/seqGAN

dvlab-research/LLaMA-VID

facebookresearch/fairseq2

xwjdsh/2048-ai

yangshun/2048-python

lingjzhu/CharsiuG2P

lingjzhu/charsiu

kazgu/zotero-chatgpt

mpc001/auto_avsr

Vontigo/Vontigo

jasonppy/PromptingWhisper

ms-dot-k/Visual-Audio-Memory

Janie1996/AV4SER

srv-sh/Visual-Audio-Memory