vxltersmith's Stars
MoonInTheRiver/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
vxltersmith/align_refine
vosen/ZLUDA
CUDA on non-NVIDIA GPUs
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
ControlNet/MARLIN
[CVPR] MARLIN: Masked Autoencoder for facial video Representation LearnINg
Rudrabha/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
antgroup/echomimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
ZiqiaoPeng/SyncTalk
[CVPR 2024] This is the official source for our paper "SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis"
ali-vilab/dreamtalk
Official implementations for paper: DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models
flashlight/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
SamsungLabs/SummaryMixing
This repository implements SummaryMixing, a simpler, faster and much cheaper replacement to self-attention for automatic speech recognition (see: https://arxiv.org/abs/2307.07421). The code is ready to be used with the SpeechBrain toolkit).
SJTMusicTeam/Muskits
An opensource music processing toolkit
pytorch/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
huggingface/candle
Minimalist ML framework for Rust
Unity-Technologies/ml-agents
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
microsoft/JARVIS
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
mlfoundations/open_clip
An open source implementation of CLIP.
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
madebyollin/taesd
Tiny AutoEncoder for Stable Diffusion
HKUNLP/reparam-discrete-diffusion
Reparameterized Discrete Diffusion Models for Text Generation
Stability-AI/generative-models
Generative Models by Stability AI
mts-ai/audiogram
githubharald/CTCDecoder
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
run-llama/llama_index
LlamaIndex is a data framework for your LLM applications
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
XPixelGroup/DiffBIR
Official codes of DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
ShichenLiu/SoftRas
Project page of paper "Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning"
BrianPulfer/PapersReimplementations
Personal short implementations of Machine Learning papers