SakurajimaMaiii
Transfer learning, multimodal learning, and medical AI. NLP @aiwaves-cn
@aiwaves-cn Hangzhou,China
SakurajimaMaiii's Stars
langchain-ai/langchain
🦜🔗 Build context-aware reasoning applications
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
facebookresearch/faiss
A library for efficient similarity search and clustering of dense vectors.
OpenTalker/SadTalker
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
BerriAI/litellm
Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
Rudrabha/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
bentoml/OpenLLM
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
OpenTalker/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
wgwang/awesome-LLMs-In-China
**大模型
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
openai/weak-to-strong
Zz-ww/SadTalker-Video-Lip-Sync
本项目基于SadTalkers实现视频唇形合成的Wav2lip。通过以视频文件方式进行语音驱动生成唇形,设置面部区域可配置的增强方式进行合成唇形(人脸)区域画面增强,提高生成唇形的清晰度。使用DAIN 插帧的DL算法对生成视频进行补帧,补充帧间合成唇形的动作过渡,使合成的唇形更为流畅、真实以及自然。
ytongbai/LVM
casper-hansen/AutoAWQ
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
microsoft/LLaVA-Med
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
numz/sd-wav2lip-uhq
Wav2Lip UHQ extension for Automatic1111
VILA-Lab/ATLAS
A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171
open-mmlab/PIA
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
kaidic/LDAM-DRW
[NeurIPS 2019] Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss
dvlab-research/LLaMA-VID
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
kgl-prml/Contrastive-Adaptation-Network-for-Unsupervised-Domain-Adaptation
pytorch implementation for Contrastive Adaptation Network
LLaMafia/llamafia.github
shengliu66/ELR
Official Implementation of Early-Learning Regularization Prevents Memorization of Noisy Labels
google-research/syn-rep-learn
Learning from synthetic data - code and models
ZhangYuanhan-AI/visual_prompt_retrieval
[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"
nghiakvnvsd/wav2lip384
test-time-training/mttt
Re-Align/AlignTDS
Analyzing LLM Alignment via Token distribution shift