luomingshuang

Interested in CV, NLP, ASR and so on.

ICT, UCAS, Peng Cheng LabShenzhen

Pinned Repositories

k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Language:Cuda1.1k 74 383217
Awesome-Incremental-Learning
Awesome Incremental Learning
0 1 00
Cross-Modal-Pretraining-with-BERT
5 1 00
GE2E-SV-TI-Voxceleb-LMS
Language:Python3 3 01
icefall
Language:Python1 1 01
k2-speechbrain
In this repository, I try to combine k2 with speechbrain to decode well and fastly.
Language:Python16 6 12
lhotse
Language:Python0 1 00
M3GPT
M3GPT: An advanced multimodal, multitask framework for motion comprehension and generation.
Language:Python5 3 00
Prompt-Engineering-Guide
:octopus: Guides, papers, lecture, and resources for prompt engineering
Language:Jupyter Notebook0 1 00
sherpa
Streaming and non-streaming ASR server in Python
Language:Python0 1 00

luomingshuang's Repositories

luomingshuang/M3GPT
M3GPT: An advanced multimodal, multitask framework for motion comprehension and generation.
Language:Python5 3 00
luomingshuang/OLMoE
OLMoE: Open Mixture-of-Experts Language Models
Language:Jupyter Notebook1 0 0
luomingshuang/3D_Human_Motion_Visualization
1 0
luomingshuang/AnimationGPT
AnimationGPT:An AIGC tool for generating game combat motion assets
Language:Python0 0
luomingshuang/awesome-avatar-plaza
Daily tracking of awesome avatar papers, including 2d talking head, 3d head avatar, body avatar.
luomingshuang/awesome-cn
超赞列表合集
Language:Python0 0
luomingshuang/awesome-diffusion-v2v
Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translation. And a video editing benchmark code.
luomingshuang/Awesome-Embodied-Agent-with-LLMs
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates!
0 0
luomingshuang/Awesome-Human-Motion-Video-Generation
Human Motion Video Generation: A Survey (https://www.techrxiv.org/users/836049/articles/1228135-human-motion-video-generation-a-survey)
luomingshuang/Awesome-Human-Video-Generation
A work list of recent human video generation method. This repository focus on half/full body human video generation method, The Nerf, Gaussian splashing, Motion Pose, and talking head/Portrait is not included. Our details survey is online now.
luomingshuang/Awesome-Unified-Multimodal-Models
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
luomingshuang/bvh
A modern C++ BVH construction and traversal library
Language:C++0 0
luomingshuang/conditional-flow-matching
TorchCFM: a Conditional Flow Matching library
Language:Python0 0
luomingshuang/Diffusion-Noise-Optimization
DNO: Optimizing Diffusion Noise Can Serve As Universal Motion Priors
luomingshuang/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
luomingshuang/Emu3
Next-Token Prediction is All You Need
Language:Python0 0
luomingshuang/HOI-Learning-List
A list of Human-Object Interaction Learning.
0 0
luomingshuang/lerobot
🤗 LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch
Language:Python0 0
luomingshuang/LivePortrait
Bring portraits to life!
Language:Python0 0
luomingshuang/MMM
Official repository for "MMM: Generative Masked Motion Model"
Language:Jupyter Notebook0 0
luomingshuang/MotionEditor
[CVPR2024] MotionEditor is the first diffusion-based model capable of video motion editing.
Language:Python0 0
luomingshuang/motionfix
MotionFix: Text-Driven 3D Human Motion Editing [SIGGRAPH ASIA 2024]
luomingshuang/PHC
Official Implementation of the ICCV 2023 paper: Perpetual Humanoid Control for Real-time Simulated Avatars
luomingshuang/ProgMoGen
Programmable Motion Generation for Open-Set Motion Control Tasks (CVPR24)
Language:Python0 0
luomingshuang/sherpa-onnx
Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
Language:C++0 0
luomingshuang/SHOW
This is the codebase for SHOW in Generating Holistic 3D Human Motion from Speech [CVPR2023],
luomingshuang/StableMoFusion
Language:Python0 0
luomingshuang/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Language:Python0 0
luomingshuang/UniPose
[ECCV 2024] Official implementation of the paper "UniPose : Detecting Any Keypoints"
Language:Python0 0
luomingshuang/vision-lstm
xLSTM as Generic Vision Backbone
Language:Python0 0