johnson1228's Stars
xjasonlyu/1point3acres
1Point3Acres handy scripts.
rasbt/LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
dendenxu/fast-gaussian-rasterization
A geometry-shader-based, global CUDA sorted high-performance 3D Gaussian Splatting rasterizer. Can achieve a 5-10x speedup in rendering compared to the vanialla diff-gaussian-rasterization.
NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
lipku/LiveTalking
Real time interactive streaming digital human
Kedreamix/Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
judylime/grokking
youssefHosni/Data-Science-Interview-Questions-Answers
Curated list of data science interview questions and answers
kyutai-labs/moshi
cvlab-kaist/GaussianTalker
Official implementation of “GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting” by Kyusun Cho, Joungbin Lee, Heeji Yoon, Yeobin Hong, Jaehoon Ko, Sangjun Ahn and Seungryong Kim
ictnlp/LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
streamlit/streamlit
Streamlit — A faster way to build and share data apps.
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
TadasBaltrusaitis/OpenFace
OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.
EricGuo5513/text-to-motion
Official implementation for "Generating Diverse and Natural 3D Human Motions from Texts (CVPR2022)."
liangxuy/Inter-X
[CVPR 2024] Official implementation of the paper "Towards Versatile Human-Human Interaction Analysis"
TMElyralab/MusePose
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
shubham-goel/4D-Humans
4DHumans: Reconstructing and Tracking Humans with Transformers
Jeevan-kumar-Raj/Grokking-System-Design
Systems design is the process of defining the architecture, modules, interfaces, and data for a system to satisfy specified requirements. Systems design could be seen as the application of systems theory to product development.
jin-s13/COCO-WholeBody
ECCV2020 paper "Whole-Body Human Pose Estimation in the Wild"
facebookresearch/sapiens
High-resolution models for human tasks.
caizhongang/SMPLer-X
Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
ByteByteGoHq/system-design-101
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
Tencent/MimicMotion
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
jantic/DeOldify
A Deep Learning based project for colorizing and restoring old images (and video!)
microsoft/Bringing-Old-Photos-Back-to-Life
Bringing Old Photo Back to Life (CVPR 2020 oral)
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
breezedeus/CnOCR
CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】
BadToBest/EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning