Dingpx's Stars
meta-llama/llama3
The official Meta Llama 3 GitHub site
FoundationVision/VAR
[NeurIPS 2024 Best Paper][GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
yisol/IDM-VTON
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
johnma2006/mamba-minimal
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
yuweihao/MambaOut
MambaOut: Do We Really Need Mamba for Vision?
InstantStyle/InstantStyle
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation š„
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
yyyujintang/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
AGI-Edgerunners/LLM-Agents-Papers
A repo lists papers related to LLM based agent
YanjieZe/3D-Diffusion-Policy
[RSS 2024] 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
DmitryRyumin/AAAI-2024-Papers
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for better understanding. ā experience the forefront of progress in artificial intelligence with this repository!
AILab-CVC/SEED-X
Multimodal Models in Real World
HL-hanlin/Ctrl-Adapter
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
RoboFlamingo/RoboFlamingo
Code for RoboFlamingo
VDIGPKU/GALA3D
[ICML 2024] GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
h-zhao1997/cobra
[AAAI-25] Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference
1989Ryan/llm-mcts
[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.
bytedance/GR-1
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
GuanxingLu/ManiGaussian
[ECCV 2024] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
astramind-ai/Mixture-of-depths
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
HKUNLP/diffusion-of-thoughts
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
kyegomez/Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
AssassinWS/LLM-TAMP
LLM3: Large Language Model-based Task and Motion Planning with Motion Failure Reasoning
Hon-Wong/Elysium
[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM
Dingpx/EAI
Official code of [AAAI2024] Expressive Forecasting of 3D Whole-body Human Motions
Nicolinho/RoboVLM
Westlake-DL/DL-Course-2024
Official Repository for Westlake Deep Learning Course (2024)
AlbertTan404/RoLD
PyTorch implementation of Robot Latent Diffusion
Eezekiel/Awesome-Motion-Diffusion-Models
A collection of resources and papers on Motion Diffusion Models.