marinero4972

marinero4972's Stars

facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook11.1k944
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
Language:Python1.1k156
datawhalechina/self-llm
《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合**宝宝的部署教程
Language:Jupyter Notebook8.2k981
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Language:Python2.7k172
catcathh/UltraPixel
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
Language:Python53820
Open3DA/LL3DA
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
Language:Python2289
TangYuan96/MiniGPT-3D
[MM 2024] [Need a RTX 3090] MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors
Language:Python604
dvlab-research/LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
Language:Python1.8k127
qizekun/ShapeLLM
[ECCV 2024] ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Language:Python1219
ZiyuGuo99/Point-Bind_Point-LLM
Align 3D Point Cloud with Multi-modalities for Large Language Models
Language:Python41032
LLaVA-VL/LLaVA-NeXT
Language:Python2.5k186
UMass-Foundation-Model/3D-LLM
Code for 3D-LLM: Injecting the 3D World into Large Language Models
Language:Python91155
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python13.6k1.1k
GraphPKU/MachineLearning2024
652
NVIDIA/MinkowskiEngine
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
Language:Python2.4k361
pengsongyou/openscene
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
Language:Python63544
YunzeMan/Lexicon3D
[NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Language:Python284
ScanNet/ScanNet
Language:C1.8k346
dk-liang/UniSeg3D
[NeurIPS 2024] A Unified Framework for 3D Scene Understanding
Language:Python853
OpenRobotLab/EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Language:Python46034
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python3.8k302
OpenRobotLab/Grounded_3D-LLM
Code&Data for Grounded 3D-LLM with Referent Tokens
Language:Python771
OpenRobotLab/PointLLM
[ECCV 2024 Oral] PointLLM: Empowering Large Language Models to Understand Point Clouds
Language:Python53424
52CV/CVPR-2024-Papers
72143
graphdeco-inria/gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Language:Python13.8k1.8k
3DTopia/LGM
[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Language:Python1.6k104
skyhehe123/ScatterFormer
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention (ECCV 2024)
Language:Python695
black-forest-labs/flux
Official inference repo for FLUX.1 models
Language:Python14.3k1k
deepseek-ai/DeepSeek-MoE
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Language:Python97647
davidmrau/mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
Language:Python95298