TzRain

Southeast UniversityChina

TzRain's Stars

karpathy/LLM101n
LLM101n: Let's build a Storyteller
28.2k 2k 01.5k
facebookresearch/dino
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
Language:Python6.2k 67 247904
naver/dust3r
DUSt3R: Geometric 3D Vision Made Easy
Language:Python5k 52 147545
amazon-science/mm-cot
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
Language:Python3.7k 55 52310
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Language:Python2.4k 41 384150
LLaVA-VL/LLaVA-NeXT
Language:Python2.4k 32 210165
THUDM/AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Language:Python2.1k 29 138147
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
Language:Python1k 10 151142
52CV/CVPR-2024-Papers
690 6 340
apchenstu/mvsnerf
[ICCV 2021] Our work presents a novel neural rendering approach that can efficiently reconstruct geometric and neural radiance fields for view synthesis.
Language:Python672 17 9284
haonan-li/CMMLU
CMMLU: Measuring massive multitask language understanding in Chinese
Language:Python668 11 3649
lupantech/ScienceQA
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
Language:Python584 9 1964
justimyhxu/GRM
Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
524 62 1830
google-research-datasets/conceptual-captions
Conceptual Captions is a dataset containing (image-URL, caption) pairs designed for the training and evaluation of machine learned image captioning systems.
Language:Shell512 18 1926
jiawei-ren/dreamgaussian4d
[arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting
Language:Python501 15 1531
microsoft/voxelpose-pytorch
Official implementation of "VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment"
Language:Python480 22 5091
AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
Language:Python303 4 279
yuweihao/MM-Vet
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
Language:Python251 2 710
OpenGVLab/OmniCorpus
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Language:Python245 13 35
computational-imaging/GSM
Gaussian Shell Maps for Efficient 3D Human Generation (CVPR 2024)
Language:Jupyter Notebook194 21 410
kai422/IART
[CVPR 2024 Highlight] Enhancing Video Super-Resolution via Implicit Resampling-based Alignment.
Language:Python152 2 1114
open-compass/MMBench
Official Repo of "MMBench: Is Your Multi-modal Model an All-around Player?"
142 3 309
PhoenixZ810/MG-LLaVA
Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).
Language:Python136 2 64
opendatalab/laion5b-downloader
Language:Python90 4 58
TideDra/VL-RLHF
A RLHF Infrastructure for Vision-Language Models
Language:Python85 4 145
OpenGVLab/MMT-Bench
ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Language:Python83 5 72
nttmdlab-nlp/SlideVQA
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)
Language:Python72 1 57
Liuziyu77/MMDU
Official repository of MMDU dataset
Language:Python60 2 31
OpenGVLab/LCL
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Language:Python57 1 42
XunshanMan/MVGFormer
This is the official implementation of the work presented at CVPR 2024, titled Multiple View Geometry Transformers for 3D Human Pose Estimation (MVGFormer).
29 5 60

TzRain

TzRain's Stars

karpathy/LLM101n

facebookresearch/dino

naver/dust3r

amazon-science/mm-cot

InternLM/InternLM-XComposer

LLaVA-VL/LLaVA-NeXT

THUDM/AgentBench

open-compass/VLMEvalKit

52CV/CVPR-2024-Papers

apchenstu/mvsnerf

haonan-li/CMMLU

lupantech/ScienceQA

justimyhxu/GRM

google-research-datasets/conceptual-captions

jiawei-ren/dreamgaussian4d

microsoft/voxelpose-pytorch

AILab-CVC/SEED-Bench

yuweihao/MM-Vet

OpenGVLab/OmniCorpus

computational-imaging/GSM

kai422/IART

open-compass/MMBench

PhoenixZ810/MG-LLaVA

opendatalab/laion5b-downloader

TideDra/VL-RLHF

OpenGVLab/MMT-Bench

nttmdlab-nlp/SlideVQA

Liuziyu77/MMDU

OpenGVLab/LCL

XunshanMan/MVGFormer