SlongLiu

Ph.D. Student @ CST of Tsinghua University. Intern @IDEA-Research CVR group. homepage: lsl.zone

THU | IDEABeijing | Shenzhen

SlongLiu's Stars

deepseek-ai/DeepSeek-V3
Language:Python95k15.4k
leverimmy/THU-Annual-Eat
一年过去了，你在华子食堂里花的钱都花在哪儿了？
Language:Python46479
microsoft/OmniParser
A simple screen parsing tool towards pure vision based GUI agent
Language:Jupyter Notebook21.3k1.7k
shxie2020/Awesome-UGVFM
A collection of vision foundation models unifying understanding and generation.
482
IDEA-Research/DINO-X-API
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Language:Python97837
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Language:Python2.5k205
uni-medical/GMAI-VL
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI.
Language:Python662
NVlabs/Hydra-MDP
36120
BAAI-Agents/Cradle
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.
Language:Python2.1k183
baaivision/Emu3
Next-Token Prediction is All You Need
Language:Python2.1k79
mit-han-lab/duo-attention
[ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Language:Python44425
facebookresearch/lingua
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Language:Python4.5k244
QwenLM/Qwen2.5-VL
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Jupyter Notebook9.5k656
AIDC-AI/Ovis
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Language:Python85956
JusticeFighterDance/JusticeFighter110
田柯宇 (Tian Keyu)恶意攻击集群事件的证据揭露
66142
real-stanford/diffusion_policy
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
Language:Python2.3k418
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python17.7k1.5k
NVIDIA/Megatron-Energon
Megatron's multi-modal data loader
Language:Python18420
All-Hands-AI/OpenHands
🙌 OpenHands: Code Less, Make More
Language:Python52.3k5.8k
shizhediao/Human-Contribution-Measurement
2
cfahlgren1/webllm-playground
Run LLMs in the Browser with MLC / WebLLM ✨
Language:TypeScript12717
twke18/CAST
Language:Python38
rt219/The-Emergence-of-Objectness
This is the official released code for our paper, The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos, which has been accepted by NeurIPS 2021.
Language:Python538
LLaVA-VL/LLaVA-NeXT
Language:Python3.6k339
baaivision/tokenize-anything
[ECCV 2024] Tokenize Anything via Prompting
Language:Jupyter Notebook57425
NVlabs/EAGLE
Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs
Language:Python64439
git-lfs/git-lfs
Git extension for versioning large files
Language:Go13.4k2.1k
baaivision/DenseFusion
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Language:Python1371
IDEA-Research/Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Language:Jupyter Notebook1.9k192
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook14.8k1.6k

SlongLiu

SlongLiu's Stars

deepseek-ai/DeepSeek-V3

leverimmy/THU-Annual-Eat

microsoft/OmniParser

shxie2020/Awesome-UGVFM

IDEA-Research/DINO-X-API

webdataset/webdataset

uni-medical/GMAI-VL

NVlabs/Hydra-MDP

BAAI-Agents/Cradle

baaivision/Emu3

mit-han-lab/duo-attention

facebookresearch/lingua

QwenLM/Qwen2.5-VL

AIDC-AI/Ovis

JusticeFighterDance/JusticeFighter110

real-stanford/diffusion_policy

QwenLM/Qwen

NVIDIA/Megatron-Energon

All-Hands-AI/OpenHands

shizhediao/Human-Contribution-Measurement

cfahlgren1/webllm-playground

twke18/CAST

rt219/The-Emergence-of-Objectness

LLaVA-VL/LLaVA-NeXT

baaivision/tokenize-anything

NVlabs/EAGLE

git-lfs/git-lfs

baaivision/DenseFusion

IDEA-Research/Grounded-SAM-2

facebookresearch/sam2