dddraxxx
Dong, Qihua. Interested in discovering intelligence in M-LLM and building general AI!
Northeastern University, SmileLabBoston
dddraxxx's Stars
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
ShiArthur03/ShiArthur03
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
poloclub/transformer-explainer
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
LLaVA-VL/LLaVA-NeXT
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
lxtGH/OMG-Seg
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
pytorch/data
A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
cheahjs/free-llm-api-resources
A list of free LLM inference resources accessible via API.
waspinator/pycococreator
Helper functions to create COCO datasets
NVlabs/GroupViT
Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
iamhyc/Overleaf-Workshop
Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.
conda/conda-pack
Package conda environments for redistribution
lvis-dataset/lvis-api
Python API for LVIS Dataset
tsb0601/MMVP
lil-lab/nlvr
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
ZrrSkywalker/MathVerse
[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
sail-sg/ptp
[CVPR2023] The code for 《Position-guided Text Prompt for Vision-Language Pre-training》
ByungKwanLee/Full-Segment-Anything
This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the full-grid prompt (automatic mask generation) with post-processing: removing duplicated or small regions and holes, under flexible input image size
stevewongv/SSIS
Instance Shadow Detection with A Single-Stage Detector [SSIS & SSISv2] (CVPR 2021 Oral & TPAMI 2022)
scenarios/WeMM
PhyscalX/gradio-image-prompter
Image Prompter for Gradio
harrytea/Detect-AnyShadow
Official PyTorch implementation for TCSVT 23 "Detect Any Shadow: Segment Anything for Video Shadow Detection"
cmprmsd/Overleaf-Image-Helper
Adds functionality to paste screenshots from your clipboard to Overleaf cloud and on-premise.
filipbasara0/simple-clip
A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch
JierunChen/Ref-L4
Evaluation code for Ref-L4, a new REC benchmark in the LMM era
liujunzhuo/FineCops-Ref
Official repo for "FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension." EMNLP 2024