Pinned Repositories
daily-paper-computer-vision
记录每天整理的计算机视觉/深度学习/机器学习相关方向的论文
LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
LLaVA-HR
LLaVA-HR: High-Resolution Large Language-Vision Assistant
LWTransformer
Lightweight Transformer for Multi-modal Tasks
MCN
[CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)
OneTeacher
Real-time-Global-Inference-Network
Code for paper "A Real-time Global Inference Network for One-stage Referring Expression Comprehension."
RepAdapter
Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".
SimREC
A lightweight codebase for referring expression comprehension and segmentation
wuenda
luogen1996's Repositories
luogen1996/LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
luogen1996/RepAdapter
Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".
luogen1996/LLaVA-HR
LLaVA-HR: High-Resolution Large Language-Vision Assistant
luogen1996/MCN
[CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)
luogen1996/OneTeacher
luogen1996/SimREC
A lightweight codebase for referring expression comprehension and segmentation
luogen1996/LWTransformer
Lightweight Transformer for Multi-modal Tasks
luogen1996/MoIL
luogen1996/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
luogen1996/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
luogen1996/detr
End-to-End Object Detection with Transformers
luogen1996/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
luogen1996/FGD
Focal and Global Knowledge Distillation for Detectors (CVPR 2022)
luogen1996/LaConvNet
luogen1996/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
luogen1996/lmbxmu
luogen1996/luogen1996
luogen1996/luogen1996.github.io
luogen1996/MAttNet
MAttNet: Modular Attention Network for Referring Expression Comprehension
luogen1996/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
luogen1996/openvqa
A lightweight, scalable, and general framework for visual question answering research
luogen1996/RealGIN-Keras
luogen1996/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
luogen1996/SeqTR
SeqTR: A Simple yet Universal Network for Visual Grounding
luogen1996/Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
luogen1996/TencentPretrain
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
luogen1996/TRAR-VQA
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
luogen1996/VC-R-CNN
The official pytorch implementation of CVPR 2020 ``Visual Commonsense R-CNN''
luogen1996/vision_transformer
luogen1996/zhiweichen0012