luogen1996

Pinned Repositories

daily-paper-computer-vision
记录每天整理的计算机视觉/深度学习/机器学习相关方向的论文
1 1 01
LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Language:Python489 6 4137
LLaVA-HR
LLaVA-HR: High-Resolution Large Language-Vision Assistant
Language:Python177 3 158
LWTransformer
Lightweight Transformer for Multi-modal Tasks
Language:Python15 2 01
MCN
[CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)
Language:Python133 6 1225
OneTeacher
Language:Python78 1 139
Real-time-Global-Inference-Network
Code for paper "A Real-time Global Inference Network for One-stage Referring Expression Comprehension."
Language:Python9 3 33
RepAdapter
Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".
Language:Python184 15 1424
SimREC
A lightweight codebase for referring expression comprehension and segmentation
Language:Python49 2 04
wuenda
Language:HTML5 2 02

luogen1996's Repositories

luogen1996/LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
Language:Python489 6 4137
luogen1996/RepAdapter
Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".
Language:Python184 15 1424
luogen1996/LLaVA-HR
LLaVA-HR: High-Resolution Large Language-Vision Assistant
Language:Python177 3 158
luogen1996/MCN
[CVPR2020] Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation, CVPR2020 (oral)
Language:Python133 6 1225
luogen1996/OneTeacher
Language:Python78 1 139
luogen1996/SimREC
A lightweight codebase for referring expression comprehension and segmentation
Language:Python49 2 04
luogen1996/LWTransformer
Lightweight Transformer for Multi-modal Tasks
Language:Python15 2 01
luogen1996/MoIL
Language:Python1 2 00
luogen1996/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
Language:Python1 0
luogen1996/detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Language:Python1 0
luogen1996/detr
End-to-End Object Detection with Transformers
Language:Python1 0
luogen1996/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python2 0
luogen1996/FGD
Focal and Global Knowledge Distillation for Detectors (CVPR 2022)
Language:Python1 0
luogen1996/LaConvNet
Language:Python3 1
luogen1996/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python0 0
luogen1996/lmbxmu
1 0
luogen1996/luogen1996
2 0
luogen1996/luogen1996.github.io
Language:HTML2 0
luogen1996/MAttNet
MAttNet: Modular Attention Network for Referring Expression Comprehension
Language:Jupyter Notebook2 0
luogen1996/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
Language:Python1 0
luogen1996/openvqa
A lightweight, scalable, and general framework for visual question answering research
Language:Python1 0
luogen1996/RealGIN-Keras
Language:Python2 0
luogen1996/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook0 0
luogen1996/SeqTR
SeqTR: A Simple yet Universal Network for Visual Grounding
Language:Python1 0
luogen1996/Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Language:Python1 0
luogen1996/TencentPretrain
Tencent Pre-training framework in PyTorch & Pre-trained Model Zoo
Language:Python0 0
luogen1996/TRAR-VQA
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task
Language:Python1 0
luogen1996/VC-R-CNN
The official pytorch implementation of CVPR 2020 ``Visual Commonsense R-CNN''
Language:Python2 0
luogen1996/vision_transformer
Language:Jupyter Notebook1 0
luogen1996/zhiweichen0012
1 0