gersys

Sungkyunkwan UnivSouth Korea

gersys's Stars

frostinassiky/denoiseloc
The official implementation of DenoiseLoc: Boundary Denoising for Video Activity Localization, ICLR 2024
Language:Python6
houzhijian/CONE
[2023 ACL] CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Language:Python284
waybarrios/guidance-based-video-grounding
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
Language:Python17
liyongqi67/MINDER
Language:Python451
RUC-NLPIR/GenIR-Survey
This is the repository for the GenIR survey.
1226
Pter61/context-i2w
Context-I2W: Mapping Images to Context-dependent words for Accurate Zero-Shot Composed Image Retrieval [AAAI 2024 Oral]
Language:Shell39
yzy-bupt/LDRE
[SIGIR'2024 Best Paper Honorable Mention] Official repository for "LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval"
Language:Python171
ninatu/howtocaption
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
Language:Python45
DavidHuji/CapDec
CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)
Language:Python18519
whwu95/Cap4Video
【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Language:Python23719
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.1k2.2k
Seonghoon-Yu/Pseudo-RIS
[ECCV 2024] Official code for "Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentation"
Language:Python14
ExplainableML/EgoCVR
[ECCV 2024] EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval
Language:Python22
vl2g/CSTBIR
Language:Python121
haokunwen/Awesome-Composed-Image-Retrieval
Collection of Composed Image Retrieval (CIR) papers.
925
LLaVA-VL/LLaVA-NeXT
Language:Python2.8k224
jinhyunj/EaTR
Official pytorch repository for "Knowing Where to Focus: Event-aware Transformer for Video Grounding" (ICCV 2023)
Language:Python482
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook12.1k1.1k
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.4k2.9k
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
1.5k75
meta-llama/llama-models
Utilities intended for use with Llama models.
Language:Python4.7k808
navervision/CompoDiff
Official Pytorch implementation of "CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion" (TMLR 2024)
Language:Python803
ml-jku/cloob
Language:Python15411
Yui010206/SeViLA
[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering
Language:Python17821
miccunifi/CIRCO
[ICCV 2023] - Composed Image Retrieval on Common Objects in context (CIRCO) dataset
Language:Python522
QinYang79/RDE
Noisy-Correspondence Learning for Text-to-Image Person Re-identification (CVPR 2024 Pytorch Code)
Language:Python623
ExplainableML/Vision_by_Language
[ICLR 2024] Official repository for "Vision-by-Language for Training-Free Compositional Image Retrieval"
Language:Python412
youngkyunJang/VDG
Visual Delta Generator with Large Multi-modal Model for Semi-supervised Composed Image Retrieval - CVPR2024
Language:Python151
miccunifi/SEARLE
[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion
Language:Python1546
google-research/composed_image_retrieval
Language:Shell17217

gersys

gersys's Stars

frostinassiky/denoiseloc

houzhijian/CONE

waybarrios/guidance-based-video-grounding

liyongqi67/MINDER

RUC-NLPIR/GenIR-Survey

Pter61/context-i2w

yzy-bupt/LDRE

ninatu/howtocaption

DavidHuji/CapDec

whwu95/Cap4Video

haotian-liu/LLaVA

Seonghoon-Yu/Pseudo-RIS

ExplainableML/EgoCVR

vl2g/CSTBIR

haokunwen/Awesome-Composed-Image-Retrieval

LLaVA-VL/LLaVA-NeXT

jinhyunj/EaTR

facebookresearch/sam2

Vision-CAIR/MiniGPT-4

yunlong10/Awesome-LLMs-for-Video-Understanding

meta-llama/llama-models

navervision/CompoDiff

ml-jku/cloob

Yui010206/SeViLA

miccunifi/CIRCO

QinYang79/RDE

ExplainableML/Vision_by_Language

youngkyunJang/VDG

miccunifi/SEARLE

google-research/composed_image_retrieval