ekazakos

VILab, University of BristolCambridge, UK

ekazakos's Stars

hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python36.8k 219 5.6k4.5k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.6k 268 5.7k5k
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.5k 218 4692.9k
roboflow/supervision
We write your reusable computer vision tools. 💜
Language:Python24.5k 161 4501.8k
QwenLM/Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Language:Python15k 110 1.1k1.2k
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook13.3k 76 4001.3k
alshedivat/al-folio
A beautiful, simple, clean, and responsive Jekyll theme for academics
Language:HTML11.6k 27 58911.4k
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.1k 97 676980
huggingface/lerobot
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
Language:Python8k 85 155804
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages
Language:Python7.3k 140 907895
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Language:Python6.6k 62 140483
UX-Decoder/Segment-Everything-Everywhere-All-At-Once
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Language:Python4.4k 59 149412
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python4.1k 36 545323
open-mmlab/mmtracking
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
Language:Python3.6k 49 465597
visionml/pytracking
Visual tracking library based on PyTorch.
Language:Python3.3k 83 411608
z-x-yang/Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
Language:Jupyter Notebook2.9k 52 154343
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
1.7k 55 588
ttengwang/Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
Language:Python1.7k 16 24104
microsoft/X-Decoder
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
Language:Python1.3k 34 69137
mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Language:Python1.3k 15 123110
ashkamath/mdetr
Language:Python985 19 98129
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Language:Python799 31 8038
FoundationVision/Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
Language:Python586 36 3661
mbzuai-oryx/Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
Language:Python248 14 1711
antoyang/TubeDETR
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Language:Python173 3 229
Soldelli/Awesome-Temporal-Language-Grounding-in-Videos
A curated list of grounding natural language in video and related area. :-)
91 1 05
ROCm/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python51 6 129
ninatu/howtocaption
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
Language:Python49 5 10
JacobChalk/TIM
Codebase for the paper: "TIM: A Time Interval Machine for Audio-Visual Action Recognition"
Language:Python39 3 375
Lzq5/Video-Text-Alignment
Language:Python23 2 00

ekazakos

ekazakos's Stars

hiyouga/LLaMA-Factory

vllm-project/vllm

Vision-CAIR/MiniGPT-4

roboflow/supervision

QwenLM/Qwen

facebookresearch/sam2

alshedivat/al-folio

salesforce/LAVIS

huggingface/lerobot

stanfordnlp/stanza

gaomingqi/Track-Anything

UX-Decoder/Segment-Everything-Everywhere-All-At-Once

InternLM/xtuner

open-mmlab/mmtracking

visionml/pytracking

z-x-yang/Segment-and-Track-Anything

yunlong10/Awesome-LLMs-for-Video-Understanding

ttengwang/Caption-Anything

microsoft/X-Decoder

mbzuai-oryx/Video-ChatGPT

ashkamath/mdetr

mbzuai-oryx/groundingLMM

FoundationVision/Groma

mbzuai-oryx/Video-LLaVA

antoyang/TubeDETR

Soldelli/Awesome-Temporal-Language-Grounding-in-Videos

ROCm/vllm

ninatu/howtocaption

JacobChalk/TIM

Lzq5/Video-Text-Alignment