engindeniz

INRIARennes

engindeniz's Stars

meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
Language:Jupyter Notebook10.3k1.5k
apple/ml-ferret
Language:Python8.2k475
ml-explore/mlx
MLX: An array framework for Apple silicon
Language:C++15.6k883
AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
Language:Python2678
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
10.2k679
filipecalegario/awesome-generative-ai
A curated list of Generative AI tools, works, models, and references
2.2k331
mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Language:Python1k92
huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Language:Python128k25.4k
dair-ai/Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Language:MDX45.7k4.4k
yuhangzang/UPT
541
THUDM/P-tuning-v2
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
Language:Python1.9k193
microsoft/VideoX
VideoX: a collection of video cross-modal models
Language:Python939158
google/latexify_py
A library to generate LaTeX expression from Python code.
Language:Python7.1k374
KMnP/vpt
❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119
Language:Python95391
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook9.1k903
antoyang/FrozenBiLM
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Language:Python14722
antoyang/just-ask
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Language:Jupyter Notebook11415
karpathy/nn-zero-to-hero
Neural Networks: Zero to Hero
Language:Jupyter Notebook10.8k1.3k
showlab/EgoVLP
[NeurIPS2022] Egocentric Video-Language Pretraining
Language:Python21519
microsoft/LAVENDER
A Unified Framework for Video-Language Understanding
Language:Python558
cmhungsteve/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
4.4k479
microsoft/UniVL
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
Language:Python33254
a-nagrani/ffmpeg-commands
Collection of useful FFMPEG commands for processing audio and video files.
438
opencv/opencv
Open Source Computer Vision Library
Language:C++76.7k55.7k
microsoft/SwinBERT
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
Language:Python23433
SwinTransformer/Video-Swin-Transformer
This is an official implementation for "Video Swin Transformers".
Language:Python1.4k194
s9xie/Mini-Kinetics-200
Mini-Kinetics-200 data splits used in paper "Rethinking Spatiotemporal Feature Learning For Video Understanding"
7915
jayleicn/TVCaption
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
Language:Python8511
m-bain/frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
Language:Python33442
medhini/clip_it
CLIP-It! Language-Guided Video Summarization
721

engindeniz

engindeniz's Stars

meta-llama/llama-recipes

apple/ml-ferret

ml-explore/mlx

AILab-CVC/SEED-Bench

BradyFU/Awesome-Multimodal-Large-Language-Models

filipecalegario/awesome-generative-ai

mbzuai-oryx/Video-ChatGPT

huggingface/transformers

dair-ai/Prompt-Engineering-Guide

yuhangzang/UPT

THUDM/P-tuning-v2

microsoft/VideoX

google/latexify_py

KMnP/vpt

salesforce/LAVIS

antoyang/FrozenBiLM

antoyang/just-ask

karpathy/nn-zero-to-hero

showlab/EgoVLP

microsoft/LAVENDER

cmhungsteve/Awesome-Transformer-Attention

microsoft/UniVL

a-nagrani/ffmpeg-commands

opencv/opencv

microsoft/SwinBERT

SwinTransformer/Video-Swin-Transformer

s9xie/Mini-Kinetics-200

jayleicn/TVCaption

m-bain/frozen-in-time

medhini/clip_it