bpiyush

1st year DPhil, VGG, Oxford. Past: MSc in AI from UvA | Research @ Wadhwani AI | B.S. in Mathematics @ IIT Kanpur

University of OxfordOxford

bpiyush's Stars

meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook15.3k 194 3802.2k
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10k 99 667974
cambrian-mllm/cambrian
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Language:Python1.8k 21 69115
google-gemini/generative-ai-python
The official Python library for the Google Gemini API
Language:Python1.6k 32 279321
facebookresearch/TimeSformer
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
Language:Python1.6k 27 129212
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Language:Python1.3k 21 6655
DAMO-NLP-SG/VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Language:Python900 10 10360
rhymes-ai/Aria
Codebase for Aria - an Open Multimodal Native MoE
Language:Jupyter Notebook845 18 3070
PKU-YuanGroup/LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Language:Python729 15 6352
ylsung/Ladder-Side-Tuning
PyTorch codes for "LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning"
Language:Python232 1 1110
ziqipang/LM4VisualEncoding
[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
Language:Python225 4 97
zhangfaen/finetune-Qwen2-VL
Language:Jupyter Notebook208 2 1920
reka-ai/reka-vibe-eval
Multimodal language model benchmark, featuring challenging examples
Language:Python149 16 46
joaanna/something_else
Code repository for the paper: 'Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks'
Language:Python146 6 2214
epic-kitchens/epic-kitchens-100-annotations
:plate_with_cutlery: Annotations for the public release of the EPIC-KITCHENS-100 dataset
Language:Python133 15 3428
see2sound/see2sound
Official code for SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Language:Python116 5 49
yukw777/VideoBLIP
Supercharged BLIP-2 that can handle videos
Language:Python116 3 76
dylran/crowddiff
Language:Python102 4 3112
NeeluMadan/ViFM_Survey
Foundation Models for Video Understanding: A Survey
97 1 03
songrise/CLIP-Count
[ACM MM23] CLIP-Count: Towards Text-Guided Zero-Shot Object Counting
Language:Python92 7 168
wenhuchen/Time-Sensitive-QA
Code and Data for NeurIPS2021 Paper "A Dataset for Answering Time-Sensitive Questions"
Language:Jupyter Notebook63 3 76
park-jungin/DualPath
Language:Jupyter Notebook47 2 52
ninatu/howtocaption
Official implementation of "HowToCaption: Prompting LLMs to Transform Video Annotations at Scale." ECCV 2024
Language:Python45 5 10
Yiming-M/CLIP-EBC
The official implementation of the crowd counting model CLIP-EBC.
Language:Jupyter Notebook45 4 218
alibaba-mmai-research/DiST
ICCV2023: Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Language:Python39 2 51
wlin-at/MAXI
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge (ICCV 2023)
Language:Python27 3 63
leexinhao/ZeroI2V
Official implementation of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video" (ECCV2024)
Language:Python18 3 70
zhaochen0110/Timo
Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)
Language:Python18 1 11
eisneim/clip-vip_video_search
showing how to use CLIP-Vip to do video search
Language:Python13 1 02
Pshubham1012/Classification-approach
Language:Python1

bpiyush

bpiyush's Stars

meta-llama/llama-recipes

salesforce/LAVIS

cambrian-mllm/cambrian

google-gemini/generative-ai-python

facebookresearch/TimeSformer

FoundationVision/LlamaGen

DAMO-NLP-SG/VideoLLaMA2

rhymes-ai/Aria

PKU-YuanGroup/LanguageBind

ylsung/Ladder-Side-Tuning

ziqipang/LM4VisualEncoding

zhangfaen/finetune-Qwen2-VL

reka-ai/reka-vibe-eval

joaanna/something_else

epic-kitchens/epic-kitchens-100-annotations

see2sound/see2sound

yukw777/VideoBLIP

dylran/crowddiff

NeeluMadan/ViFM_Survey

songrise/CLIP-Count

wenhuchen/Time-Sensitive-QA

park-jungin/DualPath

ninatu/howtocaption

Yiming-M/CLIP-EBC

alibaba-mmai-research/DiST

wlin-at/MAXI

leexinhao/ZeroI2V

zhaochen0110/Timo

eisneim/clip-vip_video_search

Pshubham1012/Classification-approach