MohamedAfham
Grad Student @ TU Darmstadt | Passionate in Deep Learning Research
TU DarmstadtDarmstadt, Germany
MohamedAfham's Stars
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
voxel51/fiftyone
Refine high-quality datasets and visual AI models
wkentaro/gdown
Google Drive Public File Downloader when Curl/Wget Fails
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
qianqianwang68/omnimotion
mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
xiaobai1217/Awesome-Video-Datasets
Video datasets
microsoft/VideoX
VideoX: a collection of video cross-modal models
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
KaiserWhoLearns/CS-PhD-Application-fee-waivers
Collections of CS PhD Application Fee Waivers of schools in North America
muzairkhattak/ViFi-CLIP
[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
mbzuai-oryx/Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
mbzuai-oryx/VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
u2seg/U2Seg
[CVPR 2024] Code release for "Unsupervised Universal Image Segmentation"
xlliu7/TadTR
[TIP 2022] End-to-end Temporal Action Detection with Transformer
vra/flopth
A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.
jameelhassan/PromptAlign
[NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
egoschema/EgoSchema
mengyuest/AR-Net
Jathurshan0330/Cross-Modal-Transformer
Official repository of cross-modal transformer for interpretable automatic sleep stage classification. https://arxiv.org/abs/2208.06991
kahnchana/clippy
Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)
kkahatapitiya/LangRepo
Language Repository for Long Video Understanding
Muzammal-Naseer/DCViT-AT
Official repository for "Boosting Adversarial Transferability using Dynamic Cues " (ICLR 2023)
MCG-NJU/OCSampler
[CVPR 2022] OCSampler: Compressing Videos to One Clip with Single-step Sampling
auniquesun/CrossPoint-DDP
PyTorch DistriubtedDataParallel (DDP) implementation of the CVPR 2022 Paper CrossPoint.
harinduravin/DualCam
DualCam Traffic Light Dataset created using two synchronous narrow angle and wide angle cameras.
theamaya/CrossMoST
Cross-Modal Self-Training: Aligning Images and Point Clouds to learn Classification without Labels
theamaya/3DLatNav
BrinthanK/pyDeepP2SA
A python package for particle size and shape analysis using deep learning
wadduwagelab/All-Optical-QPM