MohamedAfham

Grad Student @ TU Darmstadt | Passionate in Deep Learning Research

TU DarmstadtDarmstadt, Germany

MohamedAfham's Stars

haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.4k 159 1.5k2.3k
voxel51/fiftyone
Refine high-quality datasets and visual AI models
Language:Python8.9k 62 1.5k565
wkentaro/gdown
Google Drive Public File Downloader when Curl/Wget Fails
Language:Python4.3k 24 178350
facebookresearch/jepa
PyTorch code and models for V-JEPA self-supervised learning from video.
Language:Python2.7k 37 56254
qianqianwang68/omnimotion
Language:Python2.1k 123 55126
mbzuai-oryx/Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Language:Python1.2k 15 122108
xiaobai1217/Awesome-Video-Datasets
Video datasets
1.2k 28 1295
microsoft/VideoX
VideoX: a collection of video cross-modal models
Language:Python982 23 112161
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Language:Python784 31 7538
KaiserWhoLearns/CS-PhD-Application-fee-waivers
Collections of CS PhD Application Fee Waivers of schools in North America
348 13 330
muzairkhattak/ViFi-CLIP
[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
Language:Python250 9 2118
mbzuai-oryx/Video-LLaVA
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
Language:Python245 14 1611
mbzuai-oryx/VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
Language:Python218 5 2615
u2seg/U2Seg
[CVPR 2024] Code release for "Unsupervised Universal Image Segmentation"
Language:Python175 5 116
xlliu7/TadTR
[TIP 2022] End-to-end Temporal Action Detection with Transformer
Language:Python144 12 2812
vra/flopth
A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.
Language:Python119 3 89
jameelhassan/PromptAlign
[NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
Language:Python97 3 1311
egoschema/EgoSchema
Language:Python72 1 210
mengyuest/AR-Net
Language:Python61 5 87
Jathurshan0330/Cross-Modal-Transformer
Official repository of cross-modal transformer for interpretable automatic sleep stage classification. https://arxiv.org/abs/2208.06991
Language:Jupyter Notebook46 2 44
kahnchana/clippy
Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)
Language:Jupyter Notebook37 3 15
kkahatapitiya/LangRepo
Language Repository for Long Video Understanding
Language:Python28 2 13
Muzammal-Naseer/DCViT-AT
Official repository for "Boosting Adversarial Transferability using Dynamic Cues " (ICLR 2023)
Language:Python19 1 12
MCG-NJU/OCSampler
[CVPR 2022] OCSampler: Compressing Videos to One Clip with Single-step Sampling
Language:Python17 1 33
auniquesun/CrossPoint-DDP
PyTorch DistriubtedDataParallel (DDP) implementation of the CVPR 2022 Paper CrossPoint.
Language:Jupyter Notebook15 1 61
harinduravin/DualCam
DualCam Traffic Light Dataset created using two synchronous narrow angle and wide angle cameras.
Language:Python15 1 13
theamaya/CrossMoST
Cross-Modal Self-Training: Aligning Images and Point Clouds to learn Classification without Labels
Language:Python60
theamaya/3DLatNav
Language:Jupyter Notebook4 1 00
BrinthanK/pyDeepP2SA
A python package for particle size and shape analysis using deep learning
Language:Jupyter Notebook3 1 00
wadduwagelab/All-Optical-QPM
Language:Jupyter Notebook2 1 10

MohamedAfham

MohamedAfham's Stars

haotian-liu/LLaVA

voxel51/fiftyone

wkentaro/gdown

facebookresearch/jepa

qianqianwang68/omnimotion

mbzuai-oryx/Video-ChatGPT

xiaobai1217/Awesome-Video-Datasets

microsoft/VideoX

mbzuai-oryx/groundingLMM

KaiserWhoLearns/CS-PhD-Application-fee-waivers

muzairkhattak/ViFi-CLIP

mbzuai-oryx/Video-LLaVA

mbzuai-oryx/VideoGPT-plus

u2seg/U2Seg

xlliu7/TadTR

vra/flopth

jameelhassan/PromptAlign

egoschema/EgoSchema

mengyuest/AR-Net

Jathurshan0330/Cross-Modal-Transformer

kahnchana/clippy

kkahatapitiya/LangRepo

Muzammal-Naseer/DCViT-AT

MCG-NJU/OCSampler

auniquesun/CrossPoint-DDP

harinduravin/DualCam

theamaya/CrossMoST

theamaya/3DLatNav

BrinthanK/pyDeepP2SA

wadduwagelab/All-Optical-QPM