sanjayss34's Stars
nerfstudio-project/nerfstudio
A collaboration friendly studio for NeRFs
jwyang/faster-rcnn.pytorch
A faster pytorch implementation of faster r-cnn
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
LLaVA-VL/LLaVA-NeXT
peteanderson80/bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
shubham-goel/4D-Humans
4DHumans: Reconstructing and Tracking Humans with Transformers
airsplay/lxmert
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
nikitakit/self-attentive-parser
High-accuracy NLP parser with models for 11 languages.
facebookresearch/av_hubert
A self-supervised learning framework for audio-visual speech
google/learned_optimization
jwyang/graph-rcnn.pytorch
[ECCV 2018] Official code for "Graph R-CNN for Scene Graph Generation"
uclanlp/visualbert
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
CogComp/cogcomp-nlp
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
mpc001/Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages
brjathu/LART
Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)
lil-lab/nlvr
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
airsplay/py-bottom-up-attention
PyTorch bottom-up attention with Detectron2
rowanz/merlot
MERLOT: Multimodal Neural Script Knowledge Models
Cyanogenoid/vqa-counting
[ICLR 2018] Learning to Count Objects in Natural Images for Visual Question Answering
HazyResearch/lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
eladsegal/tag-based-multi-span-extraction
The official implementation of EMNLP 2020, "A Simple and Effective Model for Answering Multi-span Questions".
allenai/medicat
Dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references
allenai/faithful-nmn
Evaluating and improving the faithfulness of the interpretations offered by Neural Module Networks
danyaljj/infinitelyDeepNeuralNetworks
Deep Learning without any depth limitation
hack4impact-upenn/close-calls-philly
(Clean Air Council Spring 2017): A vehicle incident reporting web application
jacobkahn/cis-700-project