sanjayss34

sanjayss34's Stars

nerfstudio-project/nerfstudio
A collaboration friendly studio for NeRFs
Language:Python9.7k 114 1.7k1.3k
jwyang/faster-rcnn.pytorch
A faster pytorch implementation of faster r-cnn
Language:Python7.7k 91 8392.3k
NVIDIA/cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++5.9k 109 1.2k1k
LLaVA-VL/LLaVA-NeXT
Language:Python3.2k 37 331278
peteanderson80/bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Language:Jupyter Notebook1.4k 26 117378
shubham-goel/4D-Humans
4DHumans: Reconstructing and Tracking Humans with Transformers
Language:Python1.3k 23 152121
airsplay/lxmert
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
Language:Python940 18 113159
nikitakit/self-attentive-parser
High-accuracy NLP parser with models for 11 languages.
Language:Python875 20 98155
facebookresearch/av_hubert
A self-supervised learning framework for audio-visual speech
Language:Python862 15 111138
google/learned_optimization
Language:Python757 14 1663
jwyang/graph-rcnn.pytorch
[ECCV 2018] Official code for "Graph R-CNN for Scene Graph Generation"
Language:Python732 29 113157
uclanlp/visualbert
Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"
Language:Python530 14 40104
CogComp/cogcomp-nlp
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
Language:Java475 62 385142
mpc001/Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages
Language:Python361 13 2956
brjathu/LART
Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)
Language:Jupyter Notebook258 7 3132
lil-lab/nlvr
Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.
Language:HTML258 8 958
airsplay/py-bottom-up-attention
PyTorch bottom-up attention with Detectron2
Language:Python231 5 2957
rowanz/merlot
MERLOT: Multimodal Neural Script Knowledge Models
Language:Python223 14 1825
Cyanogenoid/vqa-counting
[ICLR 2018] Learning to Count Objects in Natural Images for Visual Question Answering
Language:Python205 10 1348
HazyResearch/lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
Language:Python200 20 621
eladsegal/tag-based-multi-span-extraction
The official implementation of EMNLP 2020, "A Simple and Effective Model for Answering Multi-span Questions".
Language:Python157 2 2037
allenai/medicat
Dataset of medical images, captions, subfigure-subcaption annotations, and inline textual references
Language:Python135 8 916
allenai/faithful-nmn
Evaluating and improving the faithfulness of the interpretations offered by Neural Module Networks
Language:Python13 8 11
danyaljj/infinitelyDeepNeuralNetworks
Deep Learning without any depth limitation
Language:Python6 2 01
hack4impact-upenn/close-calls-philly
(Clean Air Council Spring 2017): A vehicle incident reporting web application
Language:Python5 6 01
jacobkahn/cis-700-project
Language:Python3 2 00