yash0307

Computer Vision, Machine Learning.

Czech Technical University, Carnegie Mellon University, IIIT HyderabadPrague, Czech Republic

yash0307's Stars

facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook48.3k 314 6795.7k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.9k 158 1.6k2.3k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python14.8k 124 1.2k1.4k
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python10.7k 80 5041k
kornia/kornia
Geometric Computer Vision Library for Spatial AI
Language:Python10.1k 129 949978
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.1k 97 676980
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:Python8.8k 77 579634
open-mmlab/mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Language:Python8.4k 54 2.4k2.6k
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook4.9k 34 201655
makcedward/nlpaug
Data augmentation for NLP
Language:Jupyter Notebook4.5k 42 221463
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Language:Python3.8k 31 263345
magicleap/SuperGluePretrainedNetwork
SuperGlue: Learning Feature Matching with Graph Neural Networks (CVPR 2020, Oral)
Language:Python3.4k 56 141679
rom1504/clip-retrieval
Easily compute clip embeddings and build a clip retrieval system with them
Language:Jupyter Notebook2.4k 25 233216
OFA-Sys/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Language:Python2.4k 21 364249
zju3dv/LoFTR
Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021, T-PAMI 2022
Language:Jupyter Notebook2.4k 45 224366
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Language:Python903 7 6644
jy0205/LaVIT
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
Language:Jupyter Notebook547 14 3929
deepglint/unicom
MLCD & UNICOM : Large-Scale Visual Representation Model
Language:Python467 8 2721
Tangshitao/QuadTreeAttention
QuadTree Attention for Vision Transformers (ICLR2022)
Language:Jupyter Notebook342 11 3034
facebookresearch/paco
This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts, and visualization notebooks.
Language:Python274 18 1012
shabie/docformer
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)
Language:Python263 15 4040
uta-smile/TCL
code for TCL: Vision-Language Pre-Training with Triple Contrastive Learning, CVPR 2022
Language:Python261 6 2733
weitong8591/differentiable_ransac
PyTorch Implementation of the ICCV 2023 paper: Generalized Differentiable RANSAC ($\nabla$-RANSAC).
Language:Python177 5 1410
facebookresearch/SWAG
Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.
Language:Jupyter Notebook175 9 109
facebookresearch/diht
Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
Language:Python132 20 54
rossumai/docile
DocILE: Document Information Localization and Extraction Benchmark
Language:Python119 12 59
Yuting-Gao/DisCo-pytorch
Code for DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning
Language:Python96 5 610
facebookresearch/nbm-spam
Training and evaluating NBM and SPAM for interpretable machine learning.
Language:Python76 7 714
yash0307/RecallatK_surrogate
Code for Recall@k Surrogate Loss with Large Batches and Similarity Mixup, CVPR 2022.
Language:Python58 6 108
manyids2/mkd_local_descriptor
Implementation of [Understanding and Improving Kernel Local Descriptors](https://arxiv.org/abs/1811.11147) using PyTorch.
Language:Python17 5 00

yash0307

yash0307's Stars

facebookresearch/segment-anything

haotian-liu/LLaVA

Dao-AILab/flash-attention

mlfoundations/open_clip

kornia/kornia

salesforce/LAVIS

facebookresearch/xformers

open-mmlab/mmsegmentation

salesforce/BLIP

makcedward/nlpaug

rom1504/img2dataset

magicleap/SuperGluePretrainedNetwork

rom1504/clip-retrieval

OFA-Sys/OFA

zju3dv/LoFTR

PKU-YuanGroup/Chat-UniVi

jy0205/LaVIT

deepglint/unicom

Tangshitao/QuadTreeAttention

facebookresearch/paco

shabie/docformer

uta-smile/TCL

weitong8591/differentiable_ransac

facebookresearch/SWAG

facebookresearch/diht

rossumai/docile

Yuting-Gao/DisCo-pytorch

facebookresearch/nbm-spam

yash0307/RecallatK_surrogate

manyids2/mkd_local_descriptor