feymanpriv's Stars
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
milvus-io/milvus
A cloud-native vector database, storage for next generation AI applications
mlfoundations/open_clip
An open source implementation of CLIP.
openai/glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
facebookresearch/pytorchvideo
A deep learning library for video understanding research.
microsoft/NUWA
A unified 3D Transformer Pipeline for visual synthesis
pengzhiliang/MAE-pytorch
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
pytorch/torchrec
Pytorch domain library for recommendation systems
milvus-io/bootcamp
Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
tczhangzhi/pytorch-distributed
A quickstart and benchmark for pytorch distributed training.
xiaoweiChen/CMake-Cookbook
:book: 作为对《CMake Cookbook》的中文翻译。
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
IDEA-Research/awesome-detection-transformer
Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
facebookresearch/SLIP
Code release for SLIP Self-supervision meets Language-Image Pre-training
raoyongming/DenseCLIP
[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
jiasenlu/vilbert_beta
m-bain/frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
KeremTurgutlu/self_supervised
Implementation of popular SOTA self-supervised learning algorithms as Fastai Callbacks.
mmaaz60/mvits_for_class_agnostic_od
[ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".
ducha-aiki/pydegensac
Advanced RANSAC (DEGENSAC) with bells and whistles for H and F estimation
showlab/all-in-one
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
YoadTew/zero-shot-image-to-text
Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
LibRerank-Community/LibRerank
LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.
CryhanFang/CLIP2Video
sajjjadayobi/CLIPfa
CLIPfa: Connecting Farsi Text and Images
axelBarroso/Key.Net-Pytorch
[ICCV 2019] Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters - PyTorch
whwu95/DSANet
【ACMMM'2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
feymanpriv/DOLG-paddle
Paddle Implementation of DOLG (ICCV 2021)
BUPT-PRIV/MAE-priv
jvzhao/video_understanding
一些关于视频理解的资料