xinyu1205
Ph.D. Student at Fudan University, homepage: xinyu1205.github.io
Fudan UniversityShanghai, China
Pinned Repositories
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
ActionCLIP
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
ALBEF
Code for ALBEF: a new vision-language pre-training method
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
CLIP
Contrastive Language-Image Pretraining
IDEA-pytorch
Code for paper: IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training [ACM MM2022]
Implementation-and-hiding-analysis-of-reversible-steganography-algorithm-based-on-deep-learning
recognize-anything
Open-source and strong foundation image recognition models.
robust-loss-mlml
Code for paper: Simple and Robust Loss Design for Multi-Label Learning with Missing Labels
xinyu1205.github.io
xinyu1205's Repositories
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
xinyu1205/robust-loss-mlml
Code for paper: Simple and Robust Loss Design for Multi-Label Learning with Missing Labels
xinyu1205/IDEA-pytorch
Code for paper: IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training [ACM MM2022]
xinyu1205/Implementation-and-hiding-analysis-of-reversible-steganography-algorithm-based-on-deep-learning
xinyu1205/xinyu1205.github.io
xinyu1205/ActionCLIP
This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"
xinyu1205/ALBEF
Code for ALBEF: a new vision-language pre-training method
xinyu1205/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
xinyu1205/CLIP
Contrastive Language-Image Pretraining
xinyu1205/daily_fudan
一键平安复旦小脚本,自动化快速上报疫情
xinyu1205/Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Tag2Text & Stable Diffusion & BLIP & Whisper - Automatically Recognize, Detect, Segment and Generate Anything with Image, Text, and Speech Inputs
xinyu1205/GroundingDINO
The official implementation of "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
xinyu1205/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
xinyu1205/MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
xinyu1205/moco
PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
xinyu1205/object_detection_metrics
Object Detection Metrics
xinyu1205/query2labels
Official implementation of paper "Query2Label: A Simple Transformer Way to Multi-Label Classification".
xinyu1205/rentainhe
xinyu1205/ssl-small
Code implementation for paper "On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals".
xinyu1205/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
xinyu1205/txsun1997.github.io
xinyu1205/xinyu1205