sunxm2357's Stars
allenai/unified-io-2
ROCm/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Zhongping-Zhang/MGT_Localization
Implementation for Machine-Generated Text Localization (ACL 2024 Findings)
Mrhuangyi/Ebooks-Shared
:book: Ebook share
Brave-peng/books
各类闲书分享(equb版本,ipad可直接打开阅读)
PengtaoJiang/Awesome-Weakly-Supervised-Semantic-Segmentation-Papers
Recent weakly supervised semantic segmentation paper
Computer-Vision-in-the-Wild/CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
LAION-AI/CLIP_benchmark
CLIP-like model evaluation
piotr-teterwak/open_clip
An open source implementation of CLIP.
prismformore/Multi-Task-Transformer
Code of ICLR2023 paper "TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding" and ECCV2022 paper "Inverted Pyramid Multi-task Transformer for Dense Scene Understanding"
SimplifyJobs/New-Grad-Positions
A collection of full time roles in SWE, Quant, and PM for new grads.
sunxm2357/DIME-FM
Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"
ashafaei/pdf2pptx
Convert your (Beamer) PDF slides to (Powerpoint) PPTX
AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
vacancy/SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).
EgoAlpha/prompt-in-context-learning
Awesome resources for in-context learning and prompt engineering: Mastery of the LLMs such as ChatGPT, GPT-3, and FlanT5, with up-to-date and cutting-edge updates.
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
OpenGVLab/InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
mlfoundations/open_clip
An open source implementation of CLIP.
mlfoundations/datacomp
DataComp: In search of the next generation of multimodal datasets
guozix/TaI-DPT
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
NVlabs/GroupViT
Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Guang000/Awesome-Dataset-Distillation
A curated list of awesome papers on dataset distillation and related applications.
Joeclinton1/google-images-download
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!
ultralytics/google-images-download
Google/Bing Images Web Downloader
bethgelab/robustness
Robustness and adaptation of ImageNet scale models. Pre-Release, stay tuned for updates.
microsoft/scene_graph_benchmark
image scene graph generation benchmark
Shenggan/awesome-distributed-ml
A curated list of awesome projects and papers for distributed training or inference