Dawn-LX's Stars
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
QSCTech/zju-icicles
浙江大学课程攻略共享计划
lm-sys/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
onnx/onnx
Open standard for machine learning interoperability
gitalk/gitalk
Gitalk is a modern comment component based on Github Issue and Preact.
showlab/Tune-A-Video
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
google-research/kubric
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
allenai/open-instruct
open-mmlab/Multimodal-GPT
Multimodal-GPT
Computer-Vision-in-the-Wild/CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
vacancy/SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).
wutong16/DistributionBalancedLoss
[ ECCV 2020 Spotlight ] Pytorch implementation for "Distribution-Balanced Loss for Multi-Label Classification in Long-Tailed Datasets"
OpenGVLab/LAMM
[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
fredzzhang/upt
[CVPR'22] Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"
yunqing-me/WatermarkDM
Code of the paper: A Recipe for Watermarking Diffusion Models
ChenDelong1999/polite-flamingo
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
NVlabs/Bongard-LOGO
Bongard-LOGO is a Python code repository with the purpose of generating synthetic Bongard problems on a large scale with little human intervention.
richard-peng-xia/LMPT
[ACLW'24] LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition
Jeeseung-Park/ViPLO
[CVPR 2023] ViPLO - Official Pytorch Implementation
ZJUSCT/mirror-front
ZJU mirror front-end
YAIxPOZAlabs/MuseDiffusion
YAI 11 x @POZAlabs : Music generation & modification from Unclear midi SEquence with Diffusion model
ZJUSCT/MirrorsDotNet
Mirrors.NET, the Mirror Manager for ZJU Mirror
joyhsu0504/geoclidean_framework
ZJUSCT/mirror-issues
Code Unrelated Issues for ZJU Mirror
bobwan1995/Weakly-HOI
MIvanovska/TomatoDIFF
Official implementation of the paper "TomatoDIFF: On-plant Tomato Segmentation with Denoising Diffusion Models"
shaucky/Petdoctor
一款基于AIR的《赛尔号》Flash页游对战动画播放器