RollingWang

A vision algorithm engineer~

MeituanBeijing

RollingWang's Stars

feizc/FluxMusic
Text-to-Music Generation with Rectified Flow Transformers
Language:Python1.5k116
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python3.8k302
mlfoundations/open_clip
An open source implementation of CLIP.
Language:Python9.9k959
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python4.9k373
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python5.6k440
sunanhe/MKT
Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".
Language:Python1186
feizc/Diffusion-RWKV
Scaling RWKV-Like Architectures for Diffusion Models
Language:Python1145
yformer/EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
Language:Jupyter Notebook2.1k150
Q-Future/Q-Align
③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.
Language:Python25116
zwx8981/LIQE
[CVPR2023] Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective
Language:Python17810
cloneofsimo/lora
Using Low-rank adaptation to quickly fine-tune diffusion models.
Language:Jupyter Notebook7k481
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Language:Python10.4k669
lllyasviel/ControlNet
Let us control diffusion models!
Language:Python29.9k2.7k
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Language:Jupyter Notebook11.6k1.5k
tgxs002/HPSv2
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Language:Jupyter Notebook37412
stanford-crfm/helm
Holistic Evaluation of Language Models (HELM), a framework to increase the transparency of language models (https://arxiv.org/abs/2211.09110). This framework is also used to evaluate text-to-image models in Holistic Evaluation of Text-to-Image Models (HEIM) (https://arxiv.org/abs/2311.04287).
Language:Python1.9k244
yuvalkirstain/PickScore
Language:Python42424
tgxs002/align_sd
Better Aligning Text-to-Image Models with Human Preference. ICCV 2023
Language:Python2649
THUDM/ImageReward
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Language:Python1.1k62
xinyu1205/recognize-anything
Open-source and strong foundation image recognition models.
Language:Jupyter Notebook2.8k271
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python141k26.6k
alibaba/MNN
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
Language:C++8.6k1.7k
UX-Decoder/Segment-Everything-Everywhere-All-At-Once
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Language:Python4.3k388
Eurus-Holmes/Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
Language:Python1.3k151
willard-yuan/awesome-cbir-papers
📝Awesome and classical image retrieval papers
1.7k293
amusi/awesome-ai-awesomeness
A curated list of awesome awesomeness about artificial intelligence
836114
amusi/CV-Company-List
**提供计算机视觉(CV)算法岗位的公司名单，欢迎大家提交issues进行补充
94990
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Language:Jupyter Notebook14.9k1.4k
woshidandan/Image-Aesthetics-and-Quality-Assessment
[ACMMM 2023, Official Code] for paper "EAT: An Enhancer for Aesthetics-Oriented Transformers". Official Weights and Demos provided. 目前是地表最强开源美学评估模型之一.
Language:Python1067
DarrenPan/Awesome-CVPR2024-Low-Level-Vision
A Collection of Papers and Codes in CVPR2023/2022 about low level vision
64850

RollingWang

RollingWang's Stars

feizc/FluxMusic

InternLM/xtuner

mlfoundations/open_clip

QwenLM/Qwen-VL

OpenGVLab/InternVL

sunanhe/MKT

feizc/Diffusion-RWKV

yformer/EfficientSAM

Q-Future/Q-Align

zwx8981/LIQE

cloneofsimo/lora

microsoft/LoRA

lllyasviel/ControlNet

CompVis/latent-diffusion

tgxs002/HPSv2

stanford-crfm/helm

yuvalkirstain/PickScore

tgxs002/align_sd

THUDM/ImageReward

xinyu1205/recognize-anything

AUTOMATIC1111/stable-diffusion-webui

alibaba/MNN

UX-Decoder/Segment-Everything-Everywhere-All-At-Once

Eurus-Holmes/Awesome-Multimodal-Research

willard-yuan/awesome-cbir-papers

amusi/awesome-ai-awesomeness

amusi/CV-Company-List

IDEA-Research/Grounded-Segment-Anything

woshidandan/Image-Aesthetics-and-Quality-Assessment

DarrenPan/Awesome-CVPR2024-Low-Level-Vision