Kizna1ver

Kizna1ver's Stars

xinntao/Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Language:Python29.3k 231 6923.7k
HqWu-HITCS/Awesome-Chinese-LLM
整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
17.5k 221 281.7k
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Language:Jupyter Notebook15.6k 115 3971.4k
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.5k 263 130860
stas00/ml-engineering
Machine Learning Engineering Open Book
Language:Python12.3k 118 30755
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.2k 97 679989
chaiNNer-org/chaiNNer
A node-based image processing GUI aimed at making chaining image processing tasks easy and customizable. Born as an AI upscaling application, chaiNNer has grown into an extremely flexible and powerful programmatic image processing application.
Language:Python4.8k 61 860291
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Language:Python3.8k 31 264348
jingyi0000/VLM_survey
Collection of AWESOME vision-language models for vision tasks
2.7k 125 10229
ChaofWang/Awesome-Super-Resolution
Collect super-resolution related papers, data, repositories
2.6k 96 8356
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
Language:Python2.3k 4 241186
apple/ml-aim
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Language:Python1.1k 27 2554
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Language:Python859 13 4552
DirtyHarryLYL/LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
849 53 1436
CircleRadon/Osprey
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Language:Python785 15 4542
showlab/Awesome-MLLM-Hallucination
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
554 7 820
wl-zhao/VPD
[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
Language:Jupyter Notebook520 5 6731
UCSC-VLAA/CLIPA
[NeurIPS 2023] This repository includes the official implementation of our paper "An Inverse Scaling Law for CLIP Training"
Language:Python306 14 1213
apple/ml-veclip
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
Language:Jupyter Notebook238 15 014
LightDXY/FT-CLIP
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
Language:Python212 4 78
hsouri/Battle-of-the-Backbones
198 5 25
XiaoxiaoGuo/fashion-iq
Language:Python149 3 1740
wendashi/Cool-GenAI-Fashion-Papers
🧢🕶️🥼👖👟🧳 A curated list of cool resources about GenAI-Fashion, including 📝papers, 👀workshops, 🚀companies & products, ...
120 8 17
OliverRensu/D-iGPT
[ICML 2024] This repository includes the official implementation of our paper "Rejuvenating image-GPT as Strong Visual Representation Learners"
Language:Python98 3 08
xiaolul2/MGMap
[CVPR2024] The code for "MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction"
Language:Python96 7 55
xuewyang/Fashion_Captioning
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.
Language:OpenEdge ABL84 5 1413
LiWentomng/Point2Mask
The code for "Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport", ICCV2023
Language:Python66 6 76
CircleRadon/APro
The code for "Label-efficient Segmentation via Affinity Propagation". [NeurIPS2023]
Language:Python65 5 71
RotsteinNoam/FuseCap
FuseCap: Large Language Model for Visual Data Fusion in Enriched Caption Generation
Language:Python51 7 90
zijinxuxu/PDFNet
RGB-D fusion for two-hand reconstruction
Language:Python25 1 13