lcxrocks

CS Ph.D. student @ Nanjing University.

Nanjing University @MCG-NJUShanghai, China

lcxrocks's Stars

microsoft/LLM2CLIP
LLM2CLIP makes SOTA pretrained CLIP model more SOTA ever.
Language:Python28312
MLNLP-World/Paper-Writing-Tips
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
3.6k470
locuslab/llava-token-compression
Language:Python281
x-cls/superclass
[NeurIPS 2024] Classification Done Right for Vision-Language Pre-Training
Language:Python1395
Stanford-AIMI/RaVL
[NeurIPS 2024] RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
Language:Jupyter Notebook141
zejiangh/Semi-FSL
PyTorch implementation of the paper "Semi-Supervised Few-Shot Learning via Dependency Maximization and Instance Discriminant Analysis", available at https://link.springer.com/content/pdf/10.1007/s11265-022-01796-x.pdf
Language:Python1
zhuhsingyuu/Frolic
Our Implement for Frolic. More details will be provided later.
6
thunlp/LLaVA-UHD
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Language:Python31915
MCG-NJU/AWT
[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Language:Python791
zhmiao/OpenLongTailRecognition-OLTR
Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)
Language:Python849128
Imbalance-VLM/Imbalance-VLM
Language:Python1158
hzwer/WritingAIPaper
Writing AI Conference Papers: A Handbook for Beginners
1.4k47
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Python3.2k202
lixinustc/GraphAdapter
The efficient tuning method for VLMs
Language:Python761
apachecn/ml-mastery-zh
:book: [译] MachineLearningMastery 博客文章
Language:JavaScript685179
Huage001/LinFusion
Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"
Language:Python24516
kongds/E5-V
E5-V: Universal Embeddings with Multimodal Large Language Models
Language:Python1758
Zeyi-Lin/HivisionIDPhotos
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
Language:Python12.8k1.3k
YueYANG1996/LaBo
CVPR 2023: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
Language:Python836
iamxym/Deep-Fourier-based-Arbitrary-scale-Super-resolution-for-Real-time-Rendering
SIGGRAPH 2024 Conference Paper: Deep Fourier-based Arbitrary-scale Super-resolution for Real-time Rendering
Language:Python213
bfshi/scaling_on_scales
When do we not need larger vision models?
Language:Python33610
zengwang430521/TCFormer
The codes for TCFormer in paper: Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer
Language:Python22120
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python9.3k875
LALBJ/PAI
[ECCV 2024] Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Language:Python712
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook12.5k1.2k
Vill-Lab/2024-AAAI-HPT
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)
Language:Python664
zhuhsingyuu/SSP
Our Implement for SSP
Language:Python7
zhengli97/Awesome-Prompt-Adapter-Learning-for-VLMs
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
30113
bytedance/tarsier
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.
Language:Python1468
fishaudio/fish-speech
Brand new TTS solution
Language:Python14.6k1.1k

lcxrocks

lcxrocks's Stars

microsoft/LLM2CLIP

MLNLP-World/Paper-Writing-Tips

locuslab/llava-token-compression

x-cls/superclass

Stanford-AIMI/RaVL

zejiangh/Semi-FSL

zhuhsingyuu/Frolic

thunlp/LLaVA-UHD

MCG-NJU/AWT

zhmiao/OpenLongTailRecognition-OLTR

Imbalance-VLM/Imbalance-VLM

hzwer/WritingAIPaper

QwenLM/Qwen2-VL

lixinustc/GraphAdapter

apachecn/ml-mastery-zh

Huage001/LinFusion

kongds/E5-V

Zeyi-Lin/HivisionIDPhotos

YueYANG1996/LaBo

iamxym/Deep-Fourier-based-Arbitrary-scale-Super-resolution-for-Real-time-Rendering

bfshi/scaling_on_scales

zengwang430521/TCFormer

THUDM/CogVideo

LALBJ/PAI

facebookresearch/sam2

Vill-Lab/2024-AAAI-HPT

zhuhsingyuu/SSP

zhengli97/Awesome-Prompt-Adapter-Learning-for-VLMs

bytedance/tarsier

fishaudio/fish-speech