aaronma2020

Nanjing UniversityNanjing

aaronma2020's Stars

haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python22k 158 1.6k2.4k
state-spaces/mamba
Mamba SSM architecture
Language:Python14.4k 105 6291.3k
changgyhub/leetcode_101
LeetCode 101：力扣刷题指南
9.3k 145 911.2k
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python5.1k 27 665527
THUDM/VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
Language:Python4.1k 42 358423
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
Language:Python2.1k 12 351305
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
Language:Python1.7k 22 8986
Farama-Foundation/chatarena
ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
Language:Python1.4k 19 23140
jackaduma/awesome_LLMs_interview_notes
LLMs interview notes and answers:该仓库主要记录大模型（LLMs）算法工程师相关的面试题和参考答案
1.2k 17 7302
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Language:Python856 32 8345
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
Language:Python781 11 15583
awekrx/ChatGPT-MidJourney-prompt
This is a ChatGPT based prompt generation model for MidJorney. The purpose of this model is to simplify the creation of images and increase their creativity. By introducing a partial hint, ChatGPT creates a follow-up that can be used to stimulate creativity and provide new ideas.
Language:Python337 9 651
h-zhao1997/cobra
[AAAI-25] Cobra: Extending Mamba to Multi-modal Large Language Model for Efficient Inference
Language:Python270 6 248
mertyg/vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
Language:Python270 7 3818
kaixindelele/ChatOpenReview
Crowdfunding open source projects: use OpenReview's high-quality review data to fine-tune a professional review and response LLM. 众筹开源项目：利用OpenReview的优质审稿数据，微调出一个专业的审稿和审稿回复GPT
Language:Python199 9 012
YujieLu10/LLMScore
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Language:Python128 3 710
allenai/aokvqa
Official repository for the A-OKVQA dataset
Language:Python80 5 168
mathvision-cuhk/MATH-V
MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.
Language:Python60 1 24
xuanlinli17/large_vlm_distillation_ood
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability (ICCV 2023)
Language:Python56 1 25
imJunaidAfzal/Prompt-Engineering
Prompt Engineering for Language models (GPT-3, GPT-4, chatGPT) and text-to-image models (Stable Diffusion, Midjourney, Dall-e)
Language:Jupyter Notebook37 2 05
cliang1453/task-aware-distillation
Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)
Language:Python34 1 14
KavrakiLab/Spec2Mol
Language:Python22 0 413
HAWLYQ/InfoMetIC
Language:Python12 2 70
PKU-ICST-MIPL/LFR-GAN_TOMM2023
Official repository for "LFR-GAN: Local Feature Refinement based Generative Adversarial Network for Text-to-Image Generation" (TOMM 2023).
Language:Python11 2 11
njucckevin/KnowCap
Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model
Language:Python10 1 40
ChangxinWang/BoFiCap
Bounding and Filling: A Fast and Flexible Framework for Image Captioning
Language:Python9 1 42
aaronma2020/Food500-Cap
7 1 10
aaronma2020/BoFiCap
Bounding and Filling: A Fast and Flexible Framework for Image Captioning
1 0 00
aaronma2020/probing_vlp
1 1 00
aaronma2020/robust_captioning_metric
1 1 00

aaronma2020

aaronma2020's Stars

haotian-liu/LLaVA

state-spaces/mamba

changgyhub/leetcode_101

open-compass/opencompass

THUDM/VisualGLM-6B

open-compass/VLMEvalKit

baaivision/Emu

Farama-Foundation/chatarena

jackaduma/awesome_LLMs_interview_notes

mbzuai-oryx/groundingLMM

TinyLLaVA/TinyLLaVA_Factory

awekrx/ChatGPT-MidJourney-prompt

h-zhao1997/cobra

mertyg/vision-language-models-are-bows

kaixindelele/ChatOpenReview

YujieLu10/LLMScore

allenai/aokvqa

mathvision-cuhk/MATH-V

xuanlinli17/large_vlm_distillation_ood

imJunaidAfzal/Prompt-Engineering

cliang1453/task-aware-distillation

KavrakiLab/Spec2Mol

HAWLYQ/InfoMetIC

PKU-ICST-MIPL/LFR-GAN_TOMM2023

njucckevin/KnowCap

ChangxinWang/BoFiCap

aaronma2020/Food500-Cap

aaronma2020/BoFiCap

aaronma2020/probing_vlp

aaronma2020/robust_captioning_metric