turned2670

School of SoftwareTsinghua University

turned2670's Stars

THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python4.8k393
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python5.7k440
Mamadou-Keita/VLM-DETECT
[ICASSP 2024] The official repo for Harnessing the Power of Large Vision Language Models for Synthetic Image Detection
Language:Python142
WeOpenML/PandaLM
Language:Python88267
agisnetwork/agis-engine
Language:Python266
lupantech/ScienceQA
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
Language:Python58864
amazon-science/mm-cot
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
Language:Python3.8k309
keirp/automatic_prompt_engineer
Language:Python1.1k147
KindXiaoming/pykan
Kolmogorov Arnold Networks
Language:Jupyter Notebook14.8k1.3k
xiaobai1217/Awesome-Video-Datasets
Video datasets
1.1k91
ytdl-org/youtube-dl
Command-line program to download videos from YouTube.com and other video sites
Language:Python132k10k
gokulkarthik/hateclipper
Hate-CLIPper: Multimodal Hateful Meme Classification with Explicit Cross-modal Interaction of CLIP features - Accepted at EMNLP 2022 Workshop
Language:Jupyter Notebook408
OFA-Sys/Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Language:Python4.4k453
zsavvas/memes_pipeline
Memes Processing Pipeline that enables the track of memes across multiple Web communities.
Language:Python5618
aabhandari/CrisisHateMM
Language:Jupyter Notebook4
hiyouga/LLaMA-Factory
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
Language:Python31.9k3.9k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python19.6k2.2k
Yuchen413/text2image_safety
Language:Python13312
X-PLUG/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Language:Python2.3k171
tianyi-lab/HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Language:Python2306
scenarios/WeMM
Language:Python8511
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
12k769
whitzard-ai/jade-db
"他山之石、可以攻玉"：复旦白泽智能发布面向国内开源和国外商用大模型的Demo数据集JADE-DB
30019
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python141k26.6k
YitingQu/meme-evolution
Language:Python11
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Language:Jupyter Notebook25k3.2k
rizavelioglu/hateful_memes-hate_detectron
Detecting Hate Speech in Memes Using Multimodal Deep Learning Approaches: Prize-winning solution to Hateful Memes Challenge. https://arxiv.org/abs/2012.12975
Language:Jupyter Notebook5319
Muennighoff/vilio
🥶Vilio: State-of-the-art VL models in PyTorch & PaddlePaddle
Language:Python8829
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook67.8k10.1k
sahoonihar/ToxicBias_CoNLL_2022
1

turned2670

turned2670's Stars

THUDM/GLM-4

OpenGVLab/InternVL

Mamadou-Keita/VLM-DETECT

WeOpenML/PandaLM

agisnetwork/agis-engine

lupantech/ScienceQA

amazon-science/mm-cot

keirp/automatic_prompt_engineer

KindXiaoming/pykan

xiaobai1217/Awesome-Video-Datasets

ytdl-org/youtube-dl

gokulkarthik/hateclipper

OFA-Sys/Chinese-CLIP

zsavvas/memes_pipeline

aabhandari/CrisisHateMM

hiyouga/LLaMA-Factory

haotian-liu/LLaVA

Yuchen413/text2image_safety

X-PLUG/mPLUG-Owl

tianyi-lab/HallusionBench

scenarios/WeMM

BradyFU/Awesome-Multimodal-Large-Language-Models

whitzard-ai/jade-db

AUTOMATIC1111/stable-diffusion-webui

YitingQu/meme-evolution

openai/CLIP

rizavelioglu/hateful_memes-hate_detectron

Muennighoff/vilio

CompVis/stable-diffusion

sahoonihar/ToxicBias_CoNLL_2022