MemeCraft

Official implementation for: MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation

The paper proposes a novel meme generation model that leverages large language models(LLMs) and a large vision models(LVMs). The framework is as follows:

Human evaluation

We have evaluated our model generated memes by human based on five aspects

Do the generated memes resemble publicly available online memes? [Authencity]
Are the generated memes humorous? [Hilarity]
Do the generated memes communicate the intended message? (e.g., support climate action) [Message Conveyance]
Are the generated memes persuasive? [Persuasiveness]
Is the safety mechanism effectiveness in reducing hateful meme generation? [Hatefulness]

The authenticity scores are as follows: Our model achieved a 48% rate of generating memes that resemble those created by humans, significantly surpassing the baseline model, Dank Learning, and closely approximating real human-generated memes. For more details on the evaluation, please refer to our paper.

Installation

The code has been tested with Python 3.11. To use it, first install the dependencies from requirements.txt.

Features

The dataset folder contains memes generated by MemeCraft and baseline models.
The script folder houses Python code for extracting text descriptions, generating meme text and detect hateful memes.

Usage

[vlm_text_description_generation.py] - Extracts text descriptions.
[llm_meme_text_generation.py] and [vlm_meme_text_generation.py] - Generate contextual meme text using LLM or VLM. [prompt_demonstration.py] - Prompt demonstration examples.
[text_overlay.py] - Overlays text onto meme templates.
[hateful_memes_detection.py] - Identifes and excludes hateful memes.

Citation

@article{hanw2024memecraft,
 author =       {Han wang, Roy Ka-Wei Lee},
 title =        {MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation},
 year =         {2024}
}
@misc{singh2020mmf,
 author =       {Singh, Amanpreet and Goswami, Vedanuj and Natarajan, Vivek and Jiang, Yu and Chen, Xinlei and Shah, Meet and
                Rohrbach, Marcus and Batra, Dhruv and Parikh, Devi},
 title =        {MMF: A multimodal framework for vision and language research},
 howpublished = {\url{https://github.com/facebookresearch/mmf}},
 year =         {2020}}

Contact Information

For questions or feedback, email [han_wang@sutd.edu.sg].