Official implementation for: MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation
The paper proposes a novel meme generation model that leverages large language models(LLMs) and a large vision models(LVMs). The framework is as follows:
We have evaluated our model generated memes by human based on five aspects
- Do the generated memes resemble publicly available online memes? [Authencity]
- Are the generated memes humorous? [Hilarity]
- Do the generated memes communicate the intended message? (e.g., support climate action) [Message Conveyance]
- Are the generated memes persuasive? [Persuasiveness]
- Is the safety mechanism effectiveness in reducing hateful meme generation? [Hatefulness]
The authenticity scores are as follows: Our model achieved a 48% rate of generating memes that resemble those created by humans, significantly surpassing the baseline model, Dank Learning, and closely approximating real human-generated memes. For more details on the evaluation, please refer to our paper.
The code has been tested with Python 3.11. To use it, first install the dependencies from requirements.txt.
- The dataset folder contains memes generated by MemeCraft and baseline models.
- The script folder houses Python code for extracting text descriptions, generating meme text and detect hateful memes.
- [vlm_text_description_generation.py] - Extracts text descriptions.
- [llm_meme_text_generation.py] and [vlm_meme_text_generation.py] - Generate contextual meme text using LLM or VLM. [prompt_demonstration.py] - Prompt demonstration examples.
- [text_overlay.py] - Overlays text onto meme templates.
- [hateful_memes_detection.py] - Identifes and excludes hateful memes.
@article{hanw2024memecraft, author = {Han wang, Roy Ka-Wei Lee}, title = {MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation}, year = {2024} } @misc{singh2020mmf, author = {Singh, Amanpreet and Goswami, Vedanuj and Natarajan, Vivek and Jiang, Yu and Chen, Xinlei and Shah, Meet and Rohrbach, Marcus and Batra, Dhruv and Parikh, Devi}, title = {MMF: A multimodal framework for vision and language research}, howpublished = {\url{https://github.com/facebookresearch/mmf}}, year = {2020}}
For questions or feedback, email [han_wang@sutd.edu.sg].