COSMOS: A Python repository from DenizUgur

COSMOS: Catching Out-of-Context Misinformation using Self-Supervised Learning

COSMOS dataset consists of images and captions scraped from news articles and other websites designed for training and evaluation of out-of-context use of images. We refer readers to the paper for more details. To get access to the dataset, please fill out this form. We will provide you script to download the dataset. The official documentation for the project can be found here

Dataset Description

Dataset Statistics

COSMOS dataset consist of three splits : Training (160 K images), Validation (40 K images) and Test (1700 images). For training, we do not have/use out-of-context annotations. We only use these annotations in the end to evaluate our model. The dataset stats are listed below.

Table 1: Dataset stats.

Split	# Images	# Captions	Context Annotation
Train	161,752	360,749	No
Valid	41,006	90,036	No
Test	1700	3400	Yes

Data Format

The COSMOS training, validation and test sets are provided as JSON (JavaScript Object Notation) text files with the following attributes for every data sample stored as a dictionary:

File Structure for train.json and val.json

{	"img_local_path": <img_path>, 
	"articles": [
                 { "caption": <caption1>, 
                   "article_url": <url1>, 
                   "caption_modified": <caption_mod1>,
                   "entity_list": <entity_list1>},
                   
                 { "caption": <caption2>,
                   "article_url": <url2>,
                   "caption_modified": <caption_mod2>,
                   "entity_list": <entity_list2>},

                 { "caption": <caption3>,
                   "article_url": <url3>,
                   "caption_modified": <caption_mod3>,
                   "entity_list": <entity_list3>},
                   
                  ......

				 ],
    "maskrcnn_bboxes": [ [x1,y1,x2,y2], [x1,y1,x2,y2], ... ]
}

Table 2: Attributes in Train/Validation files.

Key	Description
`img_local_path`	Source path in dataset directory for the image
`articles`	List of dict containing metadata for every caption associated with the image
`caption`	Original Caption scraped from the news website
`article_url`	Link to the website image and caption scraped from
`caption_modified`	Modified caption after applying Spacy NER (We used these caption as input to our model during experiments)
`entity_list`	List that consists of mapping between modified named entities in the caption with the corresponding hypernym
`maskrcnn_bboxes`	List of detected bounding boxes corresponding to the image. (x1,y1) refers to start vertex of the rectangle and (x2, y2) refers to end vertex of the rectangle

Note that for detecting bounding boxes, we used Detectron2 pretrained model linked here. We detect upto 10 bounding boxes per image.

File Structure for test.json

{	
        "img_local_path": <img_path>,
	"caption1": <caption1>,
	"caption1_modified": <caption1_modified>,
	"caption1_entities": <caption1_entities>,
	"caption2": <caption2>,
	"caption2_modified": <caption2_modified>,
	"caption2_entities": <caption2_entities>,
	"article_url": <article_url>,
	"label": "ooc/not-ooc",
	"maskrcnn_bboxes": [ [x1,y1,x2,y2], [x1,y1,x2,y2], ... ]
}

Table 3: Attributes in Test file.

Key	Description
`img_local_path`	Source path in dataset directory for the image
`caption1`	First caption associated with the image
`caption1_modified`	Modified Caption1 after applying Spacy NER
`caption1_entities`	List that consists of mapping between modified named entities in the caption1 with the corresponding hypernym
`caption2`	Second caption associated with the image
`caption2_modified`	Modified Caption2 after applying Spacy NER
`caption2_entities`	List that consists of mapping between modified named entities in the caption2 with the corresponding hypernym
`article_url`	Link to the website image and caption scraped from
`label`	Class label whether the two captions are out-of-context with respect to the image (1=Out-of-Context, 0=Not-Out-of-Context )
`maskrcnn_bboxes`	List of detected bounding boxes corresponding to the image. (x1,y1) refers to start vertex of the rectangle and (x2, y2) refers to end vertex of the rectangle

Getting started

The code is well-documented and should be easy to follow.

Source Code: $ git clone this repo and install the Python dependencies from requirements.txt. The source code is implemented in PyTorch so familarity with PyTorch is expected.
Dataset: Download the dataset by filling out the form here.
Visualize Dataset: It is difficult to view the dataset using only JSON file. Navigate to the directory dataset_visualizer and follow the instructions to visualize the dataset using a simple Python-Flask based web tool
Train and Test For Image-Text Matching Task: This code is based on Detectron2 to extract features from objects present in the image. Please setup and install detectron2 first if you wish to use our feature detector for images. The minimal changes to be done to detectron2 source code to extract object features are added to detectron2_changes directory. Navigate to detectron2 source code directory and simply run patch -p1 ../detectron2 < 0001-detectron2-mod.patch. Consider setting up detectron2 inside this directory, it worked seamlessly for me without doing many changes.
All the training parameters are configured via utils/config.py. Specify hyperparameters, text-embeddings, threshold values, etc in the config .py file. Model names are specifed in the trainer script itself. Configure these parameters according to your need and start training.
To train the model, execute the following command: python trainer_scipt.py -m train
Once training is finished, then to evaluate the model with Match vs No-Match Accuracy, execute the following command: python trainer_scipt.py -m eval

Evaluating

Test For Out-of-Context Detection Accuracy: Once training is over, then to evaluate the model for out-of-Context Detection task, specify model name in evaluate_ooc.py.

In order to reproduce our results execaute the following commands for each section.

Section 3.1: Differential Sensing

    export COSMOS_IOU=0.5
    export COSMOS_DISABLE_ISFAKE=1
    export COSMOS_DISABLE_RECT_OPTIM=1

    python evaluate_ooc.py
    
    unset $(env | sed -n 's/^\(COSMOS.*\)=.*/\1/p')

Section 3.2: Fake-or-Fact

    export COSMOS_IOU=0.5
    export COSMOS_DISABLE_ISOPPOSITE=1
    export COSMOS_DISABLE_RECT_OPTIM=1

    python evaluate_ooc.py
    
    unset $(env | sed -n 's/^\(COSMOS.*\)=.*/\1/p')

Section 3.3: Object-Caption Matching

    export COSMOS_IOU=0.5
    export COSMOS_WORD_DISABLE=1

    python evaluate_ooc.py
    
    unset $(env | sed -n 's/^\(COSMOS.*\)=.*/\1/p')

Section 3.4: IoU Threshold Adjustment

    export COSMOS_IOU=0.25
    export COSMOS_WORD_DISABLE=1
    export COSMOS_DISABLE_RECT_OPTIM=1

    python evaluate_ooc.py
    
    unset $(env | sed -n 's/^\(COSMOS.*\)=.*/\1/p')

Section 3.1 + 3.2 + 3.3 + 3.4: Proposed Method

    python evaluate_ooc.py

Environment Variables

These variables modify the behaviour of our evaluation method

Method specific variables

Variable Name	Description
COSMOS_DISABLE_ISOPPOSITE	Disables Section 3.1 Differential Sensing.
COSMOS_DISABLE_ISFAKE	Disables Section 3.2 Fake-or-Fact.
COSMOS_DISABLE_RECT_OPTIM	Disables Section 3.3 Object-Caption Matching.
COSMOS_IOU	Setting it to "0.5" disabled Section 3.4, and "0.25" enables it.
COSMOS_WORD_DISABLE	Disables Section 3.1 and 3.2 altogether.

Dataset specific variables

Variable Name	Description
COSMOS_BASE_DIR	Base directory.
COSMOS_DATA_DIR (optional)	Data directory.
COSMOS_TARGET_DIR (optional)	Target directory

Comparison specific variables

Variable Name	Description
COSMOS_COMPARE	Setting this to "1" will enable comparison with original paper.
COSMOS_COMPARE_LEVEL	Choose between [0, 1, 2] (0 is default). Changes verbosity level.

Citation

If you find our dataset or paper useful for your research , please include the following citation:

@inproceedings{10.1145/3458305.3479968,
	author = {Akgul, Tankut and Civelek, Tugce Erkilic and Ugur, Deniz and Begen, Ali C.},
	title = {COSMOS on Steroids: A Cheap Detector for Cheapfakes},
	year = {2021},
	isbn = {9781450384346},
	publisher = {Association for Computing Machinery},
	address = {New York, NY, USA},
	url = {https://doi.org/10.1145/3458305.3479968},
	doi = {10.1145/3458305.3479968},
	numpages = {5},
	keywords = {RNN, fake, SBERT, BERT, differential sensing, IoU, Cheapfakes},
	location = {Istanbul, Turkey},
	series = {MMSys '21}
}

DenizUgur/COSMOS