NLP project for the Information Retrieval course

The goal of the project was to reproduce this paper on stance classification. The system has been trained to determine the article headline stance (for, against or observing) with respect to the claim. The team was able to reproduce the entire research and actually improve the model accuracy.

A final report can be found in this paper.

Andrea Alfieri - @andreaalf97
Diego Albo - @DiegoStock12
Tomasz Motyka - @motykatomasz
Avinash Saravanan - @asarav

Board Management

If you want to assign someone to a card, convert the card to an issue.
To ensure that the board carries out automated actions make sure you associate pull requests with issues.

Directory Structure (Explanation of Folders and Files)

Data
- CSV Files
  - Contains the original emergent data from which we are working. The cleaned file was cleaned to remove any extraneous data and invalid information.
- Pickled Features
  - Contains features that have been extracted. Can be read by using Pandas or Pickle.
- PPDB (Paraphrase Database)
Data Reading
- Contains files used for reading data in from the data folder.
Evaluation
- Contains files used for classifier training.
Feature Extraction
- Contains files used to extract features to their respective pickle files.
Feature Selection
- Contains ablation tests, statistical tests, and forward/backward selection which are used to determine the usefulness of features.
Visualization
- Contains scripts used to generate graphs and charts from extracted features.

motykatomasz/Natural-Language-Processing

NLP project for the Information Retrieval course

Board Management

Directory Structure (Explanation of Folders and Files)

Useful Links