Sarcasm Detector

About the Project
- Built With
Getting Started
- Prerequisites
Roadmap
Contributing
License
Contact
Acknowledgements

About The Project

It analyses the text enterd by a user and analyses whether the text is sarcastic or not. Sarcasm, which is both positively funny and negatively nasty, plays an important part in human social interaction. Sarcasm detection is a very narrow research field in NLP, a specific case of sentiment analysis where instead of detecting a sentiment in the whole spectrum, the focus is on sarcasm. Therefore the task of this field is to detect if a given text is sarcastic or not.

Some people could disagree about its purpose, but there is a convention in that people use positive words in order to convey a negative message. Of course, it varies through person to person and is highly dependent on the culture, gender and many other aspects. Americans and Indians for example, perceive sarcasm differently. Moreover, someone being sarcastic doesn’t mean the other person perceiving it as the speaker intended. This subjectivity will have implications in the performance of DL models.

Context

Past studies in Sarcasm Detection mostly make use of Twitter datasets collected using hashtag based supervision but such datasets are noisy in terms of labels and language. Furthermore, many tweets are replies to other tweets and detecting sarcasm in these requires the availability of contextual tweets.

To overcome the limitations related to noise in Twitter datasets, this News Headlines dataset for Sarcasm Detection is collected from two news website. TheOnion aims at producing sarcastic versions of current events and we collected all the headlines from News in Brief and News in Photos categories (which are sarcastic). We collect real (and non-sarcastic) news headlines from HuffPost.

This new dataset has following advantages over the existing Twitter datasets:

Since news headlines are written by professionals in a formal manner, there are no spelling mistakes and informal usage. This reduces the sparsity and also increases the chance of finding pre-trained embeddings.

Furthermore, since the sole purpose of TheOnion is to publish sarcastic news, we get high-quality labels with much less noise as compared to Twitter datasets.

Unlike tweets which are replies to other tweets, the news headlines we obtained are self-contained. This would help us in teasing apart the real sarcastic elements.

Content

Each record consists of three attributes:

is_sarcastic: 1 if the record is sarcastic otherwise 0

headline: the headline of the news article

article_link: link to the original news article. Useful in collecting supplementary data

Further Details

General statistics of data, instructions on how to read the data in python, and basic exploratory analysis could be found at this GitHub repo. A hybrid NN architecture trained on this dataset can be found at this GitHub repo.

Inspiration

Can you identify sarcastic sentences? Can you distinguish between fake news and legitimate news?

Built With

This chatbot was build using following frameworks, libraries and softwares.

Getting Started

To run this project you need to follow the following steps.

Getting Started

To run this project you need to follow the following steps.

Prerequisites

These are the prerequisites you need to build this bot as well as run it.

cmd:\ pip install tensorflow
cmd:\ pip install keras

Extra SETUP

Create conda environment and create project in this environment
After installing requirements in above Modules LIST
You need python idle such as Jupyter notebook or spyder

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

MIT license

Contact

Aditya Mangla - @aadimangla - aadimangla@gmail.com - adityamangla.com

Project Link: https://github.com/aadimangla/Sarcasm-Detector

aadimangla/Sarcasm-Detector

Sarcasm Detector

Table of Contents

About The Project

Context

Content

Further Details

Inspiration

Built With

Getting Started

Getting Started

Prerequisites

Extra SETUP

Roadmap

Contributing

License

Contact

Acknowledgements