/plino

Flask based spam filtering system built on top of https://github.com/prodicus/spammy

Primary LanguageCSSGNU General Public License v3.0GPL-3.0

Build status Requirements Status

An intelligent spam filtering system built using a custom Naive Bayes classifier

▶️ You can try it out here at https://plino.heroku.com/

This app is built directly on the work I did on https://github.com/tasdikrahman/spammy



Table of contents


Demo

⬆️ Back to top


For more screenshots

Desktop view Mobile View
desktop demo screens mobile demo screens

REST API usage

⬆️ Back to top

Yes, we do provide an API for our service!

using curl

General Syntax

$ curl -H "Content-Type: application/json" -X \
POST -d \
'{"email_text":"SAMPLE EMAIL TEXT"}' \
https://plino.herokuapp.com/api/v1/classify/

Show me an example

You thought I was lying!

$ curl -H "Content-Type: application/json" \
-X POST -d \
'{"email_text":"Dear Tasdik, I would like to immediately transfer 10000 thousand dollars to your account as my beloved husband has expired and I have nobody to ask for to transfer the money to your account. I come from the family of the royal prince of burkino fasa and I would be more than obliged to take your help on this matter. Would you care to share your bank account details with me in the next email conversation that we have? -regards -Liah herman"}' \
https://plino.herokuapp.com/api/v1/classify/

JSON response

{
  "email_class": "spam", 
  "email_text": "Dear Tasdik, I would like to immediately transfer 10000 thousand dollars to your account as my beloved husband has expired and I have nobody to ask for to transfer the money to your account. I come from the family of the royal prince of burkino fasa and I would be more than obliged to take your help on this matter. Would you care to share your bank account details with me in the next email conversation that we have? -regards -Liah herman", 
  "status": 200
}

using requests

⬆️ Back to top

How can we forget our beloved requests module!

>>> import requests
>>> import json
>>> import pprint
>>>
>>> api_url = "https://plino.herokuapp.com/api/v1/classify/"
>>> payload = \
{
'email_text': 'Dear Tasdik, I would like to immediately transfer 10000 '
               'thousand dollars to your account as my beloved husband has '
               'expired and I have nobody to ask for to transfer the money '
               'to your account. I come from the family of the royal prince '
               'of burkino fasa and I would be more than obliged to take '
               'your help on this matter. Would you care to share your bank '
               'account details with me in the next email conversation that '
               'we have? -regards -Liah herman'
}
>>>
>>> headers = {'content-type': 'application/json'}
>>> # query our API
>>> response = requests.post(api_url, data=json.dumps(payload), headers=headers)
>>> response.status_code
200
>>> pprint.pprint(response.json())
{
 'email_class': 'spam',
 'email_text': 'Dear Tasdik, I would like to immediately transfer 10000 '
               'thousand dollars to your account as my beloved husband has '
               'expired and I have nobody to ask for to transfer the money '
               'to your account. I come from the family of the royal prince '
               'of burkino fasa and I would be more than obliged to take '
               'your help on this matter. Would you care to share your bank '
               'account details with me in the next email conversation that '
               'we have? -regards -Liah herman',
 'status': 200
 }
>>> 

Using standard python 3 library

⬆️ Back to top

requests module really makes our life easy and I use it all the time. But sigh, there should be an example using the standard library so here it is

>>> import urllib.request
>>> import json
>>> import pprint 
>>>
>>> url = "https://plino.herokuapp.com/api/v1/classify/"
>>> req = urllib.request.Request(url)
>>> req.add_header(
       'Content-Type',
       'application/json; charset=utf-8'
   )
>>>
>>> body = \
{'email_text': 'Dear Tasdik, I would like to immediately transfer 10000 '
               'thousand dollars to your account as my beloved husband has '
               'expired and I have nobody to ask for to transfer the money '
               'to your account. I come from the family of the royal prince '
               'of burkino fasa and I would be more than obliged to take '
               'your help on this matter. Would you care to share your bank '
               'account details with me in the next email conversation that '
               'we have? -regards -Liah herman'
}
>>> json_data = json.dumps(body).encode('utf-8')   # needs to be bytes
>>> req.add_header('Content-Length', len(json_data))
>>>
>>> with urllib.request.urlopen(req, json_data) as f:
...   print(f.read().decode('utf-8'))
... 
{
  "email_class": "spam", 
  "email_text": "Dear Tasdik, I would like to immediately transfer 10000 thousand dollars to your account as my beloved husband has expired and I have nobody to ask for to transfer the money to your account. I come from the family of the royal prince of burkino fasa and I would be more than obliged to take your help on this matter. Would you care to share your bank account details with me in the next email conversation that we have? -regards -Liah herman", 
  "status": 200
}
>>> 

Technologies used

⬆️ Back to top

Built upon the giant shoulders of (in no particular order)

Backend

and some more

Front end


Contributing

⬆️ Back to top

Installing it locally

$ virtualenv env              # Create virtual environment
$ source env/bin/activate     # Change default python to virtual one
(env)$ git clone https://github.com/tasdikrahman/plino.git
(env)$ cd plino
(env)$ pip install -r requirements.txt

Running it

$ make run

Refer CONTRIBUTING.md for detailed reference

Contributers


FAQ

⬆️ Back to top

What is the classifier based on

This repo is build directly on the work I did on tasdikrahman/spammy

What did you train the classifier on

The pickled classifier was trained against a total of close to 33,000 emails picked from publicly available enron dataset. You can find the full_corpus directory, which holds the training emails here

How accurate is it

I will leave that to you to decide upon. But for the questions sake, decent enough! 😄


Roadmap

⬆️ Back to top

  • Deploying to heroku
  • Creating a REST API
  • Improving the UI
  • Writing tests
  • Simple API authentication

Legal Stuff

⬆️ Back to top

Licensed under GNU GPLv3

plino: A spam filtering system
Copyright (C) 2016  Tasdik Rahman

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

You can find the full copy of the LICENSE here

gplv3

Donation

If you have found my little bits of software being of any use to you, do consider helping me pay my internet bills :)

PayPal Donate via PayPal!
Gratipay Support via Gratipay
Patreon Support me on Patreon
£ (GBP) Donate via TransferWise!
€ Euros Donate via TransferWise!
₹ (INR) Donate via instamojo