Pinned Repositories
azure-sdk-for-python
Microsoft Azure SDK for Python
ChatBot
This ChatBot is based on Python with NLTK. Its a basic chatbot.
index
My_Site
Language-Detection-From-Text---Bi-gram-based
Language-Detection-From-Text---Bi-gram-based It uses Bi-gram language model and bi-gram frequency addition classifier for language identification task. Trained over 6 languages namely German, English, Spanish, French, Italian and Dutch. The original source of the text corpus is wortschatz leipzig corpora. Both the train and test corpus were taken from this corpora. The training corpus consists of 30000 sentences from news/web domain. Test corpus 10000 unseen sentences from news/web domain. Also, the chosen six languages were such that the same languages are present in the LIGA twitter dataset which consists of 9066 tweets. Note : Directory path used for train and test corpus in code language-test.py, language-train.py and liga_test.py needs to be properly set accordingly.
Mail-Spam-Filtering
Mail-Spam-Filtering It uses machine learning models to predict whether the email is spam or ligitimate. Best thing would be to follow my blog-post for implementation. The description about the steps to build a spam filter from scratch can be read from my blog: https://appliedmachinelearning.wordpress.com/2017/01/23/nlp-blog-post/ It is a python implementation using Naive Bayes Classifier and Support Vector Machines from Scikit-learn ML library. The results has been shown on two publicly open corpus. Ling-spam corpus Euron-spam corpus The link for corpus/dataset download is given in blog-post. Note : Directory path used for training and testing models in lingspam_filter.py and euron-spamfilter.py needs to be properly set accordingly.
Object-recognition
In this blog-post, we will demonstrate how to achieve 90% accuracy in object recognition task on CIFAR-10 dataset with help of following concepts: 1. Deep Network Architecture 2. Data Augmentation 3. Regularization
Predict-the-Happiness-HackerEarth-Challenge
It uses 2-layered fully connected/Dense Neural network model to predict whether the hotel reviews at TripAdvisor site are positive sentiment or negative sentiment. It is a python implementation utilizing Keras library for DNN. This problem statement came from a HackerEarth challenge: "Predict the Happiness" The accuracy score achieved was 88% when prediction file (sample_submisson.csv) is uploaded to their portal. The link for corpus/dataset download is given in blog-post.
Sentiment-Analysis-using-tf-idf---Polarity-dataset
It uses machine learning models to do sentiment polarity analysis on movie reviews. In other words, to classify opinions expressed in a text review (document) in order to determine whether the reviewer’s sentiment towards the movie is positive or negative.
Text-classification-and-clustering
It demonstrates the example of text classification and text clustering using K-NN and K-Means models based on tf-idf features.
Titanic-Sink-Analysis
The project is based on statistical analysis with R, which provides the survival prediction based on age,sex ratio,tickets,male,female,children etc.
Surendra414's Repositories
Surendra414/Mail-Spam-Filtering
Mail-Spam-Filtering It uses machine learning models to predict whether the email is spam or ligitimate. Best thing would be to follow my blog-post for implementation. The description about the steps to build a spam filter from scratch can be read from my blog: https://appliedmachinelearning.wordpress.com/2017/01/23/nlp-blog-post/ It is a python implementation using Naive Bayes Classifier and Support Vector Machines from Scikit-learn ML library. The results has been shown on two publicly open corpus. Ling-spam corpus Euron-spam corpus The link for corpus/dataset download is given in blog-post. Note : Directory path used for training and testing models in lingspam_filter.py and euron-spamfilter.py needs to be properly set accordingly.
Surendra414/Predict-the-Happiness-HackerEarth-Challenge
It uses 2-layered fully connected/Dense Neural network model to predict whether the hotel reviews at TripAdvisor site are positive sentiment or negative sentiment. It is a python implementation utilizing Keras library for DNN. This problem statement came from a HackerEarth challenge: "Predict the Happiness" The accuracy score achieved was 88% when prediction file (sample_submisson.csv) is uploaded to their portal. The link for corpus/dataset download is given in blog-post.
Surendra414/Text-classification-and-clustering
It demonstrates the example of text classification and text clustering using K-NN and K-Means models based on tf-idf features.
Surendra414/ChatBot
This ChatBot is based on Python with NLTK. Its a basic chatbot.
Surendra414/index
My_Site
Surendra414/Language-Detection-From-Text---Bi-gram-based
Language-Detection-From-Text---Bi-gram-based It uses Bi-gram language model and bi-gram frequency addition classifier for language identification task. Trained over 6 languages namely German, English, Spanish, French, Italian and Dutch. The original source of the text corpus is wortschatz leipzig corpora. Both the train and test corpus were taken from this corpora. The training corpus consists of 30000 sentences from news/web domain. Test corpus 10000 unseen sentences from news/web domain. Also, the chosen six languages were such that the same languages are present in the LIGA twitter dataset which consists of 9066 tweets. Note : Directory path used for train and test corpus in code language-test.py, language-train.py and liga_test.py needs to be properly set accordingly.
Surendra414/Object-recognition
In this blog-post, we will demonstrate how to achieve 90% accuracy in object recognition task on CIFAR-10 dataset with help of following concepts: 1. Deep Network Architecture 2. Data Augmentation 3. Regularization
Surendra414/Sentiment-Analysis-using-tf-idf---Polarity-dataset
It uses machine learning models to do sentiment polarity analysis on movie reviews. In other words, to classify opinions expressed in a text review (document) in order to determine whether the reviewer’s sentiment towards the movie is positive or negative.
Surendra414/Titanic-Sink-Analysis
The project is based on statistical analysis with R, which provides the survival prediction based on age,sex ratio,tickets,male,female,children etc.
Surendra414/azure-sdk-for-python
Microsoft Azure SDK for Python
Surendra414/dockerfiles-windows
Various Dockerfiles for Windows Containers
Surendra414/gol
Surendra414/hello-world