/bot137

Chatbot for registering urban problems

Primary LanguageJupyter Notebook

Bot137

Introducing a demo version of a Chatbot137 for registering urban problems - a solution aimed at improving the current process of issue reporting through a simple chatbot. This demo chatbot is designed to demonstrate how technology can be leveraged to provide citizens with an easy and convenient way to report urban problems such as potholes, damaged sidewalks, broken street lights, and more directly to the concerned authorities.

However, it should be noted that this chatbot is trained on limited data generated by the creators and their friends. The data is unofficial and the purpose of this demo is to show the potential for improvement in the current 137 call center of the municipality of Tehran.

The chatbot uses natural language processing to understand and categorize the reported issues based on the information provided by the user (Detailed information about model training can be found below).

Running demo

git clone https://github.com/ahkarimi/bot137.git
cd bot137/Demo
pip install -r requirements.txt
python wsgi.py
  • open browser with this url: localhost:5000

Text Classification Report

Text classification is a common problem in Natural Language Processing (NLP) that involves assigning predefined categories to a given text document. In this report, we compare four popular machine learning algorithms for text classification: RandomForestClassifier, LinearSVC, MultinomialNB, and LogisticRegression.

Methodology:

For text representation, we used the Term Frequency-Inverse Document Frequency (TFIDF) method, which is a widely used method for text classification. The algorithms were trained on a dataset of text documents and their corresponding labels. The performance of each algorithm was evaluated using accuracy.

Results:

The results show that the Logistic Regression algorithm outperforms the other algorithms with an accuracy of 95.30%. The LinearSVC performed well with an accuracy of 94.92%. The RandomForestClassifier algorithm had an accuracy of 93.40%. The MultinomialNB algorithm had the lowest performance with an accuracy of 92.22%.

Conclusion:

Based on the results, the Logistic Regression algorithm is the best performer for text classification in this demo. The performance of the algorithms may vary depending on the size and quality of the data and the specific requirements of the problem. However, the results provide a good starting point for further analysis and improvement.