Twitter_analyzer: A Jupyter Notebook repository from lmBored

From Tweets to Trends

Weird data challenge
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Roadmap
Contributing
License
Contact
Acknowledgments

About The Project

This is a weird data challenge project, mainly about data preprocessing and data analysis (so data science stuff). Also use a sentiment analysis model.

(back to top)

Built With

(back to top)

Getting Started

Clone the project and download the data. Unzip and put the json files inside data/ folder. Note that there might still exists a zip file inside data/ after extracting.

To remove the zip file, you can do:

rm -f data/data.zip

Prerequisites

Install mysql
Install python3

Installation

Clone the repo
```
git clone ...
```
Install python packages
```
pip install -r requirements.txt
```
Create your .env file
```
touch .env
```

Setup your .env file, it should follow this format

HOST=your_host
USERNAME=your_username
PASSWORD=your_pass
DATABASE=your_dbname

Go to mysql shell and create your database
```
create database your_dbname;
```

(back to top)

Usage

Load the tweets into csv file

python preprocess/ultimate_tweet_loader.py

Type all.
Change the name of the resulting csv file tweets_dataset_all.csv to combined_dataset.csv
```
mv tweets_dataset_all.csv combined_dataset.csv
```
Run main.py
```
python main.py
```
Type csvadduser to load the users information to a csv file.
Type setup to load the data from csv file to your mysql tables and extract conversations from the tweets data.
Type categorize to add the categorize the data into topics.

For more examples, please refer to the Documentation

(back to top)

Roadmap

Utilize GPU to parallel processing in performing sentiment analysis in batch
Use multithreading to perform I/O tasks when loading tweets data to a csv file
Add time remaining when loading the data
- Add log file to report errors when loading the data
- Add checkpoints

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

(back to top)

License

Distributed under the TU/e License. Have to pay 69 euros to clone the repo.

(back to top)

Contact

Go to momentum at 9am in the morning, knock on the side door 69 times, I will appear, else find my name being hidden in this repository and reverse search my contact.

Project Link: https://github.com/lmBored/Twitter_analyzer

(back to top)

Acknowledgments

(back to top)