/Twitter_analyzer

Primary LanguageJupyter Notebook


Logo

From Tweets to Trends

Weird data challenge
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Product Name Screen Shot

This is a weird data challenge project, mainly about data preprocessing and data analysis (so data science stuff). Also use a sentiment analysis model.

(back to top)

Built With

  • python
  • mysql

(back to top)

Getting Started

Clone the project and download the data. Unzip and put the json files inside data/ folder. Note that there might still exists a zip file inside data/ after extracting.

To remove the zip file, you can do:

rm -f data/data.zip

Prerequisites

Installation

  1. Clone the repo
    git clone ...
  2. Install python packages
    pip install -r requirements.txt
  3. Create your .env file
    touch .env
  4. Setup your .env file, it should follow this format
    HOST=your_host
    USERNAME=your_username
    PASSWORD=your_pass
    DATABASE=your_dbname
  5. Go to mysql shell and create your database
    create database your_dbname;

(back to top)

Usage

  1. Load the tweets into csv file
    python preprocess/ultimate_tweet_loader.py
  2. Type all.
  3. Change the name of the resulting csv file tweets_dataset_all.csv to combined_dataset.csv
    mv tweets_dataset_all.csv combined_dataset.csv
  4. Run main.py
    python main.py
  5. Type csvadduser to load the users information to a csv file.
  6. Type setup to load the data from csv file to your mysql tables and extract conversations from the tweets data.
  7. Type categorize to add the categorize the data into topics.

For more examples, please refer to the Documentation

(back to top)

Roadmap

  • Utilize GPU to parallel processing in performing sentiment analysis in batch
  • Use multithreading to perform I/O tasks when loading tweets data to a csv file
  • Add time remaining when loading the data
    • Add log file to report errors when loading the data
    • Add checkpoints

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the TU/e License. Have to pay 69 euros to clone the repo.

(back to top)

Contact

Go to momentum at 9am in the morning, knock on the side door 69 times, I will appear, else find my name being hidden in this repository and reverse search my contact.

Project Link: https://github.com/lmBored/Twitter_analyzer

(back to top)

Acknowledgments

(back to top)