/social-media-analysis

Social Media Analysis, scalable solution, flexible deployment that analyses social media contents

Primary LanguageJupyter NotebookMIT LicenseMIT

Contributors Forks Stargazers Issues MIT License LinkedIn


Logo

Social Media Analysis

End-to-end solution for analysing social media contents
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Acknowledgments

About The Project

Social Media Analysis

This is a simple and scalable end-to-end solution that analyses tweets & users of selected topics. Here, tts made to be crypto / blockchain related, such as crypto KOLs, applications and coins. It deploys

  • Scalable crawler that crawls
    • Tweets of topics mentioning influencers such as Elon Musk, Vitalik, applications such as DeFi, Metaverse, coins suchs as $BTC, $ETH, $ADA
    • Users mentioning the tweets above, are snapshotted daily
  • Stream ingestion of crawled data into data warehouse
    • Tweets data
    • Users data
  • Clean and transform data into dimensions that models
    • Coin interests - grouped by crypto KOLs
    • Coin interests - grouped by crypto applications
  • BI Dashboard - shows insights of coin interests in different segment of social media users

(back to top)

Built With

(back to top)

Getting Started

Currently, the setup can only be done in a single machine. Everything is in containers, runs on docker-compose, so you will need docker installed.

Prerequisites

You will need docker engine & docker-compose cli such as docker-compose. I am on MacOS and use a combination of colima + docker-compose

  • docker-compose cli

    $ brew install docker-compose

You will need a Google Cloud Platform account in order to setup BigQuery and the credentials in order for it to work

You will need python libraries and dependency manager such as pip. I use poetry

  • Poetry

    $ curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -

Installation

  1. Clone the repo

    $ git clone https://github.com/koksang/social-media-analysis.git
  2. To install dependencies locally especially DBT, you can do

    • Poetry

      $ poetry install

    Or if you use pip

    • Pip
      $ poetry export -f requirements.txt --without-hashes
      $ pip install -r requirements.txt
      
      ---> 100%
  3. Start docker-compose.yaml by doing as below. It will start in detached mode

    $ docker-compose -f infra/docker-compose.yaml up -d
  4. Set environment variables

    • Google Cloud specific

      $ export GOOGLE_APPLICATION_CREDENTIALS=${YOUR PATH TO GOOGLE_APPLICATION_CREDENTIALS}
      $ export GOOGLE_PROJECT_ID=${YOUR GOOGLE_PROJECT_ID}

(back to top)

Usage

Todo

(back to top)

Roadmap

  • Deploy BI Dashboard for insights visualization
  • Finalize docker-compose deployment
  • Convert docker-compose deployment into K8s
  • Use other BI dashboards (such as: apache/superset)
  • Crawl more
    • Nested tweet replies
    • Top trends
    • User profiles snapshot

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Acknowledgments

(back to top)