/sna-cryptocurrency-with-r

Mini Project to learn Social Network Analysis with R about Indonesian Cryptocurrency Tweets

Primary LanguageR

Social Network Analysis with R and Gephi

Indonesian Cryptocurrenct Tweets.
Similar with https://github.com/SokKanaTorajd/gemastik21 without Topic Modelling.

Dataset

Use same dataset from https://www.kaggle.com/wijatama/indonesiancryptotweets.

Indonesian Stopwords

Combines Sastrawi's stopwords and Mas Devid's Stopwords and extra stopwords from myself.

Workflow Process

  1. Make sure your RStudio and Gephi are installed. Gephi download here.
  2. Download the dataset.
  3. Install required packages such as nurandi/kataDasar, etc and import the libraries
  4. Import dataset and stopwords.
  5. Preprocessing (remove duplicate tweets, text lowering, stripping, tokenizing then remove EN and ID stopwords.
  6. Rejoin the tokens then remove tweets that contain less than 3 words.
  7. Create and filter bigrams. I only use bigram that appears more than 10 times.
  8. Separate bigram into source and target.
  9. Import required libraries for creating the network.
  10. Create and save the network.
  11. Open Gephi. Load the graphml file, then feel free to explore visualization you want.

References

Teached by: Ujang Fahmi and Text Cleaning Bahasa Indonesia