/text-mining-corona-articles

Text Mining for Indonesian Online News Articles About Corona

Primary LanguageJupyter Notebook

TEXT MINING FOR INDONESIAN ONLINE NEWS ARTICLES ABOUT CORONA

Hi! In the notebook, we will start our text mining journey by scraping a list of news articles from tirto.id and detik.com about the Coronavirus using BeautifulSoup package. The contents will be saved to an individual .tsv (tab seperated value) files, which will be loaded again for further analysis. From there, we analyze the posting pattern for each sites and train a Word2Vec model using gensim package in order to analyze the semantic and syntactic similarity between each preprocessed words.

REFERENCES

Article Contents

Stopwords List

About Word2Vec

External Media