Hebrew most common words by Twitter
based on tweets from March 2018 to March 2019
General Info
- # of tweets ~ 20M
- # of tweets excluded -~ 1.8M
- excluded tweets with links as they usually appear to be automated links to Youtube or clickbit
- # of words in corpus ~ 436M
EXTRA
comparison to swadesh list
where each word rank in the top 100k most common by Twitter list
words ranked 100001 were not found in the top 100k
single letter words such as the Hebrew 'and' were mis-ranked as they doesn't appear independently in the 100k list
Data
data can be provided upon request