Swahili sentiment analysis dataset is the dataset training binary (positive or negative) sentiment analysis model.
This data you see in swahili.csv
is but results of cleaning and back-translation of data found on an opensource repository titled Swahili-sentiment-analysis by jinamizi.
This is the same data that was used to train Spark NLP | Swahili Sentiment analysis model.
>>> import pandas as pd
>>> dataset = pd.read_csv('swahili.csv', usecols=['text', 'labels'])
>>> dataset.head()
text labels
0 team 2019merimera alikuwa takataka negative
1 sijafurahishwa negative
2 kubuni dosari negative
3 bila kusema nilipoteza pesa zangu negative
4 sema kupoteza pesa na wakati negative
>>>
We also used the same data to train Swahili-Sentiment-Analysis model.
In case you're facing any difficulty when trying to use the data, please raise an issue so as we can assist you.
Please Feel free to contribute to this repository, whether its code, docs or data processing pipeline.
Did you find this repository useful, please give it a star so as more people can know about it
All the credits to