Train relevance classifier
Closed this issue · 0 comments
kristian4cast commented
Problem to solve
As scientist
, I want have a classifier that predicts the relevance of Tweets
so that I can only consider "relevant" Tweets for my rain classification Task
.
Further details
Initial implementation done during maelstrom bootcamp (see a2 repo).
"Relevance"/"relevant" Tweets are Tweets that contain sufficient information for a human/AI to determine if it was raining or not raining.
Proposal
- Bring relevance dataset into form such that a train and test set can be created for this task
- All Tweets that have a relevance score predicted by falcon should now get the score as a label that is either "relevant" or "not relevant"
- This depends on the relevance score threshold, we pick 0.5 for our first attempt
- All Tweets that have a relevance score predicted by falcon should now get the score as a label that is either "relevant" or "not relevant"
- Adopt rain classifier to allow for classification of relevance of Tweets
- Use you rain classifier notebook as basis for this. (make a copy)
- Evaluate the performance of your model.
- Check manually if predictions of model make sense
- For easy and more difficult (especially misclassified) Tweets
- Check manually if predictions of model make sense
- Change relevance score threshold and see how it affects performance
Testing
What does success look like, and how can we measure that?
-
Trained relevance classifier exists
- performance is estimated
- performance based on relevance score threshold is estimated
-
weight: 5