olgaliak/active-learning-detect

Predict and keep most predictions and filter later

abfleishman opened this issue · 3 comments

Sometimes a model preforms very poorly at first. In this case sometimes it is helpful to set the confidence threshold at some low number so that you get some predictions and can visualize what is going on. This number seems a little hard to pick in an informed way. It would be nice if predictions were made and kept above some very very low number and then the user could choose the threshold above which (or and lower and upper bound) to target for tagging. This way when you run download_vott_json.py it would have an extra argument (or in the config.ini) that would be the confidence threshold above which to sample.

Hey Abram! That sounds like a reasonable change to the process. I made a yash/features branch with a new download_vott_json.py. This one uses the min_confidence in config.ini while downloading as well - basically, you can set a higher min_confidence in your local config.ini than in your training machine config.ini, and then more predictions will be kept in the csv files but won't be shown on VoTT. This seemed like the easiest way to do what you proposed, but if it doesn't make sense let me know!

Also, I made the edits within the github editor so there may be a few minor errors! Let me know if you face any.

Hey Abram,

Let me know if there's any updates on this. Did you try out the new yash/features branch?

I have not tried the feature branch yet!