NullitOut
Plan:
- Look at the implementation code (Tue)
- Find dataset format (Tue)
- Preprocess our data (format it) (Fri)
- Run the null it out algo (Sun)
- Convert our datase to embeddings: (get_embeddings_based_dataset(comment_text))
- Fasttext
- Bert
- Train toxicity classifier
- Profession classifier (start with logistic regression) Y_dev is 'toxicity' (0,1)
- Savethe performance (Acc)
- Build ethnicity classifier (Gender classifier) and run NIO
- get_projection_matrix()
- get accurcy (we need to reduce this)
- Run toxicity classifier on our new embeddings
-
get performance (we want this to be somewhat the same as before)
-
Plot the results (Sun)
-
Analysis report (Sun)
-
Presentation (Mon)