facebookresearch/dynabench

Suggestion: Can we add a way for get_random_example() in "turk" mode to prevent getting examples for the current annotator (by annotator_id)?

maxbartolo opened this issue · 6 comments

Suggestion: Can we add a way for get_random_example() in "turk" mode to prevent getting examples for the current annotator (by annotator_id)?

@maxbartolo is it not easier to check this on the frontend and just fetch a new example if the annotator_id matches?

@ktirumalafb wanna take a look at this one? Basically, we would insert a check here https://github.com/facebookresearch/dynabench/blob/main/api/controllers/examples.py#L83 that does something like

if credentials["id"] == "turk":
  annotator_id = # get this from an optional query var
  metadata = util.json_decode(example.metadata_json)
  if ("annotator_id" not in metadata or metadata["annotator_id"] != annotator_id):
    bottle.abort(403, "Access denied")

@douwekiela that's what I'm currently doing but doing it at the example level seems like a more general solution and each interface creator won't have to figure out a way to handle these exceptions themselves.

I was thinking something along the lines of passing annotator_id in the query as you suggested (same as we currently handle tags), checking if "annotator_id" is in query_dict and passing it to getRandom() in https://github.com/facebookresearch/dynabench/blob/main/api/controllers/examples.py#L84.

Then, getRandom() can filter at the db level directly based on Example.metadata_json as it currently does for tags in

I also wouldn't filter out if "annotator_id" not in metadata but just filter out the examples by annotator_id match -- if we want to ensure that we're validating the mTurk examples only then that's what the tags are for

What a coincidence. I believe that I actually just fixed this. #823

Let me know if there are any issues with that fix

Looks great, thanks Tristan, you're awesome! My only comment is whether annotator_id should default to None if it's not passed in the query? Seems like currently not setting passing an annotator_id in the query might break things at

my_uid=query_dict["annotator_id"][0],
?

Yes that would break things. Maybe @ktirumalafb could take that part of it on