Watts-Lab/surveyor

On-platform architecture

Opened this issue · 2 comments

This is a plan that spans a few repositories, but let's sketch it out here.

At a high level, so far we have been trying to do most of our Surveyor activity on our own platform and not use Turk, however this leads to a few issues:

  1. people are not used to it (and complain to us, but more importantly, they complain to Turk, who don't want us to do this and today threatened to block our account)
  2. payment requires an extra step and opportunities for us to make more mistakes
  3. our servers are imperfect and sometimes this leads to issues where people can't submit surveys

I think we can solve all these problems, and engage with some of the other shifts we have been thinking about around multiple AWS accounts for Turk (https://github.com/Watts-Lab/CSSLab-Operations/issues/195). Here I will sketch that plan and want to discuss it before we decide to execute. However, due to issue 1 above, I think we should probably start moving on this within a week or so, as we just got a warning from Turk and in my experience they quickly escalate that to a complete account block if more reports are received.

What if we host our surveys on turk instead of on Surveyor. This entails a few shifts (these are check boxes so that once we decide on a list, we can convert to issues and reduce steps to profit):

  • #115
  • #116
  • #117
  • #118
  • Provide easy options for internal researchers to issue HITs and target workers in the panel

Most of these parts are readily doable and operationally using turk will mean we no longer need to worry about servers, and that we can do things like autopay on completion or automatically bonus with Fair work (a system that will bonus workers to achieve $15/hour based on the median completion time of a HIT). Some caveats around this are that direct rewards have a higher fee for HITs with more than 9 assignments, but that we could either use bonuses via Fair work, or use qualification groups and many repeated HITs so that each HIT only has 9 assignments.

For non survey notifications (e.g., Empirica experiments), we could still use a combination of notifications and qualifications with a base HIT. Further this seems to make things a bit more portable because we are not relying on a centralized notification mechanism, or even on a centralized data store. Instead we would allow teams to use the base scripts and make customizations for their specific uses across different projects and accounts. Just as a reminder there, qualifications can be made without prior working relationships, but notifications and bonuses can not, so there may be some thinking to be done about when and how notifications are required instead of leveraging qualifications alone (something we used with mixed success to build the base panel so far).

I think this makes sense as an operational design, and I'm interested to hear from others about any possible issues that we should consider before deciding if this is a worth while switch to invest in.

Lastly, there are also a few low stakes test we can do to try to reduce uncertainty in some areas for this, and thats something we can also chat about while discussing this.

CCing people for feedback but of course feel free to bring others who might have thoughts into this conversation: @TutiGomoka @rivera-lanasm @shapeseas @JamesPHoughton

Hey Mark, this looks great, sorry for the delay replying here!

some thoughts:

  • for the "Add an output format for Surveyor" and other four check box steps, I can fork the Surveyor repo and start there, and create issues for each of the steps. Could we set up some time to walk through the components of that codebase and conceptually map how we'd like to organize it for this plan? It sounds like a large overhaul of the repo, so considering just forking into a new repo rather than creating a branch.

  • Should we go into more detail regarding how this system will facilitate giving multiple teams access to send/receive surveys to/from MTurk, while maintaining distinct accounting attribution for each team? Basically what we've been discussing in https://github.com/Watts-Lab/CSSLab-Operations/issues/195.

Great.

I agree that this will probably end up as more than one repo, but where it overlaps with stuff we are already doing I’d rather we aim to consolidate if possible. Or at least, have a good argument why something should be a new thing. In any case, I do think starting to map these issues out in more detail would be great.

I am also talking to a group who builds tools for running large turk projects next week and its possible their tools will change our thinking around some of this, so perhaps we can do some planning here, but need not start digging into the technical aspect until after that meeting, just in case their tools solve all our problems :-)