lookit/lookit-docs

UPDATE - Improve documentation on ineligibility edge cases

mekline opened this issue · 0 comments

TL;DR

We now provide researchers with a column telling them whether and why a specific child who participated is eligible/ineligible for a study. In some edge cases, this information is confusing or difficult to interpret because of permissions issues. That is, researchers see that a child is marked as e.g. "Ineligible_Participation" but can't verify exactly why this is so, because they don't have permission to view the data.

The interactions with permissions/privacy and difficulty of setting 'participated' criteria mean that we probably need to 'solve' this with better documentation of the current behavior rather than code upgrades.

Narrative

As a researcher, I want to know exactly why each participant is ineligible for my study. But as CHS Administrator, I want to abide by privacy restrictions and provide consistent behavior onsite across all researchers, in some cases even when this departs from a specific researcher's desired workflow.

In some cases this lack of info is straightforwardly the best option even though it's confusing for the researcher. For instance, researchers can blacklist one or more studies from another lab that's too similar to their own, but cannot directly verify that a specific person participated in a specific other study (because that would violate the privacy of that other participant/researcher pair.)

In other cases, there is some wiggle room over what the right course of action should be, but for a well-functioning platform we should NOT allow researchers to tune this at will. For instance, consider a case where a researcher self-blacklists a study to avoid repeat participation. A participant who has their consent statement rejected on try #1 will be marked as ineligible on try #2, because they will have an existing study session object (albeit a short one). This is compounded by the fact that this rejected participant becomes mostly invisible (PENDING: Details from Tiffany D. about 1/3/24 case) to the researcher, since their information appropriately is not included in the resulting dataset!

We could start elaborating the rules for what 'counts' as having participated, to e.g. allow participated-but-consent-rejected children to count as NOT having done a study, but this gets extremely messy extremely quickly (c.f. survey consent issues as they interact with emails!!!), and it would be a major effort to define a logic that (a) successfully covers all edge cases (b) does so in a logically consistent fashion across the entire codebase (c) does so in a logically consistent fashion that is easy for researchers to intuitively understand.

The alternative is to leave functionality as-is but improve documentation. Researchers should be prepared for the fact that the blacklist/whitelist criteria are very strict, get a good intuition about some of the edge cases that can come up, and be made aware of 'softer' alternatives. (For instance, leaving a study off of its own blacklist, but then using the available child object info to check the details previous participation and warn or re-route participants exactly as desired.)

Acceptance Criteria

  • Verify the edge cases in question - right now we have researcher reports, but should review these for clarity/consistency
  • Documentation has been added, either to an existing page or a new one
  • Documentation has been reviewed by a few researchers to make sure it's comprehensible

Implementation Notes

Background on "has participated"

Since the current team (mid 2021 forward) has been touching relevant parts of the codebase, we have consistently defined "participated" to mean "a response object exist" for eligibility (aka black/white listing) and I think some other cases as well. This is consistent with most other site behavior we know about, but not consistent with email, which seems to have stricter criteria, resulting in participants getting occasionally re-emailed e.g. in cases where the video consent process is not used. (For external studies, we did define participated=response object exists, and this seems to be intuitively clear to researchers, since all subsequent info past the initial click is recorded offsite.)

A large and annoying piece of technical debt to pay down would be a thorough front-end AND back-end review of all possible workflows onsite that concern participation (ie from Admin, Researcher, Participant perspectives, including emails, study display, study eligibility, study eligibility marking in datasets, my study history, etc.) and resolution to a single standard. Assuming that we don't decide to immanently pay this down, the best course of action seems to be to continue using the "participated=response object" in a consistent manner, and make sure researchers know this fact as well as their options for softer implementations they may want to make use of.