palantir/policy-bot

bug: review approval dismissals are misleading when predicates are not satisfied

devinburnette opened this issue · 4 comments

For example, the use case we saw had the status predicate:

if:
  has_successful_status:
    - some_status_check

Policy bot said that a review is dismissed because it didn't have a valid pattern. I think we should probably just allow review dismissals for results with a skipped evaluation assuming that evaluation will be triggered again when and if the predicate is ever satisfied.

In other words, if the status is skipped, then evaluation can't happen so there's no chance of accidentally allowing an approval that should have been dismissed, until a final evaluation happens.

I'm not sure it's the best approach or exactly the best way to implement, but it's what I got. What do you think?

/cc @bluekeyes

To make sure I understand: the rule with the has_successful_status predicate also has github_review_comment_patterns set to some pattern (say "LGTM"), but it is the only rule in the policy using this pattern (or all other rules using it are also skipped), so reviews containing LGTM get dismissed?

Assuming that is accurate, this is something I did find a bit strange about the initial feature. How valuable is it to dismiss reviews that don't match a review comment pattern? To me, that feels more like getting an approval from someone on the wrong team (a case where the review is not dismissed), rather than a case where the review become invalid or stale (like it does if edited or if someone pushes a new commit.)

Would it make sense to only dismiss reviews if they are edited or if new commits are pushed? I think that would solve this problem (at least the most confusion versions of it), but I'd have to think about how to implement it.

I also have to think more about including reviews that match skipped rules. My initial reaction is that it feels wrong, because we otherwise exit evaluation pretty early for skipped rules. Doing partial evaluation beyond that point to get reviewers kind of breaks the the idea that skipped rules don't influence the result.

To make sure I understand: the rule with the has_successful_status predicate also has github_review_comment_patterns set to some pattern (say "LGTM"), but it is the only rule in the policy using this pattern (or all other rules using it are also skipped), so reviews containing LGTM get dismissed?

No there are other rules in the policy using has_successful_status but they both have github_review_comment_patterns set.

Assuming that is accurate, this is something I did find a bit strange about the initial feature. How valuable is it to dismiss reviews that don't match a review comment pattern? To me, that feels more like getting an approval from someone on the wrong team (a case where the review is not dismissed), rather than a case where the review become invalid or stale (like it does if edited or if someone pushes a new commit.)

I think it's very valuable for policies using github_review_comment_patterns where simply approving with no comment will appear to the reviewer to be sufficient, but that actually isn't the case and so it says on the PR that it has been approved maybe by all the correct people also, but policy bot is not satisfied because the pattern wasn't included. The feedback is helpful for that use case. However, with the current implementation it's a bit heavy handed as a fall through case for everything that isn't edited or invalidated by another commit. I hadn't considered skipped evaluations.

Would it make sense to only dismiss reviews if they are edited or if new commits are pushed? I think that would solve this problem (at least the most confusion versions of it), but I'd have to think about how to implement it.

Doing so would reintroduce the situations where an approved review still shows up in the PR as green, but conflicts with policy bot's evaluated approval. This wouldn't be the end of the world since that's how it worked before anyway, but not the most ideal solution. With the current logic this could be easily implemented by just return nil in the fall through case for the reason function and then just continue in the reviews loop so that it doesn't call the dismissal mutation.

I also have to think more about including reviews that match skipped rules. My initial reaction is that it feels wrong, because we otherwise exit evaluation pretty early for skipped rules. Doing partial evaluation beyond that point to get reviewers kind of breaks the the idea that skipped rules don't influence the result.

Yea, I haven't actually tried it, but I feel like based on the existing layout it could work, but I'll keep thinking about potential solutions because right now it's just broken as is because valid reviews are being dismissed just because stuff hasn't evaluated yet due to not meeting the condition of the predicate.

To pick a solution here, I think it would help to clarify the purpose of review dismissals. Basically, how inaccurate or incomplete can they be before the feature stops being useful? On one side, it's clear that dismissing reviews that are valid and influence approval is not acceptable. But I'm finding it hard to think about the other side: how many (or which types of) reviews can we miss dismissing before the feature feels buggy and broken instead of helpful?

Having no practical experience using the github_review_comment_patterns option, it seems similar to team conditions for approval to me. I don't see much of a difference between "my approval doesn't count because I used the wrong pattern" and "my approval doesn't count because I'm a member of the wrong team." What is this like in your experience? Are people much more likely to forget the pattern than they are to approve when they not part of the right team?

I ask because we never considered a version of review dismissal where it dismisses reviews from people who are not part of the set of allowed actors (for the record: I think that would be a bad feature), but we are considering variations on what happens when review comments do not match.


If we do want to go ahead with dismissing reviews that don't match patterns, I can get behind making AllowedCandidates the "set of allowed candidates as if this rule was evaluated" instead of "the set of allowed candidates after evaluating this rule", effectively collecting candidates from skipped rules too.

But to the point above about purpose, this defines dismissal as "dismiss all reviews that could not satisfy any rule under any condition." For policies where some rules set github_review_comment_patterns or invalidate_on_push and other don't, does this lead to a useful result? For example, in a policy where as long as I only edit README.md, I can push without invalidating approval, that one rule, skipped in most cases, now means we never dismiss stale reviews for other rules on push.

how many (or which types of) reviews can we miss dismissing before the feature feels buggy and broken instead of helpful?

This is a really good point. Okay so the original intent here was to try and reduce some confusion around the disconnect between github's "approval" and policy bot's approval, to address the question some folks have around, "my PR was approved now why has the status not turned green to allow me to merge". I think this feature does a great job of that. Additionally, github's native dismissal feature is all or nothing, so I liked the feature because where some rules are still valid in a world where you have a policy with two rules and only one rule invalidates on push for example.

What is this like in your experience? Are people much more likely to forget the pattern than they are to approve when they not part of the right team?

Yes, in my experience with the github_review_comment_patterns policies, people are just forgetting the pattern piece and just clicking approve with no comment, but in our case it's important that we decipher which review the approval was for. We very rarely see someone trying to do a review on a repo where they aren't part of the team, but I think the existing details page did a good job of stating that on it's own, but the nuance of github review comment patterns seemed like the dismissal message would still be helpful for that use case because you almost always mean to include the comment when hitting approve. But we do have cases like in the cases of the team membership example where someone wants to express their approval, but not trigger formal approval, and for that I felt like a comment is usually good enough.

this defines dismissal as "dismiss all reviews that could not satisfy any rule under any condition."

You're right, I think my intent got foggy once I began developing the feature. It was titled, dismiss stale reviews, not dismiss non-satisfying reviews. The implementation of that purpose sounds scary, but I see what you mean by it feeling broken if it's not inclusive of everything. I started to creep in non-satisfying reviews by adding the github comment pattern piece, but maybe that doesn't belong there and it would be more suited as additional information in the details view instead.