cartography-cncf/cartography

Github user.role is potentially inaccurate in graphs with multiple github orgs?

Closed this issue · 5 comments

In the process of working on pr 1373 I noticed what seems like an edge case for users who are in multiple Github organizations: imagine a user who is an 'ADMIN' to Org1, and a 'MEMBER' to Org2. The user node's ‘role’ property will say either 'ADMIN' or 'MEMBER', so, it will be incorrect with respect to one of the orgs.

This was not an issue for my PR specifically but in thinking through it, it did make me think of an alternative graph that might be clearer? Cartography could graph these relationships like so:

(User)-[MEMBER_OF|ADMIN_OF|UNAFFILIATED]-(Org)

And then the user.role property would goes away (because its info would now be encoded in the type of the relationship between user and org).

I haven't looked into implementing this yet, because I wasn't sure if it's of any interest, or if maybe there is a reason for the way things are currently done. It would be a 'breaking' change, since it removes a property from a node and would change the relationship type of some users (i.e. admins currently are MEMBER_OF but would become ADMIN_OF).

Does this make sense and, if we wanted to make this update in one of our upcoming sprints, would that be welcome? It's not critical for us, and we have other changes we'd like to make, but it might be something I can do.

  • Cartography release version 0.93
  • Python version: 3.12

Mm, yeah everything you wrote sounds correct. The code right now does not support the idea of a user having different 'role' properties for different orgs. How about a schema like this:

  1. A user belongs to a github org. This is the "tenant relationship". cartography convention is to have one label be that specific relationship so that we can determine what org a user belongs to in a uniform way -- that is, without needing to account for multiple possibilities of label values connecting back to the org.
(:GitHubOrganization)<-[:MEMBER_OF]-(:GitHubUser)
  1. Some users can be admins of github orgs:
(:GitHubOrganization)<-[:ADMIN_OF]-(:GitHubUser)

So in the case where a user is an admin, we would have 1 MEMBER_OF edge + 1 ADMIN_OF edge, so it would be a bit redundant, but that's fine. It'd look something like

(:GitHubUser)-[:MEMBER_OF]->(:GitHubOrganization)
          |                 ^
           -[:ADMIN_OF]----|

Thoughts?

cc: @heryxpc @ramonpetgrave64 @chandanchowdhury

I like it. I will wait for others to chime in.

Also, a little more thinking on why this could be valuable: maybe multiple orgs in one enterprise is a bit of an edge case, and I am pretty sure Github says it’s a contra pattern and that they try to dissuade it, but, I think it still happens, and it can't be bad for the graph to be more accurate, especially for a relationship as important as member vs admin.

(For my part, we have other Github module contributions we'll need to work on first, but, assuming this is still not done, I could then argue to my manager that I can do it while all the Github module stuff is fresh in my mind.)

I agree, cartography is all about verifying assumptions, especially if things are used in a way that is not intended

Just FYI that, barring surprises, I intend to work on this this week.

Issue was resolved in #1403, now merged.