OpenConext/OpenConext-profile

Under my-services, each service displays the same list of (wrong) attributes

thijskh opened this issue ยท 9 comments

Under my-services, each of the services I have logged into shows the same list of four attributes it supposedly received. The list should be different per SP depending of course on the attribute release policy for that SP.

At the moment of creating the new Profile application this was a conscious design decision. EngineBlock does not keep track of, nor could provide a list of attributes and values exposed per SP. Therefor it would be implossible to create this functionality.

There is currently an issue in the Profile Icebox that facilitates the functionality as described. This still requires a design decision and is pending an implementation in EngineBlock (as EngineBlock should provide the data). The current implementation is created similar to the functionality as described in this issue.

The old profile now displays the result of applying the current ARP for an SP to the current set of attributes received by profile for this user.

108068776 in my reading deals with the fact that this is not strictly accurate, since the ARP, or the set of the user's attributes could have changed over time. However, it's definitely a good approximation of what happens.

I currently see the following four attributes, and it's the same for every service:
screenshot from 2016-06-10 14 12 51
I don't understand how I would get to that list from reading 06201046.

The current profile is part of EngineBlock, and therefor can have the ARPs applied for the respective SPs over all attributes it has for the current user, as EngineBlock contains the ARP logic. The new profile is not part of EngineBlock and does not contain the ARP logic (and should not), therefor Profile cannot apply the same ARPs as EngineBlock (and thus not as the current Profile). The choice to create the current implementation was discussed during the development phase of Profile.

The list that is currently shown is the result of all attributes Profile receives from EngineBlock (after applying ARP for Profile if there is any), followed by filtering these attributes as is done in that class, as was required per link in the description of https://www.pivotaltracker.com/n/projects/1453004/stories/106201046. These attributes are then indeed shown for all SPs.

Hope that explains it ๐Ÿ˜„

Alright. I would then conclude that GivenName which is in the list or Surname which is also in the list would show up, while urn:mace:attribute-def:mobile, which is not in the list, is shown in Profile.

The original implementation of the filter seems to be a blacklist rather than a whitelist. This has been carried over into Profile. If that is incorrect or unwanted behavior it should be corrected. I think this could best be discussed with @arothuis and @relaxnow as the creators of the new and current versions of Profile ๐Ÿ˜„

So regardless of how we got here, I think the functional requirement is as follows.

As a user, in order to see which services I used and which attributes each service received, I want to see a list of each SP I've logged into. Because we have no historic record of what was released in the past exactly, for each SP, the list of currently released attributes (ARP) should be listed, with my current value behind it (as a fair approximation of reality).

At this point we can ignore Attribute Manipulations.

What would be required to make this possible?

There are a couple of considerations here:

  • the ARPs as applied by EngineBlock take all ARPs from all SPs in the requester chain to determine the list of allowable attributes
  • the Attributes released to EngineBlock are dictated by the IdP, which may be different when visiting profile.

What can be done:

Given is the list of attributes as received by Profile from EngineBlock upon login.

Option A

Make a custom (to be created) API call to EngineBlock with a list of SP EntityIDs as retrieved in an earlier API-call (where all consents given are retrieved). This API returns a list of ARPs, listing the EntityID specific ARP per SP. Profile then applies the ARPs itself on the received attributes and show the allowed attributes to the user.
Pro: relatively simple, ARPs could be cached by Profile if need be (for an X amount of time), call could be made against non-user-facing EB instance (for reduced load on user-facing instances).
Con: duplicated logic of applying the ARP (no matter how simple the logic is, it could lead to discrepancies if only one application changes), would only apply ARP of given SP, not of other SPs in the requester chain, only uses currently (for Profile released) attributes.

Option B

Make a custom (to be created) API call to EngineBlock with a list of SP EntityIDs as retrieved in an earlier API-call (where all consents given are retrieved) and the current attributes and values as received by Profile upon login. EngineBlock then applies the respective ARPs and returns a list of attributes and values which are shown to the user.
Pro: relatively simple, no duplicated logic, call could be made against non-user-facing EB instance.
Con: ARPs are always "live", no possibility to cache (unless that is created in EB), quite intensive for EB to do, would only apply ARP of given SP, not of other SPs in the requester chain, only uses currently (for Profile released) attributes.

Option C

A bit more event-driven. When a new consent is given, after sending the response to the user[1], EngineBlock makes a call to a new Profile API containing the following:

  • User NameID
  • List of attributes for which consent is given
  • (Possibly including attribute values, would depend on SURFconext privacy statement and whether or not wanted)

If no attribute values are sent, the values Profile received from EngineBlock upon login can be used to fill in as much of the values as possible, an explanation could be added for the missing values. This allows Profile to always use its local data, and would remove the necessity for the API call to EB to retrieve the given consents of a user (no API calls from Profile to EB would exist anymore).
Pro: no duplicated logic, writes only on consent given, granular approach to what data is shared between EB and Profile would be possible, everything is "cached" by default (no external API calls necessary by Profile), all applied ARPs would be taken into consideration, would use attributes as released by the IdP upon giving the consent, Attribute Manipulations would be applied as well (could become more important with the current development of moving AM from after consent to before consent).
Con: reliance on Profile to be available when EB is available as a failed call from EB to Profile might lead to outdated or missing data [2], must be done by user-facing instance of EB[2]


With respect to development impact, A and B are roughly the same amount of work, C would be a bit (handwaving, a 20%) more work.
With respect to correctness, A and B are equally correct but both are a rough approximation, C would be 100% correct.
With respect to additional operational requirements, all would be roughly the same amount of load on the system as a whole, with the load on C being a lot more stable and predictable as it would happen each consent vs each profile visit for A and B. The calls made by B require more work from EB and could be quite a bit of work when dealing with large amounts of consents.

Given these options, I would advise to go for Option C, although I can see that that might be a bit daunting. It is the more future-proof solution, the risks can be negated (now or in the future) and the strain on the platform is consistent and predictable. Option A and B would also be an approximation, whereas C would be correct (albeit some values may be missing if wanted).

If the choice would be A or B I would go for B. This may be more work for EB, but at the same time removes the possibility of an outdated consent showing due to some (possible) cache. Furthermore (and more important), the logic is not duplicated.

Thoughts?


[1]: This ensures the user doesn't see any impact in performance. Can be done using Symfony specific implementation that allows heavy lifting (e.g. sending emails, doing additional calls or event broadcasting) that should not influence the user to take place after the response has been sent to the user.
[2]: These risks would be negated with a simple job queue implementation. It would schedule a new task and that would be executed asap, possibly by a non-user-facing instance. The job queue is also required for the future implementation of deregistration of a user upon request (to allow successful deregistration in EB, SSA and Teams).

Thanks for the really extensive analysis. Important starting points for me are:

  • Profile is used sporadically in comparison with Engineblock logins. We expect only a subset of users to ever visit Profile.
  • My vision for Engine is that it concentrates as much as possible on the processing of logins; ideally I don't want it to be occupying itself more with auxiliary or peripheral functionality more than necessary.

While I can see the benefits of C, I'm hesitant to put profile-related code into EB. We've just separated out profile as a separate application. I'd like EB to provide generic api's for profile to do its job, not do too much profile-specific things. Also, given the first point, it might not be worth it to store lots of data for users who will never come by profile. A nice property of the current profile is that it doesn't store anything, which keeps it a simple application.

So that leaves A and B as options. Agreed that B is the better option of those two. My only concern would be the complexity inside EB. If the ARP code is indeed easily reused without creating significant new complexity in EB, I'd go for B. If somehow it turns out to be tricky or requires lots of refactoring, then I prefer A because I'd prefer to keep complexity out of EB where possible. The duplication of code in Profile is not the end of the world as applying ARP's is relatively straightforward. So even option A is not a bad option in itself, I think.

Concluding I propose that we do B unless implementing it in EB turns out to be complex, then we choose A.

For option B the complexity inside EngineBlock is low, both with the current metadata system as the new envisioned metadata system the actual application of ARPs on a set of attributes is simple - the API layer is simple and thin as well.

B it is then ๐Ÿ‘