adobe-apiplatform/user-sync.py

LDAP - Support different search_bases for the user & group queries

srtucker opened this issue · 5 comments

Is your feature request related to a problem? Please describe.
We have a large number of LDAP users that should not be included in the user sync and all of the users that that should be included in the sync are isolated in a the sub-tree of a specific OU. I would like the ability to only get the users that are in this specific sub-tree.

Describe the solution you'd like
Currently user-sync.py supports setting a single base_dn that is used as the search_base in the query of users and groups. I would like the ability split these out and configure a different search_base for each of these queries.
My thought would be to add the following optional settings to connector-ldap.yml:

  • user_search_base - if set, this value would be used for the user searches, otherwise use base_dn
  • group_search_base - if set, this value would be used for the group searches, otherwise use base_dn

Describe alternatives you've considered
I have tried to put together a filter so achieve the same results but it does not appear that you can filter on OU path.

Additional context
None.

Thank you!

Thank you for reaching out. I'm curious to learn more about your use case.

  • What LDAP system are you using?
  • Does your LDAP system require two-step group membership lookup or will it permit Active Directory-style memberOf lookups?
  • What are your concerns with using a broad base_dn? In general the base DN should be as broad as possible to ensure the LDAP connector has full visibility of the users and groups that should be synced. Are you concerned about performance or security perhaps?

Hi Andrew,
We are using Active Directory. It does not require two steps and it is currently pulling group membership correctly.

My concerns are a bit mixed, but it is primarily performance and ensuring the right information is synced into the Adobe Admin Console.

We have at least two user accounts for each person: a normal account, an admin account, and about 100 people have one or more additional accounts for various higher levels of access to AD & servers. Each of these accounts for the same human, have the same mail field, but they have different names. We also have service accounts and department accounts, that together with the "normal" person accounts we call "Enterprise Accounts". These enterprise account are all under a specific OU, while all of the other account are in different locations. Only these enterprise accounts should ever be synced into the Adobe Admin Console (the others can't use SSO).

Our groups are in an entirely different location in AD than our enterprise accounts and the structure they share in common is the root of the domain so I can't set a single base_dn without also including all of the accounts we don't want.

One of the concerns I have is that the wrong account could be synced into Adobe that would have the wrong name, for example mine could be imported as "Scott Tucker Server Admin" instead of "Scott Tucker" since they have the same email address (the unique id in Adobe).

The second concern is performance, the LDAP query and subsequent processing is more than twice as fast if I set the base_dn to the OU for just the users I want to sync (but then the groups don't work).

Please let me know if any of this needs clarification. Thank you!

Thank you for the additional info. For AD-type LDAP structures I don't think it's technically possible to do what you're describing. When the LDAP connector is told to get users for specific groups (when users is mapped and process_groups is enabled), it performs this workflow for each directory_group in your group mapping:

  1. Look up DN for given group CN using group_filter_format
  2. Interpolate group DN into group_member_filter_format
  3. Join group_member_filter_format and all_users_filter with & to make a query that targets users who belong to the group

Here's an example query string generated by the LDAP connector for my personal AD sandbox:

(&(memberOf=CN=adobe-all-apps,OU=Groups,DC=adobeccetest,DC=com)(&(objectClass=user)(objectCategory=person)(!(userAccountControl:1.2.840.113556.1.4.803:=2))))

This is the reason the base_dn must be broad enough to encompass all users and groups you wish to sync. No users are returned if the base DN only covers groups or users.

To avoid duplicate or unwanted users, there are a few things to keep in mind -

  • If you follow recommended practices (users: mapped and process_groups: True) then the UST only sees users that are members of mapped directory groups
  • Federated users are identified not only by email but username as well. Username is matched to the NameID of the SAML assertion. You should plan to configure your LDAP connector to map the username using the user_username_format. The most common setting for this when using AD is {userPrincipalName} but it should be configured to map the same field you're setting as the NameID in your SAML setup
  • You can alter all_users_filter to exclude based on any attribute you want. So if you have device or utility accounts in your mapped groups, I imagine there is some means of identifying them based on an attribute
  • If that isn't feasible, you can look at setting up the extension config to identify unwanted users based on LDAP attribute. You can't filter the users at that part of the sync workflow, but you can ensure they don't get assigned licenses or even put them in a special user group

Let me know if you have any questions or need additional info.

I should also add that the CSV connector confers a greater deal of flexibility. You would have to write your own script to query and filter the exact users you want, but if you can get them into a CSV file the UST can treat that file like an identity source.

Hi Andrew, I looked into this further and it looks like it is not needed for our daily sync that only does matched users. It would only apply to our account delete process that we run very rarely so I am the extra overhead of querying all users is not that important. Between this and updating our mapping to use userPrincipalName for username, it addresses all of my concerns. Thank you for your explanations, they were very helpful.