cleanup-org-users fail
vanillacandy opened this issue · 9 comments
Describe the bug
A clear and concise description of what the bug is.
We are using pipeline+ldap combination.
To cleanup, I have the following code inside my pipeline.yml
- name: cleanup-org-users
build_logs_to_retain: {{build_to_retain}}
plan:- get: config-repo
passed: [create-orgs]
trigger: true - get: 15m
trigger: true - task: cleanup-org-users
- get: config-repo
The task was able to clean up a bunch of users, however, it got stuck in one of org and can't proceed further with the following error:
The org has already been cleanup when I checked in UI, but the error log seems to believe it is stuck on that org.
2020/10/30 21:31:40 I1030 21:31:40.623767 20 cleanup_users.go:75] Unable to find user () GUID from uaa, using org user guid instead
14:31:40
2020/10/30 21:31:40 I1030 21:31:40.623835 20 cleanup_users.go:86] Removing User from org
14:31:40
error: Error removing user from org xx: cfclient error (CF-AssociationNotEmpty|10006): Please delete the user associations for your spaces in the org
To Reproduce
Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
I check that both org and space yml have
enable-remove-users: true
Desktop (please complete the following information):
- Version [e.g. 22]
The version of cf-mgmt we are using is tag: "1.0.43"
We have created an issue in Pivotal Tracker to manage this. Unfortunately, the Pivotal Tracker project is private so you may be unable to view the contents of the story.
The labels on this github issue will be updated when the story is started.
vmware case created here also: https://community.pivotal.io/s/case/5000e00001nEXNvAAO/cleanup-org-users
Hi @vanillacandy! Unfortunately I don't seem to have access to the VMware case on community.pivotal.io, so I'm sorry if my solution is already suggest there.
The problem seems to be that it cf-mgmt was unable to remove the user from the org because a prerequisite of that operation is that the user has all of its org roles unset first.
You may be able to resolve this by:
- Using the cf-cli to list org users in the org
cf org-users YOUR-ORG
(The UI does not always display all users that the cli would) - Compare the list of users you get back with the list of users you expect to see in the org
- Manually unset any org-roles of org-users that should no longer be there (i.e.
cf unset-org-role USERNAME ORG ROLE
) - Run the
cleanup-org-users
task again
If the list of users you get back from cf org-users
is what you expect then there might be something more complex going on that will require some extra debugging.
Hopefully that helps!
Thank you, that's very helpful. I will try to do that.
Also, I ran the cleanup command on my second foundation, the task completed and said it clean up all orgs, however, when I checked on some orgs, it got cleaned up successfully, but some of them didn't get cleanup at all. Without error command saying the task fail.
Hmmm, do you see log messages saying cf-mgmt did clean those orgs? Or are the log messages for those orgs not present? It may be worth checking your orgs.yml
to see if any of the orgs that were not cleaned up might match any entries under the protected-orgs
list.
There's no protected-orgs in the list.
The set up is like following
org.yml
org: XX
org-billingmanager:
ldap_users: []
users: []
ldap_group: ""
org-manager:
ldap_users: []
users: []
ldap_group: ""
org-auditor:
ldap_users: []
users: []
ldap_group: ""
enable-org-quota: false
memory-limit: 10240
instance-memory-limit: -1
total-routes: -1
total-services: -1
paid-service-plans-allowed: false
enable-remove-users: true
###space.yml
space1:
org: orgname
space: XX
space-developer:
ldap_users: []
users:
- user1
ldap_groups:
- group1
- group2
space-manager:
ldap_users: []
users: []
ldap_group: ""
space-auditor:
ldap_users: []
users: []
ldap_groups:
- ldapgroup1
allow-ssh: true
enable-space-quota: true
memory-limit: xx
instance-memory-limit: -1
total-routes: -1
total-services: -1
paid-service-plans-allowed: false
enable-security-group: false
enable-remove-users: true
###second space.yml
org: xx
space: space2
space-developer:
ldap_users: []
users:
- user1
- user2
ldap_groups:
- ldapgroup1
- ldapgroup2
space-manager:
ldap_users: []
users: []
ldap_group: ""
space-auditor:
ldap_users: []
users: []
ldap_groups:
- ldapgroup
allow-ssh: false
enable-space-quota: true
memory-limit: xx
instance-memory-limit: -1
total-routes: -1
total-services: -1
paid-service-plans-allowed: false
enable-security-group: false
enable-remove-users: true
....and more space.yml in similar format
It's probably worth mentioning that cleanup-org-users
is really only a "clean up". The users it removes already have no permissions and are just lingering entries, so it really does no harm if the task fails. As long as there are no errors when setting/unsetting user roles in the org, this step is really just a nice housekeeping step. But if you like to see a pristine environment, carry on!
Because of the error: Unable to find user () GUID from uaa, using org user guid instead
, it could imply that one of the users who is in one of your LDAP groups is not actually in uaa
. The way this works is kind of confusing, but essentially when you setup LDAP to act as an authentication origin for UAA, it does not automatically add all of your users.
Instead the logic is:
- "userRyan" is trying to login to UAA
- "userRyan" is not found in UAA, check against other UAA origins to see if "userRyan" exists there
- "userRyan" exists in LDAP, so create a "shadow user" in UAA with the
origin: ldap
so we know to look there in the future
So it's possible that the user that cf-mgmt cleanup-org-users
is trying to clean up has never logged in and therefore they do not "exist" in UAA.
You can try to identify which user(s) this might be by running cf org-users XX -a
. According to your provided ymls, it should only contain:
- user1
- user2
- members of group1
- members of group2
- members of ldapgroup
- members of ldapgroup1
- members of ldapgroup2
Then you can remove the individuals who should not be there by using this api: https://apidocs.cloudfoundry.org/14.0.0/organizations/remove_user_from_the_organization_by_username.html
Hope that helps!
I have two scenarios.
- Pipeline gave me error and it won't proceed
- Pipeline finished running cleanup-org-users but orphan users weren't all cleanup.
The first scenario is resolved after following your suggestion above by comparing the UI and cfcli commands differences. The second scenario persists, but I found out it's possible my bad. I have found one of the space within that org contains the user I thought was supposed to be orphan, so it didn't get cleanup.
Ah okay, then sounds like that might be the issue! Let me know if that doesn't resolve the problem for you