openshift/cloud-credential-operator

Operator Doesn't Delete User in AWS if User is Added to Group

mattpielm opened this issue · 8 comments

My IT department automatically adds every IAM user created to a group that Denies a number of "unsafe" (and reasonably so) AWS permissions. When a new user is created via a CredentialsRequest and later deleted, the delete operation hangs with:

    message: 'failed to deprovision resource: AWS Error: DeleteConflict: Cannot delete
      entity, must remove users from group first., status code: 409'
    reason: CloudCredDeprovisionFailure
    status: "True"
    type: CredentialsDeprovisionFailure

If I delete the user manually via AWS, the object will eventually delete successfully.

How to reproduce (I think):

  • Create a CredentialRequest in Openshift
  • Use AWS tools to add user to some AWS group
  • Delete CredentialRequest in Openshift

Possible solutions:
I've run into this same issue with users created via Terraform. Terraform has an option to "force_delete" that automatically removes the user from any groups. Also in Terraform, if you add explicitly add the user to the forced AWS group during creation, Terraform understands the group is there and removes the user automatically. If the CredentialRequest had the ability to specifiy "additionUserGroups" or something knew to remove the user from those groups before deleting it should succeed.

More info:
I run into this during IPI cluster install when the process destroys the bootstrap resources; in that case I added the force_delete to the installer Terraform templates so the install wouldn't fail. I also see this when destroying a cluster where openshift-install will get stuck with that same 409 error deleting all the users created using CredentialRequest (example, the ebs and machine-api users). I scripted up this to clear out the users and let the destroy process complete:

users=$(aws iam list-users | jq -r '.Users[] | select(.UserName|test("^'$env'-[a-z0-9]{5}-")) | .UserName')

if test "x$users" != "x" ; then
    for user in $users ; do 
        echo Removing $user...
        aws iam remove-user-from-group --user-name=$user --group-name ForcedUserGroup
        aws iam delete-user --user-name=$user    
    done
else
    echo No users found
fi

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

/remove-lifecycle stale

@matt314 This might get more traction if you opened it as a card on our jira board instead: https://issues.redhat.com/projects/HIVE

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.