Manage users access and group membership using configuration
thomas-vogels opened this issue · 0 comments
Is your feature request related to a problem?
This proposal addresses multiple issues commonly faced in a data warehouse by extending the current approach in Arthur:
- Needing a "system" user, which a BI tool uses
- Adding users from the Data or Analytics team with wider permissions
- Grouping users so that permissions can be based on groups
- Adding users and removing users from groups, thus changing their permissions
- Ensuring that users lose permissions (and access) when leaving the team.
(The first already exists. The rest is limited or doesn't yet exist.)
Instead of taking a manual approach of:
- Using a SQL client like
psql
to run commands (which wouldn't be recorded) - Using
arthur.py update_user
to add a user to a group
this suggests to instead create a configuration section
- with users
- and their groups
So that users, groups and memberships can be configured and automatically managed.
This means that we have one place that describes who has access to which schemas (via groups). And we know that that's reflected in the state of the database as well. We make onboarding and offboarding faster and easier, thus ensuring that we're using the right tools consistently and timely.
How this plays out:
- If a new group is mentioned (as user's group, as a reader or as a writer group), that group is created.
- If a new user is mentioned, that user is created.
- The user's groups in the database are compared with the groups in the configuration file.
- The user is removed from any group not listed in its configuration.
- The user is added to any group, where they are not already a member.
- A user who is no longer listed in the configuration loses access to the database. (Removing the user would require dropping any objects so managing owned objects is a follow-up feature.)
Describe the solution you'd like
Here's an example:
Example Step 1:
{
"data_warehouse": {
"users": [
{
"name": "default",
"group": "analyst"
},
{
"name": "alice",
"groups": ["analyst", "etl_ro"]
}
]
}
}
- This creates groups
analyst
andetl_ro
. - This creates user
alice
. - This adds
alice
toanalyst
andetl_ro
.
Example Step 2:
{
"data_warehouse": {
"users": [
{
"name": "default",
"group": "analyst"
},
{
"name": "alice",
"groups": ["etl_ro"]
},
{
"name": "bob",
"groups": ["analyst"]
}
]
}
}
- This creates the new user
bob
. - This removes
alice
from theanalyst
group. - This adds
bob
to theanalyst
group.
Example Step 3:
{
"data_warehouse": {
"users": [
{
"name": "default",
"group": "analyst"
},
{
"name": "bob",
"groups": ["etl_ro", "daas_ro"]
},
{
"name": "carol",
"groups": ["analyst", "daas_ro"]
}
]
}
}
- This removes the user
alice
(in this step, just by usingPASSWORD DISABLE
) - This adds the user `carol.
- This adds the group
daas_ro
. - This removes
bob
fromanalyst
and adds him toetl_ro
anddaas_ro
. - This adds
carol
toanalyst
anddaas_ro
.
Describe alternatives you've considered
- Still looking for a tool that would do that for us.
- Regarding database access, we should be leveraging IAM.
Also, out of scope of the configuration: managing a user's credentials by updating their passwords. (For now, continue to use arthur.py update_user
.
Additional context
We have sometimes schemas that aren't managed by Arthur or external. For those schemas, we have to decide whether we want to have Arthur manage permissions as well.