Introduce a status field on relations and entities
Opened this issue · 20 comments
The status can either be "draft" or "published", and unregistered users - and users with "read" permission only - may only be shown "published" entities and relations.
Just looking e.g. at Entity. The pseudo-fundamental 'name' field is actually a property (properties) specially named 'name'. I take this to mean that it is stored in a properties table rather than directly in an entities table (?). This sets a sort of precedence in which it is not clear to me if 'status' should be a field on the model called status, or like 'name' should be a specially named property. Which is preferred?
Hi Scott, great to have you on board! You're making a very good point and I'm not entirely sure what the best options is. The argument in favour of a first-class attribute on the entity table is that we don't need versioning or sourcing for the status, at least as far as I can imagine the use case going. The pseudo-fundamental (nice term) nature of name
has also not made it entirely easy to handle, so I'm not to keen to have another one of those.
On the other hand, a property would make this pluggable, so that instances where a status is not required could simply remove it and thus simplify the system.
My sense is that a table column may be easier, but I'm not strongly biased either way.
My sense is that a table column may be easier, but I'm not strongly biased either way.
My vote is for making it a table column. It's definitely a meta data attribute, that is only used for permission checking. It'll make queries a lot easier to read.
Thanks Friedrich, I'm looking forward to working on this project.
I just discussed this with Manuel, as well as a follow-up issue on how to deal with the permissions. Regarding the attribute (model-field) vs. related-property approach -- we are both leaning toward the opinion that this is a type of metadata that deserves to go directly on the model, which accords with your "not keen to have another" name-like property. So I will add it as a field directly on the model.
Regarding permission handling. Manuel and I think that treating draft/published as a recorded permission (ie. object-level role-based boolean permission in the grano_permission table) could be problematic as there is not really a built in construct which supports "I can't see this because I am a reader AND because it is in draft" -- rather only: "I can't see this because I am a reader". This is to say, we have a situation here where the permission depends on the state of the object - not on the explicitly assigned permissions -- and that state-based permission may come in conflict with the otherwise assigned permission.
For this reason, we are thinking that this is a special case scenario whereby if the object status is draft, then a stricter permission set will be applied -- overriding the permission settings for the object.
The thing that is not clear to me is where to best handle this special casing. From the looks of the way Permission.by_project_and_id is being used, I take it that the permission properties as stored in the db are being used directly. Do I modify the return values of by_project_and_id, to_dict_index, and to_dict based on the status of the object? One reason this does not feel right to me is that these methods really have no knowledge of the object state, so I feel like there is some other place I am not seeing where this responsibility would fit better.
ok, looking at this a bit more, I am confused. Are there in fact Entity level permissions - or is it the case that there are only project level permissions. Looking at things like the permissions_api and the way authz works, I am starting to think it is just the latter. Is this right?
Are there in fact Entity level permissions - or is it the case that there are only project level permissions. [...] I am starting to think it is just the latter. Is this right?
Yep, that's what I tried to explain :) The grano_permission
table only points to an Account
and a Project
, so permissions apply only to a user and a project. That is, project level.
Ha, ok. So I must have seemed kind of stubborn. Everything I stated above doesn't make sense then. I guess then I don't really understand how we might affect accessibility of Entities and Relationships if permissions are project specific.
Ok, that was confusing - sorry also for the lag in responding. I'm not sure project-level permissions are going to be enough in the long run, but it seemed less complex than proper ACLs for now.
As for how to affect the accessibility of entities/relations based on their state, I think we can go with a simple query parameter and a minimum publication threshold specified in the overall project settings (i.e if you modelled the status as an int, you could by default filter for things that are bigger than project.publication_threshold)?
sorry, confusion was my fault. I thought I understood something - Manuel was trying to talk me out of it, but my assumption bias skewed my understanding. I need to look into this a bit more. Will report back.
Awesome, let me know how you're doing :) Once you've acquainted yourself with the codebase a bit, I'd also like to email you some further info about where we're headed with this to make sure that you find a piece of the puzzle that interests you and have a good sense of how that relates to actual users.
This is what I now understand:
- Permissions are by project/account (ie. user) and expose read/edit/admin permissions for that user on that project.
- Permissions are vetted (typically in views) by calls to utilities exposed in the authz module. These are all project-level permissions
- authz functions take a project parameter. Examples in the entity views demonstrate that entity permissions are really the project level permissions -- where the project is the project associated with the entity, and the account is the session account from Flask's request.
All this said, I don't think this directly gives us a means for limiting access to entities by status. What I propose is to add new functions to authz, e.g.: entity_read, entity_edit, entity_manage, entity_delete. Each would take an entity, rather than a project, and the resulting boolean return value would be determined based on entity status and the appropriate project permission. E.g.:
def entity_read(entity):
"""For a published entity, user access is determined by project level access. For a draft entity,
only users with edit or admin permissions can access the entity."""
if entity.status == 'published':
return project_read(entity.project)
elif entity.status == 'draft':
return project_edit(entity.project) or project_manage(entity.project)
assert False # supporting only published, draft
This approach would require the appropriate view changes (ie. calls to authz entity_ functions instead of project_ functions). I have not looked at relations -- for the moment I am assuming that something similar would work there also.
I agree on the new authz functions, those seem useful. At the same time, I haven't really managed to think up a generic way of doing access control on collection queries, which is why those handle it explicitly.
When implementing the status field, it may make sense to define the field as an integer, which would allow you to write queries of the type: "give me all entities whose status is bigger than or equal to DRAFT". The default for that could be set per project, i.e. "give me all entities whose status is bigger or equal to the value given in the min_public_status field of the assiciated project".
I can do it this way, but I'm not sure I understand the motivation here. It seems like an arbitrary abstraction for some use case we don't yet have in mind. Maybe you have something in mind.
I am wary of a thresholded approach in general though. The way I've usually seen access control done is much more explicit than this where permission types (e.g. read, write) are arbitrary but explicit, and are determined according to a principal, role, or role predicate. This allows permission to be affected very explicitly on a case-by-case basis if needed. Thresholding by integers strikes me as abstract in a way that might make both reading and writing permissions and permissions structures/hierarchies difficult.
I am also now shying a bit away from expanding authz. This does not seem scalable to me. If we think that this new permission type (read permission affected by draft/publish) is as far as we will go with object-based permissions, then maybe this is ok. But if we are going to start doing permissions for objects, I can't help but wonder if we shouldn't consider a standard ACL approach. Then, e.g. the Entity class would be responsible for its own permissions (Pyramid does this via an acl list on the class) and there would be a single has_permission() authorization function that takes the perm type and the object.
Pyramid's ACL approach does not (from what I remember) directly provide a mechanism for the state-based permissions that we need. However this https://github.com/mikeboers/Flask-ACL pyramid-like approach for Flask might provide a hint in the idea of predicate-based principal assignment (I can imaging the principal on a read permission ACE being dynamically generated based on the object status). It's not clear to me that this Flask extension even works, but it is pretty lean and might be a good starting point for a full ACL for us -- or maybe some stripped down variation on this idea.
So, I don't know if you want to do the ACL thing. If you'd prefer to continue to grow the authz file, we can do that. However, I'd kind of like to get a sense of the anticipated use case for the thresholding approach if this is the direction you want to go.
So, this is what we are looking at in authz for a thresholded approach. If this is the direction you want to go, I will need to add the integer status field to Entity and modify the views before submitting the pull request:
PUBLISHED_THRESHOLD = 5 # minimum status value for publication
def entity_create():
return logged_in()
def entity_read(entity):
if not entity.project.private and entity.status >= PUBLISHED_THRESHOLD:
return True
q = _find_permission(entity.project).filter_by(reader=True).first()
if q:
if entity.status >= PUBLISHED_THRESHOLD:
return True
else:
if q.editor or q.admin:
return True
return False
def entity_edit(entity):
return project_edit(entity.project)
def entity_manage(entity):
return project_manage(entity.project)
def entity_delete(entity):
return entity_manage(entity)
Note: I'm not sure if .first()
is the right thing to do here on the permissions query. Is it possible that there would be more than one Permission for a user/project? I don't see unique restrictions on the Permission model.
I ended up doing the authz functions and using a threshold as you suggested. Let me know when you get a chance to review the pull request. I want to hold off on further changes related to this issue until I know we are on the same page. Thanks!
I'm horribly sorry for the lag, I was at a workshop last week and didn't get to review it yet - but will do so now. Would you mind also sharing your email, I'd like to send you some links for discussion!
So, getting ready to jump into Relation permissions. Is the idea to give a status flag to the Relation model itself, or should the Relation somehow inherit permissions based on the status of the source and target Entities?
I didn't consider this initially - I thought that giving a flag to relations would be the only option. But now you've proposed it, I actually like the implied visibility thing a whole lot - it saves us implementation work while making the whole thing so much easier to use both from a developer and user perspective. So my vote would be clear: let's make it implicit, based off both ends of the relation!
FYI I'm working a bit on implementing facets support (#54), which affects the relevant code areas. I'm therefore checking in some raw and rough code on this branch so you track development: https://github.com/granoproject/grano/tree/facets