cleanup_job condition is too restrictive

Question

cleanup_job condition is too restrictive

Closed this issue a year ago · 7 comments

Ref: https://github.com/lyft/cartography/blob/ab04b205ee942700e352e5c8911d8a596c8c38d2/cartography/graph/cleanupbuilder.py#L148

Description:
Cleanup job requires that 'TargetNodeMatcher PropertyRefs in the sub_resource_relationship must have set_in_kwargs=True.'.
This condition is very restrictive and block cleanup for simple intel modules.

Ex:

Use an APIKey on Gandi
Retrieve organization (and organization id)
Pulling DNS Zones (and try to set DNSZone a sub-resource of the organization)
Pulling DNS Records (and try to set Records a sub-resource of the DNSZone)
You will not be able to cleanup DNSZone and DNZRecords due to the restriction

To Reproduce:
You can test this issue on #1235

pytest -vvv tests/integration/cartography/intel/gandi

Logs:

tgm = TargetNodeMatcher(name=item.registered_domain)

    def _validate_target_node_matcher_for_cleanup_job(tgm: TargetNodeMatcher):
        """
        Raises ValueError if a single PropertyRef in the given TargetNodeMatcher does not have set_in_kwargs=True.
        Auto cleanups require the sub resource target node matcher to have set_in_kwargs=True because the GraphJob
        class injects the sub resource id via a query kwarg parameter. See GraphJob and GraphStatement classes.
        This is a private function meant only to be called when we clean up the sub resource relationship.
        """
        tgm_asdict: Dict[str, PropertyRef] = asdict(tgm)
    
        for key, prop_ref in tgm_asdict.items():
            if not prop_ref.set_in_kwargs:
>               raise ValueError(
                    f"TargetNodeMatcher PropertyRefs in the sub_resource_relationship must have set_in_kwargs=True. "
                    f"{key} has set_in_kwargs=False, please check by reviewing the full stack trace to know which object"
                    f"this message was raised from. Debug information: PropertyRef name = {prop_ref.name}.",
                )
E               ValueError: TargetNodeMatcher PropertyRefs in the sub_resource_relationship must have set_in_kwargs=True. name has set_in_kwargs=False, please check by reviewing the full stack trace to know which objectthis message was raised from. Debug information: PropertyRef name = registered_domain.

cartography/graph/cleanupbuilder.py:159: ValueError

Answer 1 · 2023-09-08T19:57:50.000Z

Here's why I restricted this.

In most cases, legacy Cartography modules have ingestion queries that use a keyword argument to identify the tenant. For example, here's AWS ECR identifying the owner account and here's GCP DNSZones identifying the owner project.

I wrote the querybuilder code as an abstraction around this. For example the parameters supplied to load() will look similar to how it did before where the tenant ID is specified as a kwarg:

https://github.com/lyft/cartography/blob/5c1aa119dad88bb2e934aee5bdcc77275f358ac7/cartography/intel/aws/emr.py#L67

and we tell the data model where to find that value here:

https://github.com/lyft/cartography/blob/5c1aa119dad88bb2e934aee5bdcc77275f358ac7/cartography/models/aws/emr.py#L48-L50

Theoretically, we could remove _validate_target_node_matcher_for_cleanup_job(), but I think having a specialized way to provide the tenant ID as a kwarg (as opposed to a key in the dict for other relationships) helps the module author be aware of tenant relationships. It's a very frustrating bug when cleanup jobs delete the assets in other tenants accidentally, so I'd like to keep this separated out.

I'm open to other thoughts and reconsidering but for the Gandi module and the Slack module, I don't think it's that much of a big lift to specify a kwarg to the tenant id in the call to load().

Answer 2 · 2023-09-08T20:02:01.000Z

Oh, one more follow up on the examples you provided (those are very good call-outs):

Pulling DNS Zones (and try to set DNSZone a sub-resource of the organization)
Pulling DNS Records (and try to set Records a sub-resource of the DNSZone)
You will not be able to cleanup DNSZone and DNZRecords due to the restriction

When the data model refactor is done, I intend for all assets to have a relationship back to the owning tenant. Then automatic cleanups will work.

Answer 3 · 2023-09-11T20:57:40.000Z

I'm catching myself going back and forth on whether this condition is too restrictive. @ramonpetgrave64 what do you think?

Answer 4 · 2023-09-11T21:22:13.000Z

@achantavy, it's primarily a matter of defining the edge resource rather than restricting it or not.

With the definition you provided (i.e., the edge resource is the link between the tenant and all the elements of that tenant) to represent a tenant, a zone within it, and a record, I need:

Tenant -> Zone (with tenant ID in the kwargs)
Tenant -> Record (with tenant ID in the kwargs)
Zone -> Record (which is another relationship)
Without the tenant-record link, there's no auto-clean.

With a broader definition (i.e., the edge resource is any composition link between an element and a set of elements that compose it), we can limit it to:

Tenant -> Zone
Zone -> Record
The schema is simpler and seems more natural to me.

However, having security that prevents a "Root node" like a tenant from being deleted seems like a good idea, but perhaps it needs to be implemented differently.

Answer 5 · 2023-09-11T21:30:01.000Z

I agree that having every item tie back to a tenant feels verbose, but I like it for consistency and keeping things organized especially in environments where there are multiple zones in multiple tenants with similar names.

Answer 6 · 2023-09-12T15:04:53.000Z

Also agree that the way we have it now requiring all nodes be directly attached to a Tenant node does unintuitive at first. But it lets us write the load and cleanup queries in a reliable way. Still, we'd appreciate help to make the query builders more flexible.

Side note: As is, I have doubts if it will work well when a resource exists in multiple tenants, like with AWS IAM's foreign principals.

Answer 7 · 2023-09-12T15:18:41.000Z

Side note: As is, I have doubts if it will work well when a resource exists in multiple tenants, like with AWS IAM's foreign principals.

I don't know of any examples of a resource belonging to multiple tenants. With AWS principals, a principal belongs to one tenant but can have trust relationships to another. Therefore there will still only be one tenant relationship.