amundsen-io/amundsen

Badges having same key but being applied to different node types (table vs column) overriding each other when publishing

mikaalanwar opened this issue · 5 comments

Current Behaviour

In case we have two badges having the same "key" and different "category", they are over-ridden at the time of publishing based on the one that get's published last. E.g I have a table badge having "PII" and a column badge having key "PII". If I was to do something as following, it would update the category of the same badge without recognising that they are two different badges (meant for different entities).

BadgeMetadata(
    start_label=TableMetadata.TABLE_NODE_LABEL,
    start_key=TableMetadata.TABLE_KEY_FORMAT.format(
        db=table.ref.database,
        cluster=table.ref.cluster,
        schema=table.ref.schema,
        tbl=table.ref.name
	    ),
	    badges=[Badge(
                category='compliance',
                name='PII'
            )]
	)

.
.
.

BadgeMetadata(
	start_label=ColumnMetadata.COLUMN_NODE_LABEL,
	start_key=ColumnMetadata.COLUMN_KEY_FORMAT.format(
	    db=column.table_ref.database,
	    cluster=column.table_ref.cluster,
	    schema=column.table_ref.schema,
	    tbl=column.table_ref.name,
	    col=column.name
	    ),
	    badges=[Badge(
                category='column',
                name='PII'
            )]
	)

Expected Behaviour

Ideally, the uniqueness check of badges should also consider the category or the type of the node / entity (column vs table) it is being applied to. Otherwise, column badges could override table badges and vice versa etc.

Possible Solution

The workaround that I am currently considering to use is to use a different/unique name of table or column badge but doesn't seem ideal.

Steps to Reproduce

  1. Try creating badges using the pseudo-code above.
  2. The order of creating will determine which category gets applied.

Screenshots (if appropriate)

image

Context

Yes, we have a scenario where we have a table and a column badge having the same name. A change in ordering of badge creation (during some refactoring) caused the relevant "table" badge to disappear from the home page as they category of that badge was changed from "compliance" to "column" (and we don't display column badges on our home screen).

Your Environment

This was the intended behavior of badges, similar to how if you name 2 tables exactly the same thing they would be considered the same table. You can try naming the table and column badges slightly differently in the database and for display purposes you can configure the badges to display any string even if its the same.

@allisonsuarez In the badge case they're not the "same" as they're categorically different for column badge and table badge. Our initial thought was that the "category" should be included in the de-dedup and should be enough to differentiate the badges. similarly, if we have tables of the same name with different storage layer i.e (Redshift or Bigquery)

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale commented

This issue has been automatically closed for inactivity. If you still wish to make these changes, please open a new pull request or reopen this one.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.