odpi/egeria

[Enhancement] Support the capabilities of Apache Atlas classifications

mandy-chessell opened this issue · 3 comments

Existing/related issue?

No response

Please describe the new behavior that that will improve Egeria

Apache Atlas supports a tagging system called Classifications. They are easy for end users to create and assign to glossary terms and entities. Classifications can have attributes and can be organized into a hierarchy to allow inheritance of attributes. These classifications are used to organize the assets in Apache Atlas and also are used in governance - for example, they are synchronized with Apache Ranger for rule-based security.

These classifications are not the same as the Classification Definitions (ClassificationDefs) support by Egeria's type system - which have a more formal lifecycle and for part of the open metadata language.

Today, Apache Atlas's classifications match Egeria's InformalTags in intent.

Alternatives

Alternatives include:

  1. Use InformalTags whilst ignoring the attributes and inheritance values when synchronizing metadata between Egeria and Atlas - this means that Altas classifications can not be prepared in open metadata and pushed to Atlas. As such it is not a good choice and has been discarded.
  2. Extending Egeria's informal tags to match the capabilities of Apache Atlas's classifications.
    • Add support for attributes
    • Add an inheritance relationship between informal tags
      These extensions to the Informal Tags will allow the exchange of Tags/classifications between Egeria and Atlas.
  3. Creating a new entity type and and associated relationships to represent Atlas's Classification (such as ClassificationTag entity, ClassificationTagged relationship and ClassificationTagHierarchy).

The second option is the simplest and enables existing support for informal tags to be used for maintenance. The down-side is that InformalTags can typically updated by any metadata consumer. Atlas classifications are used for controlling access in Apache Atlas and so an Egeria end user may inadvertantly create a security breach.

The third option seems the safest, but is more work.

The choice is important because the type of tagging system used by Atlas is also seen in many data catalogs such as DataHub. Therefore a decision here with also affect other integrations.

Any Further Information?

No response

Would you be prepared to be assigned this issue to work on?

YES

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 20 days if no further activity occurs. Thank you for your contributions.