ucoProject/UCO

Need ability to represent concept of a Selector

Opened this issue · 6 comments

Background

UCO currently lacks the ability for the expression of a structured interrogatory mechanism (e.g., query, pattern, etc.) to reduce a set of data to a particular desired scope.
This ability is a critical requirement for the multiple cyber application domains including the security operations domain where it is needed for representing signatures/rules, the cyber threat intel (CTI) domain where it is needed for representing threat indicators. and the risk domain where it is needed for representing specific data queries for the analytic evaluation of particular risk conditions. It is highly likely that it is relevant and needed by other application domains as well.

There is a need to be able to express relationships between such selector patterns and other CDO concepts (e.g., Observations, ObservableObjects, Alerts, Malware, Campaigns, RIskMeasures, RiskMeasurements, etc.)

A conceptual structure is needed that is flexible enough to allow subclassing for various particular pattern expression structures/syntaxes/forms.

Requirements

Requirement 1

Ability to name and textually describe a particular data selector pattern.

Requirement 2

Ability to express relationships (potentially temporally variant) between data selector patterns and other CDO concepts.

Requirement 3

Ability to explicitly express various data selector pattern syntaxes (SPARQL, SQL, YARA, GraphQL, STIX pattern, etc).

Requirement 4

Ability to explicitly express manual data selector patterns as a general textual last resort if appropriate explicit data selector pattern syntax implementation is not available.

Requirement 5

Ability to explicitly specify the type (syntax or structure) of a data selector pattern.

Requirement 6

Ability to specify a particular datasource targeted by a data selector pattern.

Requirement 7

Ability to express a particular execution vector (URL, command line, etc) for a data selector pattern.

Risk / Benefit analysis

Benefits

  • Ability to express specific data selector patterns to support cyber application domain needs (security operations, cyber threat intelligence, risk, etc.)
  • Provides general basis for a variety of current and potential future use cases involving data selector patterns.
  • Provides a flexible and extensible yet consistent approach.

Risks

None

Solution suggestion

  • add new selector namespace;
  • add new selector:Selector class as subclass of core:UcoObject;
  • add new vocabulary:SelectorTypeVocab vocabulary;
  • add new selector:SelectorPatternFacet class as abstract basis of subclass specialization for various data selector pattern syntaxes/structures/forms;
  • add new selector:GrphQLPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:ManualPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:SQLPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:YARAPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:SPARQLPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:STIXPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:RegexPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:ByteArrayPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:URLPatternFacet class as subclass of selector:SelectorPatternFacet;
  • add new selector:selectorType property;
  • add new selector:selectorTarget property;
  • add new selector:selectorExecutionVector property;
  • add new selector:patternString property as abstract basis of subproperty specialization for various data selector pattern serialized syntaxes;
  • add new selector:graphQLPattern property as subproperty of selector:patternString;
  • add new selector:manualPattern property as subproperty of selector:patternString;
  • add new selector:sqlPattern property as subproperty of selector:patternString;
  • add new selector:yaraPattern property as subproperty of selector:patternString;
  • add new selector:sparqlPattern property as subproperty of selector:patternString;
  • add new selector:stixPattern property as subproperty of selector:patternString;
  • add new selector:regexPattern property as subproperty of selector:patternString;
  • add new selector:byteArrayPattern property as subproperty of selector:patternString;
  • add new selector:urlPattern property as subproperty of selector:patternString;
  • add new associated property shapes on all new classes

Solution discussion

The proposed solution provides a simple class for consistently, yet flexibly, representing and relating structured interrogatory mechanisms (e.g., query, pattern, etc.) to reduce a set of data to a particular desired scope.
The initial set of proposed pattern formats covers some obvious current choices but is not asserted to be complete. Other formats can be added going forward without breaking any existing structure.

This structure is necessary for the Risk application domain and is currently codified and operationally used (as proposed above) by the draft CDO-Risk application domain ontology (currently underpinning MITRE's Supply Chain Security System of Trust Framework).
The structure provides a mechanism to link and to evaluate particular data sources to evaluate particular risk conditions. It forms a basis for manual, semi-automated, and fully automated evaluation approaches.
The upper portion of the diagram below shows a simple overview of a portion of the CDO-Risk ontology including use of the Selector class.
The lower portion of the diagram shows an example how it is and can be used.

Selector CP diagrms-Risk - Selector drawio

This structure is necessary for the CTI application domain as it captures the core context of what a cyber threat indicator is identifying as relevant. In the CTI-CDO application domain ontology the pattern (Selector) will be separated from the relationship of what it indicates in order to support more flexible and powerful expression and analysis.
The below diagram provides a very simple example of how the proposed Selector class would be used to convey cyber threat indicators. In the diagram, objects in orange are class objects from the CTI-CDO application domain ontology.

Selector CP diagrms-CTI - Selectors drawio

This structure is necessary for the security operations domain as it provides a structured mechanism for expressing and relating the core concept of signatures/rules.
The below diagram provides a very simple example of how the proposed Selector class would be used within the security operations domain to convey the deployment of a particular signature to an IDS tool. The DeploySelector class object in the diagram would be a subclass of the DeploySelector (action subclass) class proposed in the Software CP .

Selector CP diagrms-SOC - Selectors drawio

Beyond its applicability across numerous application domains it also provides significant value in clearly and consistently expressing and relating such patterns across multiple various pattern formats where certain contexts of tools may call for specific formats but the underlying pattern is the same.
The below diagram provides a simple example of a scenario where a STIX (non-CDO) Indicator is received for detecting a file with a particular hash.
To make use of this STIX indicator within the CDO ecosystem and for various contexts and purposes the following could occur:

  • Capture the purely json serialized STIX Indictor using a UCO ContentData object
  • Utilize a fictional but completely possible STIX-CDO translation tool to convert the STIX serialization of the indicator to a CTI-CDO Indicator by breaking the pattern out as a Selector and the indicated context out into an Indicator Relationship to a Malware object
  • Perhaps it is desired to also express and potentially deploy this pattern in other formats such as YARA or SPARQL
    • Utilize a fictional but completely possible Pattern Polyglot tool to convert between various pattern formats. In this case from STIX Pattern to YARA and to SPARQL and capture the new pattern formats as separate Selectors
    • Specify new CTI-CDO Indicators linking each of the new Selectors to the same Malware object that the original STIX Indictor asserted as indicating.

Selector CP diagrms-Selector translation drawio

Given the above example and diagram resulting in 3 differently formatted Selectors for the same underlying pattern it would now be possible to deploy these Selectors to different tools across different application domains where such a pattern is relevant.
Within the CTI domain where the indicator originally came from we could express that the STIXPattern Selector is deployed to a CTI-relevant tool such as Trend Vision One from Trend Micro that consumes STIX pattern format.
Within the security operations domain we could express that the YARAPattern Selector is deployed to Suricata, a widely used security operations tool that consumes YARA pattern format.
Within a cross-domain context where CDO data from multiple application domains is aggregated into an overall CDO graph we could express that the SPARQLPattern Selector is deployed to a fictional but completely possible CDO Surveiller tool that runs queries against the graph in the background and alerts on any hits.

Selector CP diagrms-Selector Deployment drawio

The proposed Selector construct also makes pattern-based alerting much easier, clearer, more consistent and more flexible.
The below diagram provides a very simple example of expressing an Alert created by Suricata in a security operations domain context when the specified YARAPattern file hash Selector is detected and it can provides details (as an Observation) of what it saw that it believes matches the pattern.

Selector CP diagrms-Selector triggered Alert drawio

If you look across the last three diagrams it is easy to see how the proposed Selector not only enables numerous tactical use cases but also supports the flow of data across application domains in a more strategic fashion.

Coordination

  • Tracking in Jira ticket OC-308
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on TODO-date
  • Requirements to be discussed in OC meeting, date TBD
  • Requirements Review vote has not occurred
  • Requirements development phase completed.
  • Solution announced to OCs on TODO-date
  • Solutions Approval to be discussed in OC meeting, date TBD
  • Solutions Approval vote has not occurred
  • Solutions development phase completed.
  • Backwards-compatible implementation merged into develop for the next release
  • develop state with backwards-compatible implementation merged into develop-2.0.0
  • Backwards-incompatible implementation merged into develop-2.0.0 (or N/A)
  • Milestone linked
  • Documentation logged in pending release page
  • Prerelease publication: CASE develop branch updated to track UCO's updated develop branch
  • Prerelease publication: CASE develop-2.0.0 branch updated to track UCO's updated develop-2.0.0 branch

Thank you for starting this, @Bradichus .

Can you please relate this to observable:ObservablePattern?

@Bradichus , @sbarnum - I'd thought a SQL pattern would also be in this proposal. Could that be included, as it will be relevant for CASE applications that characterize SQL+SQLite resources?

@Bradichus , @sbarnum - I'd thought a SQL pattern would also be in this proposal. Could that be included, as it will be relevant for CASE applications that characterize SQL+SQLite resources?

Likewise, would YARA rules be in scope of this proposal?

This Issue is blocked by #562.

I am hesitant to bring this proposal forward for committee discussion without at least one JSON-LD- or Turtle-encoded example.

Parts of this proposal come close to existing concepts in UCO that have never received public demonstration, e.g., pattern:patternExpression in #562 that turned out to be incorrectly implemented, and nobody realized this because it was never tested.

I also see some potentially significant complexity in the proposed Facet and subproperty hierarchies. One risk we're going to run into is how SHACL behaves with subproperties, and implications this may have for end users. That is, UCO will have to consider whether it has ontology-wide requirements pertaining to end users and inferencing (/knowledge expansion, under RDFS entailment or OWL entailment).

Last, I see there is a Facet class hierarchy proposed, but no subclasses of the new selector:Selector. The proposal needs to include whether this is permitted, or an error, and why:

kb:Selector-88f86a50-dc38-48b2-a228-186aa8e74523
  a selector:Selector ;
  uco-core:hasFacet
    kb:ByteArrayPatternFacet-31de240e-ed2d-47e5-a425-34cb842850e0 ,
    kb:SQLPatternFacet-9992fe57-c822-4bff-bb16-581477784733
    .

kb:ByteArrayPatternFacet-31de240e-ed2d-47e5-a425-34cb842850e0
  a selector:ByteArrayPatternFacet ;
  # ...
  .

kb:SQLPatternFacet-9992fe57-c822-4bff-bb16-581477784733
  a selector:SQLPatternFacet ;
  # ...
  .

@sbarnum or @Bradichus : Please take any of the graphic illustrations and render it as JSON-LD or Turtle. Please cover two query forms that could be suggestive of different literal-data types: byte array patterns, and SQL or SPARQL. Maybe try finding a unicode string of interest, the infinity character (∞, U+221E) in a string-valued field of a SQLite database using a byte array pattern and a (UTF-8 aware) SQLite pattern?

If this demonstration comes by COB March 4th, I think we will be informed enough to hold a Requirements Review vote in the March 14th meeting.

Issue 562 has been scheduled for a Solutions Approval vote, which should remove a blocker on this proposal (550).

@sbarnum and @Bradichus : My position on this proposal stands, that it needs an encoded demonstration graph (even a small one) before coming for a Requirements Review vote. @sbarnum , thank you for the illustrated figures, but they do not appear to me to be sufficient information to write unit tests. For instance, the last figure (t-shaped, gray Action at top, blue File at bottom) doesn't make clear what the input and output of the Observation are. Prior discussion around Observation has met similar confusion on its inputs vs. outputs. (I have an opinion influenced by the similarly-named class sosa:Observation, but I would prefer to see your perspective, or further complete the Qualities proposal, before noting my view here.)