usnistgov/OSCAL

OSCAL Implementation: Component Definition

akarmel opened this issue · 61 comments

User Story:

As an OSCAL component owner or provider, I am able to publish an OSCAL "Component Definition" (as per the diagram below) that associates a component in an OSCAL implementation with an OSCAL Profile, identifying the specific controls or parts of controls implemented by the component from the Profile. Mappings to multiple Profiles can be handled either through multiple mappings in a single "component definition" or through multiple "component definitions".

Goals:

  1. Per the following representative diagram describing the elements of the OSCAL implementation schema, develop an OSCAL implementation "component definition" that associates a component in an OSCAL implementation with an OSCAL Profile.
Defined by the system owner/administrator
+----------------------------------+
| System Specification             |    
| + -----------------------------+ |
| | Aggregated into Capabilities | |        
| | +-------------------------+  | |
| | | Component Specification |  | |
| | +------|------------------+  | |
| +--------|---------------------+ |
+----------|-----------------------+
           |
          \/
+----------------------+
| Component Definition |
+----------------------+
Provided by the component owner or provider

Dependencies:

None.

Acceptance Criteria

  1. OSCAL Implementation schema has been developed to support the OSCAL "Component Definition" associating a component in an OSCAL implementation with an OSCAL Profile.
  2. OSCAL Implementation schema has been sufficiently documented to describe the functions contained therein.
  3. OSCAL Implementation schema PR has been reviewed by the OSCAL Team and merged into the OSCAL GitHub repo.

6/28/2018 Status

Reviewed the user story to reflect the scenario discussed during the meeting.

@shawndwells: The meetings mentioned in our comments are NIST-internal meetings. Thank you for your interest in our project. We are very interested in collaborating, in different ways, with other parties. Please let me or @david-waltermire-nist know in which way would you like to support our effort, and we will integrate you.

July 18 Status Update (Sprint 12 Acceptance Meeting)

The issue was not assigned to any member of the team in Sprint 12 and we will push the issue to Sprint 13

8/30/2018 Status Meeting

@wendellpiez , @anweiss @JJediny will continue on refining the prototype model and generate data sets to test against this model.

My current approach has been to analyze Microsoft's plethora of Audit Reports -> https://servicetrust.microsoft.com/Documents/ComplianceReports. There's a lot of great data in there that spans FedRAMP, ISO, HIPAA, etc. I've been trying to identify the "component definition" equivalents in that data and correlating those equivalents to applicable elements of TOSCA that might be useful in our actual model for "component definition". Interestingly enough, what I'm finding is that most of their "component definitions" are pretty high-level; just a few paragraphs in many cases with a sprinkling of architecture diagrams. Given that, I think we need to be a bit more specific as to who our intended consumers/producers of implementation are. Ultimately, if "component definitions" are simply passed through to assessment and assessment results, then the "component definitions" themselves aren't nearly as meaningful to a risk professional as are other elements of "implementation" and "assessments/assessment results". However, producers of implementation (both the org itself and CSPs/ISVs) may feel it necessary to document everything under the sun for a specific component which could easily detract from the organization's ability to effectively assess risk. The "middle ground", so to speak, is going to be a challenge to figure out ... and we're just talking about the "component definition" in this case.

Sprint 13 Progress September 27 2018

Most work this Sprint on this Issue is represented in notes we have exchanged or not as the case may be. 😁

However we have made progress - so much that I am (now) thinking that the component definition (not the "declaration") might be potentially very small and simple, and most of the work will be done by modeling "capabilities" either/both within the component declaration (not "definition") and/or cross-linked with components.

I am thinking of a "capability map". Components could be either named in line or linked. This would give us a way to connect controls (being addressed/implemented) with the components (and settings) on which they depend. Indexing such a map would provide useful inputs for SSPs such as Control Summaries, tables of responsibilities/roles, configurations, etc.

09/27/2018

Continuing to prototype with the initial "component definition" mock-up that was developed a couple weeks ago. Added a few comments to that working doc.

9/26/2018

This work will continue in Sprint 14. See https://hackmd.io/x43sK8IvSyOhW3no7fdh2w?view for progress.

@anweiss Sorry about that, didn't mean to edit the HackMD doc.

Anyway, my comment was:

If something like YAML is proposed for the markup, there really should be a way to 'include' text as well as provide internal cross-references to other components. Otherwise, this is too unweildy for average folks.

This is one of the main issues that I had with OpenControl. The ability to automatically cross-map inside the document and to have an 'include' structure simply wasn't there.

We fell back to ReStructuredText for our docs and I'm hoping that we don't have to do some hybrid munge with OSCAL in the end.

@trevor-vaughan no worries at all! the YAML is being used purely to provide a human-readable mock up of some potential data elements to be included in those "implementation" sub-layers. And totally agree in that being able to both cross-reference within the document and include text from external sources is definitely something that should be supported.

@anweiss I just realized that a concrete example might be useful.

This is what we currently have for one component of the stack (ssh) https://simp.readthedocs.io/en/master/security_mapping/components/ssh/ssh.html and, as you can see, it auto-links back to the referencing policy under each control.

My hope is to be able to link back not only the prose but the technical implementation in a seamless document but that's going to need to be driven by an ability to cobble everything together from disparate sources.

As far as I can tell, this is in line with the 800-18 approach just incredibly tedious if you have to keep it in more than one location.

This is helpful, thanks! The linking you've described is likely going to be provided by the "component specification" sub-schema as described at a high-level in the notes.

10/4/2018

This issue is still in the experimentation phase (@anweiss ) @wendellpiez has some ideas/concerns:

  1. can create a small component fast and start piloting
  2. XML or JSON might not be the interface of choice for the data that a component needs to capture. Can MD, YAML, etc.
    @anweiss - we need more consensus on the structure first.
    @david-waltermire-nist - we have to start somewhere to address this problem. Tooling can address human's concerns regarding the format.
    We will discuss this issue outside of this meeting by using examples. @anweiss generated example in #244, and metashema driven approach is working.
    @anweiss will provide a data set by COB Friday 10/5/2018, that @wendellpiez can use to move forward with the modeling the component.

Here's some initial mock data based on the "component definition" model from the HackMD notes: https://gist.github.com/anweiss/8afd321b6bf2a9d4e1679657a1b8f2fe ... CC @wendellpiez. I've only included one component and one control for brevity. Some thoughts from my prototyping thus far:

  • Both provisioning and validation mechanisms should include some sort of id prop so they can be cross-referenced
  • Given that SCAP is a likely option for provisioning and validation mechanisms, constructs that allow for external references should be considered (e.g. refs to data stream, OVAL, XCCDF, etc) CC @david-waltermire-nist ... probably best reserved for discussions related to SCAP 2.0
    • Other provisioning and validation mechanisms should also be allowed
  • for implemented parameters, some sort of choice or conditional value should be allowed (as highlighted by the example referencing AC-10 Param 2 which allows for a multi-value constraint)
  • I believe itemId was kind of a catch-all for parts, props, etc, but wasn't sure
  • value fields should allow for multiple data types as this would potentially allow for somewhat stricter data typing constraints

My repo now has a sample made out of Andrew's example, passed through a "refinement" filter to produce an XML equivalent:

https://github.com/wendellpiez/OSCAL/blob/feature-component-definition-issue216/content/component-sandbox/recast-component-sample.xml

The JSON source data is provided in the same folder, as is the XSLT filter.

#102 is related to this issue.

@anweiss please look at my branch: I'm looking forward to hearing what you think. I have not taken it past the bare sketch but the sample is XSD valid. (This will also be a good chance for us to shake everything down again.)

The "component" metaschema has no docs in it to speak of just placeholder text, so the generated docs are sketchy.

https://github.com/wendellpiez/OSCAL/tree/feature-component-definition-issue216/content/component-sandbox

Along with the metaschema I can also write up a readme.md file if you have any questions for that feel free to post them here (or we can find another place).

cc @redhatrises @david-waltermire-nist @iMichaela @akarmel @brianrufgsa @PCrayton what is ted's github?

Regarding @anweiss concerns as expressed in #102 ... I wonder how much of the contents of this component model can be produced by system queries or transformations in a back end, vs having to be written always by hand ... another way to put the question is, whether there is any redundancy here we can collapse or factor out.

@wendellpiez ideally external tools or the systems themselves will be producing this data, rather than being handwritten. I'm already seeing some redundancy in the implementsProfiles element, namely when two or more profiles select the same control from the underlying catalog and make no modifications to it. This is even demonstrated by our simple example with parameter ac-10_prm_2. But I also think these scenarios are going to be difficult to collapse unless the profile(s) being referenced are properly processed.

Also, on a separate note, I think we should make distinctions between value's. It may be better to have both a defaultValue field and an assessedValue field. Otherwise, there are too many unknowns with regards to the origination of that data.

@anweiss fantastic, cool. Redundancy does have a sometimes-useful feature: it can provide for cross-checking. Conflicting claims ought to be readily discoverable, and as such may sometimes be "informative". In the context of a pipeline where tooling is available at both ends (which is to say, this stuff will not all always be hand-authored), redundancy troubles me less for this reason among others. But this is something to keep an eye on.

Could we flag value with a role or some such, so <value role="fallback"> ?

Just wanted to clarify, in my head I think as a component as a Service, Software, or Hardware. Also I think there should be a ComponentType concept, so you can group components by type for example, Hardware/Software/service, then under hardware, you could have networking, computer, etc.

Also a Component can be a child of another component correct?

@wendellpiez good points. And yea, flagging with a role, of sorts, might work. I think this warrants further discussion.

@bsilberberg spot on! I think we're also looking to model non-technical elements as components as well (e.g. policies, process, procedure, etc), but haven't dug into this yet. CC @david-waltermire-nist. And the relationships key is what can be used to define component relationships, where the relationship type is simply derived from the TOSCA relationship types (see http://docs.oasis-open.org/tosca/TOSCA-Simple-Profile-YAML/v1.2/cs01/TOSCA-Simple-Profile-YAML-v1.2-cs01.pdf, Section 5.7). Today those values can be Root, DependsOn, HostedOn, ConnectsTo, AttachesTo and RoutesTo, but I would certainly think there's room for flexibility here.

also, any thoughts on a standard format for component definition IDs? maybe SCAP CPEs or some sort of reverse DNS notation for any technical components?

Sprint 15 Progress Report Oct 25 2018

As noted above, we have the beginnings of a straw man model, but have not yet managed to exercise or improve it. Also it is mainly undocumented, largely because to do so requires more domain expertise than I have.

To develop the model we require data or at least more mockups. (Ideally enough to start framing a conceptual demo.) To develop the docs we need more attention to the model.

Progress: 20% - suggest we discuss ways of producing mockups even before documentation is done.

11/08/2018

Waiting on @tedsteffan to provide the sample based on S3 bucket and on @brianrufgsa to create a Crypto Module component sample.

@brianrufgsa we should take a look at the NIST Automated Cryptographic Validation Testing (ACVT) project -> https://csrc.nist.gov/projects/automated-cryptographic-validation-testing and https://github.com/usnistgov/ACVP. More specifically, we should re-use whatever API format they decide to use to model a specific crypto module. At the moment, the ACVP ACV server API does not provide any endpoints for listing validated modules, but my guess is that it will at some point. CC @david-waltermire-nist

My assumption, though, is that we need to include at least the following data:

  • Certificate Number
  • Module Name
  • Standard
  • Status
  • Sunset Date
  • Validation Dates
  • Overall Level
  • Caveat
  • Security Level Exceptions
  • Module Type
  • Embodiment
  • Description
  • Tested Configuration(s)
  • FIPS Algorithms
  • Allowed Algorithms
  • Software Versions
  • Vendor Info and Contact
  • Related Files
  • Lab Info

OSCAL FR-SSP (v2).pdf
Adding diagrams here based on a conversation with Dave and Wendell today. This is a conceptual view of how to adopt a component approach to building an SSP while also supporting the "flat file" (legacy) SSP approach that exists today. This allows a CSP to start with the current approach, then slowly transition to the component approach.

I'm still finalizing the mock-up, which supports bundling of a group of smaller components into a larger component. Capabilities are modeled using the same component tags, rather than being treated differently. This allows each organizations and compliance regimes to differentiate "component" vs. "capability" as they see fit rather than forcing one definition.

Adding a mock-up of a component OSCAL/XML suggestion based on my recommendation in the comment above.
This is the potential and provisioned information. I still need to model an actual deployment in the SSP mock-up.
component.txt

NOTE: GitHub won't let me attach XML files here, so converted the extension to TXT. Content is XML.

11/15/2018

@david-waltermire-nist will update this issue to use the work from #246. @tedsteffan can base his examples on the work completed by @brianrufgsa.

@brianrufgsa @david-waltermire-nist in the latest conceptual model presented by @brianrufgsa, are we still planning to use the original "component definition" and "component specification" nomenclature? should we re-work the original model diagram at the root of the issue to reflect @brianrufgsa's latest conceptual proposal?

@anweiss - I believe this was one of the discussion topics for Thursday. The intention was always to blend my efforts and yours, and that nomenclature may still be fully appropriate.

Updated diagram to reflect "Component Definition" and "Component Specification" nomenclature. Some minor clarification tweaks as well on page 2.
OSCAL FR-SSP (v3).pdf

@anweiss, @tedsteffan, and @redhatrises: The attached PDF outlines the high-level (proposed) XML tags for component definition. It deliberately lacks the detail. Can each of you either concur, or provide feedback/concerns with this high-level approach?
Once we reach agreement on this, we can focus more on the more detailed content.

If you can't give feedback by noon Thursday, please let me know as Wendll and I will be working issues related to this Thursday afternoon.

Thank you!

OSCAL Component Definition Anatomy (v1).pdf

@brianrufgsa I'm a +1 on this high-level model. My only feedback is to see whether or not we can come up with an alternative to the word "potential". It's a bit ambiguous. Since its aim is to provide information about component characteristics in the context of (or lack thereof) a system, maybe something to the effect of context="none" or system-context="none"; rather than the type attribute

@anweiss - I'm fine with trying to find a better word than "potential". I'd like to defer to @wendellpiez on moving away from @type in favor of @context or @system-context. (Wendell, do you see any reason to prefer one over the other?)

To be clear, only two of three types are represented in that diagram. "potential" and "provisioned". The third type is for use in the SSP and is "implemented".

My main goal is to keep the same tag syntax, but be able to easily differentiate those three use cases. I am open to any approach for accomplishing that, as long it is intuitive, and conforms with any existing XML conventions/etiquette. (We'll discuss JSON after we get the XML nailed down.)

To use your nomenclature, we could use something like the following for all three:

  • type = "definition"
  • type = "provisioned"
  • type = "specification"

Or we could use something like:

  • context = "none"
  • context = "provisioned"
  • context = "implemented"

untitled
@brianrufgsa, @wendellpiez, and @anweiss: In the Crypto Module validation world against FIPS 140 (and in many similar programs), there is an 'implementation guidance' that describes all the solutions (implementations) that would meet the particular requirement (a shall statement). To me, section currently labeled 'Implementation Options' (see above) is more like 'Configuration Options' while the "Control Information (Potential)" contains the Implementation Options or Guidance of how to properly implement the component.

The "Provisioning' part describes 'options' but I am concerned “options” could be perceived as exclusive (it is Option 1 or Option 2, not both), To me they are rather '[Provisioned] control instances' (as in class instance) of the same control ‘class’ that exists only in the context of the current system implementation (see OO programming).

@iMichaela re. your comments on "provisioning", I would actually look at it in terms of "composition" rather than "inheritance". The "configuration options" are composed of "controls" that they can implement. This potentially allows for greater re-use of specific "configuration options" across "controls" in different domains and contexts (e.g. FISMA, PCI, etc); especially if the "configuration option" is exactly the same for 800-53 Control X and PCI DSS Control Y. In my mind, it's analogous to the polymorphism provided by composition over inheritance in OO programming.

Ultimately, as a vendor, I'd like to be able to re-use my configuration and attestation narratives as much as possible across different standards and the model should allow for that.

@anweiss I agree with the 'provision option' analogy to the polymorphism provided by composition. But in this case, the 'Control information (potential)' pulled up front, outside the provisioning options, does not make sense because each 'Provisioning option' might be composed of different control instances or sub-control instances. To me, a Component could have multiple sets of control-instances or subcontrol-instances, with each set offering, what you labeled above, a 'configuration option'. I am looking at the 'composition' at the control level to provide a secure configuration of the Component. Each set would provide a secure configuration that meets particular criteria. In some isolated cases we might be able to have the same 'configuration option' for an 800-53 Control X and a PCI DSS Control Y, but more often we will find that a component implementing set_1 composed of 800-53 control-instances will satisfy FISMA requirements configuring & securing the Component almost similar to set_2 of PCI DSS control-instances that are also configuring & securing the same Component to meet the financial industry requirements.

This is long, and probably better as an in-person discussion.
@iMichaela and @anweiss I feel like we are using the same words, but assigning different meaning to them.

First, (forgetting components for a moment) the way NIST 800-53 defines "controls", they are equivalent to system functional requirement statements that all happen to be related to security. The are requirements that must be satisfied by the system.

When a vendor produces a "component definition", it is the universe of every security-relevant configuration option (all possible settings and any related instructions). It is also a list of every possible security control (requirement) that MIGHT be satisfied by the product or service. The vendor can then offer a description for each of those controls as to how their product might satisfy the control (requirement). Those descriptions can later be adopted/tailored in an SSP, but that is closer to the end of the process.

A single product/component can have more than one set of provisioning content. (I really want to use the word "profile" here, but we already use that to mean something else in OSCAL.)
One provisioning option might be a CIS Benchmark. Another might be a DISA STIG. Both for the same product.

The CIS Benchmark would draw from the component definition above an list only the security settings relative to satisfying that benchmark. This must definitionally be a subset of the configuration choices in the component definition above.
Likewise, this CIS Benchmark provisioning content would list the controls (requirements) that are actually satisfied IF the provisioning standard is followed, along with a description of how each of these controls (requirements) is satisfied as a result of deploying consistent with the CIS Benchmark. Again, these controls must definitionally be a subset of all the controls listed in the component definition.

Two key points:

  1. I don't see a direct one-to-one relationship between individual configuration options and controls (requirements) that can be satisfied by a component. This may sometimes be true, but typically it will be a many-to-one or many-to-many relationship.

  2. (and more directly to the address @anweiss concern) Component definition content can link to multiple different compliance regimes. It doesn't have to be limited to just one.

I think ideally, you want automatic reciprocity. If a control in 800-53 also satisfies PCI, you'd like to just point to one control from the component, and have it automatically link to both 800-53 and PCI. I think our model can eventually support that, once we define the "periodic table of security controls" and link the various regimes to it. Until then, the best that could be accomplished is pointing to more than one regime from the same component definition file.

Finally, where more than one regime is referenced from the component definition file. any individual provisioning content or implementation content can ignore any regimes that are not relevant. For example, if your component definition file pointed to both 800-53 and PIC controls, the CIS benchmark might also point to both, but the DISA STIG might only point to the 800-53 controls and ignore the PCI controls.

@brianrufgsa I could not agree more with you on everything you described above.
The concept of 'periodic table of security controls' will only work though when we will decompose all controls (800-53, PCI. COBIT, ISO/IEC) in simple 'security requirements', 'flattening' the impact levels or disregarding them but only for the purpose of constructing the 'periodic table of security controls'.

@brianrufgsa +1 ... I think you’ve summed this up nicely

We could call it "status" instead of "type" or "context", with no flag given amounting to "status='potential'".

Updated Component Anatomy Diagram

OSCAL Component Definition Anatomy (v2).pdf

Updated SSP / Component Diagrams

OSCAL FR-SSP (v4).pdf

Having suggested the name "status" I am now thinking about other possibilities.

  • mode="potential" (can do)
  • mode="provisional" (should do)
  • mode="implementation" (is doing)

Another possibility: applicability='potential' | provisional | implemented'.

If you think 'provisional' doesn't really mean "as provisioned", I will listen.

Thank you @wendellpiez! I like "provisional" and "implemented". I think there were some strong arguments for avoiding the term "potential", which was part of why I was thinking"defined" or "definition".

In any case, I tend to believe clusters of terms like this are more intuitive if they are either all verbs or all nouns. In this case, I think think most suggestions are verbs of being, so I'd like to suggest "defined","provisional", and "implemented", with the grammar test being that you can say something "is defined", "is provisional", or "is implemented".

@brianrufgsa understood: these are hard choices to make especially given the temptations of debates over metaphysics (to say nothing of epistemology).

I like defined | provisional | implemented especially if there is a nice alignment (perhaps nominally enforceable) between (notions of) "provisional" and "is provisioned". Do you have a current feeling regarding the name of the flag?

After a long back and forth in email between me, @wendellpiez, and @iMichaela I'd like to propose the following for the three attribute uses:

@status="definition" for the vendor to indicate what a product/service can do (all possible security settings)

@status="specification" for the various provisioning information that might be provided. (eg. one specification for FISMA Moderate, another for PCI, one for GDRP, etc.)

@status="actual" (or no @status attribute) to indicate the content reflects how the product is actually configured within the given system.

This applies to both the <characteristics> and <satisfaction> elements.

Please note, I believe the use of "specification" here deviates slightly from above. @anweiss are you OK with that change?

@brianrufgsa works for me

New glitch as I'm working on the Metaschema. To better fit the "big picture", I think I need to reverse the satisfaction and control elements in the component definition.

I think satisfaction should be an element with control and sub-control, so it would look like this:
<control id='ia-2'> <satisfaction> <p>A description of how IA-2 is satisfied by this component.</p> </satisfaction> </control>
The other way causes Metaschema complications (or I'm still novice with our Metaschema to understand how to simplify it.)
Thoughts?
Concerns?

Makes sense. Without deviating too far - and in order to keep moving forward - I'll use 'control-satisfaction' instead of 'control'. We can change it later.

After a conversation with Wendell, I plan is now to flatten this and just use "satisfaction" and add an @id attribute, rather than group "control" or "control-satisfaction" elements under "satisfaction". There would just be multiple satisfaction elements at that first level, instead of them being grouped.

There have been several small changes, so I've updated the Component Anatomy diagram and am attaching it here.
OSCAL Component Definition Anatomy (v4).pdf

2/7/2019 - This structure is now represented in the SSP metaschema developed for issue #267.

I've started to compile a list of ideas for merging the initial component definition prototype with @brianrufgsa's SSP model. Feel free to take a look at https://hackmd.io/VjzeOpEKQoWe_-9uCI6pKg

@anweiss I like there this is going, I'm mostly tracking with you, but wanted to touch on two points:

First I can agree with a component file external to an SSP up to a point. At some point, the actual components are selected and configured. Those details need to be in the SSP. Also, those details will often require a system-specific modification to the generic satisfaction language. That system-specific portion should reside in the SSP, leaving the more generic write-up untouched in the external component file.

Second, parameter values are intended to "fill in the blank" for a requirement statement. Auditors review actual settings against requirement statements.

Your syntax seems to mix a components "actual setting" (which demonstrates compliance) with the requirement statement.

The link is very helpful to auditors; however, re-using the same syntax is dangerous. I think it is important to use clearly different syntax so as to not confuse a requirement of "15 minutes" with the "15" setting that satisfies the requirement.

Somewhat related to the above, my thought process was to capture all the security-relevant settings of a component in the sub-element of within the SSP. This would only include the actual settings, and could be linked to the external file for unused settings/choices.

Putting the two topics together, I see a link to the external component file, and only see the components element in the SSP as containing the system-specific component details (not the entire component definition). To link component settings to parameters, I see something like this:

<components>
   <component id="abc" href="//link/to/external/component/file">
      <!-- origin, validation, and provisioning are referenced in the external file -->
      <characteristics context="implementation">
         <setting id="abc-timeout">
            <value>15</value>
            <unit>minutes</unit>
            <satisfies param-id="ac-2_prm_1" />
            <satisfies param-id="ac-4_prm_2" />
         </setting>
      </characteristics>
      <satisfaction control-id="ac-2" context="implementation">
         <p>This is copied initially from the external component file's generic explanation.</p>
         <p>It is then tailored as needed to describe the implementation in this specific system.</p>
      </satisfaction>
   </component>
</components>

I'd enjoy the opportunity to discuss this further.

All great points @brianrufgsa and agree with you! I like your proposal above. A lot of context (both system owner context and assessor context) can also be provided by the declarations model.

The concepts in this issue will be document on the OSCAL website in issue #363. Closing this issue.