usnistgov/oscal-content

NIST SP 800-53 r4 and r5 Missing Label Property in Group

Opened this issue ยท 20 comments

Describe the bug

While label properties are present for controls, the label property is missing from groups.

Who is the bug affecting?

Tool developers and content consumers.

What is affected by this bug?

Tools that honor the label property are unable to display a label for groups.

When does this occur?

When working with the NIST SP 800-53 r4 and r5 OSCAL files.

How do we replicate the issue?

Load the content into a tool that is designed to present the values defined in label properties for controls and groups.

Alternatively:
Open the NIST SP 800-53 r4 or r5 catalog, find any group and note that no properties are present as direct children of that group.
Screenshot 2024-07-01 113742

Expected behavior (i.e. solution)

Each group should have a label property with its assigned label from the 800-53 publication. For example, The Access Control family should have a label property with a value of "AC".

Other Comments

@brian-comply0
What you describe is not a bug. It is a feature that can be enhanced, if the community needs it.

To my recollection, OSCAL representations of the 800-53 controls, Rev 4 and Rev5, never had <prop name="label" value="AC"> because they were not deemed necessary. The id of each group can be used programmatically by simply capitalizing it to identify the "label" solicited here.

After years of processing the 800-53 catalog in OSCAL without this set of props for the groups, I am afraid it is going to be ignored anyways. With the said, if the community makes a case for such enhancement, OSCAL team will consider it as an enhanced feature, not a bug, since it is not a bug.

@wendellpiez has done extensive work generating the catalogs and I would like to hear his perspective as well.

@iMichaela while it's true that specifically for 800-53 the ID and the label align, this is not guaranteed to be true of any catalog. Indeed, the ID value of a group could be a UUID value as discussed in other venues.

So if tool developers are simply capitalizing the label because they are only working with NIST-published catalogs right now, they will have to fix their tools when they start consuming other catalogs where a simple capitalization of the label doesn't provide a reasonable presentation.

The whole point of the label property is for content creators to provide the label as it is intended to be visibly published. Not including the label property essentially says there is no published label.

While it may be fair to say this is an enhancement rather than a bug, it is consistent with OSCAL syntax to include these label properties, and it doesn't break anything to include them. I've already submitted a PR to include them, so the only effort required for NIST is to review and approve the PR.

@brian-comply0 thanks for the effort, much appreciated.

To me this falls well within the realm of 'reasonable' and I'd be prepared to approve the change, given the facts that (a) you have identified the requirement while (b) it's also consistent (c) etc. etc.

In other words, it's the kind of feature to which I might have said "let's see if anyone says they need it" at the same time as I argued to leave it out for reasons @iMichaela offers.

I would also be curious as to the level of testing and whether any tests e.g. Schematron could be provided to ensure consistency / correctness. In this case testing the relation between the label and @id would do the trick. (Of course I also assume such a test is brain-dead since how were the labels created? but distributed brains sometimes need brain-dead testing.) We might test both that every group has a label, and that the label given corresponds to what we expect.

Such testing could be added to the Schematron file now in place in src/nist.gov/SP800-53/rev5/xml/validate-labels_SP800-53-catalog.sch. Feel free to ping me if you want to save steps.

@brian-comply0

So if tool developers are simply capitalizing the label because they are only working with NIST-published catalogs right now, they will have to fix their tools when they start consuming other catalogs where a simple capitalization of the label doesn't provide a reasonable presentation.

I am a little confused - label props are not prescriptive mechanism in OSCAL for reconstructive a human readable document. We did it in 800-53 because we started with a pdf document. Today, even 800-53 n what NIST calls CPRT. Most likely we will have no label props in 800-53 Rev6.

What might be interesting would be a pipeline that would produce a fully-elaborated OSCAL version consistent with the current practice (including the various redundancies, many useful IMV) from a very minimalistic and 'space-efficient' version of a catalog (CPRT-similar or not).

I see valid points on both sides here. The question is, what is information and what is just echoes or noise. The argument is that these labels are information - maybe because they could vary (since they are no longer strictly rule-based) but they happen not to vary in this (important) instance. I.e. the fact of their consistency is part of what we wish to maintain and represent.

Whereas if we go back to the most parsimonious form possible, there are no labels, just values from which they are derived .. the IDs? and we never know we have had labels, or where.

Or is the rule in SP800-53 that "the label for a control Family is made from the initials of the title of the Family" (assuming that is so, as I think)? - and hence we don't even need IDs since all the titles are different?

(I hope you can see this argumentum ad absurdam - no I do not propose doing away with the IDs. It is about balance. What the CPRT designers decide is a different question of course ๐Ÿ˜Ž .)

In any case, all these variants (elaborated or sparse) could be implemented via stable, deterministic mappings. From that point of view no information is being added. Just how it is found.

NB: within the larger context of catalogs that are similar to this one, I think the argument for adding the prop becomes clearer. It is just another data point, and as such it could be useful.

@iMichaela I agree with changing this from "bug" to enhancement, but lack sufficient permissions to change the GitHub label on the issue.

As for dropping all label properties in r6, I would strongly encourage you to socialize that with the larger community before doing so.

The labels are how the entire community refer to families and controls.

IDs are intended to be machine-consumable and were never intended to be exposed to content consumers. The whole point of the label property was for content creators to provide the human-readable label in the exact format specified by the content creator.

The direction you are describing will force tool developers and content creators to create a tight coupling between a machine-readable ID and a human-readable label by collapsing the two into one field. This adds unnecessary complication for tool developers and adds unnecessary constraints to other content creators.

Further, there is ambiguity when attempting to derive the part and sub-part labels from their ID fields. Especially when you get into Objectives.

I strongly recommend you keep the label property, constrain the cardinality to 0 or 1 and not attempt to overload the property by having multiple variants on the same control. Those variants can and should be handled as OSCAL extensions (using @ns rather than @class) for any org that wants to provide alternative labels.
Please, please do not eliminate the label properties from r6 without significant socialization.

@wendellpiez you @david-waltermire and I had this conversation a few years ago, and from that we decided that the label property was needed.

The problem is that just because 800-53's labels are easily derived, does not mean that is the case in every catalog.

Indeed, I have a catalog based on the Federal PKI Certificate Policy where all of the labels are section numbers in the format of "1.2.3.4". As you may know, current OSCAL syntax does not allow an ID to start with a number. So I can't draw a 1-1 correlation between the ID and the label the way you can with 800-53.

Now if the same tool is supposed to process both the FPKI CP and 800-53 you are asking for it to somehow "know" that with one catalog I should attempt to derive labels from IDs, but with this other catalog I should absolutely not attempt to derive labels from IDs.

We can say that if a label property is present, it should be used, but when no label property is present,
how does a tool know when it should or should not attempt to derive a label from an ID or what rules to use for that derivation?

Based on the current OSCAL syntax rules, the answer is that you have to keep the concept of ID and label separate, and apply those two concepts consistently across content. If there is no label property, the tool should display no label. Period.

Even within 800-53, you need one derivation method for groups and top level controls (simply capitalize), another for control enhancements (capitalize, break apart at the period, and wrap the last part in parens), and a third for control parts (way too complicated to summarize here).

@brian-comply0 - As you highlight above, this is not a simple problem and no decision will be made easily. If the 800-53 will have similar structure in CPRT 800-53 v6, then we will maintain all labels available.

In the PR you submitted, you are proposing labels for groups that do not exist in the published pdf version of the original document. If you review the document you will find for the Access Control family the label "3.1" not "AC".

Here is a snapshot of the document available here: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-53r5.pdf

3.1 ACCESS CONTROL
Quick link to Access Control Summary Table
AC-1 POLICY AND PROCEDURES
Control:
a. Develop, document, and disseminate to [Assignment: organization-defined personnel or
roles]:
1. [Selection (one or more): Organization-level; Mission/business process-level; System-
level] access control policy that:
(a) Addresses purpose, scope, role
...

The website does no have a label "3.1" , BUT introduces the "AC" abbreviation, which matches your proposed label prop

AC | ACCESS CONTROL |  

We need to acknowledge that there will always be catalogs that do not need label properties. The real purpose of OSCAL is to generate machine readable information for machines to consume not another format for humans to peruse. For humans there is the NIST website , the MS Excel and even the CPRT version which is considered the gold version or information source.

To stress the point @iMichaela is making -- whatever OSCAL we have is downstream of an authoritative source. For Rev 5, that used to be a published PDF document.

(Long section here left out on what this meant in practice and the challenges we met in order to make this happen.)

From Rev 6 that is planned to be different. The CPRT source will be what it will be.

If there is a PDF it will probably look quite different. Whether any PDF that looks like the current PDF even exists, is an open question (albeit we could make that easily enough with adequate OSCAL - next question). OSCAL Team (or your team) having a capability is not the same as some other team having that same capability (even here at NIST).

Assuming all the information is there, I assume that OSCAL can and will be produced from this CPRT source - that is the commitment as I understand it, at least currently.

Such OSCAL - downstream from authoritative public data and therefore something producible by anyone, in principle - might well have labels on all the groups. That is no problem, only a tiny detail at that point.

@brian-comply0 I don't think it's as scary as you make it sound!

I suggest again that we pick this up by extending the current Schematron to validate the coupling in this catalog between group labels and IDs.

It already does this for controls and parts of controls. As such, having the rule set to test the data ensures that expectations are in line with reality.

If you have no current easy means to apply a Schematron to an XML document, take a look at https://github.com/usnistgov/oscal-xproc3 -- a way to make standalone tools working on the command line or (at least in Windows) using drag and drop on the desktop (all open source, declarative and assessable), and including Schematron among its capabilities.

Indeed, were I not already buried (including with other possible spinoff projects), I might offer to pick this up as a demo project for you there ... that being said, it's more like home model rocketry than rocket science ๐Ÿš€

If that seems too much to take on, please let me know if I can help with providing or testing the Schematron in the branch where it belongs (where the data is changed as well). The rule itself will be much simpler than the overhead.

@iMichaela page 8 (as printed - physical page 35) in that document has the following chart:
Screenshot 2024-07-02 101536
Rev 4 has a very similar chart.

While the chart uses the term "ID". It's clear they didn't mean an OSCAL machine-readable ID. They intended that as a human-consumable ID (aka label). And the IDs are in caps. Not lower case.

Those labels are widely used across government. The FedRAMP PMO and every Agency where I've done cybersecurity work refers to the "Access Control" family as "the AC Family". The words "Access Control" are rarely used.

@wendellpiez respectfully, I'm not trying to make anything "sound scary." I'm communicating a real-world situation that was anticipated over four years ago when OSCAL was being designed and label properties were added to the syntax.

Now I'm presenting a real-world use case for that scenario.
Again, how am I supposed to build tools to consistently manage catalogs when one catalog author expects me to derive labels from IDs and another catalog author can't use labels for IDs because of OSCAL-imposed limitations on the ID field?

I have neither the means nor the experience to work in Schematron.
If you saying that the only way you'll accept community edits to the 800-53 content is if the contributor is also skilled enough in Schematron to make the supporting changes, you are essentially blocking 99.9% of otherwise qualified contributors to this repo.

@brian-comply0 I like the proposed enhancement. But I kind of question the idea that we can accept a PR without even the most rudimentary testing, in good faith.

Maybe I got ahead of us by stressing that or by assuming Schematron. For that, apologies. Alternative suggestions are also welcome. It's not Schematron expertise we require, it's the ability to validate your (and our own) claims for consistency over data sets too large for us to review by hand. We need that capability and so do you.

Not only do I like the proposed change, but I appreciate very much how you have taken the trouble to propose it. (This in itself has third- and fourth-order benefits I could talk about.) What I would like to see further would be (a) that capability just mentioned, and (b) other voices bearing on the specific merits (since we have heard ourselves now).

I believe for a tool that processes the catalogs strictly as per OSCAL schema, it does not matter whether the "label" property is present or not for groups, as it is optional. But for applications that displays or creates content for human consumption (Word, PDF, etc.) from OSCAL formats, having a label property might be helpful. But such applications should also have a default behaviour if the label property is not present in the content as it is optional and hence all catalogs may not have labels for groups.

It should be upto the catalog creator to add "label" for groups. Mandating it for every catalog may not be desirable. If we want to mandate it then it should not be property but a separate required element called "label" in the schema. But, in that case we should first make the "id" field in groups mandatory (currently it is optional).

@vikas-agarwal76 apologies, I am trying to read between the lines here - modeling questions aside, is your feeling that the content proprietor(s) in this case should not add the prop since if someone needs it they can always add it to a local version or copy?

If not leave it alone (not add the prop), what should the content proprietor(s) do here?

I say 'content proprietors' because we are not in a position to rewrite the catalog - instead, we aim for adequate 'representational fidelity' as I call it - but also because it is public, anyone can rewrite it, use only parts of it etc. etc. But as it is widely copied and emulated, this particular simulacrum becomes an important one.

@wendellpiez I am just saying that since label property is optional not all catalogs may have it. Hence, applications should handle the situation when some optional props are not present.

@vikas-agarwal76 yes, noted. (You are right and I agree with you.) The question on this Issue is specifically as pertains to the catalog in question, i.e. the files in this repository. (So far you seem to be in the 'no' column - not only don't require them, don't provide them.)

@wendellpiez Apologies, if I was not very clear. I was saying in general for optional properties. I didn't mean don't provide them in this specific case. As I mentioned earlier, applications that displays or creates content for human consumption (Word, PDF, etc.) from OSCAL formats, having a label property might be helpful. So, its fine to add it if there is a requirement.

@iMichaela page 8 (as printed - physical page 35) in that document has the following chart:
Screenshot 2024-07-02 101536
Rev 4 has a very similar chart.

Hi @brian-comply0 - I am aware of the tables but the label properties are used today for chapters, section, subsections. This was the point I was trying to make. I do not dispute the usability of the proposed labels for editorial tools which are converting OSCAL back to human-readable format, but I do strongly argue over the proposed mandate of all catalogs to have such labels because editorial tools will not work otherwise. Such mandate will not be OSCAL backwards compatible and it cannot happen even if deemed necessary until OSCAL 2.0.0. It can be perceived as formatting best practices for OSCAL catalog owners, if the community members at large and label props important to have.
NIST can make a decision only when it comes to NIST owned content. And even then, the representation needs to preserve the authoritative source. In the 800-53 rev 5.1.1, your PR brings a proper, jsutifiable enhancement. Thank you for raising this enhancement issue and for and contributing.
For the Rev 4 catalog, I need to discuss it with the RMF team. It is not an easy decision.
Also - testing/validating the content's consistency per @wendellpiez 's point, is also something we need to consider and include in out testing/validation process. I only did a visual review for the 800-53 rev 5.1.1 catalog.

@iMichaela I'm sorry if something I said sounded like I was suggesting the label properties be mandatory. That was never my thought, nor intention.

From an OSCAL perspective, I suggested the label property has a cardinality of 0 or 1 (rather than the metaschema default of 0 or more), which means it remains optional, but there can be no more than one in the OSCAL default namespace.

I further suggested that any additional label properties should be handled through OSCAL extensions (@ns) rather than the class (@class) attribute to reduce ambiguity for tools. The use of the class attribute is under-defined. Tool developers have no clear guidance on how to handle multiple properties of any time that use class, whereas they have more clear guidance on how to handle extensions.

From a tool perspective, I am saying that tools need to either display the content of a label property if present or display no label. Since there are no label properties for groups in 800-53, then no label would be presented by tools.

It is neither fair nor practical to expect tools to derive a label from an ID for the following reasons:

  • Every framework would have different derivation rules.
  • There is no mechanism for content creators to communicate those derivation rules to tools for consistent presentation.
  • The data type of OSCAL ID's does not align well with the labels used in some catalogs. (details in my earlier comment)

Tools would therefore need bespoke code for driving 800-53 labels from IDs, and different bespoke code for each additional catalog/framework that uses this "derive from ID" approach. This runs counter to the goals of OSCAL.

From an 800-53 perspective, I agree it's the RMF Team's decision as to whether they include labels and which ones they use. The OSCAL Team should not be driving that decision.

That said, the two-capital-letter group labels in the chart I cited are what all consumers of 800-53 have been using for 20 years across Federal, state, and local governments, academia, and industry. Their presence or absence is significant, which is why I would strongly encourage the RMF team to include them, and why I submitted a PR with them added.

I look forward to hearing the RMF Team's position on this as they elect to either accept or reject the PR.

I suggest moving this conversation to Discussions, if there is further interest in identifying

  1. best practices for using props with name="label"
  2. the need for a dedicated assembly fro labels.

The outcome of the discussion can them be considered for next major version of OSCAL if the community pomes with a solution that breaks the backwards compatibility.

From an OSCAL perspective, I suggested the label property has a cardinality of 0 or 1 (rather than the metaschema default of 0 or more), which means it remains optional, but there can be no more than one in the OSCAL default namespace.

@brian-comply0 - your proposal above is backward incompatible with implementations that use more that one prop with name="label"