cuelang/cue

spec: potential issue with pattern properties in the context of additional properties

myitcv opened this issue · 4 comments

What version of CUE are you using (cue version)?

$ cue version
cue version +aa61ee7f linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

In the course of looking at #893 I found myself referring to previous discussion about pattern properties and additional properties.

In f52a0ed we concluded that the semantics of pattern constraints ([k]: v) should remain as is, namely further constraining existing fields that match pattern k by value v. For example:

// case 1
package x

_#X: {
	test:     string
	[string]: "test"
}

a: _#X & {
}

results in:

a: {
    test: "test"
}

It also correctly restricts the value of the test field:

// case 2
b: _#X & {
	test: 5
}

gives:

b.test: conflicting values string and 5 (mismatched types string and int)

However, the [string] form of pattern also allows additional fields:

// case 3
c: _#X & {
	tap: "test"
}

gives:

c: {
    test: "test"
    tap:  "test"
}

Whereas:

// case 4
_#Y: {
	test:   string
	["^t"]: "test"
}

d: _#Y & {
	tap: "test"
}

results in:

d: field not allowed: tap

The behaviour of [string] in case 3 seems to agree with the following part of the spec:

A closed struct c is a struct whose instances may not declare any field with a name that does not match the name of field or the pattern of a pattern constraint defined in c.

Therefore, I think case 4 is a bug according to the spec.

But ignoring this for one second, I think this definition of closedness with respect to pattern properties is a problem. Because in f52a0ed we also talked about adding support for the equivalent of JSON schema additional properties, in the form [...k]: v. As things stand today according to the part of the spec quoted above, pattern properties not only apply to existing fields but also open a closed struct to allow further fields that match that pattern, fields that are then constrained by the value associated with that pattern (case 3 above).

So whilst the current implementation might suggest we don't need to support additional properties (because that can be done by pattern properties, like case 3 demonstrates above), it hides a different problem. Namely that it's impossible to define constraints that only apply to existing fields, or only apply to additional fields: with [string] is as it is implemented today, the constraints applies to both sets.

Unless I'm missing something here?

Assuming I am not, I think we need to:

  • add support for additional properties, e.g. [...string]: v and friends
  • "fix" [string]: v to only apply to existing properties

I think this is a breaking change, and a relatively significant one at that because wherever people use [string]: v today, it's more than likely the case they mean [...string]: v.

Something to discuss at least.

I agree that we need some way differentiate between existing and additional property constraints. My opinion is that the [...<>] prefix is better for additional.

Does the following define a closed struct with no fields or a set of potential fields (while still also being closed)?

#A: {
  ["^t.*"]: string
}

where I think this ought to be valid (aligning with the RHS of the or above)

ok: #A & { tap: "test" }

badKey: #A & { foo: "bar" }
badVal: #A & { tap: 1 }

(some assumptions about the query syntax were made here, namely the regex for the pattern constraint)

My opinion is that the [...<>] prefix is better for additional.

Per f52a0ed, that remains the plan from my perspective.

Does the following define a closed struct with no fields or a set of potential fields (while still also being closed)?

I think the answer has to be "yes, it is closed". Because you can't have an exception for unification, otherwise it's an exception for all cases.

The solution would be to use embedding:

exec cue eval

-- x.cue --
package x

#A: {
	["^t.*"]: string
}

ok: {
	#A
	tap: "test"
}

shouldBeBad1: {
	#A
	foo: "bar"
}

shouldBeBad2: {
	#A
	tap: 1
}

But note the output from the above is:

> exec cue eval
[stdout]
#A: {}
ok: {
    tap: "test"
}
shouldBeBad1: {
    foo: "bar"
}
shouldBeBad2: {
    tap: 1
}

when the shouldBeBad* cases should, well, be bad! So I think this points to another problem, most likely the conflation of existing vs additional fields.

(some assumptions about the query syntax were made here, namely the regex for the pattern constraint)

I don't follow this point?

mpvl commented

I think you meant:

// case 4
_#Y: {
	test:   string
	[=~"^t"]: "test"
}

d: _#Y & {
	tap: "test"
}

which actually works.

All other cases seem to be working as intended.

This issue has been migrated to cue-lang/cue#1024.

For more details about CUE's migration to a new home, please see cue-lang/cue#1078.