Use YAML, not space-separated string of keywords, to specify problem type
thorehusfeldt opened this issue · 6 comments
The new specification of type
avoids the use of YAML as a specification language, which leads a convoluted definition, unnecessarily lenient values, arbitrary implementations, and difficulty of giving editor support.
(For instance, 2023-07-draft-2024-05-05
uses inconsistent symmetry in its specification of “is incompatible with”, and would allow type: "scoring scoring scoring"
. Is "interactive multi-pass"
meant to be allowed?) While we’re at it, we manage to use pass
with two different meanings_ (“the testcase passes, the problem uses two passes”) in the same value string – pass-fail multi-pass
(!) That can’t be good.
Consider using YAML for this, since we’re using YAML for many other parts of the configuration files already. Here are some examples:
judgement: score
---
# empty, defaults to a single-pass, non-scoring problem
---
# same as above
judgement: verdict
submission: executable
passes: single
---
submission: answer # submit-answer
---
passes: interactive
---
judgement: score
submission: answer
---
judgement: score
passes: multiple
---
# FORBIDDEN
submission: answer
passes: multiple
The above can be specified neatly in CUE.
judgment: *"verdict" | "score"
submission: *"executable" | "answer"
if submission == "executable" {
passes: *"single" | "multiple" | "interactive"
}
or less neatly in JSON (I can’t be bothered right now.) Thanks to the magic of schemastore.org
The JSON spec would then automatically make every JSON-schema-aware editor in the world (such as VS Code or Emacs of VIM) check that people’s configuration files are correct (without typos and without contradictions). I cannot stress enough how useful that would be, in particular to new authors.
Caveat
The actual names for fields and values are unimportant. For instance, we can do
scoring: *false | true
executable: *true | false
if executable {
passes: *"single" | "multiple" | "interactive"
}
Please don’t let your opinions about what the fields should be called get in your way of evaluating this proposal. It’s about using YAML instead of putting all of this into a string with some extra rules.
Also, I can see a case for rounds
instead of passes
, mainly to really make sure we avoid the class with pass-fail
.
Say,
rounds: *1 | "multiple" | "interactive"
I consider these details secondary; they only become worth talking about after moving away from "pass-fail multi-pass pass-fail"
.)
Benefits
- Avoids various name clashes, if key/value names are picked carefully
- Supports problem author via schema-aware editor, avoiding ill-formed problem specifications
- Trivial to parse for the tool (which already parses yaml)
- Shorter and clearer to write down
- Automates (using
cue vet
or JSON), so we can automatically verify problem specifications.
It’s about using YAML instead of putting all of this into a string with some extra rules.
Was it not changed from string to a YAML sequence / list recently? I definitely agree it should not be a string in some arbitrary format.
EDIT: Found it here #131
multi-pass
is orthogonal to interactive
i.e.interactive multi-pass
should be valid
Use YAML, not space-separated string of keywords, to specify problem type
This is already the case. Clearly this needs to be made more clear :).
I guess it's the "String or" part of "String or sequence of strings" that's confusing? The intent here is that it must be a sequence of strings from among the allowed values (i.e. pass-fail
, scoring
, multi-pass
, interactive
, submit-answer
), but that it's it's a sequence of length 1 we also allow to not be a sequence at all. I.e. these are valid values for type
:
type: pass-fail
---
type:
- pass-fail
---
type: [pass-fail]
---
type:
- pass-fail
- interactive
---
type: [pass-fail, interactive]
The first three mean exactly the same thing, as does the last two.
For instance,
2023-07-draft-2024-05-05
uses inconsistent symmetry in its specification of “is incompatible with”,
Does it? I've been staring at it for a bit (because this is something I was explicitly worried about getting wrong), and it looks perfectly symmetrical to me? X is incompatible with Y exactly when Y is incompatible with X.
and would allow
type: "scoring scoring scoring"
.
(You mean type: [scoring, scoring, scoring]
).
Sure, I guess it doesn't explicitly say that you can't provide the same value more than once. It's not a huge issue, because if it was allowed it should clearly (?) mean exactly the same as type: scoring
or type: [scoring]
, but I think it would make sense to disallow.
Is
"interactive multi-pass"
meant to be allowed?)
Yes, definitely.
Also, I can see a case for
rounds
instead ofpasses
, mainly to really make sure we avoid the class withpass-fail
.Say,
rounds: *1 | "multiple" | "interactive"
The interactive
that we already have does not make sense as a value for rounds or passes. Are you intending some other meaning of "interactive" here? I could imagine a difference between a constant number of passes as a opposed to a variable number of passes, and maybe the latter could be called "interactive"? I don't think that distinction is important, and if we want it I don't think "interactive" is a good name for that concept.
I consider these details secondary; they only become worth talking about after moving away from
"pass-fail multi-pass pass-fail"
.
What does this mean? Type is not (and was never) a space separated string, and you were never intended to provide multiple copies of the same value, so I think we have moved away from this (or we were never there)?
(You mean type: [scoring, scoring, scoring]).
Well, apparently, I mean ["scoring", "scoring", "scoring"]
and "scoring scoring scoring"
was never an intended value in the first place. Thank you for this clarification; what an unfortunate misunderstanding.
So what is meant in the current specification 2023-07-draft-early-may
is that there are some fixed strings, and the value of type
is one of those values or a list of them. So the syntax is
#type_indicator: "pass-fail" | "scoring" | "multi-pass" | "interactive" | "submit-answer"
type?: #type_indicator | [...#type_indicator]
Or maybe even (with a default)
#type_indicator: "pass-fail" | "scoring" | "multi-pass" | "interactive" | "submit-answer"
type: *"pass-fail" | #type_indicator | [...#type_indicator]
This allows:
type: pass-fail
---
type: ["pass-fail", "interactive"]
---
type:
- scoring
- multipass
Moreover, when type
is a list
- there shall be no repetitions, and
- there are some list values that may not both appear.
(I can write these down precisely later; pressed for time right now. But I maintain the position that we should instead specify these constraints using YAML, instead of providing them as “a list with some extra rules”, which we demonstrably fail to communicate clearly to each other.)