pandastrike/jsck

Explicitly allowing undefined

Closed this issue · 29 comments

moll commented

Hey,

I know the JSON schema spec has no reference to undefined as a valid type, but I've found it a very useful concept besides null. How does one allow undefined values with Jsck? With Themis I've found I can use the non-standard {type: ["undefined", "string"]} schema nicely.

Thanks!

Could you provide an example schema, a valid, and and invalid document?

In general, it is quite easy to allow a property to be undefined by merely refraining from making it be "required".

moll commented

Sure.

Schema with the proposed undefined type:

{ type: 'object',
  properties:
   { id: { type: [ 'undefined', 'number' ] }}
}

Valid documents:

{id: undefined}
{id: 42}
{}
Object.create({id: 42})

Invalid document:

{id: "69"}
{id: null}
Object.create({id: null})

Update: Added inherited properties example just in case.

I'm inclined to say that one should never distinguish

{id: undefined}

from

{}

null is different from undefined for a reason.

This preference is reflected in the JS behavior:

JSON.stringify({id: undefined})

Returns

{}

Counterpoint: I have said to several people before that "JSON Schema" is a misnomer. That it might be better to call it "ON Schema", because you can express the schemas and documents in 'most any form.

Counter-counterpoint: The ability to treat undefined as a value is one of the Bad Parts of Javascript.

Fixing JSCK to work the way you ask, @moll, is a one-liner that should not affect performance much.

in the test_type function, I would just add this as the first line:

    return true if typeof(data) == "undefined"

But I would really like to discourage the use of undefined as a value. Presumably you're not using it in an object literal, but it is being set as a value without direct intent.

moll commented

Great conversation with self you got going there. :-)

I've actually found convenient uses for the undefined and null duo — the former can be used to explicitly signal a known property with an unknown value and the latter a known property with an unset value. Most of the time it's irrelevant semiotics, but I'm using them to indicate lazy loaded or uninitialized properties.

I'm with you that {} and {name: undefined} should be considered equivalent in every other case. That's why I put them both in the valid box.

Counterpoint: I have said to several people before that "JSON Schema" is a misnomer. That it might be better to call it "ON Schema", because you can express the schemas and documents in 'most any form.

Hehe. I also cringe every time someone says "JSON object". Given that JSON, like XML, is always, by definition, a string, it's as if to say "plain text object" — a nonsensical utterance.

Fixing JSCK to work the way you ask, @moll, is a one-liner that should not affect performance much.

Thanks. Is that something you're willing to do to this public project or are you talking about monkey patching?

I'm looking for a reasonable alternative to that change, but will probably go with it tomorrow.

Given that JSON, like XML, is always, by definition, a string, it's as if to say "plain text object" — a nonsensical utterance

I wish to take it a little further than that, I suppose. So maybe it's not even "ON Schema" I want, it's "O Schema". Possibly "Reductionist Schema".

I've used JSON Schemas in JavaScript, CoffeeScript, Ruby, and Python, but never expressed them as actual JSON.

BTW @moll When I said "Fixing JSCK to work the way you ask", I was inaccurate.

I won't support "undefined" as a primitive type, but what I will do, and what is arguably the right option for all validators, is to have JSCK treat explicit undefined values in the same way it does actual undefinedness.

fwiw, I saw a Douglas Crockford talk last week in San Francisco where he said that he had experimented with abandoning null completely, in favor of undefined, as a kind of "Good Parts Evolved." I think his main impetus for doing so was that typeof null returns "object". anyway, he declared the experiment a success, and basically considers null part of JavaScript's "bad parts" now.

however, Crockford's whole "good parts" / "bad parts" thing is very much a matter of opinion. It's really just a style guide.

re the actual ticket, I didn't get this comment actually:

I won't support "undefined" as a primitive type, but what I will do, and what is arguably the right option for all validators, is to have JSCK treat explicit undefined values in the same way it does actual undefinedness.

@gilesbowkett

I don't want to add "undefined" as one of the primitive type names usable as the value of the JSON Schema "type" keyword.

I do want to treat undefined-as-a-value in the same way I treat an, er, undefined property.

moll commented

Ah. That, I don't think is right in all cases either.

A set, but undefined property can cause havoc when not expected (in pseudo-API):

var isValid = new Schema({
  type: "object",
  properties: {name: {type: "string"}, age: {type: "number"}}
})

var DEFAULTS = {name: "", age: 0}
var input = {name: undefined, age: 42}
var person = isValid(input) ? assign({}, DEFAULTS, input) : DEFAULTS

You're going to end up with person.name being the invalid undefined, rather than an empty string.
When using prototypical inheritance there's also a difference between a set undefined property and one not set at all. So it's not that clear cut.

Explicitly allowing undefined is useful for code that expects it. In other cases it's not a good idea.

I'm inclined to say that one should never distinguish
{id: undefined}
from
{}
null is different from undefined for a reason.

I take half of my agreement back. :-) In plain property access they're equivalent, but when iterating, they definitely matter as the case above shows.

The example with JSON.stringify ignoring undefined is a little misleading. It could be said that rather than it ignoring undefined, it just doesn't serialize things it can't serialize. When you add a custom serializer, it perfectly well calls it with undefined values.

var input = {name: undefined, age: 42}

Basically, I object to the use of undefined in a literal in any situation, ever.

moll commented

Perfectly fine objection. But don't forget that input is merely a placeholder for input out of your control. ;-)

Yeah, I get that sometimes you are dealing with bad input. But that's no reason to propagate bad design decisions, like settable undefined.

When you're dealing with a library that uses the bad parts of Javascript, I think it is reasonable to provide your own prophylactic layer, rather than expecting packages like JSCK to handle bad things.

moll commented

What's a settable undefined? Ending up with undefined properties is a piece of cake. That's why validation is useful — to ensure validity.

var person = req.body && {name: req.body.name, age: req.body.age}
moll commented

If you don't want the undefined type specification in the core Jsck, that's fine.
I do find it useful, so is there a way I could subclass Jsck (rather than fork) to add it for when I need it? Override the isPrimitive function perhaps? Thanks!

@moll I suspect the majority of cases can be handled by the change I proposed.

Would you be able to provide me with some use cases you think it can't handle?

So the major problem with this feature request, from my point of view, is that JSON Schema has a very useful universality to it. It's not unusual to build an app in one language, and then rebuild small pieces of it in a new language, with an emphasis on speed. The canonical example would probably be refactoring a monorail (monolithic Rails app) out into services, with those services written in faster languages like Node or Go. Because of this, we want to avoid language-specific features, e.g., undefined.

This seems to be to a request to allow extending the validator, which seems useful and interesting. It satisfies our usual maxim of No Hermetic Abstractions. And it's a scenario where using functions rather than code generation is an advantage. It intuitively seems easier to override or add a function that performs a given test than to do the same thing for a function that generates code.

I think that's what @automatthew was suggesting, but I'm not entirely sure. @gilesbowkett, this would satisfy your criteria that it doesn't alter JSON Schema, which would remain the default. But it would allow developers to tweak it to their needs. In my experience, this is a very common scenario, where you're interoperating with a service that, for whatever reason, isn't quite following the rules.

moll commented

@moll I suspect the majority of cases can be handled by the change I proposed.
Would you be able to provide me with some use cases you think it can't handle?

Indeed, that change will solve a particular case I have (because I was precisely after allowing undefined), but create validation holes elsewhere that I described above that I'm not comfortable with.

So the major problem with this feature request, from my point of view, is that JSON Schema has a very useful universality to it.
...
Because of this, we want to avoid language-specific features, e.g., undefined.

I don't mind the core ignoring that, hence my question about subclassing. ;-)

Extensibility and/or subclassing is definitely a good thing. I'll see what I can do about making JSCK easy to extend.

@automatthew given the mixin-based approach to supporting drafts…could we do the same thing for extending...just allow for additional mixins?

@dyoder probably that would work. I need to check though, to make sure that there aren't unintended consequences. And that there isn't a better way.

While I don't want JSCK to add additional primitives, I have no objection to giving JSCK the ability to allow users to add additional primitives.

@automatthew without having reflected a whole lot on the alternatives, i think mixins would be a fantastic way to do this. for two reasons: (a) you're already doing it that way to handle variations between drafts; and (b) it would require adding only an optional argument to creating validators.

Also, I just love mixins. That part is sort of irrational and has to do with having learned OOP via a dialect of Flavors Lisp, so I always think of Steve'sAmy's ice-cream.

I would like if at least JSCK allowed non-explicit undefined values from failng validation. I believe this is what @automatthew intended in #84 (comment). Even if the debate about the semantics around explicit undefined continues, that would be a good improvement

I've been running with that patch, and it seems to work well, however another change was required. If a field defined as enum is not defined, a similar change is needed at https://github.com/pandastrike/jsck/blob/master/src/common/comparison.coffee#L10

EDIT:Actually, upon further inspection, all tests including test_type appear to be handled if the undefined check is performed (solely) at https://github.com/pandastrike/jsck/blob/master/src/validator.coffee#L257. I'm not sure if this would be desirable if "undefined" was made an explicit type, but as a "don't let undefined values fail validation" mechanism it seems correct.

I'll be reviewing this today. Apologies for the delays.