nvie/decoders

[Bug] Builtin composite decoders cannot take decoders with non-`unknown` input types

treykasada opened this issue ยท 8 comments

Version

2.0.0-beta9

Steps to reproduce

  1. Define a Decoder<O, I> where I is a type other than unknown
  2. Attempt to pass the defined Decoder<O, I> to any of the builtin composite decoders

E.g.

import { Decoder, DecodeResult, optional } from 'decoders';
import { unknown } from 'decoders/annotate';
import { err, ok } from 'decoders/result';
import * as uuid from 'uuid';

const binaryUUIDFromStringUUID: Decoder<Uint8Array, string> = (uuidString: string): DecodeResult<Uint8Array> => {
  try {
    return ok(Uint8Array.from(uuid.parse(uuidString)));
  } catch (error) {
    return err(unknown(error));
  }
};

const optionalBinaryUUIDFromStringUUID = optional(binaryUUIDFromStringUUID);

Expected result

No type errors.

Actual result

TS2345: Argument of type 'Decoder<Uint8Array, string>' is not assignable to parameter of type 'Decoder<Uint8Array, unknown>'. 
  Type 'unknown' is not assignable to type 'string'.

Affected composite decoders/functions:

  • guard
  • compose
  • predicate
  • transform
  • array
  • nonEmptyArray
  • set
  • describe
  • dict
  • exact
  • inexact
  • mapping
  • object
  • either
  • lazy
  • maybe
  • nullable
  • optional
  • taggedUnion
  • tuple

Motivating use-cases

  • The binary UUID example from above
  • Decoding arrays with a known element type without having to re-assert the known type:
    const bigIntFromString: Decoder<bigint, string> = ...;
    const bigIntArrayFromStringArray: Decoder<bigint[], string[]> = array(bigIntFromString);
  • Decoding objects with known value types without having to re-assert the known types:
    const booleanFromString: Decoder<boolean, string> = ...;
    const bigIntFromString: Decoder<bigint, string> = ...;
    const recordFromStringifiedValueRecord: Decoder<{ foo: boolean, bar: bigint }, { foo: string, bar: string }> =
      object({ foo: booleanFromString, bar: bigIntFromString });
  • Piece-wise composition of decoders:
    const bigIntFromString: Decoder<bigint, string> = ...;
    const bigIntFromJSON: Decoder<bigint> = compose(compose(json, string), bigIntFromString);
nvie commented

Hi, thanks for being an early adopter for v2 โค๏ธ ! I've been iterating and making significant changes, even in each 2.0.0-beta version. Even between beta9 and the upcoming beta10 I've reworked a lot of the internals (see below), and I'm planning to drop compose() soon entirely in favor of a simpler approach for building custom decoders yourself. Please bare with me for the time being ๐Ÿ™ !

First of all, have you noticed there now exists (only since v2.0.0-beta9!) a built-in uuid decoder, which will accept valid UUID strings? If that's all you need, you can simply use that. It won't do anything with the "uuid" package/library though. If you need the binary version, you can easily build one yourself in a much simpler way than using compose() or work with the decoders/annotate or decoders/result internal modules.

You can use a transform()1:

import { Decoder, describe, optional, transform, string } from 'decoders';
import * as uuid from 'uuid';

const binaryUUID: Decoder<Uint8Array> = describe(
  transform(string, value => uuid.parse(value)),
  'Must be UUID string'
);

const optionalBinaryUUID = optional(binaryUUID);

What transform() will do here, is first use the default string decoder to verify that the incoming value is actually a string (and not a number, for example), and then call uuid.parse() on that result if so. If that throws an error, the error message (thrown by uuid.parse()) would get annotated on the input, but by wrapping this in a describe() you can provide a more user-friendly error message.

To implement simple decoders like this, you should not have to be exposed to the decoders/result or decoders/annotate helper modules at all!

Just as heads-up: please note that these APIs will change one more time in the upcoming 2.0.0-beta10, where I'll move these to decoder "methods", for even better readability. In beta10 you'll have to write that like this:

const binaryUUID: Decoder<Uint8Array> =
  string
    .transform(string, value => uuid.parse(value))
    .describe('Must be UUID string');

(Please note that v2.0.0-beta10 hasn't been published yet, but if you look at https://decoders.cc, then you'll notice that's how it's documented there already.)

Footnotes

  1. In v1, this was called map(). โ†ฉ

nvie commented

As far as your motivating use cases are concerned:

Decoding arrays with a known element type without having to re-assert the known type:

Will become:

const bigIntFromString: Decoder<bigint> = string.transform(s => BigInt(s));
const bigIntArrayFromStringArray: Decoder<bigint[]> = array(bigintFromString);

Decoding objects with known value types without having to re-assert the known types:

const booleanFromString: Decoder<boolean> = string.transform(s => s === 'true' /* or however you want */);
const recordFromStringifiedValueRecord: Decoder<{ foo: boolean, bar: bigint }> =
  object({ foo: booleanFromString, bar: bigIntFromString });

etcโ€ฆ

I think you can get the point with these examples. Hope that helps!

Hey, thanks for the replies! ๐Ÿ™‚

Just to give some context: What I'm really looking for in this library is the ability to both

  • Define "validators"/"parsers" for data of entirely unknown types
  • Define "refinements"/"transformations" for data of known types

As of 2.0.0-beta9, it looks like the library has really strong support for the first point, but only limited support for the second. I was able to achieve the second point by defining Decoders with specific input types (rather than unknown); the problem was just that none of the composite decoders seemed to work with them (which is why I raised this as a bug rather than a feature request, haha).

If you're planning on dropping compose though, it sounds like this has become more of a feature request! ๐Ÿ˜…

Should I maybe raise a new issue for that? Or would you rather I talk more about it here?

Oh, side-note on the API changes:

please note that these APIs will change one more time in the upcoming 2.0.0-beta10, where I'll move these to decoder "methods"

Thanks for the heads up! ๐Ÿ™‚

Though I confess I'm a little sad to see this change. Another library in the same space called funtypes offers both a method-based and a function-based API for composing things, and we initially went with the method-based API. We found over time that the method-based API made it hard to understand how things were composed in some cases, and switching to the function-based API was a lot clearer. It's definitely not a deal-breaker though!

nvie commented

Thanks for the response! There are a few things to unpack here. Hope the below sheds a little light on our thinking for the 2.x changes!

Defining a "validator" / "parser" for data for unknown types is indeed exactly what a Decoder<T> is for! ๐Ÿ‘

Defining "refinements" / "transformations" for existing data decoders is also very much possible, with the new/upcoming .transform() and .refine() methods. These respectively replace the unclearly named 1.x map, compose, and predicate decoders, which had quite a few gotchas. So I'm not so much removing the ability to compose pipelines, as much as I'm removing the weird / unintuitive / suggestive API that existed in 1.x for it. The goal for 2.x is to be able to do the same things, but simpler.

When your goal is to refining data of a known type to something narrower or different, then you won't need the decoders library for that at all. Normal functions are perfect for that already!

That's to say, if you want to convert a string to a bigint (like in your const bigIntFromString: Decoder<bigint, string> example), I wouldn't bother to use decoders for that. The BigInt constructor is a normal function that will take a string and return a bigint value already โ€” no decoders needed!

Screen Shot 2022-01-25 at 22 00 17@2x

Now, if you want to accept bigint strings from untrusted (!) input positions, then you're in Decoder territory and you can build a Decoder<bigint> yourself by first pulling it through a string decoder and then invoking that same (safe) transformation method:

const bigint: Decoder<bigint> = string
  .transform(s => BigInt(s))
  .describe('Must be a numeric string');  // Suggested, but optional

Then, you're composing a bigint Decoder from a string decoder and that very same string-to-bigint function which will work for all known-to-known values, but which should not be called a "decoder". In this line of thinking, a decoder will always be for going from unknown to T, no exceptions.

tl;dr

  1. If you have a known string value already, use a normal string => bigint transformation function
  2. If you work with an unknown value, define a bigint decoder by composing it from a base string decoder, and that very same transformation function from (1) (see the example above)

I have seen this confusion more often with users coming from v1.x, and I hold the Decoder<T, I> second I param accountable for this confusion, as it may falsely suggest that it would be a good idea to define a Decoder taking a known input param. It's fine to define such transformations of course! But those things should not be called "decoders". In 2.x, we're correcting that mistake in the design and will use a more precise definition of what's a decoder exactly: an instrument to parse an unknown input value to a known and verified value at runtime. That's why in 2.x (starting with the upcoming beta10 release) the second I type param has been completely removed from the Decoder<T> type definition even.

If you're planning on dropping compose though, it sounds like this has become more of a feature request! ๐Ÿ˜…

As said above, the ability to compose decoders hasn't been removed, only the confusing API. You can still compose the bigint decoder (which works for string inputs and returns bigints) like so, no problem:

const biggies: Decoder<bigint[]> = array(bigint);
const maybeBiggy: Decoder<bigint | undefined> = optional(bigint);

Though I confess I'm a little sad to see this change. [โ€ฆ] We found over time that the method-based API made it hard to understand how things were composed in some cases [โ€ฆ]

I acknowledge this is indeed very much a trade-off. The reason we started this transition was that declaring extra predicate checks and transformations were pretty clunky before. Refinement, transformation, and mutating error messages make sense for all decoders and are super common, so we thought they were going to be more discoverable through IDE completion and much easier to implement custom decoders yourself.

Compare:

// โŒ v1.x
import { compose, number, predicate } from 'decoders';

const positive: Decoder<number> =
  compose(
    number,
    predicate((n) => n >= 0, 'Must be positive'),
  );

vs the upcoming 2.x version:

// โœ… v2.x (available in 2.0.0-beta10)
import { number } from 'decoders';

const positive: Decoder<number> =
  number.refine((n) => n >= 0, 'Must be positive');

Hopefully this explains some of the thinking and hopefully you can agree this is a much simpler API and approach going forward. Thank you so much for helping with the testing โ€“ and your patience while we're going through the motion with all the (upcoming) changes ๐Ÿ™ ! Much appreciated!

nvie commented

Hey @treykasada โ€“ I just published 2.0.0-beta10 which enables the examples above. Would you like to give it a spin? Much appreciated! ๐Ÿ™

Hey @nvie - thanks for your detailed comments! ๐Ÿ™‚

We'll look at migrating to the new version at some point in the coming days. It'll require us to change how we're using the library a bit though, as we're currently using compose in a few places and we weren't really expecting a breaking change of this magnitude in a beta release. ๐Ÿ˜…

Given that the API changes in the new version remove support for Decoders with non-unknown input types, the original issue no longer applies, so I'll go ahead and close this ticket. Thanks again for the discussion! ๐Ÿ™‚

nvie commented

Yeah, I confess the beta iteration cycles have been a bit wild ๐Ÿ™ˆ But I think the end result is much nicer and more future-proof so it's done mostly to avoid another major breaking change soon.

Btw, regarding your specific use of compose, there also is the .then() method now which most closely resembles the old compose helper. (But I still think for the aforementioned cases you won't need it.)

Please do let me know if you need help converting anything in particular that isn't clearly documented enough โ€“ I much appreciate the early feedback and I'd be happy to help you out with anything.