Better guide and support for inconsistent structures
Closed this issue · 14 comments
Disclaimer
First of all I must say: this library is great!
Background
Few days ago I had to write 7 decoders for 7 different currency rate providers and there were some troubles:
- the documentation for decoders covers only very simple cases, one needs to dig through the source code to find out all the power the Thoth.Json.Net offers
- some common patterns are not supported out of the box, had to implement my own.
Solution
Extending the documentation page and providing few more decoder operations (see DecodeX
at the end) all the fancy, strange and inconsistent JSON decoders are quite simple and very concise.
The question is:
would you consider extending the base Decode
module a little? I could prepare pull request. If not, then maybe I could just add a little bit to the documentation, showing maybe examples and solutions? That could help/attract new users.
Let me show you few examples
Common theme to notice is: collect only available currency rates. Do not trouble with missing ones.
Currency Rate Data Provider A
Response can look like a list:
[ { "code": "GBPUSD", "rate": 1.3159, "timestamp": 1597732500 },
{ "code": "GBPZAR", "rate": "NA", "timestamp": 1597732500 },
{ "code": "USDHRK", "rate": 6.3305, "timestamp": 1597732500 } ]
or it can be a single valid rate:
{ "code": "GBPUSD", "rate": 1.3159, "timestamp": 1597732500 }
or it can be a single missing rate:
{ "code": "GBPZAR", "rate": "NA", "timestamp": 1597732500 }
Decoder:
let ratesDecoder: Decoder<RateObject list> =
let rateDecoder: Decoder<RateObject option> =
Decode.map3
(fun (code: string) close timestamp ->
{ Pair = pairOfString code
Rate = Rate close
ValidFrom = Instant.FromUnixTimeSeconds timestamp })
(Decode.field "code" Decode.string)
(Decode.field "close" Decode.decimal)
(Decode.field "timestamp" Decode.int64)
|> DecodeX.noFail
// received only one rate (not a list)
let caseA = rateDecoder |> Decode.map Option.toList
// received list of rates
let caseB = rateDecoder |> DecodeX.listFromOptions
Decode.oneOf [ caseA; caseB ]
Currency Rate Data Provider B
{
"ts": "2021-08-13T09:54:25Z",
"EUR_PLN": { "bid": "4.5700", "ask": "4.5709" },
"GBP_PLN": { "error": "some description" },
"USD_PLN": { "bid": "3.8934", "ask": "3.8943" }
}
Decoder:
let ratesDecoder: Decoder<RateObject list> =
let bidAskDecoder: Decoder<decimal option> =
Decode.map2
(fun bid ask -> (bid + ask) / 2M)
(Decode.field "bid" Decode.decimal)
(Decode.field "ask" Decode.decimal)
|> DecodeX.noFail
let tsDecoder: Decoder<Instant> =
Decode.string
|> DecodeX.nodaParse InstantPattern.ExtendedIso.Parse
let rateObject ts (key, value) =
{ Pair = pairOfString key
Rate = Rate value
ValidFrom = ts }
Decode.map2
(rateObject >> List.map)
(Decode.field "ts" tsDecoder)
(DecodeX.keyValueOptions bidAskDecoder)
Currency Rate Data Provider C
Can you guess the JSON by just reading the decoder below? Hint: start from bottom.
Decoder:
let ratesDecoder: Decoder<RateObject list> =
let tsPattern =
LocalDateTimePattern.CreateWithInvariantCulture("dd-MM-yyyy HH:mm:ss.FFF")
let tz =
DateTimeZoneProviders.Tzdb.["America/New_York"]
let tsDecoder: Decoder<Instant> =
Decode.map2
(fun date time -> date + " " + time)
(Decode.field "D791" Decode.string)
(Decode.field "D768" Decode.string)
|> DecodeX.nodaParse tsPattern.Parse
|> Decode.map (fun it -> (it.InZoneLeniently tz).ToInstant())
Decode.map5
(fun c1 c2 bid ask ts ->
{ Pair = Currency c1, Currency c2
Rate = Rate(bid + ask / 2M)
ValidFrom = ts })
(Decode.field "S4577" Decode.string)
(Decode.field "S4578" Decode.string)
(Decode.field "D4" Decode.decimal)
(Decode.field "D6" Decode.decimal)
tsDecoder
|> Decode.list
Currency Rate Data Provider C
Again, can you guess the JSON by just reading the decoder below?
Decoder:
let ratesDecoder: Decoder<RateObject list> =
let rateObject ts (pair, rate) =
{ Pair = pair
Rate = rate
ValidFrom = Instant.FromUnixTimeSeconds ts }
let quotesDecoder: Decoder<(Pair * Rate) list> =
Decode.map3
(fun c1 c2 mid -> (Currency c1, Currency c2), Rate mid)
(Decode.field "base_currency" Decode.string)
(Decode.field "quote_currency" Decode.string)
(Decode.field "mid" Decode.decimal)
|> DecodeX.noFail
|> DecodeX.listFromOptions
Decode.map2
(rateObject >> List.map)
(Decode.field "timestamp" Decode.int64)
(Decode.field "quotes" quotesDecoder)
DecodeX
[<RequireQualifiedAccessAttribute>]
module DecodeX =
/// successful decoding yields successful Ok,
/// unsuccessful decoding yields successful None
let inline noFail (decoder: Decoder<'a>) : Decoder<'a option> =
fun path token ->
match decoder path token with
| Ok x -> Ok(Some x)
| Error _ -> Ok None
/// collects only successful items in lists
let inline collectOptions (decoder: Decoder<'a option list>) : Decoder<'a list> =
decoder |> Decode.map (List.choose id)
let inline listFromOptions (decoder: Decoder<'a option>) : Decoder<'a list> =
decoder |> Decode.list |> collectOptions
let inline keyValueOptions (decoder: Decoder<'a option>) : Decoder<(string * 'a) list> =
decoder
|> Decode.keyValuePairs
|> Decode.map (
List.collect
(fun (key, aOpt) ->
match aOpt with
| Some a -> [ key, a ]
| None -> [])
)
let inline flattenResult (decoder: Decoder<Result<'a, string>>) : Decoder<'a> =
decoder
|> Decode.andThen
(function
| Ok value -> Decode.succeed value
| Error str -> Decode.fail str)
// this one is specific to NodaTime, could not get into the Thoth.Json.Net for sure
let inline nodaParse (nodaTimeParser: string -> ParseResult<'a>) (decoder: Decoder<string>) : Decoder<'a> =
decoder
|> Decode.map (nodaTimeParser >> NodaX.parserResultToResult)
|> flattenResult
Hello @witoldsz,
thank you for taking the time of writing this issue.
It has been a long time that I post-poned the rework of Thoth.Json documentation.
I am happy to announce that I am now working on it both because you pinged me and because Nacara v1 has been reach allowing me to have a good base for building documentation.
About extending the core library I am not against it, I just need to think over the different proposition, both in term of names and if they are general purpose enough.
So what I propose is:
- Work on the documentation rework
- Review your different proposition to see if we can add them or need to improve the naming or something etc.
The work on the documentation has already started when ready, I can ping you to review it.
Like that you can provide some feedbacks and make addition where you think things are missing.
On another subject, you should avoid making your general purpose inline
because they will increase the bundle size each time you use them.
OK, so you will cover the case while working on the new documentation, is that right?
OK, so you will cover the case while working on the new documentation, is that right?
I would say, but I am not sure which case you are speaking about.
If you mean:
- the fact that number are encoded as string, then yes
- Add exemple for more complexe encoder/decoder like DUs and composition using
andThen
,map
, etc. then yes too
If you are more case that you think should be cover please mention them, so we can discuss it and include them if needed.
OK, will just wait to see what did you come up with.
From my perspective, it would be great if new docs were covering the cases like I have described: { ts: date, EUR_USD: {rate}, EUR_CHF: {rate}, …}
(so you do not know what the keys are) or when you want to define successful case decoder and then collect over the list or map, you know… what I have described as the A, B and C data providers.
The documentation rework has been completed, please have a look and see if this cover the subjects you had in mind.
The new documentation is nice and all, but it does not cover the cases I have described above:
- how to decode object with unknown fields, e.g.
{ "EURUSD": 1.134323, "USDCZK": 21.6945, ...possibly many others }
- how to decode object with some fields known and some unknown (and I do not mean optional), e.g.
{ "time": ..., "EURUSD": ..., and possibly many others }
- how to decode object with known or unknown fields but with different shape of values, e.g.
{ "EURUSD": 1.134323, "USDCZK": "n/a", etc… }
- how to decode some bizarre situations like with one provider which can either return data like this:
{ "base_currency": "EUR", "quote_currency": "USD", "rate": 1.134323 }
or if has more than one result, then inside an array:
[ { "base_currency": "EUR", "quote_currency": "USD", "rate": 1.134323 }, … ]
and you never know which shape you get.
To sum it up, basically all the cases I had when I was implementing decoders for several currency rates data providers, described in this ticket.
I was able to successfully write all the decoders because I have some background with FP and with Elm decoders and I was studying the source code of Thoth.Json.Net to figure it all, but for someone with less background it will end up with using some other library/language with ugly imperative code to decode it (my project was to replace old JS code with F#).
@witoldsz Thank you for taking the time to look at the documentation.
I kind of knew that it would not cover all the things you listed in this issues but I had a hard time processing it. (I am at my 10th re-read now ahah 😅). Because, the case you are describing are not simple.
From what I understand now:
-
Add documentation about
Decode.oneOf
because indeed I forgot about this helper which each is common.This will help cover case where you can have different shape for your type. For example:
{ "$type": "int", "value": 42 } 42
I am using simpler JSON that you, but the idea should be the same.
-
Add documentation about case where you have an object with some fields that you don't know the name but needs to retrieve. In which case, you used
Decode.keyValuePairs
to decode it and retrieve a list of field. -
Review the list of decoders from
DecodeX
and decide if they should go into Thoth.Json or a dedicated library or just placed in a Guide
Am I right?
Well, it is not just the 3 points you've mentioned, but also the clever ways to combine them to get an outstanding result. It's like you have a LEGO blocks for the first time or you see the "manual" of the chess game – it does not mean you know how to build something you need or how to play the game. One needs some hints and maybe some extra tools (like the one I had to build to cover all the data providers with ease).
For example, the Decode.keyValuePairs
itself is cool, but see what DecodeX.keyValueOptions
can do with it if you know how to connect few dots on the decoders diagram.
P.S.
I have just realized the new documentation does not include the reference manual, where one could see all the available operators with short description and (hopefully) links to Thoth.Json and Thoth.Json.Net code. That would be awesome!
but also the clever ways to combine them to get an outstanding result.
I get your point, originally I had a "Guide" section in the new documentation but removed it. I think, I will re-add it and use it as an example of how to use Thoth.Json for complex types. I "just" need to find a good idea of clear and concise example to use.
I have just realized the new documentation does not include the reference manual, where one could see all the available operators with short description and (hopefully) links to Thoth.Json and Thoth.Json.Net code. That would be awesome!
Yes, it doesn't have it yet. The reason is I think some of the F# used in Thoth.Json is not well rendered by Nacara yet. The current version included in Nacara, is really a POC that I put to test the concept and to have for Nacara. As it simplified the documentation a lot since I was able to put the documentation in the code directly :)
I am in the process of rewriting it from scratch to have it robust and tested so I can fix bug without having regression in it. It should come in the coming weeks.
I added documentation about Decode.oneOf
to the released documentation.
I also worked on an advanced section which try to work with one of the JSON you provided and goes step by step over the code.
I kind of had to invent the example, but I wasn't sure how your code was but I think I have a lot of things covered by this example. Could you please have a look at it?
It is in this PR: #126
And to have a better experience you should probably run the documentation locally:
- Pull the branch
npm i
./build.js docs --watch
I am looking at the advanced/unknown-fields section of #126 and for me it seems more complicated than it actually is.
I am sure that with two or three more functions in Decode
(especially keyValueOptions
and noFail
, maybe with better names?), the whole solution would be not that advanced and could just be an addition to the "inconsistent structure" chapter.
The more I work on the documentation/example and the more I am stuck...
Right now, I feel like to fix this issue, the discussion should be: Should the proposed decoders be added to the core library?
And if yes, add a doc comment explaining their usage so the documentation is more focused on a single problem at the time ==> The one that the decoder solves.
Using doc comment, will also allow to have the documentation available from the IDEs and API reference generated by Nacara. Example of API reference
Example of doc comment:
/// <summary>
/// Map the result of the given decoder into an `Option` type.
/// </summary>
///
/// <example>
///
/// Successful decoder:
///
/// <code lang="fsharp">
/// let json = "42"
///
/// Decode.fromString (Decode.ignoreFail Decode.int) json
///
/// // Returns: Some 42
/// </code>
///
/// Failed decoder:
/// <code lang="fsharp">
/// let json = "\"abc\""
///
/// Decode.fromString (Decode.ignoreFail Decode.int) json
///
/// // Returns: None
/// </code>
/// </example>
/// <param name="decoder">Decoder to transform</param>
/// <returns>
/// Returns `Some 'T` if the decoder succeeds, `None` otherwise.
/// </returns>
let ignoreFail (decoder : Decoder<'T>) : Decoder<'T option> =
fun path token ->
match decoder path token with
| Ok x -> Ok(Some x)
| Error _ -> Ok None
thoth_json_demo_tooltip.mp4
I am not against expanding the core library but here are my current thoughts:
I think one problem I have with the proposed decoder is that it feels like they are a mix of decoders and logic. But there are already decoder likes that:
Decode.all
Decode.oneOf
Decode.keyValuePairs
So, I think this is just me not having used them enough to get comfortable with them.
Now, the other problem I have is the naming. For me the names don't make it explicit enough what they do.
Decode.noFail
Opinion
let inline noFail (decoder: Decoder<'a>) : Decoder<'a option> =
fun path token ->
match decoder path token with
| Ok x -> Ok(Some x)
| Error _ -> Ok None
Proposition
Decode.mapToOption
because it map the decoder into anOption
Decode.toOption
because it transform the decoder into anOption
Decode.collectOptions
Opinion
/// collects only successful items in lists
let inline collectOptions (decoder: Decoder<'a option list>) : Decoder<'a list> =
decoder |> Decode.map (List.choose id)
This decoder is specifics for the list but the name doesn't show that. Also, shouldn't we offer the same decoder for array
and seq
?
Proposition
Should we offer a more high-level decoder not restricted to the 'a option list
?
Decode.List.choose
<-- Introduce, a new concept of sub modules which provides specialised decoders for a specific type. Kind of what the user does when creating manual decoders for it own types.
Decode.listChoose
let listChoose
(decoder : Decoder<'A list>)
(chooser : ('A -> option<'B>)) : Decoder<'B list> =
decoder
|> Decode.map (List.choose chooser)
Decode.listFromOptions
Opinion
let inline listFromOptions (decoder: Decoder<'a option>) : Decoder<'a list> =
decoder |> Decode.list |> collectOptions
Should we have a version for array
, seq
to ?
Does this decoder add value compared to the user manually writing the chain ?
Decode.keyValueOptions
Opinion
let inline keyValueOptions (decoder: Decoder<'a option>) : Decoder<(string * 'a) list> =
decoder
|> Decode.keyValuePairs
|> Decode.map (
List.collect
(fun (key, aOpt) ->
match aOpt with
| Some a -> [ key, a ]
| None -> [])
)
The name is not really explicit and doesn't match the existing decoder: Decode.keyValuePairs
Proposition
Decode.keyValuePairsFromOption
<-- I don't really like that name because it feels like it is building from an object while in reality it is working on an object with an optional/option decoder.
Decode.optionalKeyValuePairs
<-- Feels "ok", but it is not consistent with the fact that other optionalXXX
decoder returns 'A option
and not 'A
Decode.KeyValuePairs.choose
<-- Follow the Decode.List.choose
proposition which created sub-modules to provide specialised decoders to a specific type.
Decode.flattenResult
Opinion
let inline flattenResult (decoder: Decoder<Result<'a, string>>) : Decoder<'a> =
decoder
|> Decode.andThen
(function
| Ok value -> Decode.succeed value
| Error str -> Decode.fail str)
Seems good to me
Your proposition and all thoughts provided above are great. I am looking forward to read it all through once I find some extra time. I would argue against a rush :)
The next release will update the documentation to include the section we worked on together in the past (sorry for the delay).