hjson/hjson-rust

Upgrade to Serde 1.0.x using Serde's JSON implementation

Closed this issue ยท 20 comments

Since Hjson is an extension of JSON, and the Serde team keeps their JSON implementation up to date, I think it would make sense to keep hjson-rust as a fork of serde-rs/json. That way any changes the Serde team makes can be merged or modified to apply to Hjson. I've already been working on the deserialization part for a bit, and I'll upload my progress on Monday.

That's great, thanks for your efforts! This is probably better than upgrading from the old 0.8 serde implementation. I do hope that post 1.0 serde we will get fewer breaking changes though.

Code is now uploaded here: https://github.com/dqsully/hjson-rust. I haven't touched the documentation or seralization, and I haven't tested any of my deserialization changes yet. If anyone wants to help, have at it.

@dqsully Is your library up-to-date and working? Maybe enabling issues might be a good idea and make it a community driven successor to this library. The crate hasnt been updated within the last two years and doesn't work with serde 1+ anyway.

@dqsully Sorry to ping you here, but the repo has no issues enabled so this is the only way to actually contact you.
HJSON was meant to be used as a human readable JSON alternative. I was curious if preserving the comments is a hard thing to achieve.
Maybe some custom field annotation like the rename one from serde #[serde(rename = "new_name")] would be a good way to do that on the struct itself. So upon serializing, the implementation can use the info to add a comment to each field. Something along the lines of #[serde(comment = "A new comment appeared")]

I tried to take a look at the code but I am fairly new to Rust and do not really understand what is ging on tbh...

Sorry for the late response, I've been away for a little while. Unfortunately comments would either have to be stored in a separate data type or the deserialized data types would have to implement a comment-storing trait. I suppose it wouldn't be too bad to make a SerializeComments trait or something and have separate functions for serializing/deserializing with comments. It would just mean that you would have to implement SerializeComments yourself using a wrapping type, just like how you would implement Serialize and Deserialize for a foreign type in serde.

If you want to work on my code, most of the magic is in src/de.rs, src/ser.rs, src/read.rs, and a bit in src/error.rs. Because Hjson is a superset of JSON I decided to fork serde-json and expand the parsing to accommodate the extra syntax. Right now the only work I think my fork needs is to carefully merge in the changes from serde-json and to fix/rewrite a bunch of the documentation. And then it should be ready to be published.

Also, I didn't realize I had issues disabled, sorry about that! It's fixed now.

Any progress on this? I'd love to be able to use serde 1.0+, having to use 0.8 to use serde-hjson is quite painful!

Right now this requires an overhaul of the Hjson code I wrote in order for it to be compatible with the latest serde-json changes. I've made some progress but it's slow.

Be aware that I have removed the link to this project from the Serde website and documentation for now -- serde-rs/serde-rs.github.io@f41183d and serde-rs/serde@e9cd73f. Happy to add it back when a version compatible with Serde 1.0 is available. Please remind me when it's ready!

Is this not likely to be updated to support Serde 1.0? It's been an open issue for a while. Oddly config-rs appears to use this with Serde 1.0 in their cargo.toml which imported fine for me. But when I tried to use this crate with serde directly, I could not compile due to an error about Deserialization not being implemented for my struct with derive. Someone has pointed out to me that it's due to this crate not supporting a newer Serde?

Yeah, that is correct. Upgrading to Serde 1.0 is a bigger undertaking than I thought and I haven't had the time recently. I even tried piggybacking off of serde-json as an upstream, but Hjson is too different to make that viable.

@dqsully if you have any notes or other recommendations based on your porting experiences, I would be quite appreciative.

The problem I ran into trying to port serde-json into serde-hjson was to do with source peeking and scratch buffers. JSON is pretty straightforward to parse, you know immediately based on the next character what kind of type you will be parsing. But with Hjson, many invalid values end up interpreted as a quoteless string instead, so you always need to be able to "rewind" and re-parse data into a different type. For a while I was using a scratch buffer from serde-json for this purpose, but an update ended up conflicting and using that same buffer for incompatible reasons.

Overall I think serde-json is a great place to start from. In fact I would probably use it as a dependency if I were to rewrite serde-hjson, if nothing else to leverage their Value type. Serde-json provides a solid outline for implementing serde-hjson, but the underlying implementation will have to change.

I think there could be value in a partial port. Below is a table listing some of the notable features of hjson, along with cost and benefit columns. These are subjective, so I would appreciate others' input.

Feature Implementation costs Feature costs and benefits
Unquoted strings for keys Low. Requires type detection, but there is a clear termination character, : High. Quoting strings is a pain.
Unquoted strings for values Low. Like unquoted keys, but up to two termination patterns, e.g. \n and ,\n or just ,. High. Possible syntactically significant whitespace.
# comments at start of line Low. Clear start character and end character, # and \n Low. Syntactically significant whitespace.
# comments within lines Medium. Increased expectations for comments within values. Low. Syntactically significant whitespace.
// comments at start of line Low. Like # comments but start pattern is // Low. Like # comments.
// comments within lines Medium. Like # comments but start pattern is // Low. Like # comments.
/**/ comments at start of line Low. Clear start pattern and end pattern. High. Comments are useful, and often enough they span multiple lines. No syntactically significant whitespace.
/**/ comments within lines Medium. Increased expectations for comments within values. Low. Complicates the language and has questionable utility. Comments between statements are often sufficient.
Multiline strings Low. Clear start and end pattern(s): ''' and perhaps """. High. Multiline strings are a major pain point.
Whitespace handling for multiline strings Medium. Requires effort to strip whitespace up to and including first newline, then strip whitespace up to the first value and equivalent (or lesser?) whitespace for subsequent values. Medium. Proper indentation would be a UX improvement.
Trailing commas Low. Requires allowing lists, objects to be optionally terminated with ,] and ,} respectively. Medium. Improves diffs. Makes language more consistent.
No trailing commas Low. Adds additional termination character \n. Low. Makes language less consistent. Syntactically significant whitespace.

Skipping all of the features that add syntactically significant whitespace and the indentation handling for multiline strings, I am left with:

  • Unquoted strings for keys and values
  • /**/ comments between statements
  • Multiline strings
  • Optional trailing commas

A few notes:

  • Unquoted string values in Hjson are only terminated by \n, not by commas
  • Comments can be implemented without too much trouble by inserting comment parsing into the method that consumes whitespace, that way you don't have to worry about comments on lines with other data. Because of this, I think that all comment types should be implemented, because it's not significantly harder to do one type over the other
  • Multiline strings were one of the harder things to get right in my original port. You have to seek 1-2 characters ahead to search for more ' characters so you know the difference between an empty string and a multiline quote. Whitespace was also nontrivial

Here's my original implementations for some of those parser features (admittedly not my best work):

I discovered hjson recently and would like to use it for a config system. But this project seems to be not workable. There's a PR out for updating to serde 1.0. Can we get that over the finish line?
#24

That PR fails on CI. As long as that isn't fixed I don't see how progress can be made. The author didn't fix those errors. I can just assume that this is part of the hang-up.

Canop commented

I don't know if this comment is welcome here: I made an alternate serde 1.0 crate for Hjson (deserializer only): https://github.com/Canop/deser-hjson

It's not based on hjson-rust and may have not exactly the same use cases (I've written a FAQ).

There is nothing wrong with posting alternatives imo. And I like the deserializing-only approach. Makes maintaining easier as well! Thanks for sharing! I've always been a huge fan of HJSON

Closing due to having merged #24.