ruby/psych

JRuby: Provide some sort of ability to configure SnakeYAML LoaderOptions

chadlwilson opened this issue · 10 comments

Currently, it appears the JRuby Psych extension does not configure the SnakeYaml LoaderOptions or allow the user to override them somehow.

The parser seems to be constructed at

@JRubyMethod
public IRubyObject parse(ThreadContext context, IRubyObject yaml, IRubyObject path) {
Ruby runtime = context.runtime;
boolean tainted = yaml.isTaint() || yaml instanceof RubyIO;
try {
parser = new ParserImpl(readerFor(context, yaml));

There is an optional second arg to allow specifying options.

There is some discussion related to this because of CVEs being raised by OSS Fuzz about the risks of using SnakeYaml (or most YAML parsers really) with fully untrusted content. I am unclear if folks might be using JRuby and thus Psych to do this, e.g a YAML-based API or config approach. Having said this, the default LoaderOptions do seem to be relatively "secure by default" so this doesn't seem to be a security issue.

Possible things folks might want to do with LoaderOptions.

  1. Increase/decrease maxAliasesForCollections or nestingDepthLimit which are DoS/billion laughs mitigations
  2. Disallow duplicate keys allowDuplicateKeys

Related discussion

This is a good idea. I'm not sure the best way to include it without introducing an incompatibility. Does libyaml have such configuration options? It would be best if we could provide a uniform interface for setting options, and if possible map the same options to the same logic in both backend libraries.

I've gone ahead and merged #613 even though there's no equivalent in the libyaml extension. This at least gets exposes these settings for JRuby users, and we'll deal with aligning this API once the libyaml version gains the same ability.

Thanks @headius - that looks like a reasonable solution to me. Thanks for your efforts here and looking forward to JRuby 9.4.1.0! 😀

Let me know if you'd like help with the admin chores of backporting #613 to Psych 4.x and 3.x (for JRuby 9.4 and 9.3 respectively). My feeling is we might want to get fixes for both. Having said that, if there are other JRuby-specific fixes that might need backporting at the same time might be easier to batch them up somehow :-/

We are planning to release JRuby 9.4.1 with psych 5.1, and are considering doing the same for 9.3.11 but the safe_load change makes me a little reluctant there. Backporting might be possible but I don't know if there's any plans to maintain 3.x or 4.x, and this change bumps yaml compatibility up to 1.2 so that's a concern as well.

Oh, but I guess we could backport just the config changes. The API is not substantially different in that area so it would be a pretty easy patch.

@chadlwilson it should be trivial to backport the config patch to earlier versions of Psych. The question is whether the other maintainers have any appetite for branching and releasing updates to those versions.

OK, thanks! Personally I am not too worried as the JRuby 9.4 upgrade was pretty painless for us, so not reliant on JRuby 9.3 anymore. Given no need to backport for 9.4.x there's a bit more added friction than it'd be for 9.3.x as we'd be going back to psych 3.x.

Possible reason for considering is that the user at jruby/jruby#7543 was on JRuby 9.3. Maybe we can see if there is more noise or other friction preventing folks upgrading to 9.4 which they probably need to be doing anyway to get back to a supported Ruby version which their libraries will be maintained for (rather than Ruby 2.6).

Yeah really the only reason to upgrade 9.3 to SnakeYAML is because of the CVE noise. There are a number of reasons not to do it.

Yeah, my suggestion would be to not worry about the CVE noise and SnakeYAML engine change due to possible surface area of change and noise for you folks if it breaks things on 9.3, and if technically possible just cherry-pick the loader options config flexibility work back to Psych 3.x to allow folks to workaround YAML issues on 9.3 if it's an issue for them to upgrade to 9.4.

But even that might not be worth the effort :-)