rollbar/rollbar-gem

[bug] Receiving API reject 400 for some requests when using Ruby 2.6.0

johannesluedke opened this issue · 12 comments

For some error from my rails application, I get an error entry at rollbar

"Response
Timestamp: 2019-01-11 02:19 pm
Status code: 400
Response body:
{
  "message": "invalid or missing instance", 
  "err": 1
}

Request

Relative URL: /api/1/item/
Request body: {
  "{\"access_token\":\"<omitted>",\"data\":{\"timestamp\":1547212741,\"environment\":\"production\",\"level\":\"error\",\"language\":\"ruby\",\"framework\":\"Rails: 5.1.6.1\",\"server\":{\"host\":\"<omitted longer rest of request body>

How can this be fixed?

the request body length is ~ 65,527 characters

@johannesluedke we'll investigate.

@jessicahsieh could you verify that the payload is properly formed and that the request body is within the acceptable length for our API?

We hit this as well.

@johannesluedke did you start getting this after switching to ruby 2.6.0?

See https://mensfeld.pl/2019/01/exploring-a-critical-netprotocol-issue-in-ruby-2-6-0p0-and-how-it-can-lead-to-a-security-problem/

We switched back to 2.5.3 and the problem went away. Looks like 2.6.0 isn't ready for production use yet....

@johannesluedke could you show me your configuration and account/project name so i can investigate?

@jessewgibbs ill look into it. @letaniaferreira and i have been in contact with @benhutton via intercom and looks like the issue is related to ruby 2.6.0 although havent had a change to look into what is causing it. the payload is getting formatted properly for most of the errors but not all.

@jessicahsieh I wrote you a message via within rollbar.

--
@benhutton we use rollbar for a while now, and saw this error first on Jan 10, after we switched to Ruby 2.6.0 in the beginning of the year. So yes, it could be related. If Ruby 2.6.1 were out yet, one could try to upgrade to it

@johannesluedke @benhutton any update on whether there's a new Ruby release yet that solves this problem? I'm inclined to treat this as a known issue vs. trying to fix it since it seems like Ruby isn't behaving as expected.

@jessewgibbs there is no new Ruby release yet. The fix has been made, but the code hasn't been released as a new version yet.

As for whether or not you should try to fix things for the rollbar gem

  1. If you did fix it, I still wouldn't upgrade to ruby 2.6.0. Too many unknowns at this point. I'm waiting til 2.6.1 regardless. But I don't know that that should matter...
  2. You could see how many of your customers are actually affected by this by checking for occurrences of API error: invalid or missing instance since Christmas (which is when new ruby ships).
  3. You could see how many of your customers are potentially affected by this by checking backtraces for the string vendor/bundle/ruby/2.6.0. Just because a customer hasn't sent a too-big payload yet doesn't mean that they won't. Every user running ruby 2.6.0 is vulnerable to this right now.
  4. IMO, Rollbar is in a special class of tools. You are the thing that helps me figure out if other things are breaking. If you're breaking, I'm screwed! Thus, "Ruby isn't behaving as expected" coupled with "hopefully Ruby will ship a fix ASAP" is an adequate response for most other gems in my stack. But for Rollbar, I need to know that you're there to tell me that all those other gems are breaking, if they do in fact break.
  5. The particular error that occurred, an API error, did not get reported to us in all of our usual channels. I think I got it via email, because I get everything via email. But in slack, where most of our ops team is looking, it didn't match the settings we've configured Rollbar with to tell us when things are exploding.

So to summarize, I think you should ship a fix. I wouldn't use it. But I want Rollbar to be the type of company that would ship a fix for this. And I think it would serve other customers well.

Alternatively, you could email all of your customers that are running on ruby 2.6.0, pointing out the ruby bug and suggesting they downgrade to ruby 2.5.3 until ruby 2.6.1 releases. That would be an even more helpful thing to do, as not every rubyist knows about this error yet, I would imagine.

@benhutton thanks for the guidance on this. We've issued a warning to our customers advising against using Rollbar with Ruby 2.6.0 until the maintainers release the fix to https://bugs.ruby-lang.org/issues/15472.

Has anyone tried patching this with confirmed fix before downgrading?

I deployed a patch reopening protocol.rb with the single line change from core, temporarily, so far so good.

https://gist.github.com/rromanchuk/0b239ca4a31bd67c12c566250501837b#file-protocol-rb-L51

The bug is specific to multi-byte characters. (The length is erroneously calculated based on characters instead of bytes.)

There are two apparent ways to work around in rollbar-gem.

  1. If we try to avoid it using MAX_PAYLOAD_SIZE, Ruby IO#write_nonblock doesn't guarantee a size that won't get split. It's platform dependent, and can also be affected by system interrupts, and I'm sure other factors.
  2. We can patch Net::BufferedIO.write0, applying the same fix as was committed to Ruby, same as @rromanchuk did: https://gist.github.com/rromanchuk/0b239ca4a31bd67c12c566250501837b#file-protocol-rb-L51

Option #1 may not work at all times or on all platforms. Option #2 requires patching a core Ruby class (only for 2.6.0 of course), but should be expected to to work in all environments.

Is this fixed now that 2.6.1 is out?

@brandoncc Yes, this is fixed in 2.6.1.

For 2.6.0, PR #814 is currently pending, and may be included in the next rollbar-gem release.