appsignal/appsignal-ruby

Compiling native extension fails when having `psych >= 5`

manuelvanrijn opened this issue ยท 21 comments

Timebox: 0.5 days

Describe the bug

After we've downgraded back to psych 4 compilation worked again.

To Reproduce

We are on ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [aarch64-linux-musl]

/app # gem install psych:5
/app # gem install appsignal:3.2.2

Error output:

Building native extensions. This could take a while...
ERROR:  Error installing appsignal:
	ERROR: Failed to build gem native extension.

    current directory: /usr/local/bundle/gems/appsignal-3.2.2/ext
/usr/local/bin/ruby -I /usr/local/lib/ruby/3.1.0 extconf.rb
/usr/local/lib/ruby/3.1.0/psych.rb:455:in `parse_stream': undefined method `parse' for #<Psych::Parser:0x0000ffffa35c9628 @handler=#<Psych::Handlers::DocumentStream:0x0000ffffa35c9b78 @stack=[], @last=nil, @root=nil, @start_line=nil, @start_column=nil, @end_line=nil, @end_column=nil, @block=#<Proc:0x0000ffffa35c96f0 /usr/local/lib/ruby/3.1.0/psych.rb:399>>, @external_encoding=0> (NoMethodError)

      parser.parse yaml, filename
            ^^^^^^
	from /usr/local/lib/ruby/3.1.0/psych.rb:399:in `parse'
	from /usr/local/lib/ruby/3.1.0/psych.rb:323:in `safe_load'
	from /usr/local/lib/ruby/3.1.0/psych.rb:369:in `load'
	from /usr/local/bundle/gems/appsignal-3.2.2/ext/base.rb:12:in `<top (required)>'
	from <internal:/usr/local/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from <internal:/usr/local/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'
	from extconf.rb:1:in `<main>'

extconf failed, exit code 1

Gem files will remain installed in /usr/local/bundle/gems/appsignal-3.2.2 for inspection.
Results logged to /usr/local/bundle/extensions/aarch64-linux-musl/3.1.0/appsignal-3.2.2/gem_make.out

Thank you. This solved the problem for us. Ours was on Jenkins, using rvm, where psych 5.x somehow got installed, and was then secretly used to build appsignal.
gem uninstall psych
to list the versions, then
gem uninstall psych -v 5.0.1
to get rid of them
then
gem install psych -v 4.0.6

Thanks for reporting this issue, but I am unable to reproduce with the commands you provided.

The error is triggered when the Psych::Parser#parse method is not found. In Psych 4, this method is defined by the Psych C-extension. This error could occur when the C-extension failed to install, but that version of the psych gem is installed when Ruby is installed. I would expect the entire Ruby version to fail to install then.

In Psych 5 the parse method is defined in the Ruby gem. This method calls a C function called _native_parse, which is the renamed version of the parse function.

My best guess is that the wrong C-extension is loaded when the psych 5 gem is installed in your case. That particular scenario I have not been able to reproduce. I'm using chruby, not rvm, but my colleague using rvm can't reproduce this issue either.

To only way I get this error, is by installing psych using the gem install psych --default command (which installs version 5.0.1). That will break the entire install though so don't run it unless you like reinstalling Ruby. I couldn't even run gem list and gem uninstall after that.
My guess is that the psych 5 gem C-extension then becomes the default psych C-extension to load, but the gem install command (and others) still use the psych 4 codebase, shipped with Ruby itself.

Was psych 5 installed using gem install psych or some other way originally?

The issue itself isn't in the AppSignal gem, so I can't fix it.

tijn commented

@tombruijn, for us, psych 5 gets installed because rdoc depends on it. Bundler simply tries to install the latest version.

Appsignal doesn't list psych in the gemspec (because it's already installed with Ruby?) but rdoc explicitly specifies their dependency like such:

s.add_dependency 'psych', '>= 4.0.0'

I think that this issue can be resolved if AppSignal too could write out the dependency on psych but with an extra requirement since AppSignal appears to be incompatible with the newest version:

s.add_dependency 'psych', '>= 4.0.0', '< 5.0.0'

This will also stop dependabot from trying to upgrade libraries that cannot be upgraded.

@tombruijn Sorry for the delay. Here's an example how you can reproduce the problem:

  1. docker run --rm -ti ruby:3.1-alpine3.15 apk add --update --no-cache build-base && sh
  2. vi Gemfile
source 'https://rubygems.org'

gem 'sdoc'
gem 'appsignal', '3.2.2'
  1. bundle install
  2. vi Gemfile and change the version number of appsignal to 3.3.1. So:
source 'https://rubygems.org'

gem 'sdoc'
gem 'appsignal', '3.3.1'
  1. Run bundle install

crap... @manuelvanrijn, you beat me to it! :P

Created an repo which also can reproduces the error: https://github.com/Memoriam-tv/appsignal-ruby-test-report/actions

mvz commented

For me, just having psych 5 installed breaks the appsignal installation, even when psych 5 is not part of the bundle.

Check, @tombruijn update to 3.3.1 is failing in all our repo's because of this.

Thanks for the extra information!
I find it amazing I still can't reproduce this locally. Even with the instructions for the alpine container. That installs fine for me ๐Ÿคทโ€โ™‚๏ธ

I'll try some more with the test repo.

We may just need to add psych as a dependency for a while then. It's on my wishlist to remove the YAML dependency. It has broken so many times before already. (Private issue for removing YAML.)

Finally reproduced! I needed to make very sure that psych5 was installed first by bundler, which didn't always happen for some reason ๐Ÿคทโ€โ™‚๏ธ

A better way to test this is to first bundle without AppSignal and then with AppSignal in the Gemfile.

I can add psych as a dependency, but this causes the following:

  • If I lock it to psych version 4, you can't use it in apps using psych 5.
  • If I lock it to psych version 5, you can't use it in apps using psych 4. And it still fails to install anyway.

Either would not be ideal.

If I remove our YAML file parsing from the gem installation entirely, surprise! It reads YAML when we try to figure out Ruby's proxy settings, using Gem.configuration[:http_proxy] and fails on that instead.

I think this is a bug in Ruby. It looks to me it's loading the wrong psych extension and Ruby code combination.

I'll try to double sure confirm this theory and write up a report and/or try to see if we can figure out the proxy setting another way and go the YAML-less route.

tijn commented

If I lock it to psych version 4, you can't use it in apps using psych 5.

But that's fine since AppSignal is actually not compatible with psych 5 @tombruijn . I agree that it's not ideal but this is the reality for the AppSignal gem in its current state.

Going the YAML-less route sounds like a great idea although, in my opinion, that can be treated as a separate issue.

mvz commented

Adding the dependency won't be enough, since having psych 5 installed without it being part of the bundle is enough to trigger the problem.

You're right. It doesn't help to specify psych 4 in the bundle.

This is not a fun bug. I think I'm going for the non-YAML route and remove the http proxy thing that deep down also parses some YAML. That way it will work in all scenarios, expect when you rely on the Ruby gems http proxy config, for which people can then specify the HTTP_PROXY env var as a workaround.

Partial fix sent in as a PR: #913
That still has the issue with the Gem.configuration[:http_proxy] line causing a similar psych issue later on. I'll address that in another PR.

tijn commented

@tombruijn Oh, right! My "solution" only works if you have an isolated app with its own gems (like we have in our Docker setup). I completely forgot about other situations :-)

@tijn no worries. I also forgot to mention I tested our YAML-less install on Ruby 3.2 with pysch 5 and that does work. Meaning upgrading Ruby to version 3.2 is also an option.

We are compatible with psych 5, it's just that any Ruby version with version 4 in the standard library doesn't work, which is most Rubies.

I've submitted another part of the fix in PR #914. The two PRs #913 and #914, when merged, should fix it.

I've released the fix for this in AppSignal for Ruby gem 3.3.2.

Only caveat: We try to read the .gemrc file, but if this can't be done because of this issue it's ignored. Instead use the HTTP_PROXY env var.

Please try out this new version and let us know if it works or not!

tijn commented

Please try out this new version and let us know if it works or not!

I just installed the newest gem in all our apps @tombruijn and there's no trace of this issue anymore. ๐Ÿ‘

Looks good on my deploys! cheers!

I've reported it with Ruby upstream: https://bugs.ruby-lang.org/issues/19371

This issue was fixed in RubyGems 3.4: rubygems/rubygems#6490
Please update to the latest RubyGems with: gem update --system if you run into this issue.