CrossRef/pdfextract

font_metrics.rb:42:in `initialize': undefined method `ascent'

eelcovisser opened this issue · 15 comments

I installed pdf-extract using gem install and I'm getting the following error. A change in the library?

Update: downgrading to ruby-1.9.1 does not help

$ pdf-extract --trace extract --references --titles d912f50dae928909ed.pdf
/Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/font_metrics.rb:42:in `initialize': undefined method `ascent' for #<PDF::Reader::Font:0x007fc611c82650> (NoMethodError)
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:134:in `new'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:134:in `block in build_fonts'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:131:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:131:in `build_fonts'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/model/characters.rb:163:in `block (2 levels) in include_in'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:81:in `call'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:81:in `block (2 levels) in expand_listeners_to_callback_methods'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:170:in `block in invoke_calls'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:169:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf.rb:169:in `invoke_calls'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:42:in `block in parse'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:38:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:38:in `parse'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/lib/pdf-extract.rb:53:in `view'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/bin/pdf-extract:115:in `block (4 levels) in <top (required)>'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/bin/pdf-extract:112:in `each'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/pdf-extract-0.1.1/bin/pdf-extract:112:in `block (3 levels) in <top (required)>'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/command.rb:180:in `call'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/command.rb:180:in `call'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/command.rb:155:in `run'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/runner.rb:402:in `run_active_command'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/runner.rb:78:in `run!'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/delegates.rb:11:in `run!'
        from /Users//.rvm/gems/ruby-1.9.3-p362/gems/commander-4.1.3/lib/commander/import.rb:10:in `block in <top (required)>'

same here.. any idea what this is about?

kjw commented

Hi guys,

Can you try to force an install of the dependency "pdf-reader", version 0.1.1? It seems that later versions of this dependency have moved some methods around. Also, you will need a 1.9.3 version of Ruby, since the code uses Array#sort_by!, which I believe was introduced in 1.9.3.

It has been a while since I looked at this code but I'm planning to get back to it in the next few weeks. First task will be a rationalisation of dependencies - support the latest version of each, and support for Ruby >= 1.9.1.

Hi there, thanks a lot for getting back!

Tried installing version 0.1.1 of the pdf-reader gem, but that doesnt seem to exist (ERROR: Could not find a valid gem 'pdf-reader' (= 0.1.1) in any repository). In case you made a typo and wanted to say 1.1.1 - I tried that and it seems to be working indeed.

Worked with Ruby 1.9.1 too (well as far as I can see at least - didn't get any errors).

kjw commented

Ah yes, I meant 1.1.1! Odd that it works with 1.9.1. I guess I must be wrong about sort_by! only being 1.9.3.

Just had the same issue, with the same work around succeeding.

I'm having the same issue. Was originally presenting under Ruby 2.0.0 and pdf-reader 1.1.1. Downgraded to Ruby 1.9.3, reacquired pdf-extract and pdf-reader 1.1.1 and error still persists. Any ideas? I am a newbie to Ruby so go easy on me :)

kjw commented

Hi there,

From looking at the feedback on Github Issues it seems that people are getting pdf-extract to work with Ruby 1.9.3 and Ruby 2.0.0 so long as they switch to pdf-reader 1.1.1 . Now, currently pdf-extract is defined with a dependency on pdf-reader 1.1.0 so I recommend doing a force install of pdf-reader:

$ gem uninstall pdf-reader
$ gem install pdf-reader -v 1.1.1

Though, if you are including pdf-extract as a gem dependency in a project managed by bundler, you will want to do this in your bundler-managed project directory:

$ bundle exec gem uninstall pdf-reader
$ bundle exec gem install pdf-reader -v 1.1.1

At least, I think that should change the version of the gem that bundler is using. Better would be to patch the gemspec file in pdf-extract and set it's pdf-reader dependency to 1.1.1 . Then you won't overwrite any changes to the bundler gems when you do a subsequent 'bundle install'.

Hope this helps...

kjw commented

Sorry I just reread your comment - you've already tried this.

Any chance you could post some output?

Ta.

Hey! Turns out that I had two versions of pdf-reader installed - 1.1.1 and 1.3.3. I removed both and just reinstalled 1.1.1 and now it works. Thank you!

Hi

I am getting this error. This is my setup:

pdf-extract (0.1.1)
pdf-reader (1.1.1)

I'd really like to get this great program running!!
Thanks in advance

I got exact same problem
undefined method ascent' for #<PDF::Reader::Font:0x00000002576988>. Use --trace to view backtrace pdf-extract --trace /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/runner.rb:398:inrun_active_command': invalid command (Commander::Runner::InvalidCommandError)
from /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/runner.rb:78:in run!' from /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/delegates.rb:11:inrun!'
from /var/lib/gems/1.9.1/gems/commander-4.1.5/lib/commander/import.rb:10:in `block in <top (required)>'

I have tried downgrade to pdf-reader 1.1.1, upgrade to ruby 1.9.3 ,however nothing worked. I would really like to used this nice tool, please help me...
thanks in advance

kjw commented

I think you're seeing two problems.

The first is a missing ascent method on PDF::Reader::Font. Not sure why that isn't present in pdf-reader 1.1.1 as someone above in this thread got pdf-extract working against that version.

Second is that pdf-extract is not accepting a '--trace' paramter - which is unfortunately stopping you from printing a trace for the first issue.

From what I remember pdf-extract at some point applied a monkey patch, or whatever the term is, to the pdf-reader Font class to include an ascent method. At some point I believe this was taken out because pdf-reader incorporated the method, I believe in version 1.1.1 onwards. Thus the method was taken out of pdf-extract. Not sure why this has now disappeared from pdf-reader, too.

I'm a bit lost trying to fix this issue here. Replacing pdf-reader-1.3.3 by pdf-reader-1.1.1 (the fix suggested above) breaks prawn-0.14.0 (which depends on pdf-reader ~> 1.2). This wouldn't be a problem, except that then pdf-extract complains it couldn't activate prawn-0.14.0.

pdf-extract extract --references Beu_2010_BAP_377-378.pdf 
/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1990:in `raise_if_conflicts': Unable to activate prawn-0.14.0, because pdf-reader-1.1.1 conflicts with pdf-reader (~> 1.2) (Gem::LoadError)
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1163:in `activate'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1199:in `block in activate_dependencies'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1185:in `each'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1185:in `activate_dependencies'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/specification.rb:1167:in `activate'
    from /System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/rubygems/core_ext/kernel_gem.rb:48:in `gem'
    from /usr/bin/pdf-extract:22:in `<main>'

How did you guys work around this issue?
Thanks

UPDATE: Ok, worked around the problem here by installing a previous version of prawn (0.12.0):

$ sudo gem install prawn -v 0.12.0

The fix mentioned above doesn't fix anything for me! :(

For the record, afsartori's workaround worked for me on Ubuntu 14.04: I installed prawn 0.12.0 and pdf-reader 1.1.1, uninstalled pdf-reader 1.3.3., and was able to run pdf-extract successfully.