postmodern/spidr

Malformed mailto error

Closed this issue · 2 comments

The Spidr gem seems to not handle malformed mailto addresses properly. I can't see this problem arising very often, but it did in my case.

  • Here is the HTML that is causing the error, (notice the commas instead of dots):

< a href="mailto:user@example,org,uk" > user@example.org.uk < /a >

  • The error:

/usr/lib/ruby/1.8/uri/generic.rb:732:in merge': unrecognised opaque part for mailtoURL: user@example,org,uk (URI::InvalidComponentError) from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:509:into_absolute'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:495:in urls' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:495:inmap'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:495:in urls' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:587:invisit_page'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:513:in get_page' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:678:inprepare_request'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:507:in get_page' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:573:invisit_page'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:244:in run' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:226:instart_at'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:197:in site' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:124:ininitialize'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:194:in new' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:194:insite'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/spidr.rb:96:in `site'

This was recently fixed in git by zapnap. I will do a version bump to Spidr 0.2.7.

http://github.com/postmodern/spidr/commit/282bf8acf81eaf29d7271ee933c70a16ea7f8f02

Please upgrade to Spidr 0.2.7. Re-open this issue if the problem persists.