Malformed mailto error
Closed this issue · 2 comments
The Spidr gem seems to not handle malformed mailto addresses properly. I can't see this problem arising very often, but it did in my case.
- Here is the HTML that is causing the error, (notice the commas instead of dots):
< a href="mailto:user@example,org,uk" > user@example.org.uk < /a >
- The error:
/usr/lib/ruby/1.8/uri/generic.rb:732:in merge': unrecognised opaque part for mailtoURL: user@example,org,uk (URI::InvalidComponentError) from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:509:in
to_absolute'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:495:in urls' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:495:in
map'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/page.rb:495:in urls' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:587:in
visit_page'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:513:in get_page' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:678:in
prepare_request'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:507:in get_page' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:573:in
visit_page'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:244:in run' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:226:in
start_at'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:197:in site' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:124:in
initialize'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:194:in new' from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/agent.rb:194:in
site'
from /usr/lib/ruby/gems/1.8/gems/spidr-0.2.5/lib/spidr/spidr.rb:96:in `site'
This was recently fixed in git by zapnap. I will do a version bump to Spidr 0.2.7.
http://github.com/postmodern/spidr/commit/282bf8acf81eaf29d7271ee933c70a16ea7f8f02
Please upgrade to Spidr 0.2.7. Re-open this issue if the problem persists.