[BUG]
AllanKlaus opened this issue · 2 comments
AllanKlaus commented
Describe the bug
I'm getting error when try to get the response from nokogiri when I make a crawler in site (it works with byebug).
To Reproduce
lesson_link = 'https://escuelasabatica.co/escuela-sabatica-cuarto-trimestre-2020-educacion-cristiana-introduccion/'
lesson = Nokogiri::HTML(URI.open(lesson_link))
lesson.search('.post_content')
- See error
.rvm/gems/ruby-2.7.1/gems/ruby_jard-0.3.0/lib/ruby_jard/session.rb:69:in
write': "\xC2" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)`
Expected behavior
Retrieve the content of the object.
(byebug) lesson.search('.post_content')
[#<Nokogiri::XML::Element:0x3b9c name="div" attributes=[#<Nokogiri::XML::Attr:0x341c name="class" value="post_content">, #<Nokogiri::XML::Attr:0x3430 name="itemprop" value="articleBody">] children=[#<Nokogiri::XML::Text:0x3444 "\n">, #<Nokogiri::XML::Element:0x3480 name="style" attributes=[#<Nokogiri::XML::Attr:0x3458 name="type" value="text/css">] children=[#<Nokogiri::XML::CDATA:0x346c "\r\n\r\n.bssb-buttons {\r\ndisplay:inline-block;\r\nfloat:left;\r\npadding-bottom:7px;\r\n}\r\n\r\n.bssb-buttons > .facebook{\r\nfont-size: 13px;\r\nborder-radius: 3px;\r\nmargin-right: 6px;\r\nbackground: #2d5f9a;\r\nposition: relative;\r\ndisplay: inline-block;\r\ncursor: pointer;\r\nheight: 41px;\r\nwidth: 164px;\r\ncolor: #FFF;\r\nline-height:41px;\r\nbackground: url(https://escuelasabatica.co/wp-content/plugins/big-social-share-buttons/facebook.png) no-repeat 10px 12px #2D5F9A;\r\npadding-left: 35px;\r\nbox-shadow: 0px 4px 0px #A4A4A4, 0px 2px 0px rgba(0, 0, 0, 0.35);\r\n}\r\n\r\n.bssb-buttons > .twitter{\r\nfont-size: 13px;\r\nborder-radius: 3px;\r\nmargin-right: 6px;\r\nbackground: #00c3f3;\r\nposition: relative;\r\ndisplay: inline-block;\r\ncursor: pointer;\r\nheight: 41px;\r\nwidth: 165px;\r\ncolor: #FFF;\r\nline-height:41px;\r\nbackground: url(https://escuelasabatica.co/wp-content/plugins/big-social-share-buttons/twitter.png) no-repeat 10px 14px #00c3f3;\r\npadding-left:37px;\r\nbox-shadow: 0px 4px 0px #A4A4A4, 0px 2px 0px rgba(0, 0, 0, 0.35);\r\n}\r\n\r\n.bssb-buttons > .google {\r\nfont-size: 13px;\r\nborder-radius: 3px;\r\nmargin-right: 6px;\r\nbackground: #eb4026;\r\nposition: relative;\r\ndisplay: inline-block;\r\ncursor: pointer;\r\nheight: 41px;\r\nwidth: 165px;\r\ncolor: #FFF;\r\nline-height:41px;\r\nbackground: url(https://escuelasabatica.co/wp-content/plugins/big-social-share-buttons/google.png) no-repeat 10px 11px #eb4026;\r\npadding-left:37px;\r\nbox-shadow: 0px 4px 0px #A4A4A4, 0px 2px 0px rgba(0, 0, 0, 0.35);\r\n}\r\n\r\n\r\n ">]>,
Environment (please complete the following information):
- OS: Ubuntu 20
- Terminal: Terminator
tput colors
: 256echo $TERM
: xterm-256colorstty
: speed 38400 baud; line = 0; \n-brkint -imaxbel iutf8- Do you use tmux/screen or similar tty multiplexer? No
0x2c7 commented
Hi @AllanKlaus, could you try again with Jard from master branch? If I'm not mistaken, this bug is already fixed, but not released yet.
gem 'ruby_jard', git: 'https://github.com/nguyenquangminh0711/ruby_jard', ref: 'master'
AllanKlaus commented
Hi @AllanKlaus, could you try again with Jard from master branch? If I'm not mistaken, this bug is already fixed, but not released yet.
gem 'ruby_jard', git: 'https://github.com/nguyenquangminh0711/ruby_jard', ref: 'master'
worked