rubys/nokogumbo

Markup errors not reported

gkellogg opened this issue · 2 comments

Prior to 1.5 (e.g., 1.4.13), Nokogumbo would report markup errors such as the following:

doc = Nokogiri::HTML5.parse('<!DOCTYPE html> <html')
doc.errors #==> [#<Nokogiri::XML::SyntaxError: 1:22: ERROR: @1:22: Tokenizer error with an unimplemented error message.
<!DOCTYPE html> <html
                     ^>] 

With 1.5, this error is no longer reported. Most likely due to a change in Gumbo, but unfortunate.

rubys commented

Try:

doc = Nokogiri::HTML5.parse('<!DOCTYPE html> <html', max_parse_errors: 100)

See: #65

Thanks, that did it.