BerndGabriel/HtmlViewer

HTMLViewer as HTML Parser

Closed this issue · 2 comments

Hello,

Would HTMLViewer be used as a HTML Parser instead of MSHTML IHTMLDocument2?

Best regards.

The current HtmlViewer is not able to parse a meaningful and modifyable document model.

Unfortunately it interprets a lot of html input while reading. I. e. he combines all inline tags to a single object to speed up text formatting and floating. Also it processes (css) styling while reading.

Several years ago I started to implement a new THtmlViewer with a separate document model (branch HtmlViewer2), but I had to abandon it due to tme shortage.

HTML and CSS parser were working and created HtmlDocument.THtmlDocuments. If you are interested, I could commit the last changes, which are still on my hard disk.

Hello,

Thank you for your suggestion.

I libxml html parsing (http://xmlsoft.org/html/libxml-HTMLparser.html) is enougth for my need.

Best regards.