logstash-plugins/logstash-filter-xml

How to remove HTML Markup with filter-xml ?

DBARUNNER opened this issue · 1 comments

I am fetching data with rss-input but in feed the elastichsearch data with HTML Markup but I just want to have data not HTML Markup.
How to remove HTML Markup with filter-xml ?

kares commented

since using Nokogiri, could be possible to have filter { xml => ... } act as filter { html => }
or alternatively have filter { xml => { html => true ... } } ... none of this is implemented atm.

back to the original question, if its prober HTML (or XHTML) you just parse it like a XML document.