gjtorikian/html-pipeline

Sanitizing inline style attributes

Closed this issue · 2 comments

In v2, it was possible to sanitize inline style attributes (by allowing only certain CSS attributes).
This appear to be no longer possible in v3 due to the move from Nokogiri to Selma.

This prevents Thredded from updating to v3 because we use the inline style sanitization to allow text-align on table cells (to fully support markdown). thredded/thredded#981

I don't know how you're doing the markdown conversion here but I think there's a misunderstanding, and what you're actually trying to do should be possible with the default setup:

pipeline =  HTMLPipeline.new(convert_filter: HTMLPipeline::ConvertFilter::MarkdownFilter.new, sanitization_config: HTMLPipeline::SanitizationFilter::DEFAULT_CONFIG)

text = <<~TXT
| Option | Description |
| ------:| -----------:|
| data   | path to data files to supply the data that will be passed into templates. |
| engine | engine to be used for processing templates. Handlebars is the default. |
| ext    | extension to be used for dest files. |
TXT

pipeline.call(text)

#{:output=>
#  "<table>\n<thead>\n<tr>\n<th align=\"right\">Option</th>\n<th align=\"right\">Description</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td align=\"right\">data</td>\n<td align=\"right\">path to data files to supply the data that will be passed into templates.</td>\n</tr>\n<tr>\n<td align=\"right\">engine</td>\n<td align=\"right\">engine to be used for processing templates. Handlebars is the default.</td>\n</tr>\n<tr>\n<td align=\"right\">ext</td>\n<td align=\"right\">extension to be used for dest files.</td>\n</tr>\n</tbody>\n</table>"}

As expected, the columns have an align="right". Does this not work for you?

Additionally there is a way to manipulate the style attribute during the pipeline, but again it shouldn't be necessary for this issue.

I'm going to close this as I think the problem is solved and I like to keep my issues clean, but feel free to continue the conversation.

The align attribute can work but it is deprecated: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th#deprecated_attributes
We use our own markdown filter that uses Kramdown under the hood.