essandess/adblock2privoxy

converter does not recognise rules with leading TAB

Closed this issue · 4 comments

It turns out there are filters such as this one which for readability start off with leading tab. Adblock2privoxy creates many empty rules for those leading to privoxy block almost anything rendering Internet useless.

As workaround to this I used below code which downloads ruleset locally and prepares it for adblock2privoxy prior to being used by converter.

wget https://raw.githubusercontent.com/maciejtarmas/AlleBlock/master/alleblock.txt
sed -i -e 's/^[ \t]*//' alleblock.txt
adblock2privoxy http://local.website.com/alleblock.txt

I am not sure how should that be implemented in converter, but booking at the code the I found out InputParser.hs has following

lineSpaces :: Parser ()
lineSpaces = skipMany (satisfy isLineSpace) <?> "white space"
    where isLineSpace c = c == ' ' || c == '\t'

Maybe some changes to include tabs there could take care of this?

Please upstream issues like broken rules in other repos.

How are those rules broken? They work just fine in adblock/ublock

I must have missed the issue. Please excerpt an example that illustrates the issue.

Filter file has leading tab for rules https://github.com/maciejtarmas/AlleBlock/blob/master/alleblock.txt

! Strona główna

	allegro.pl##div[data-box-name="Showcase main"]
	allegro.pl##div[data-box-name="Showcase brand and marketing"]
	allegro.pl##div[data-box-name="reklamy APE"]

In such a case a2p creates empty rules instead creating appropriate rules for element hiding. A2P expects rules to start with the beginning of the line and hence it finds TAB there it uses it as the rule disregarding the rest of line.
One would expect in such case TAB would be skipped.