Remove subdomains if their parent domain is already included
ameshkov opened this issue · 2 comments
@Yuki2718 commented on Sat Mar 28 2020
Related: AdguardTeam/AdguardFilters#47398 (comment)
As DNS filter uses ABP syntax, there is no point to include subdomains if their parent domains are included. So why don't you add a removal process of subdomains when you compile the list as such a little bandwidth will be saved for those user.
Basically, it means that we should extend "Compress" transformation to ABP-style lists:
https://github.com/AdguardTeam/HostlistCompiler#compress
I came here to ask the exact same thing. Obviously we can't just chop off subdomains and block the parent across the board, else ||metrics.apple.com^
now instead blocks all of ||apple.com^
which would often (usually) be undesirable, of course.
However, if it's possible to implement this in a 'smart' way that would be awesome. For example, if ||parent.com^
exists then delete the duplicate ||subdomain.parent.com^
entries. Likewise, if ||*telemetry*^
exists, delete all ||telemetry.domain.com^
entries because they're already covered... etc.
I hoped this tool would already do that, but rather it just deleted a few lines of comments from "List source 1" and tacked on "List source 2" to the end, without accounting for things as above. This left a resulting file that was 500,000 lines long when if parsed as above the list would have been at least half as small. I hope this can be figured out somehow, so please consider this a +1.