Treat http|https and www.site.com|site.com as dupes
Opened this issue · 2 comments
http://site.com/
https://site.com/
http://www.site.com/
https://www.site.com/
The Sites mentioned above, they are duplicates. But the program does not recognize that.
This is intentional, because there are cases where such sites differ.
If you want to have the above considered as duplicates, enable expert mode and switch all rules except for the rules 7 (replace ^http:
by https
) and 8 (replace ^([^:]*://)www?\d*\.
by $1
) to "off". (The latter rule will actually consider also e.g. http://www2.site.com or https://www70.site.com as duplicate; if you do not want that, remove the \d*
part.)
Perfect, Excellent! That is why this extension is the maximum. I like these advanced rules, this gives the user the freedom to decide what is or is not duplicated. Congratulations on the work!
Just to help other people. I also added 2 more rules before the rule 9. All others I disabled
^http://www. by http://
^https://www. by https://
Problem solved!