nmelnick/Domain-PublicSuffix

Does not or brokenly handle hostnames ending in a dot as common in DNS queries

Closed this issue · 2 comments

I have the following script named just-the-2nd-level-domain.pl:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

use Domain::PublicSuffix;
my $suffix = Domain::PublicSuffix->new({ allow_unlisted_tld => 0 });

while (my $domain = <>) {
    chomp($domain);
    say ($suffix->get_root_domain($domain) or say $suffix->error());
}

But the results are unexpected if I add a trailing dot:

~ → echo www.example.com | bin/just-the-2nd-level-domain.pl
example.com
~ → echo www.example.com. | bin/just-the-2nd-level-domain.pl
www.example.com.

Cause is probably line 218 in lib/Domain/PublicSuffix.pm:

    218:                 $root_domain =~ s/^.*\.(.*?\.$suffix)$/$1/;

Because $suffix has been correctly set to com without trailing dot, that regexp will not match and hence nothing gets replaced. (An assertion that this regexp must replace at least something might help here to catch such issues in the future.)

So far I only looked through the code and haven't checked if this is really the cause. But I might implement the suspected fix and create a pull request with it later.

Note: I am using version 0.19 as packaged in Debian 11 Bullseye (current Debian Stable release as of this writing) with Perl 5.32 (from Debian as well).

Yep, this patch fixes the issue:

--- a/lib/Domain/PublicSuffix.pm
+++ b/lib/Domain/PublicSuffix.pm
@@ -215,7 +215,7 @@
 		$self->root_domain($suffix);
 	} else {
 		my $root_domain = $domain;
-		$root_domain =~ s/^.*\.(.*?\.$suffix)$/$1/;
+		$root_domain =~ s/^.*\.(.*?\.$suffix)\.?$/$1/;
 		$self->root_domain($root_domain);
 	}
 

Outcome with the patch:

$ echo www.example.com. | ~/bin/just-the-2nd-level-domain.pl
example.com

Will create a pull-request for that.

Thank you so much for your contribution!